Skip to content

docs: Document min_cuda_version parameter for Flash GPU endpoints#593

Draft
promptless[bot] wants to merge 1 commit intomainfrom
promptless/flash-min-cuda-version
Draft

docs: Document min_cuda_version parameter for Flash GPU endpoints#593
promptless[bot] wants to merge 1 commit intomainfrom
promptless/flash-min-cuda-version

Conversation

@promptless
Copy link
Contributor

@promptless promptless bot commented Mar 24, 2026

Open this suggestion in Promptless to view citations and reasoning process

Documents the new min_cuda_version parameter for Flash endpoints. GPU endpoints now default to CUDA 12.8 to ensure workers run on hosts with recent drivers. Users can override this value to allow older hosts if needed. CPU endpoints are unaffected.

Trigger Events

runpod/flash PR #277: feat: default GPU endpoints to minCudaVersion 12.8

Promptless Research (5 files, 1 GitHub PR)

  • Reviewed runpod/flash#277
  • Viewed .long_term_context/product_knowledge/product_overview.md
  • Viewed .long_term_context/doc_workflow/client_instructions.md
  • Viewed .long_term_context/style/client_style_guide.md
  • Viewed flash/configuration/parameters.mdx
  • Viewed flash/create-endpoints.mdx

Agent Response

I've created a documentation suggestion for PR #277. Here's a summary of the changes:

Documentation Updates

Updated flash/configuration/parameters.mdx to document the new min_cuda_version parameter:

  1. Parameter overview table - Added the new parameter with its type, description, and default values ("12.8" for GPU endpoints, None for CPU endpoints)

  2. Detailed parameter section - Added a new section explaining:

    • What the parameter does (specifies minimum CUDA driver version for host selection)
    • Code examples showing default usage and how to override it
    • Note that CPU endpoints are unaffected
    • Valid CUDA version values with validation behavior
  3. Configuration change behavior - Updated the "Changes that recreate workers" list to include min_cuda_version, since changing this value triggers worker recreation

Suggestion URL: https://app.gopromptless.ai/suggestions/9221ecfe-41a2-4b11-a339-b07ab547d343


Tip: Add or adjust Promptless's style guide in Agent Knowledge Base ✍️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Line 474)

Citation: New min_cuda_version parameter added to Endpoint class in src/runpod_flash/endpoint.py. Default value of "12.8" set in src/runpod_flash/core/resources/serverless.py. CPU endpoints clear this value via _sync_cpu_fields() in serverless_cpu.py.
View source

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Line 500)

Citation: Valid CUDA versions are validated against the CudaVersion enum via validate_min_cuda_version() in serverless.py. The error message format and validation logic are defined in the PR.
View source

@mintlify
Copy link

mintlify bot commented Mar 24, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
runpod-docs 🟢 Ready View Preview Mar 24, 2026, 5:01 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants