DOC-1581 inference provider production path#2348
Conversation
Fix awkward phrasing, informal terms, and unclear word choices across inference-provider.mdx, model-serving.mdx, and gpu-hpa-dcgm.mdx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
✅ Deploy Preview for vcluster-docs-site ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Remaining work and known gaps from DOC-1581This PR covers Phases 1–3 of the issue. The items below are either out of scope for this PR or deferred. Done in this PR
Not done — cross-repo (separate PRs needed)
Not done — vcluster-docs (future issues)
|
Reciprocal dependency: vmetal-docs#29 (DOC-1581 Phase 4)The vMetal half of this inference provider journey is now up in loft-sh/vmetal-docs#29 (DOC-1581 Phase 4). It adds an "Inference provider capacity" section to the GPU fleet ops page and cross-links the vMetal docs back to the inference provider production path created here. Merge order: vmetal-docs#29 links to This PR already links into the vMetal pages ( |
Retarget the Day 0 and Day 2 vMetal cross-links to the new #inference-provider-capacity anchor added in vmetal-docs#29, and tie the endpoint readiness warm-pool guidance to vMetal's hardware-layer warm pool against on-demand provisioning. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Content Description
Adds the Inference Provider production path for teams building managed model-serving endpoints on GPU infrastructure. Includes a new production guide, a model-serving runtimes integration page, GPU and inference autoscaling patterns (KEDA and Prometheus), a GPU Operator install guide inside tenant clusters, and a readability pass on the new content.
Preview Link
Internal Reference
Partially addresses DOC-1581
AI review: mention
@claudein a comment to request a review or changes. See CONTRIBUTING.md for available commands.@netlify /docs