fix(vram-display): clarify VRAM UX#892
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughIntroduces a unified VRAM accounting model distinguishing three concepts: rated VRAM (user-facing capacity class), system-reported VRAM (raw driver bytes), and allocatable VRAM (system minus reserved). New utility modules in Rust ( ChangesVRAM Accounting Model
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
2b6d186 to
f1245a6
Compare
f1245a6 to
3bcb389
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
crates/mesh-llm-host-runtime/src/runtime/mod.rs (1)
10369-10387: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winCover non-zero pinned-GPU reserves in this integration test.
This assertion now includes
reserved_bytes, but it only verifies theNonepath. Setting one synthetic GPU reserve toSome(...)would lock the newGpuFacts.reserved_bytes→StartupPinnedGpuTargetpropagation and allocatable-capacity behavior.Proposed test strengthening
fn pinned_gpu_startup_preflight_uses_config_gpu_id() { + const RESERVED_BYTES: u64 = 1024 * 1024 * 1024; + let options = runtime_options_for_test(&["mesh-llm"]); let config = plugin::MeshConfig { gpu: plugin::GpuConfig { assignment: plugin::GpuAssignment::Pinned, parallel: None, @@ - let gpus = vec![ + let mut gpus = vec![ synthetic_gpu(0, Some("pci:0000:65:00.0"), Some("CUDA0")), synthetic_gpu(1, Some("pci:0000:b3:00.0"), Some("CUDA1")), ]; + gpus[0].reserved_bytes = Some(RESERVED_BYTES); preflight_config_owned_startup_models_with_gpus(&config, &specs, &mut plans, &gpus, None) .unwrap(); @@ stable_id: "pci:0000:65:00.0".into(), backend_device: "CUDA0".into(), vram_bytes: 24_000_000_000, - reserved_bytes: None, + reserved_bytes: Some(RESERVED_BYTES), }) ); + assert_eq!( + plans[0] + .pinned_gpu + .as_ref() + .expect("pinned GPU should be populated") + .allocatable_vram_bytes(), + 24_000_000_000 - RESERVED_BYTES + ); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/mesh-llm-host-runtime/src/runtime/mod.rs` around lines 10369 - 10387, The test in the preflight_config_owned_startup_models_with_gpus integration test currently only verifies the case where reserved_bytes is None. To strengthen the test coverage, modify one of the synthetic_gpu calls to include a non-zero reserved value (using Some(...) instead of None for the reserve parameter), and then add a corresponding assertion to verify that the reserved_bytes field in the StartupPinnedGpuTarget is correctly propagated from the GpuFacts.reserved_bytes field. This will ensure the integration test covers both the None and Some paths for GPU reserves.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/mesh-llm-ui/src/features/configuration/components/NodeSection.tsx`:
- Line 59: The VRAMBar component has been updated to use system totals via
nodeSystemTotalGB(node), but containerFreeGB and related capacity calculations
still use rated totals, causing inconsistency between what the bar displays and
what the free capacity math shows. Replace all instances where rated total
values are used for calculating containerFreeGB and other capacity-related
fields with the system total value obtained from nodeSystemTotalGB(node). This
needs to be done in multiple locations throughout the NodeSection component (at
lines 59, 85-89, 211, and 251 as indicated) to ensure all capacity calculations
are consistent with the bar's system-based display.
In `@crates/mesh-llm-ui/src/features/configuration/lib/config-math.ts`:
- Around line 84-88: The newly added containerAllocatableGB function properly
handles allocatable capacity calculation including the pooled placement case,
but the existing containerAvailableGB function still uses the old calculation of
containerTotalGB minus containerReservedGB, which can overstate available
capacity. Update the containerAvailableGB function to use containerAllocatableGB
instead of the subtraction-based calculation to ensure accurate capacity
reporting. This fix should also be applied to any other related functions
mentioned in the review that perform similar capacity calculations.
- Around line 35-37: The gpuAllocatableGB function treats an explicit
allocatableGB value of 0 as missing by checking if explicit is greater than 0,
causing it to fall back to the computed fallback value. Instead of checking if
explicit is greater than 0, check whether allocatableGB was actually provided
(by verifying it is not NaN or checking for explicit presence) to preserve the
API contract that an explicit 0 is a valid value that should be respected and
not overridden with the computed systemTotalGB minus reservedGB fallback.
In
`@crates/mesh-llm-ui/src/features/dashboard/components/details/NodeSidebar.tsx`:
- Line 48: The React key generation for GPU rows in NodeSidebar is producing
duplicate keys when multiple GPUs share the same name and bandwidth but have
different or missing vram_bytes values. Locate the places where GPU row keys are
being generated (around the gpus array rendering and any other related GPU row
renders mentioned in the comment), and update the key generation logic to
include the array index or another unique identifier alongside the existing
properties (name, bandwidth, etc.) to ensure each GPU row has a stable, unique
key even when vram_bytes is optional or absent.
In `@crates/mesh-llm-ui/src/lib/vram.ts`:
- Around line 70-74: The gpuAllocatableVramGB function currently loses explicit
allocatable_vram_bytes values of 0 because decimalVramGBFromBytes treats 0 as
missing. Instead of relying on the return value of decimalVramGBFromBytes to
determine if an explicit value was provided, check whether
gpu.allocatable_vram_bytes is explicitly set (not null or undefined) before
conversion, and only fall back to the computed allocatable value when the
explicit bytes value is not provided.
---
Nitpick comments:
In `@crates/mesh-llm-host-runtime/src/runtime/mod.rs`:
- Around line 10369-10387: The test in the
preflight_config_owned_startup_models_with_gpus integration test currently only
verifies the case where reserved_bytes is None. To strengthen the test coverage,
modify one of the synthetic_gpu calls to include a non-zero reserved value
(using Some(...) instead of None for the reserve parameter), and then add a
corresponding assertion to verify that the reserved_bytes field in the
StartupPinnedGpuTarget is correctly propagated from the GpuFacts.reserved_bytes
field. This will ensure the integration test covers both the None and Some paths
for GPU reserves.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 63991583-73fd-4fd6-bdf6-1c20cfc7f662
⛔ Files ignored due to path filters (2)
docs/specs/assets/vram-configuration-rated.pngis excluded by!**/*.pngdocs/specs/assets/vram-dashboard-rated.pngis excluded by!**/*.png
📒 Files selected for processing (25)
crates/mesh-llm-commands/src/gpus.rscrates/mesh-llm-host-runtime/src/api/status.rscrates/mesh-llm-host-runtime/src/runtime/local.rscrates/mesh-llm-host-runtime/src/runtime/mod.rscrates/mesh-llm-system/src/lib.rscrates/mesh-llm-system/src/vram.rscrates/mesh-llm-ui/src/features/app-shell/lib/status-helpers.test.tscrates/mesh-llm-ui/src/features/app-shell/lib/status-helpers.tscrates/mesh-llm-ui/src/features/app-shell/lib/status-types.tscrates/mesh-llm-ui/src/features/app-shell/lib/topology-types.tscrates/mesh-llm-ui/src/features/app-tabs/types.tscrates/mesh-llm-ui/src/features/configuration/api/config-adapter.test.tscrates/mesh-llm-ui/src/features/configuration/api/config-adapter.tscrates/mesh-llm-ui/src/features/configuration/components/NodeSection.tsxcrates/mesh-llm-ui/src/features/configuration/lib/config-math.test.tscrates/mesh-llm-ui/src/features/configuration/lib/config-math.tscrates/mesh-llm-ui/src/features/dashboard/components/details/NodeSidebar.tsxcrates/mesh-llm-ui/src/features/network/api/status-adapter.test.tscrates/mesh-llm-ui/src/features/network/api/status-adapter.tscrates/mesh-llm-ui/src/features/reserves/pages/ReservesPage.test.tsxcrates/mesh-llm-ui/src/features/reserves/pages/ReservesPage.tsxcrates/mesh-llm-ui/src/lib/api/types.tscrates/mesh-llm-ui/src/lib/vram.test.tscrates/mesh-llm-ui/src/lib/vram.tsdocs/specs/vram-accounting.md
| const [dragKey, setDragKey] = useState<string | null>(null) | ||
| const open = !collapsed | ||
| const totalNodeGB = nodeTotalGB(node) | ||
| const systemTotalNodeGB = nodeSystemTotalGB(node) |
There was a problem hiding this comment.
containerFreeGB still uses rated totals after switching bars to system totals.
VRAMBar now uses system capacity, but selectedTotalGB still uses rated totals, so the selected-model free-capacity path can disagree with what the bar and fit math show.
Suggested patch
- const selectedTotalGB = node.placement === 'pooled' ? totalNodeGB : (selectedGpu?.totalGB ?? 0)
+ const selectedTotalGB =
+ node.placement === 'pooled'
+ ? systemTotalNodeGB
+ : (selectedGpu?.systemTotalGB ?? selectedGpu?.totalGB ?? 0)Also applies to: 85-89, 211-211, 251-251
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/mesh-llm-ui/src/features/configuration/components/NodeSection.tsx` at
line 59, The VRAMBar component has been updated to use system totals via
nodeSystemTotalGB(node), but containerFreeGB and related capacity calculations
still use rated totals, causing inconsistency between what the bar displays and
what the free capacity math shows. Replace all instances where rated total
values are used for calculating containerFreeGB and other capacity-related
fields with the system total value obtained from nodeSystemTotalGB(node). This
needs to be done in multiple locations throughout the NodeSection component (at
lines 59, 85-89, 211, and 251 as indicated) to ensure all capacity calculations
are consistent with the bar's system-based display.
Summary
rated_vram_gbandallocatable_vram_bytesin GPU JSON/status payloadsdocs/specs/vram-accounting.mdRoot cause
Driver/runtime byte counts were being formatted directly on user-facing surfaces, so rated 32 GB class hardware could appear as smaller-looking values such as 30.15. The code also lacked one shared vocabulary for rated, system-reported, reserved, and allocatable VRAM.
Display examples
CLI human output now shows rated VRAM:
CLI/API JSON keeps separate fields for display and fit math:
{ "vram_bytes": 40200896512, "rated_vram_gb": 40, "reserved_bytes": null, "allocatable_vram_bytes": 40200896512 }A 32 GB-class card with a 1 GiB reserve is represented distinctly:
{ "vram_bytes": 32000000000, "rated_vram_gb": 32, "reserved_bytes": 1073741824, "allocatable_vram_bytes": 30926258176 }Console display examples from a mocked 32 GB card whose legacy UI value would have been
30.15:Mesh VRAM 32.0 GB; the raw30.15value is absent.32 GB VRAM,30.9 GB usable, andreserved 1.1; fit/reserve math still uses allocatable capacity.Screenshots
Dashboard rated VRAM:
Configuration allocatable/reserved display:
Validation
just test-allSummary by CodeRabbit
New Features
Improvements
Documentation