Skip to content

fix(vram-display): clarify VRAM UX#892

Open
ndizazzo wants to merge 3 commits into
mainfrom
codex/vram-display-clarify-vram-ux
Open

fix(vram-display): clarify VRAM UX#892
ndizazzo wants to merge 3 commits into
mainfrom
codex/vram-display-clarify-vram-ux

Conversation

@ndizazzo

@ndizazzo ndizazzo commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • add shared Rust and UI helpers for rated, system-reported, reserved, and allocatable VRAM
  • display rated VRAM on CLI and console surfaces while keeping raw/system-reported bytes for fit math
  • expose rated_vram_gb and allocatable_vram_bytes in GPU JSON/status payloads
  • document VRAM accounting usage locations in docs/specs/vram-accounting.md

Root cause

Driver/runtime byte counts were being formatted directly on user-facing surfaces, so rated 32 GB class hardware could appear as smaller-looking values such as 30.15. The code also lacked one shared vocabulary for rated, system-reported, reserved, and allocatable VRAM.

Display examples

CLI human output now shows rated VRAM:

VRAM: 40 GB

CLI/API JSON keeps separate fields for display and fit math:

{
  "vram_bytes": 40200896512,
  "rated_vram_gb": 40,
  "reserved_bytes": null,
  "allocatable_vram_bytes": 40200896512
}

A 32 GB-class card with a 1 GiB reserve is represented distinctly:

{
  "vram_bytes": 32000000000,
  "rated_vram_gb": 32,
  "reserved_bytes": 1073741824,
  "allocatable_vram_bytes": 30926258176
}

Console display examples from a mocked 32 GB card whose legacy UI value would have been 30.15:

  • Dashboard shows Mesh VRAM 32.0 GB; the raw 30.15 value is absent.
  • Configuration shows 32 GB VRAM, 30.9 GB usable, and reserved 1.1; fit/reserve math still uses allocatable capacity.

Screenshots

Dashboard rated VRAM:

Dashboard showing rated 32.0 GB VRAM

Configuration allocatable/reserved display:

Configuration showing 32 GB rated VRAM with 30.9 GB usable and 1.1 reserved

Validation

  • just test-all

Summary by CodeRabbit

  • New Features

    • Added rated VRAM reporting to GPU inventory/status and benchmark outputs, including allocatable VRAM after reservations.
    • Introduced a formal VRAM accounting model surfaced across the UI and runtime capacity logic.
  • Improvements

    • VRAM calculations now consistently use system-reported totals minus reserved capacity for “usable/allocatable” decisions.
    • VRAM display and network/mesh metrics now prefer rated VRAM when available (with updated formatting for cleaner values).
    • Configuration and topology capacity math updated to reflect system vs allocatable semantics.
  • Documentation

    • Added a VRAM accounting specification covering rated, system-reported, and reserved VRAM and their propagation.

@coderabbitai

coderabbitai Bot commented Jun 21, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: a5cedb2d-5e28-4374-a529-12ac7e3dcb17

📥 Commits

Reviewing files that changed from the base of the PR and between 5491cd5 and 9334d54.

📒 Files selected for processing (1)
  • crates/mesh-llm-ui/src/features/configuration/lib/config-math.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/mesh-llm-ui/src/features/configuration/lib/config-math.ts

📝 Walkthrough

Walkthrough

Introduces a unified VRAM accounting model distinguishing three concepts: rated VRAM (user-facing capacity class), system-reported VRAM (raw driver bytes), and allocatable VRAM (system minus reserved). New utility modules in Rust (mesh-llm-system/src/vram.rs) and TypeScript (src/lib/vram.ts) implement rated-capacity selection via relative-error comparison. GPU JSON/API payloads gain rated_vram_gb and allocatable_vram_bytes fields. Runtime planning universally switches to allocatable VRAM for fit/split decisions. UI GPU types consolidate to GpuInfo[], display helpers prefer rated VRAM, and configuration math gains system-total and allocatable-GB functions.

Changes

VRAM Accounting Model

Layer / File(s) Summary
VRAM utility core: Rust and TypeScript libraries
crates/mesh-llm-system/src/lib.rs, crates/mesh-llm-system/src/vram.rs, crates/mesh-llm-ui/src/lib/vram.ts, crates/mesh-llm-ui/src/lib/vram.test.ts, docs/specs/vram-accounting.md
New VramCapacity struct and free functions in Rust compute allocatable bytes and select the nearest rated capacity class via decimal vs binary relative-error comparison, with string formatting. TypeScript mirrors the same logic with ratedVramGBFromBytes, per-GPU field selectors (gpuRatedVramGB, gpuSystemReportedVramGB, gpuReservedVramGB, gpuAllocatableVramGB), and formatting helpers. Both include unit tests. Spec doc defines the three VRAM concepts and cross-repo mapping.
GPU API/JSON enrichment
crates/mesh-llm-ui/src/lib/api/types.ts, crates/mesh-llm-host-runtime/src/api/status.rs, crates/mesh-llm-commands/src/gpus.rs, crates/mesh-llm-ui/src/features/app-tabs/types.ts
GpuInfo gains optional rated_vram_gb, allocatable_vram_bytes, makes idx/total_vram_gb optional, and StatusPayload gains my_is_soc. GpuEntry.rated_vram_gb is added and build_gpus populates both rated and allocatable fields. CLI gpu_json/gpu_benchmark_json emit the same new fields; format_vram delegates to format_rated_capacity. ConfigGpu gains systemTotalGB and allocatableGB.
Runtime capacity planning with allocatable VRAM
crates/mesh-llm-host-runtime/src/runtime/mod.rs, crates/mesh-llm-host-runtime/src/runtime/local.rs
StartupPinnedGpuTarget gains reserved_bytes: Option<u64> and allocatable_vram_bytes() delegating to mesh_llm_system::vram::allocatable_bytes. Preflight populates reserved_bytes from the resolved GPU. All six capacity-sensitive call sites in local.rs (Skippy config, local model start, split participant wait, split load settings, replan, and fit check) switch from vram_bytes() to allocatable_vram_bytes(). Test fixtures updated with reserved_bytes: None.
UI GPU type consolidation
crates/mesh-llm-ui/src/features/app-shell/lib/status-types.ts, .../topology-types.ts, crates/mesh-llm-ui/src/features/dashboard/components/details/NodeSidebar.tsx
Peer.gpus, StatusPayload.gpus, and TopologyNode.gpus replace inline object-array shapes with GpuInfo[]. NodeSidebarRecord.gpus changes to GpuInfo[] and formatGpuMemory receives the full gpu object instead of gpu.vram_bytes.
UI VRAM display helpers
crates/mesh-llm-ui/src/features/app-shell/lib/status-helpers.ts, .../status-helpers.test.ts, crates/mesh-llm-ui/src/features/network/api/status-adapter.ts, .../status-adapter.test.ts, crates/mesh-llm-ui/src/features/reserves/pages/ReservesPage.tsx, .../ReservesPage.test.tsx
gpuInventoryVramGb and gpuTotalVramGb switch to summing gpuRatedVramGB. formatGpuMemory becomes polymorphic. VRAM source precedence across peerVramGb, meshTotalVramGb, and nodeVramGB changes to prefer GPU-derived rated totals. Tests updated to assert rated values.
Configuration UI GPU adaptation and VRAM math
crates/mesh-llm-ui/src/features/configuration/api/config-adapter.ts, .../config-adapter.test.ts, crates/mesh-llm-ui/src/features/configuration/lib/config-math.ts, .../config-math.test.ts, crates/mesh-llm-ui/src/features/configuration/components/NodeSection.tsx
config-adapter.ts adds adaptGpuToConfigGpu, adaptLocalStatusToConfigNode (mapping my_is_socmemoryTopology: 'unified'), and prepends the local node to configuration.nodes. config-math.ts adds nodeSystemTotalGB, modelWeightsGB, containerAllocatableGB; rewrites nodeUsableGB to sum per-GPU allocatable GB; updates containerTotalGB and containerAvailableGB to use system totals and allocatable capacity. NodeSection.tsx passes systemTotalNodeGB and gpu.systemTotalGB to VRAMBar. Tests cover local-node inclusion, SOC topology, and system-total capacity.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.47% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(vram-display): clarify VRAM UX' directly addresses the main objective of the changeset, which is to improve VRAM display clarity and user experience across CLI, JSON, and console interfaces.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/vram-display-clarify-vram-ux

Comment @coderabbitai help to get the list of available commands and usage tips.

@ndizazzo ndizazzo force-pushed the codex/vram-display-clarify-vram-ux branch from 2b6d186 to f1245a6 Compare June 21, 2026 20:59
@ndizazzo ndizazzo force-pushed the codex/vram-display-clarify-vram-ux branch from f1245a6 to 3bcb389 Compare June 21, 2026 21:41
@ndizazzo ndizazzo marked this pull request as ready for review June 21, 2026 21:47
@ndizazzo ndizazzo self-assigned this Jun 21, 2026
@github-actions github-actions Bot requested a review from i386 June 21, 2026 21:47

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
crates/mesh-llm-host-runtime/src/runtime/mod.rs (1)

10369-10387: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Cover non-zero pinned-GPU reserves in this integration test.

This assertion now includes reserved_bytes, but it only verifies the None path. Setting one synthetic GPU reserve to Some(...) would lock the new GpuFacts.reserved_bytesStartupPinnedGpuTarget propagation and allocatable-capacity behavior.

Proposed test strengthening
     fn pinned_gpu_startup_preflight_uses_config_gpu_id() {
+        const RESERVED_BYTES: u64 = 1024 * 1024 * 1024;
+
         let options = runtime_options_for_test(&["mesh-llm"]);
         let config = plugin::MeshConfig {
             gpu: plugin::GpuConfig {
                 assignment: plugin::GpuAssignment::Pinned,
                 parallel: None,
@@
-        let gpus = vec![
+        let mut gpus = vec![
             synthetic_gpu(0, Some("pci:0000:65:00.0"), Some("CUDA0")),
             synthetic_gpu(1, Some("pci:0000:b3:00.0"), Some("CUDA1")),
         ];
+        gpus[0].reserved_bytes = Some(RESERVED_BYTES);
 
         preflight_config_owned_startup_models_with_gpus(&config, &specs, &mut plans, &gpus, None)
             .unwrap();
@@
                 stable_id: "pci:0000:65:00.0".into(),
                 backend_device: "CUDA0".into(),
                 vram_bytes: 24_000_000_000,
-                reserved_bytes: None,
+                reserved_bytes: Some(RESERVED_BYTES),
             })
         );
+        assert_eq!(
+            plans[0]
+                .pinned_gpu
+                .as_ref()
+                .expect("pinned GPU should be populated")
+                .allocatable_vram_bytes(),
+            24_000_000_000 - RESERVED_BYTES
+        );
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/mesh-llm-host-runtime/src/runtime/mod.rs` around lines 10369 - 10387,
The test in the preflight_config_owned_startup_models_with_gpus integration test
currently only verifies the case where reserved_bytes is None. To strengthen the
test coverage, modify one of the synthetic_gpu calls to include a non-zero
reserved value (using Some(...) instead of None for the reserve parameter), and
then add a corresponding assertion to verify that the reserved_bytes field in
the StartupPinnedGpuTarget is correctly propagated from the
GpuFacts.reserved_bytes field. This will ensure the integration test covers both
the None and Some paths for GPU reserves.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/mesh-llm-ui/src/features/configuration/components/NodeSection.tsx`:
- Line 59: The VRAMBar component has been updated to use system totals via
nodeSystemTotalGB(node), but containerFreeGB and related capacity calculations
still use rated totals, causing inconsistency between what the bar displays and
what the free capacity math shows. Replace all instances where rated total
values are used for calculating containerFreeGB and other capacity-related
fields with the system total value obtained from nodeSystemTotalGB(node). This
needs to be done in multiple locations throughout the NodeSection component (at
lines 59, 85-89, 211, and 251 as indicated) to ensure all capacity calculations
are consistent with the bar's system-based display.

In `@crates/mesh-llm-ui/src/features/configuration/lib/config-math.ts`:
- Around line 84-88: The newly added containerAllocatableGB function properly
handles allocatable capacity calculation including the pooled placement case,
but the existing containerAvailableGB function still uses the old calculation of
containerTotalGB minus containerReservedGB, which can overstate available
capacity. Update the containerAvailableGB function to use containerAllocatableGB
instead of the subtraction-based calculation to ensure accurate capacity
reporting. This fix should also be applied to any other related functions
mentioned in the review that perform similar capacity calculations.
- Around line 35-37: The gpuAllocatableGB function treats an explicit
allocatableGB value of 0 as missing by checking if explicit is greater than 0,
causing it to fall back to the computed fallback value. Instead of checking if
explicit is greater than 0, check whether allocatableGB was actually provided
(by verifying it is not NaN or checking for explicit presence) to preserve the
API contract that an explicit 0 is a valid value that should be respected and
not overridden with the computed systemTotalGB minus reservedGB fallback.

In
`@crates/mesh-llm-ui/src/features/dashboard/components/details/NodeSidebar.tsx`:
- Line 48: The React key generation for GPU rows in NodeSidebar is producing
duplicate keys when multiple GPUs share the same name and bandwidth but have
different or missing vram_bytes values. Locate the places where GPU row keys are
being generated (around the gpus array rendering and any other related GPU row
renders mentioned in the comment), and update the key generation logic to
include the array index or another unique identifier alongside the existing
properties (name, bandwidth, etc.) to ensure each GPU row has a stable, unique
key even when vram_bytes is optional or absent.

In `@crates/mesh-llm-ui/src/lib/vram.ts`:
- Around line 70-74: The gpuAllocatableVramGB function currently loses explicit
allocatable_vram_bytes values of 0 because decimalVramGBFromBytes treats 0 as
missing. Instead of relying on the return value of decimalVramGBFromBytes to
determine if an explicit value was provided, check whether
gpu.allocatable_vram_bytes is explicitly set (not null or undefined) before
conversion, and only fall back to the computed allocatable value when the
explicit bytes value is not provided.

---

Nitpick comments:
In `@crates/mesh-llm-host-runtime/src/runtime/mod.rs`:
- Around line 10369-10387: The test in the
preflight_config_owned_startup_models_with_gpus integration test currently only
verifies the case where reserved_bytes is None. To strengthen the test coverage,
modify one of the synthetic_gpu calls to include a non-zero reserved value
(using Some(...) instead of None for the reserve parameter), and then add a
corresponding assertion to verify that the reserved_bytes field in the
StartupPinnedGpuTarget is correctly propagated from the GpuFacts.reserved_bytes
field. This will ensure the integration test covers both the None and Some paths
for GPU reserves.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 63991583-73fd-4fd6-bdf6-1c20cfc7f662

📥 Commits

Reviewing files that changed from the base of the PR and between 0401bb0 and 3bcb389.

⛔ Files ignored due to path filters (2)
  • docs/specs/assets/vram-configuration-rated.png is excluded by !**/*.png
  • docs/specs/assets/vram-dashboard-rated.png is excluded by !**/*.png
📒 Files selected for processing (25)
  • crates/mesh-llm-commands/src/gpus.rs
  • crates/mesh-llm-host-runtime/src/api/status.rs
  • crates/mesh-llm-host-runtime/src/runtime/local.rs
  • crates/mesh-llm-host-runtime/src/runtime/mod.rs
  • crates/mesh-llm-system/src/lib.rs
  • crates/mesh-llm-system/src/vram.rs
  • crates/mesh-llm-ui/src/features/app-shell/lib/status-helpers.test.ts
  • crates/mesh-llm-ui/src/features/app-shell/lib/status-helpers.ts
  • crates/mesh-llm-ui/src/features/app-shell/lib/status-types.ts
  • crates/mesh-llm-ui/src/features/app-shell/lib/topology-types.ts
  • crates/mesh-llm-ui/src/features/app-tabs/types.ts
  • crates/mesh-llm-ui/src/features/configuration/api/config-adapter.test.ts
  • crates/mesh-llm-ui/src/features/configuration/api/config-adapter.ts
  • crates/mesh-llm-ui/src/features/configuration/components/NodeSection.tsx
  • crates/mesh-llm-ui/src/features/configuration/lib/config-math.test.ts
  • crates/mesh-llm-ui/src/features/configuration/lib/config-math.ts
  • crates/mesh-llm-ui/src/features/dashboard/components/details/NodeSidebar.tsx
  • crates/mesh-llm-ui/src/features/network/api/status-adapter.test.ts
  • crates/mesh-llm-ui/src/features/network/api/status-adapter.ts
  • crates/mesh-llm-ui/src/features/reserves/pages/ReservesPage.test.tsx
  • crates/mesh-llm-ui/src/features/reserves/pages/ReservesPage.tsx
  • crates/mesh-llm-ui/src/lib/api/types.ts
  • crates/mesh-llm-ui/src/lib/vram.test.ts
  • crates/mesh-llm-ui/src/lib/vram.ts
  • docs/specs/vram-accounting.md

const [dragKey, setDragKey] = useState<string | null>(null)
const open = !collapsed
const totalNodeGB = nodeTotalGB(node)
const systemTotalNodeGB = nodeSystemTotalGB(node)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

containerFreeGB still uses rated totals after switching bars to system totals.

VRAMBar now uses system capacity, but selectedTotalGB still uses rated totals, so the selected-model free-capacity path can disagree with what the bar and fit math show.

Suggested patch
-  const selectedTotalGB = node.placement === 'pooled' ? totalNodeGB : (selectedGpu?.totalGB ?? 0)
+  const selectedTotalGB =
+    node.placement === 'pooled'
+      ? systemTotalNodeGB
+      : (selectedGpu?.systemTotalGB ?? selectedGpu?.totalGB ?? 0)

Also applies to: 85-89, 211-211, 251-251

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/mesh-llm-ui/src/features/configuration/components/NodeSection.tsx` at
line 59, The VRAMBar component has been updated to use system totals via
nodeSystemTotalGB(node), but containerFreeGB and related capacity calculations
still use rated totals, causing inconsistency between what the bar displays and
what the free capacity math shows. Replace all instances where rated total
values are used for calculating containerFreeGB and other capacity-related
fields with the system total value obtained from nodeSystemTotalGB(node). This
needs to be done in multiple locations throughout the NodeSection component (at
lines 59, 85-89, 211, and 251 as indicated) to ensure all capacity calculations
are consistent with the bar's system-based display.

Comment thread crates/mesh-llm-ui/src/features/configuration/lib/config-math.ts Outdated
Comment thread crates/mesh-llm-ui/src/features/configuration/lib/config-math.ts
Comment thread crates/mesh-llm-ui/src/lib/vram.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants