Skip to content

feat: defaulting CPU offload + GPU OOM fix + viewer PLY export#82

Open
nvandamme wants to merge 1 commit into
Robbyant:mainfrom
nvandamme:ply-test
Open

feat: defaulting CPU offload + GPU OOM fix + viewer PLY export#82
nvandamme wants to merge 1 commit into
Robbyant:mainfrom
nvandamme:ply-test

Conversation

@nvandamme

Copy link
Copy Markdown

Improvements for long-sequence inference stability and visualization flexibility: CPU offloading enabled by default, GPU OOM prevention via KV cache/memory management fixes, new PLY export in the point cloud viewer and some viewer UI fixes to avoid browser freeze when loading long-sequence inference results (+3000 frames). Additionally, pyproject.toml is now compatible with uv tooling as an alternative to conda.


demo.py — Default CPU Offload + CPU-Friendly Warmup

  • Default --offload_to_cpu to True (was False) — per-frame predictions offload to CPU by default to cut GPU peak memory
  • Add CUDA OOM warning when --no-offload_to_cpu is used with windowed mode or >512 frames
  • CPU-friendly warmup: _warm_streaming() now moves only the warmup slices (not entire video) to device using non_blocking=True, preventing long videos from becoming persistent GPU residents before inference starts
  • Images no longer moved to GPU before model pipeline starts — freed memory before inference
  • Cross-platform path handling via pathlib.Path; deprecated Image.ROTATE_270Image.Transpose.ROTATE_270

lingbot_map/models/gct_stream_window.py — KV Cache + Memory Fixes

  • Refactor aggregator/camera head access through local cast variables to ensure correct attribute resolution when the model is offloaded (fixes hasattr(self.aggregator, ...) failures on CPU-offloaded models)
  • Delete unused tensor parameters (view_graphs, causal_graphs, ordered_video) from forward_aggregator immediately after receiving them to prevent GPU memory retention during streaming inference
  • Fix _rollback_last_frame(): apply cast variables consistently across FlashInfer rollback, SDPA cache trimming, camera head rollback, and frame counter decrement — fixes corruption when rolling back on CPU-offloaded models
  • Fix _execute_deferred_eviction(): same cast pattern for deferred eviction execution
  • Add explicit memory deletion in windowed inference loop: del scale_images, del window_pred, del window_images after each window to prevent accumulation
  • Replace old _align_and_stitch_windows() with incremental _merge_window_pred(): merges predictions on-the-fly instead of storing all windows then aligning, reducing peak memory from O(n×S) to O(window_size)
  • Add alignment metadata (chunk_scales, chunk_transforms, alignment_mode="scaled") when merged window count > 1

lingbot_map/vis/point_cloud_viewer.py — Rendering Overhaul + PLY Export

  • Replace deprecated matplotlib.cm (cm.get_cmap, matplotlib.cm.ScalarMappable) with modern from matplotlib import colormaps and colormaps.get_cmap()
  • Input validation: raise ValueError when depth predictions are required but missing (use_point_map=False) and when confidence is needed for sky masking
  • Replace per-frame point cloud handles (self.pc_handles, self.vis_pts_list) with single active handle (self.active_point_handle) + render caching — reduces browser scene graph size from O(n_frames) to O(history_window)
  • New frame visibility system: "Current Frame Only" (history=0) / "Recent Frame History" (default=20 frames) instead of old 4D/3D toggle buttons
  • Add gui_history_frames slider (0–200) and gui_max_render_points slider (50k–1M) for performance control on large sequences
  • Start paused (Playing=False) to prevent immediate browser-side rendering of large point clouds
  • PLY export: full .ply file export with normals, per-point confidence/frame-index/UV coordinates, camera metadata as extra elements, and scene alignment — triggered via new "Export PLY" button alongside existing GLB export under renamed "Export 3D" section

lingbot_map/vis/glb_export.py — Refactoring + PLY Export Functions

  • Soft-import trimesh via importlib.util.find_spec() for cleaner dependency handling (still prints warning on missing)
  • Extract _prepare_export_point_cloud(): shared point-cloud preparation logic used by both GLB and PLY exporters — handles filtering, sky masking, confidence thresholding, background masking, color coercion (_coerce_colors_to_uint8), and scene scale computation
  • Add predictions_to_ply(): high-level export function mirroring predictions_to_glb() API with same filtering options (confidence threshold, frame filter, sky mask, background masking)
  • Add save_point_cloud_to_ply(): writes binary little-endian PLY files supporting normals, per-point alpha, and arbitrary extra vertex properties/elements
  • Add scene alignment helpers for standalone use: apply_scene_alignment_to_vertices(), apply_scene_alignment_to_directions(), get_scene_alignment_transform() — enables aligning vertices/normals outside of a trimesh Scene context (needed for PLY export)

Configuration Files

  • .gitignore: Add uv.lock, *.ply, *.glb, *.7z, export_data.bin, lingbot-map-long.pt
  • pyproject.toml: Python version pinned to ~=3.12.0; add torch>=2.12.1, torchvision>=0.27.1, flashinfer-python, pip>=26.1.2, flashinfer-cubin>=0.6.13; switch from onnxruntime to onnxruntime-gpu
  • README.md: Added uv tooling alternative section mirroring conda step-by-step structure (uv venv --python 3.12, uv sync, optional --group vis / --group demo)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant