feat: defaulting CPU offload + GPU OOM fix + viewer PLY export#82
Open
nvandamme wants to merge 1 commit into
Open
feat: defaulting CPU offload + GPU OOM fix + viewer PLY export#82nvandamme wants to merge 1 commit into
nvandamme wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Improvements for long-sequence inference stability and visualization flexibility: CPU offloading enabled by default, GPU OOM prevention via KV cache/memory management fixes, new PLY export in the point cloud viewer and some viewer UI fixes to avoid browser freeze when loading long-sequence inference results (+3000 frames). Additionally,
pyproject.tomlis now compatible withuvtooling as an alternative to conda.demo.py— Default CPU Offload + CPU-Friendly Warmup--offload_to_cputoTrue(wasFalse) — per-frame predictions offload to CPU by default to cut GPU peak memory--no-offload_to_cpuis used with windowed mode or >512 frames_warm_streaming()now moves only the warmup slices (not entire video) to device usingnon_blocking=True, preventing long videos from becoming persistent GPU residents before inference startspathlib.Path; deprecatedImage.ROTATE_270→Image.Transpose.ROTATE_270lingbot_map/models/gct_stream_window.py— KV Cache + Memory Fixescastvariables to ensure correct attribute resolution when the model is offloaded (fixeshasattr(self.aggregator, ...)failures on CPU-offloaded models)view_graphs,causal_graphs,ordered_video) fromforward_aggregatorimmediately after receiving them to prevent GPU memory retention during streaming inference_rollback_last_frame(): applycastvariables consistently across FlashInfer rollback, SDPA cache trimming, camera head rollback, and frame counter decrement — fixes corruption when rolling back on CPU-offloaded models_execute_deferred_eviction(): same cast pattern for deferred eviction executiondel scale_images,del window_pred,del window_imagesafter each window to prevent accumulation_align_and_stitch_windows()with incremental_merge_window_pred(): merges predictions on-the-fly instead of storing all windows then aligning, reducing peak memory from O(n×S) to O(window_size)chunk_scales,chunk_transforms,alignment_mode="scaled") when merged window count > 1lingbot_map/vis/point_cloud_viewer.py— Rendering Overhaul + PLY Exportmatplotlib.cm(cm.get_cmap,matplotlib.cm.ScalarMappable) with modernfrom matplotlib import colormapsandcolormaps.get_cmap()ValueErrorwhen depth predictions are required but missing (use_point_map=False) and when confidence is needed for sky maskingself.pc_handles,self.vis_pts_list) with single active handle (self.active_point_handle) + render caching — reduces browser scene graph size from O(n_frames) to O(history_window)gui_history_framesslider (0–200) andgui_max_render_pointsslider (50k–1M) for performance control on large sequencesPlaying=False) to prevent immediate browser-side rendering of large point clouds.plyfile export with normals, per-point confidence/frame-index/UV coordinates, camera metadata as extra elements, and scene alignment — triggered via new "Export PLY" button alongside existing GLB export under renamed "Export 3D" sectionlingbot_map/vis/glb_export.py— Refactoring + PLY Export Functionstrimeshviaimportlib.util.find_spec()for cleaner dependency handling (still prints warning on missing)_prepare_export_point_cloud(): shared point-cloud preparation logic used by both GLB and PLY exporters — handles filtering, sky masking, confidence thresholding, background masking, color coercion (_coerce_colors_to_uint8), and scene scale computationpredictions_to_ply(): high-level export function mirroringpredictions_to_glb()API with same filtering options (confidence threshold, frame filter, sky mask, background masking)save_point_cloud_to_ply(): writes binary little-endian PLY files supporting normals, per-point alpha, and arbitrary extra vertex properties/elementsapply_scene_alignment_to_vertices(),apply_scene_alignment_to_directions(),get_scene_alignment_transform()— enables aligning vertices/normals outside of a trimesh Scene context (needed for PLY export)Configuration Files
.gitignore: Adduv.lock,*.ply,*.glb,*.7z,export_data.bin,lingbot-map-long.ptpyproject.toml: Python version pinned to~=3.12.0; addtorch>=2.12.1,torchvision>=0.27.1,flashinfer-python,pip>=26.1.2,flashinfer-cubin>=0.6.13; switch fromonnxruntimetoonnxruntime-gpuREADME.md: Addeduvtooling alternative section mirroring conda step-by-step structure (uv venv --python 3.12,uv sync, optional--group vis/--group demo)