Skip to content

Latest commit

 

History

History
83 lines (58 loc) · 2.95 KB

File metadata and controls

83 lines (58 loc) · 2.95 KB

LabelAny3D: Label Any Object 3D in the Wild

Jin Yao, Radowan Mahmud Redoy, Sebastian Elbaum, Matthew B. Dwyer, Zezhou Cheng

Website Paper

Samples from COCO3D dataset COCO3D samples

COCO3D Dataset

The evaluation set of COCO3D and pseudo-labeled training set are available at Hugging Face.

3D BBox Human Refinement Interface

We release the source code for the refinement interface at https://github.com/UVA-Computer-Vision-Lab/3d_annotator.

Getting Started

📦 Installation Guide - Setup instructions and external dependencies

📖 COCO Pipeline Guide - Run the pipeline on COCO dataset

🔧 OVMono3D Fine-tuning - Code for fine-tuning OVMono3D on LabelAny3D pseudo annotations

Citing

If you find this work useful for your research, please kindly cite:

@inproceedings{yao2025labelany3d,
  title={LabelAny3D: Label Any Object 3D in the Wild},
  author={Jin Yao and Radowan Mahmud Redoy and Sebastian Elbaum and Matthew B. Dwyer and Zezhou Cheng},
  booktitle={Neural Information Processing Systems (NeurIPS)},
  year={2025}
}

@inproceedings{yao2025open,
  title={Open Vocabulary Monocular 3D Object Detection},
  author={Yao, Jin and Gu, Hao and Chen, Xuweiyi and Wang, Jiayun and Cheng, Zezhou},
  booktitle={Proceedings of the International Conference on 3D Vision (3DV)},
  year={2026}
}

Acknowledgements

This work builds on many open-source projects:

  • Gen3DSR - 3D reconstruction framework
  • TRELLIS - 3D asset generation
  • MoGe - Monocular geometry estimation
  • DepthPro - Metric depth estimation
  • MASt3R - Dense matching
  • InvSR - Image super-resolution
  • COCONUT - COCO segmentation annotations
  • OVMono3D - Open vocabulary monocular 3D detection

License

This project is licensed under the MIT License.