Ranran Huang · Weixun Luo · Ye Mao · Krystian Mikolajczyk
NAS3R is a self-supervised feed-forward framework that jointly learns explicit 3D geometry and camera parameters with no ground-truth annotations and no pretrained priors.
Table of Contents
- Clone NAS3R.
git clone --recurse-submodules git@github.com:ranrhuang/NAS3R.git
cd NAS3R- Create the environment, here we show an example using conda.
conda create -n nas3r python=3.11 -y
conda activate nas3r
pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt --no-build-isolation
pip install -e submodules/diff-gaussian-rasterization --no-build-isolationOur models are hosted on Hugging Face 🤗
| Model name | Training resolutions | Training data | Training settings |
|---|---|---|---|
| re10k_nas3r.ckpt | 256x256 | re10k | 2 views |
| re10k_nas3r_multiview.ckpt | 256x256 | re10k | 2-10 views |
| re10k_nas3r_pretrained.ckpt | 256x256 | re10k | 2 views, initialized by VGGT |
| re10k_nas3r_pretrained-I.ckpt | 256x256 | re10k | 2 views, initialized by VGGT, incorporate GT intrinsics |
We assume the downloaded weights are located in the checkpoints directory.
Please refer to DATASETS.md for dataset preparation.
# 2 view on NAS3R (VGGT-based architecture), for multi-view training, modify view_sampler.num_context_views
python -m src.main +experiment=nas3r/random/re10k wandb.mode=online wandb.name=nas3r_re10k
# Initialized by pretrained VGGT weights for better performance and stability.
python -m src.main +experiment=nas3r/pretrained/re10k wandb.mode=online wandb.name=nas3r_re10k_pretrained
# Initialized by pretrained VGGT weights and incorporate GT intrinsics for better performance and stability.
python -m src.main +experiment=nas3r/pretrained/re10k-I wandb.mode=online wandb.name=nas3r_re10k_pretrained-I
# RealEstate10K on NAS3R
python -m src.main +experiment=nas3r/random/re10k mode=test wandb.name=re10k \
dataset/view_sampler@dataset.re10k.view_sampler=evaluation \
dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json \
checkpointing.load=./checkpoints/re10k_nas3r.ckpt \
test.save_image=false
# RealEstate10K on NAS3R, 10 view
python -m src.main +experiment=nas3r/random/re10k mode=test wandb.name=re10k \
dataset/view_sampler@dataset.re10k.view_sampler=evaluation \
dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json \
dataset.re10k.view_sampler.num_context_views=10 \
checkpointing.load=./checkpoints/re10k_nas3r_multiview.ckpt \
test.save_image=false
# RealEstate10K on NAS3R pretrained from VGGT
python -m src.main +experiment=nas3r/random/re10k mode=test wandb.name=re10k \
dataset/view_sampler@dataset.re10k.view_sampler=evaluation \
dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json \
checkpointing.load=./checkpoints/re10k_nas3r_pretrained.ckpt \
test.save_image=false
# RealEstate10K on NAS3R pretrained from VGGT, incorporate GT intrinsics
python -m src.main +experiment=nas3r/random/re10k-I mode=test wandb.name=re10k \
dataset/view_sampler@dataset.re10k.view_sampler=evaluation \
dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json \
checkpointing.load=./checkpoints/re10k_nas3r_pretrained-I.ckpt \
test.save_image=false
We follow the pixelSplat camera system. The camera intrinsic matrices are normalized (the first row is divided by image width, and the second row is divided by image height). The camera extrinsic matrices are OpenCV-style camera-to-world matrices ( +X right, +Y down, +Z camera looks into the screen).
This project is built upon these excellent repositories:SPFSplatV2, SPFSplat, NoPoSplat, pixelSplat, DUSt3R, and CroCo. We thank the original authors for their excellent work.
@article{huang2026nas3r,
title={From None to All: Self-Supervised 3D Reconstruction via Novel View Synthesis} ,
author={Ranran Huang and Weixun Luo and Ye Mao and Krystian Mikolajczyk},
journal={arXiv preprint arXiv: 2603.27455},
year={2026}
}