Humanoid-GPT is the first GPT-style humanoid motion Transformer trained with causal attention on a billion-scale motion corpus for whole-body control. Unlike prior shallow MLP trackers constrained by scarce data and an agility–generalization trade-off, Humanoid-GPT is pre-trained on a 2B-frame retargeted corpus that unifies all major mocap datasets with large-scale in-house recordings.
🔬 Key Contributions
- Billion-Scale Pre-Training: First to scale humanoid motion learning to 2B frames
- GPT-Style Architecture: Causal Transformer with Rotary Position Embeddings (RoPE)
- Zero-Shot Generalization: Track arbitrary unseen motions without fine-tuning
| Feature | Description |
|---|---|
| 🧠 Architecture | Causal Transformer with RoPE, supporting variable-length motion sequences |
| 📊 Scale | Pre-trained on 2B motion frames from unified mocap datasets |
| 🎯 Zero-Shot | Unprecedented generalization to unseen motions and tasks |
| 🤖 Platform | Optimized for Unitree G1 humanoid robot (29 DOF whole-body) |
| ⚡ Speed | GPU-accelerated simulation with MuJoCo-MJX |
The following features are planned for future implementation:
- ✅ Inference & deployment code.
- ✅ Pre-trained model checkpoints (
[storage/ckpts/pns_wo_priv216.onnx](storage/ckpts/pns_wo_priv216.onnx)). - Training code.
- Training data.
- NVIDIA GPU with CUDA 12.x
- MacOS is also supported for testing if you skip jax[cuda12] and use mjpython (e.g.
mjpython -m scripts.app). - Conda / Miniconda
git clone https://github.com/qizekun/Humanoid-GPT.git
cd Humanoid-GPT
conda create -n h-gpt python=3.12 -y
conda activate h-gpt
pip install -e ".[cuda]" # or ".[cpu]" on MacOS, or "." for real robot deploy-onlyOn MacOS, use mjpython instead of python for the MuJoCo viewer (e.g. mjpython -m scripts.app).
We support multiple Unitree G1 hardware versions via the G1_VERSION env var (default 5010). The asset folder storage/assets/unitree_g1_${G1_VERSION}/ is selected automatically:
G1_VERSION=5010 python -m scripts.inference ... # default: 5010A pre-trained tracking policy (.onnx) and a sample trajectory under
storage/test/ are all you need to get started.
# Interactive Gradio demo
python -m scripts.app
# Track a single motion / a folder of motions
python -m scripts.inference --load_path storage/ckpts/pns_wo_priv216.onnx --mocap_path storage/test
# Parallel evaluation over a folder of trajectories
python -m scripts.eval_parallel --load_path storage/ckpts/pns_wo_priv216.onnx \
--mocap_path storage/test --workers 32 --privileged
# Visualize a reference trajectory
python -m scripts.vis --mocap_path storage/testThe expected motion format is a .npz containing either qpos directly, or
root_pos / root_rot / dof_pos arrays. To convert retargeted mocap into
the keypoint representation the policy consumes:
python tracking/convert_qpos2kpt.py --mocap_npz <mocap_path.npz> --debug # single file (debug viz)
python tracking/convert_parallel.py --src_dir <in_dir> --save_dir <out_dir> --num_workers 32Deployment on Unitree G1 is split into sub-modules under deploy/ — start with
deploy/DEPLOY.md for install / SDK setup, then:
# Simulation
python -m deploy.play_track --track-dir storage/test
# Real robot
python -m deploy.play_track --real --net <nic_name>- 🖥️
onboard_deploy/— on-board (Jetson Orin) deploy. - 🖥️
onboard_deploy_wo_GMR/— on-board variant that streams retargeting from a host. - ✋
brainco/— BrainCo dexterous-hand tracking variant.
Humanoid-GPT/
├── 📂 tracking/ # Inference core: constants, infer_utils, ONNX policy wrapper (policy.py),
│ # keypoint conversion (convert_qpos2kpt.py) and tracking metrics
├── 📂 scripts/ # inference.py · eval_parallel.py · vis.py · app.py (gradio demo)
├── 📂 deploy/ # Real-robot deployment — see deploy/DEPLOY.md
│ ├── onboard_deploy/ # On-board (Jetson) SSH deployment
│ ├── onboard_deploy_wo_GMR/ # On-board variant with host-side retargeting
│ └── brainco/ # BrainCo dexterous-hand tracking variant
├── 📂 projects/ # Optional side modules
│ ├── hme/ # Harmonic Motion Encoder (Periodic Autoencoder)
│ ├── gqs/ # General Quality Selection (physics + diversity scoring)
│ └── tracking_transformer/ # Transformer tracking policy (inference / deploy)
├── 📂 utils/ # MuJoCo / MJX simulation, transforms, video rendering
└── 📂 storage/ # Assets, configs, sample trajectory, released checkpoints
@article{humanoid-gpt26,
title = {Humanoid-GPT: Humanoid Generative Pre-Training for Zero-Shot Motion Tracking},
author = {Qi, Zekun and Chen, Xuchuan and others},
journal = {arXiv preprint arXiv:2606.03985},
year = {2026}
}Licensed under Apache 2.0. Built on top of MuJoCo, Brax and the Unitree G1 platform.
