This repository contains the BAGEL-World-Model code used by VLA-MBPO.
This repository is intended to use an independent uv environment from the VLARLKit root environment. You need to ensure VLARLKit has already installed.
cd VLARLKit/third_party/
git submodule update --init "BAGEL"
uv sync
uv pip install flash_attn==2.5.8 --no-build-isolationIf you encounter any compilation errors when installing flash_attn, you can install a precompiled wheel instead of building from source.
-
Go to the official release page:
https://github.com/Dao-AILab/flash-attention/releases -
Download the wheel that matches your environment, typically:
- CUDA 12.4
- PyTorch 2.5
- Python 3.10 (cp310)
-
After downloading the wheel, install it with:
uv pip install <wheel_file>
The model checkpoints and datasets are not bundled in this repository yet. Before running the full training/inference workflow, prepare the following local paths:
| Artifact | Description | Download Link |
|---|---|---|
| Bagel-WM-ckpt | Fine-tuned world model checkpoints | Coming soon |
| Datasets for finetuning BAGEL | LIBERO and LeRobot datasets for finetuning BAGEL as WMs | Coming soon |
| Datasets for branch rollouts | Datasets for performing branch rollouts to train MBRL | Coming soon |
World-model training scripts are under scripts/:
bash scripts/train_libero.shFirst, set each loading path in config file VLARLKit/examples/configs/libero_goal_vla_mbpo.yaml,
branch_dataset_root: <your download data dir>
model:
model_path: "<your download path>/RLinf-Pi05-LIBERO-SFT"
data:
assets_dir: "<your download path>/RLinf-Pi05-LIBERO-SFT"
world_model:
load_model_path: <your download world model ckpt path>Now, you can lanuch the script to run!
# assume you are at VLARLKit root path
bash examples/run_vla_mbpo.shIf you find this code useful, please cite:
@article{zhang2026vlambpo,
title={Towards Practical World Model-based Reinforcement Learning for Vision-Language-Action Models},
author={Zhang, Zhilong and Ren, Haoxiang and Sun, Yihao and Sheng, Yifei and Wang, Haonan and Lin, Haoxin and Wu, Zhichao and Bacon, Pierre-Luc and Yu, Yang},
journal={arXiv preprint arXiv:2603.20607},
year={2026}
}This codebase builds on Bagel, UniPlan stack. We thank the authors and maintainers of these projects.

