RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation

Chengbo Yuan*, Suraj Joshi*, Shaoting Zhu*, Hang Su, Hang Zhao, Yang Gao.

[Project Website] [Arxiv] [Dataset] [BibTex]

RoboEngine is the first plug-and-play visual robot data augmentation toolkit (for both installation and usage). For the first time, users can effortlessly generate physics- and task-aware robot scenes with just a few lines of code. With RoboEngine, we achieve visual generalization and robustness in totally out-of-distribution scenes, with only data collecting in a single environment. Have fun !

Installation

We encapsulate RoboEngine into a plug-and-play toolkit package, you can install it simply by:

conda create -n roboengine python=3.10
conda activate roboengine
pip install -e .
pip install en_core_web_sm-3.8.0-py3-none-any.whl

The model checkpoints are managed via huggingface. They will be automatically downloaded for the first time you run RoboEngine. You can also download the checkpoints and datasets manually.

Segmentation Dataset	Segmentation Model	Inpainting Model
dataset link	checkpoint link	checkpoint link

If you also want to use "inpainting" baseline method mentioned in the paper, you will also need to install SAM2 from Meta official repository.

Quick Start

For flexibility, we decide to split RoboEngine into two segmentation and one augmentation wrappers. RoboEngine enables generative visual augmentation for robot video with only few lines. An examples is shown as below:

engine_robo_seg = RoboEngineRobotSegmentation()
engine_obj_seg = RoboEngineObjectSegmentation()
engine_bg_aug = RoboEngineAugmentation(aug_method='engine') 

video = imageio.get_reader(video_filepath)
instruction = "fold the towels"
image_np_list = [frame for frame in video]

robo_masks = engine_robo_seg.gen_video(image_np_list)
obj_masks = engine_obj_seg.gen_video(image_np_list, instruction) 
masks = ((robo_masks + obj_masks) > 0).astype(np.float32)
masks_np_list = [mask for mask in masks]
aug_video = engine_bg_aug.gen_image_batch(image_np_list, masks_np_list)

Run examples/examples.py for more details. The generated results should like this:

API Details

You could check the details of input parameters of RoboEngineRobotSegmentation, RoboEngineObjectSegmentation and RoboEngineAugmentation in robo_engine/infer_engine.py. Here are some examples:

# The API of object segmentation is the same with robot segmentation
engine_robo_seg = RoboEngineRobotSegmentation()
engine_robo_seg.gen_image(image_np, prompt="robot", preset_mask=None)
engine_robo_seg.gen_video(image_np_list, prompt="robot", anchor_frequency=8)

# For augmentation, we recommand: 'engine' and 'texture' mode. 
# Also support 'imagenet', 'background', 'inpainting', 'black' mode.
engine_bg_aug = RoboEngineAugmentation(aug_method='engine')  
engine_bg_aug.gen_image(image_np, mask_np, 
                        tabletop=False, 
                        prompt=None, 
                        num_inference_steps=5
                        )
engine_bg_aug.en_image_batch(image_np_list, mask_np_list, 
                             batch_size=16, 
                             tabletop=False,
                             prompt=None, 
                             num_inference_steps=5
                             )

More Resources

(1) For RoboSeg dataset download and usage, please following this link.

(2) For segmentation model training and evaluation, please first put RoboSeg dataset in some folder ${Dataset_DIR}, then:

cd robo_engine/robo_sam
python finetune.py --mode robot --dataset_dir ${Dataset_DIR} --save_dir ${SAVE_DIR}
python eval.py --mode robot --eval_type ${EVAL_SET} --version ${SAVE_CKPT_DIR}

where ${EVAL_SET} could be "test" or "zero", meaning using test-set or zero-shot set downloaded from internet.

(3) For policy training and deployment, please refer to Diffusion Policy and DROID Platform.

Acknowledgment

This repository is based on the code from EVF-SAM, PBG-Diffusion, SKIL, ISAT-SegTool, GreenAug, DIL-ScalingLaw, GeneralFlow, DROID Platform and SAM2. We sincerely appreciate their contribution to the open-source community, which have significantly supported this project.

Citation

If you find this repository useful, please kindly acknowledge our work :

@article{yuan2025roboengine,
  title={RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation},
  author={Yuan, Chengbo and Joshi, Suraj and Zhu, Shaoting and Su, Hang and Zhao, Hang and Gao, Yang},
  journal={arXiv preprint arXiv:2503.18738},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
examples		examples
robo_engine.egg-info		robo_engine.egg-info
robo_engine		robo_engine
LICENSE		LICENSE
README.md		README.md
en_core_web_sm-3.8.0-py3-none-any.whl		en_core_web_sm-3.8.0-py3-none-any.whl
requirements.txt		requirements.txt
setup.py		setup.py
spacy_test.py		spacy_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation

Installation

Quick Start

API Details

More Resources

Acknowledgment

Citation

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

michaelyuancb/roboengine

Folders and files

Latest commit

History

Repository files navigation

RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation

Installation

Quick Start

API Details

More Resources

Acknowledgment

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages