ROP-RAS3 is a fast online POMDP planner that searches for approximately optimal robot actions under transition and observation uncertainties. Its two key ingredients are:
- A transformation of the POMDP value function that allows the optimal solution, with respect to a reference policy, to be computed partially analytically. This transformation reduces the numerical computation in solving POMDPs to estimating expectations, which in turn enables the use of fast Monte Carlo techniques for estimation (rather than a combination of estimation and optimisation techniques) in POMDP planning.
- Vector-accelerated sampling-based motion planning VAMP, which can compute sampling-based deterministic motion plans extremely quickly. This speed allows the rapid generation of diverse sets of macro-actions based on state-space information, forming reference policies that bias the search in the belief space.
Above videos demonstrate ROPRAS-3 handling complex (requiring lookahead > 3000 steps) planning problems. The robot doesn't fully observe its own configurations and the poses of the cylinders. It initially aims to gather information about the cylinders. Once it is more certain about the environment, as the belief particles (showing as yellow ghost objects) sharpen around the true states, it proceeds to remove only necessary obstacles, leaving a valid path to retrieve the target cylinder (initially at the back) and put it at the intended location at the top of the shelf.
ROP-RAS3 is built based on the pomdp-py and vamp repos.
We recommond using python==3.10 and use uv for mananging virtual environments and python packages.
This installation instruction assumes the following packages are already installed: cmake, gcc/g++, and git.
Once you have uv installed, in the project root directory, create a virtual environment,
uv venv --python 3.10Install python dev packages,
sudo apt install python3-dev libeigen3-devTo install POMDP planning components,
uv pip install Cython
cd pomdp-py
uv pip install -e .To install vamp,
cd vamp
uv pip install -e .Finally, make sure your virtual environment is activated before running,
source .venv/bin/activateFor manipulation tasks, make sure you have compiled the panda arm for vamp,
python vamp/resources/problem_tar_to_pkl_json.py --robot pandaTo run an experiment with visualisations on:
python scripts/run_scripts.py --visualise --runs 1 --problem_name maze2dAvailable problems include maze2d, light_dark, random3d, sphere_search, capture, table_ray_pick, shelf_search.
Another example of running shelf-search only once with 100 simulations per planning step and save the video:
python scripts/run_scripts.py --problem_name shelf_search --runs 1 --visualise --file_logging --num_sims 100 --video_recordingWarning: currently the --video_recording only saves everything under the same name, so it will overwrites existing ones.
Other command line arguments include,
--file_logging: record experiment statistics in the problem folder (e.g. pomdp-py/pomdp_py/problems/shelf_search/log)--runs: number of runs for this problem--problem_name: select which problem to use--steps: maximum episode steps for the problem--planner_name: which planner to use, default ROP-RAS3,--time_out: the maximum planning time per step, -1 for unlimited, but num_sims need to be set, see below--no_pomdp: while selecting ROP-RAS3, this command line turns off the POMDP planning, we obtain belief VAMP.--ref_policy_heuristic: select between "uniform" or "entropy"--finite_ref_actions: only useful when the planner is POMCP, it turns on the RPOMCP baseline--video_recording: whether record the videos for each run--num_sims: number of simulated episodes to sample
All vamp related stuffs are organised under vamp/ and all pomdp planning related stuffs are organised under pomdp-py/.
Within pomdp-py the file structure is,
pomdp_py/
├── __init__.py
├── __main__.py
├── algorithms/
│ ├── ... # POMDP planning algorithm, ROPRAS3 implemented in here
├── framework/
│ ├──... # base templates
├── problems/
│ ├── ... # problem definitions, each in their own subdirectories
├── representations/
│ ├── ... # particle beliefs and distribution functions
└── utils/
├── ... # auxillary helper functions
There are two seperate running pipeling for navigation tasks and manipulation tasks respectively. They are located in pomdp-py/pomdp_py/problems/motion_planning/runner.py and pomdp-py/pomdp_py/problems/sphere_search/runner/runner.py.
You can create new environment but initialising a new folder in the pomdp-py/pomdp_py/problems/ directory.
Here is our recommended build. The following components are needed within your problem folder,
domain/: this should include all the files that define your POMDP models, including actions, states, observations, the observation model, the transition model and reward classes. In addition, it should contain a reference_policy_model to build your own reference policy by querying fromvamp.env/: this defines the actual physical environment, including obstacles, robots, walls etc.problem.pyfile is the entroy point of running the experiment, it should put everything together, including the planning and the experiment pipeline.utils: include any helper classes and functions, typically the particle reinvigorator.
ROPRAS uses particle beliefs and sampling important resampling (SIR) to update the belief in between planning steps. A key trade off is the number of particles, more particles means the approximation is closer to the true distribution, but the SIR updates takes a lot longer. Typically, we found 500 to 1000 particles work well. Yet, particle depletion may occur. In the examples we provided, this depletion is most likely due to the fact that the true state of the robot is near a terminal state (danger zone, goal etc), so all particles are terminated, but the true state is not. Hence, we also implement particle reinvigorations in each problem to genereate new particles when necessary.
In general, we found setting the search depth to be roughly the number of actions needed to reach the goal in deterministic settings is a good starting point. Optimal search depth may need to be a bit larger than this to take uncertainties into account.
For environments with very long horizon, a discount factor of 0.99 is generally not enough, set the discount to be 0.999 or even higher but below 1.
We also support reading logged files and replay them in pybullet,
python scripts/log_replay.py --problem [problem for the log] --log [the intended log file path] --video_recordingIf you found this research useful for your own work, please use the following citation:
@inproceedings{Liang2026Thinking,
title = {Think Fast and Far: Long-Horizon Online POMDP Planning via Rapid State Sampling},
author = {Yuanchu Liang and Edward Kim and J.Arden Knoll and Wil Thomason and Zachary Kingston and Lydia E. Kavraki and Hanna Kurniawati},
year = 2026,
booktitle = {International Journal of Robotics Research (to appear)}
}@inproceedings{Liang2024Scaling,
title = {Scaling Long-Horizon Online {POMDP} Planning via Rapid State Space Sampling},
author = {Yuanchu Liang and Edward Kim and Wil Thomason and Zachary Kingston and Hanna Kurniawati and Lydia E. Kavraki},
year = 2024,
booktitle = {International Symposium on Robotics Research}
}