A collection of algorithms and experiment tools for safe sim to real transfer and learning in robotics.
- Three different CMDP solvers, CRPO, Saute-RL and primal-dual, compatible with (variants of) Brax's SAC, MBPO and PPO.
- Algorithm implementation is interchangeable between training in simulation to training on real robots via OnlineEpisodeOrchestrator. Check out
rccar_experimentsfor a full example. Support for training online on any real robot supported by MuJoCo Playground, including Unitree Go1/2. - Fast training. Full compatibility with MuJoCo Playground. Reimplementation of OpenAI's Safety Gym in MJX and safety tasks from Real-World RL suite.
- Python == 3.11.6
uv(recommended) or the built-invenv
git clone https://github.com/yardenas/safe-learning
cd safe-learning
python3 -m venv venv
source venv/bin/activate
pip install -e .Install uv if it is not already available:
curl -LsSf https://astral.sh/uv/install.sh | shCreate a project environment and install dependencies:
git clone https://github.com/yardenas/safe-learning
cd safe-learning
uv sync
uv run python --version # sanity check, optionalSome benchmarks (e.g., the MJX-based pick-and-place tasks) require the custom
madrona_mjx fork. Build and
install it inside the UV environment you created above:
-
From the parent directory of
safe-sim2real, clone the repository and check out the tested commit:git clone https://github.com/shacklettbp/madrona_mjx.git cd madrona_mjx git checkout c34f3cf6d95148dba50ffeb981aea033b8a4d225 git submodule update --init --recursive -
Configure and build (disable Vulkan if you do not have it available):
mkdir -p build cd build cmake -DLOAD_VULKAN=OFF .. cmake --build . -j cd ..
-
While having your environment activated, install the Python bindings into your UV environment:
uv pip install -e .
Refer to the upstream repository for platform-specific prerequisites (CUDA,
Vulkan, compiler versions). Re-run uv pip install -e . whenever you rebuild the
library.
Troubleshooting tips
-
If you see CUDA OOMs immediately after the build, try
export MADRONA_DISABLE_CUDA_HEAP_SIZE=1before launching training. -
Populate the kernel caches to avoid recompilation on every run:
export MADRONA_MWGPU_KERNEL_CACHE=/path/to/cache/mwgpu export MADRONA_BVH_KERNEL_CACHE=/path/to/cache/bvh
Our code uses Hydra to configure experiments. Each experiment is defined as a yaml file in ss2r/configs/experiments. For example, to train a Unitree Go1 policy with a constraint on joint limit:
python train_brax.py +experiment=go1_sim_to_real- Policies (in
onnxformat) used for the Unitree Go1 experiments can be found inss2r/docs/policies. - In
ss2r/docs/videosyou can find videos of 5 trials for each policy, marked by its policy id.
If you find our repository useful in your work, please consider citing:
@inproceedings{
as2025spidrsimpleapproachzeroshot,
title={{SP}i{DR}: A Simple Approach for Zero-Shot Safety in Sim-to-Real Transfer},
author={Yarden As and Chengrui Qu and Benjamin Unger and Dongho Kang and Max van der Hart and Laixi Shi and Stelian Coros and Adam Wierman and Andreas Krause},
booktitle={International Conference on Neural Information Processing Systems},
year={2025},
}