Distributed Distributional DDPG (D4PG) training for a one-legged hopper robot (Monoped-V0) in Gazebo/ROS, using Acme and Reverb
Pull the docker image:
docker pull oceanthunder/hoppip-v0:latestRun the image:
xhost +local:root
docker run -it \
--rm \
--gpus all \
--name hoppip-v0 \
-p 6006:6006 \
--privileged \
--env=DISPLAY \
--env=QT_X11_NO_MITSHM=1 \
-v /tmp/.X11-unix:/tmp/.X11-unix \
oceanthunder/hoppip-v0 \
/bin/bashLaunch the Gazebo simulation:
export LD_LIBRARY_PATH=/opt/ros/noetic/lib:/opt/ros/noetic/lib/x86_64-linux-gnu
roslaunch my_legged_robots_sims main.launchIn a new terminal:
docker exec -it hoppip-v0 /bin/bashTrain using D4PG:
roslaunch my_hopper_training d4pg.launchIf you want to use TD3/SAC/A2C, modify the start_training_v2.py file inside my_hopper_training/src and then run:
roslaunch my_hopper_training main.launchIf you want to tweak the rewards, modify the file at /root/monoped_ws/src/my_hopper_training/config/learn_params.yaml
