This is a codebase used for our IROS'20 paper Efficient Exploration in Constrained Environments with Goal-Oriented Reference Path.
# safety-gym from keiohta's forked version
$ git clone git@github.com:keiohta/safety-gym.git
$ cd safety-gym
$ pip install -e .
$ mkdir -p ~/.mujoco
$ cd ~/.mujoco
$ wget https://www.roboti.us/download/mujoco200_linux.zip
$ unzip mujoco200_linux.zip
$ mv mujoco200_linux mujoco200
# Extend LD_LIBRARY_PATH with mujoco:
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/.mujoco/mujoco200/bin
# Now install mujoco-py via pip:
$ pip install mujoco-py
$ cd safety_rl
$ pip install -r requirements.txt
$ export PYTHONPATH=$PYTHONPATH:$PWD
Specify following information
--hazards-num
: number of hazards to locate--field-size
: define min-max size of field. if you specify1
, then the size will beconfig[placements_extents] = [-1, -1, 1, 1]
. default is2
.--resolution
: resolution to plan a path. default is0.1
.
$ python examples/generate_optimal_path.py
Generate a dataset for training the waypoints generator
$ python examples/generate_dataset.py --save-data --dataset-size 50000
$ python examples/generate_dataset.py --save-data --dataset-size 10000 --evaluate
Train a model with dataset generated above.
$ python examples/train_cnn.py --epochs 100 --lr 0.0001
# Evaluate trained model
$ python examples/train_cnn.py --rollout-only --show-test-progress
$ python examples/train_cnn_rl_like.py --n-warm-up=10000 --show-test-progress --test-env-interval 10000
$ python examples/rl/run_sac_waypoints_generator.py
$ python examples/rl/run_sac_waypoints_generator.py --evaluate --model-dir /path/to/results --test-episodes 10 --show-test-progress
$ python -m unittest discover -v
Generate the following datasets for training waypoints generators
pillars_2_10
: for Exp. 6.A, 6.Bpillars_3_25
: for Exp. 6.Cpillars_4_40
: for Exp. 6.Cgremlins_2_10
: for Exp. 6.Ctwo_room
: for Exp. 6.Cfour_room
: for Exp. 6.C
$ python examples/all_generate_dataset.py --run
$ python examples/all_train_cnn.py --run
# "ours" on MCS
$ python examples/rl/all_envs_ours.py --run
# "baseline" on MCS
$ python examples/rl/all_envs_baseline.py --run
Make graphs that show learning curves
$ python examples/rl/make_compare_graph.py -i ../safetyrl_results/dataset/pillars_2_10 --legend --color
Visually evaluate the trained model
$ python examples/rl/run_sac_waypoints_generator.py --evaluate --root-dir ../safetyrl_results/ --show-test-progress --robot-type doggo
Qualitatively evaluate the trained model
# Evaluate the performance of trained model on various environments on MCS
$ python examples/rl/evaluate_generalization.py
Visually evaluate the trained model
# pillars (3, 3, 25)
$ python examples/rl/run_sac_waypoints_generator.py --evaluate --root-dir ../safetyrl_results/ --show-test-progress --robot-type doggo --fine-tuning --field-size 3 --pillars-num 25
# pillars (4, 4, 40)
$ python examples/rl/run_sac_waypoints_generator.py --evaluate --root-dir ../safetyrl_results/ --show-test-progress --robot-type doggo --fine-tuning --field-size 4 --pillars-num 40
# two-room
$ python examples/rl/run_sac_waypoints_generator.py --evaluate --root-dir ../safetyrl_results/ --show-test-progress --robot-type doggo --fine-tuning --place-room --room-type 0
# gremlin
$ python examples/rl/run_sac_waypoints_generator.py --evaluate --root-dir ../safetyrl_results/ --show-test-progress --robot-type doggo --fine-tuning --dummy-gremlins --gremlins-num 10
If you use the software, please cite the following (TR2020-141):
@inproceedings{ota2020efficient
author = {Ota, Kei and Sasaki, Yoko and Jha, Devesh K and Yoshiyasu, Yusuke and Kanezaki, Asako},
title = {Efficient exploration in constrained environments with goal-oriented reference path},
booktitle = {2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year = {2020},
pages = {6061--6068},
publisher = {IEEE},
doi = {10.1109/IROS45743.2020.9341620},
url = {https://ieeexplore.ieee.org/abstract/document/9341620}
}
Please contact Devesh Jha at jha@merl.com
See CONTRIBUTING.md for our policy on contributions.
Released under AGPL-3.0-or-later
license, as found in the LICENSE.md file.
All files:
Copyright (C) 2021, 2023 Mitsubishi Electric Research Laboratories (MERL).
SPDX-License-Identifier: AGPL-3.0-or-later