This implementation requires python3 (>=3.5).
We reccomend to create a virtual environment for an easy installation of the dependencies:
pip install virtualenv
Create a new conda environment with:
conda create -n env_name python=3.6.7
Activate the environment and install the project dependencies that are located in the requirements.txt file:
conda activate env_name
pip3 install -r requirements.txt
To check if the installation worked, try one of the examples that are located in examples using the code below:
python3 examples/ppo/cartpole_swing_up/execute_model.py
The algorthims are used as follows:
python <algorithm>_runner.py --env=Qube-v0 [additional arguments]
python ppo_runner.py --env=Qube-v0 --ppoepochs=5 --training_steps=1000 --horzion=1024 --hneurons=[64, 64] --std=1.0 --minibatches=32 --lam=0.97 --gamma=0.95 --cliprange=0.2 --vfc=0.5 --lr=1e-3
python rs_runner.py --env=CartpoleSwingShort-v0 --alg=ars_v2 --ndeltas=8 --training_steps=100 --lr=0.015 --bbest=4 --horizon=1024 --snoise=0.025
Every implementation has its own model handler that enables the features to save and load models.
The following code is an example execution to benchmark PPO on Qube-v0 ten times. The benchmarking will be visualized. A model path has to be provided to load a model.
python3 ppo_runner.py --env=Qube-v0 --path=<model_path> --benchmark=True --vis=True --benchsteps=10
If numpy causes trouble, run the uninstall command multiple times until
no more version is located in your environment. Install numpy again
with pip3 install numpy==1.16.0
- Thomas Lautenschläger
- Jan Rathjens