realtime-policy-distillation

Implementation of "Real-time Policy Distillation in Deep Reinforcement Learning" paper

Installation

Python 3.8+ and ray rllib with pytorch backend are two main requirements. Full conda environment specification (and specific library versions) could be found in environment.yml file.

Code structure

In file scripts/models.py policy classes, that encapsulate both teacher and all student networks, are specified. In that file custom loss (which includes Q-loss, both versions of KL losses and imitation Q-loss) and custom evaluation function (which evaluates both teacher and all student networks on each iteration) are defined too.

In file scripts/trainer.py the trainer is inherited from the ray implementation of the APEX algorithm.

File scripts/main.py is an entry point to the training process.

File plots.ipynb contains the reproduction of all tables and figures from the report (And code, that can be used to calculate these values for any other trained model/game).

How to run training

To run a training process, firstly you should create a config file (three config files, that was used for the project are presented in the folder configs). It should be in a yaml file format and it is a common ray.rllib config, so it accepts any field that can be accepted by the ray.rllib and used to tune the behaviour of the ray.rllib.agents.dqn.ApexTrainer trainer. Run training with the following command from the root of the repository:

python -m scripts.main --config path_to_config_file

Reproduce tables and figures from the project report

File plots.ipynb contains the bare minimum of code required to calculate results presented in the report. It works with tensorboard log files that are generated during the training process. We attach tensorboard event files of our experiments, that can be accepted by this link (this files are too large for the git repo. To use them in plots.ipynb you should download them to your computer and specify the variable EVENT_FILE_PATH in the second cell of the ipynb file).

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
configs		configs
scripts		scripts
.gitignore		.gitignore
README.md		README.md
envirenment.yaml		envirenment.yaml
environment.yml		environment.yml
plots.ipynb		plots.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

realtime-policy-distillation

Installation

Code structure

How to run training

Reproduce tables and figures from the project report

About

Releases

Packages

Contributors 2

Languages

jetsnguns/realtime-policy-distillation

Folders and files

Latest commit

History

Repository files navigation

realtime-policy-distillation

Installation

Code structure

How to run training

Reproduce tables and figures from the project report

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages