Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
PPO_figs		PPO_figs
PPO_gifs		PPO_gifs
PPO_logs		PPO_logs
PPO_preTrained		PPO_preTrained
.gitattributes		.gitattributes
LICENSE		LICENSE
PPO.py		PPO.py
PPO_colab.ipynb		PPO_colab.ipynb
README.md		README.md
make_gif.py		make_gif.py
plot_graph.py		plot_graph.py
test.py		test.py
train.py		train.py

Repository files navigation

PPO-PyTorch

UPDATE [April 2021] :

merged discrete and continuous algorithms
added linear decaying for the continuous action space action_std; to make training more stable for complex environments
added different learning rates for actor and critic
episodes, timesteps and rewards are now logged in .csv files
utils to plot graphs from log files
utils to test and make gifs from preTrained networks
PPO_colab.ipynb combining all the files to train / test / plot graphs / make gifs on google colab in a convenient jupyter-notebook

Open `PPO_colab.ipynb` in Google Colab

Introduction

This repository provides a Minimal PyTorch implementation of Proximal Policy Optimization with clipped objective for OpenAI gym environments. It is primarily intended for beginners in RL for understanding the PPO algorithm. It can still be used for complex environments but may require some hyperparameter-tuning or changes in the code.

To keep the training procedure simple :

It has a constant standard deviation for the output action distribution (multivariate normal with diagonal covariance matrix) for the continuous environments, i.e. it is a hyperparameter and NOT a trainable parameter. However, it is linearly decayed. (action_std significantly affects performance)
It uses simple monte-carlo estimate for calculating returns and NOT Generalized Advantage Estimate (check out the OpenAI spinning up implementation for that).
It is a single threaded implementation, i.e. only one worker collects experience. One of the older forks of this repository has been modified to have Parallel workers

A concise explaination of PPO algorithm can be found here

Usage

To train a new network : run train.py
To test a preTrained network : run test.py
To plot graphs using log files : run plot_graph.py
To save images for gif and make gif using a preTrained network : run make_gif.py
All parameters and hyperparamters to control training / testing / graphs / gifs are in their respective .py file
PPO_colab.ipynb combines all the files in a jupyter-notebook
All the hyperparameters used for training are listed in the README.md in PPO_preTrained directory

Open `PPO_colab.ipynb` in Google Colab

Citing

Please use this bibtex if you want to cite this repository in your publications :

@misc{pytorch_minimal_ppo,
    author = {Barhate, Nikhil},
    title = {Minimal PyTorch Implementation of Proximal Policy Optimization},
    year = {2021},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/nikhilbarhate99/PPO-PyTorch}},
}

Results

PPO Continuous RoboschoolWalker2d-v1	PPO Continuous RoboschoolWalker2d-v1

PPO Continuous BipedalWalker-v2	PPO Continuous BipedalWalker-v2

PPO Discrete CartPole-v1	PPO Discrete CartPole-v1

PPO Discrete LunarLander-v2	PPO Discrete LunarLander-v2

Dependencies

Trained and Tested on:

Python 3
PyTorch
NumPy
gym
Pillow

Training Environments

Roboschool
pybullet

Graphs and gifs

pandas
matplotlib
Pillow

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PPO-PyTorch

UPDATE [April 2021] :

Open `PPO_colab.ipynb` in Google Colab

Introduction

Usage

Open `PPO_colab.ipynb` in Google Colab

Citing

Results

Dependencies

References

About

Contributors 6

Languages

License

nikhilbarhate99/PPO-PyTorch

Folders and files

Latest commit

History

Repository files navigation

PPO-PyTorch

UPDATE [April 2021] :

Open PPO_colab.ipynb in Google Colab

Introduction

Usage

Open PPO_colab.ipynb in Google Colab

Citing

Results

Dependencies

References

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 6

Languages

Open `PPO_colab.ipynb` in Google Colab

Open `PPO_colab.ipynb` in Google Colab