Skip to content

Simple and Modular implementation of Proximal Policy Optimization (PPO) in PyTorch

License

Notifications You must be signed in to change notification settings

saqib1707/RL-PPO-PyTorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Proximal Policy Optimization (PPO) with PyTorch

Overview

This repository provides a clean and modular implementation of Proximal Policy Optimization (PPO) using PyTorch, designed to help beginners understand and experiment with reinforcement learning algorithms. It includes both continuous and discrete action spaces, demonstrated on environments from OpenAI Gym. The structure is flexible, allowing easy modifications to work with other custom environments.

Key Features:

  • Modular and easy-to-understand code
  • Supports both continuous and discrete action spaces
  • YAML-based configuration for managing hyperparameters
  • Out-of-the-box compatibility with OpenAI Gym environments

Getting Started

Installation

Clone the repository

git clone https://github.com/saqib1707/RL-PPO-PyTorch.git
cd PPO-PyTorch

Dependencies

To run this code, you need the following dependencies:

  • torch
  • numpy
  • gym
  • pygame
  • box2d
  • box2d-py

Create a virtual environment and install the dependencies:

  1. Create a virtual environment
python -m venv /path/to/venv/directory
  1. Activate the virtual environment
source /path/to/venv/directory/bin/activate
  1. Install required dependencies
pip install -r requirements.txt

Note: For environments like LunarLander-v2 and BipedalWalker, make sure you have swig and box2d installed.

  1. Install swig in Mac or Linux:

For MacOS:

brew install swig

For Linux:

apt-get install swig
  1. To install box2d, run
pip install box2d
pip install box2d-py

Note: Gymnasium (the successor to OpenAI Gym) supports Python versions up to 3.11. There have been issues reported with installing gym[box2d] on Python 3.8, 3.9, and 3.10.

Usage

Environments

Supported environments include:

  • CartPole
  • LunarLander
  • Walker2d
  • HalfCheetah
  • BipedalWalker

Configuration files for each environment are located in the configs/ directory. These can be customized to adjust hyperparameters for each run.

Training:

Run the training script with the desired environment configuration:

python launcher.py --config_path="../configs/config_cartpole.yaml"

For other environments, simply modify the config path, for example:

python launcher.py --config_path="../configs/config_lunarlander.yaml"

To run experiments with modified hyperparameters, you can override the default settings from the YAML file using the --override flag:

python launcher.py --config_path="../configs/config_cartpole.yaml" --override "mode=test" "hidden_dim=256" "gamma=0.95"

Results

To be updated soon...

Contributing

Contributions are welcome! If you have suggestions for improving the code or adding new features, feel free to submit a pull request or open an issue.

Citing

If you use this repository in your research, please consider citing it:

@misc{ppo_pytorch,
    author = {Azim, Saqib},
    title = {Proximal Policy Optimization using PyTorch},
    year = {2024},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/saqib1707/RL-PPO-PyTorch}},
}

References:

Contact

Feel free to reach out with any questions or suggestions:

Email: azimsaqib10@gmail.com

Releases

No releases published

Packages

No packages published

Languages