Skip to content

Loki-Silvres/Car-Racing-PPO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Car Racing PPO

This repository provides an implementation of the Proximal Policy Optimization (PPO) algorithm for training an agent to master the CarRacing-v3 environment from OpenAI Gym with StableBaselines3. It aims to offer a straightforward yet powerful setup for both training and evaluating reinforcement learning agents in continuous control tasks.

Prerequisites

  • Conda (or any virtual environment tool)
  • Python 3.11
  • Trained on RTX 4060 Laptop GPU (optional)

Installation

Set up your environment and install dependencies by running:

conda create -n CRP python=3.11
conda activate CRP
pip install -r requirements.txt

Training the Agent

To begin training PPO agent on the CarRacing environment, execute:

python3 src/train.py

This script will initialize the training process, log performance metrics, and save model checkpoints periodically. You can customize hyperparameters (like learning rate, discount factor, and clip range) directly within the script.

Evaluating the Agent

After training—or if you have a pre-trained model—you can evaluate your agent’s performance with:

python3 src/eval.py

This evaluation script loads the saved model and runs it in the environment, providing insights into its racing capabilities.

Demo

View a demonstration of the trained agent in action:

Screencast.from.01-11-2025.09.41.24.PM.webm

Project Structure

├── src
│   ├── train.py        # Script for training the PPO agent
│   ├── eval.py         # Script for evaluating the trained model
│   └── ...             # Additional modules and utilities
├── requirements.txt    # List of Python dependencies
└── README.md           # This file

Customization and Hyperparameters

To modify training settings within src/train.py to experiment with different configurations. Parameters such as learning rate, batch size, discount factor, and clip range can be adjusted to suit needs.

Acknowledgements

About

Proximal Policy Optimization (PPO) algorithm for training an agent to master the CarRacing-v3

Topics

Resources

Stars

Watchers

Forks

Languages