Project_RL

Welcome to Project_RL! This repository showcases our exploration of reinforcement learning techniques in three environments: Freeway, Tennis, and Pong. We implemented state-of-the-art algorithms, tested their performance, and analyzed their behavior.

🚗 Freeways

A classic Atari game where the agent controls a chicken trying to cross a multi-lane highway while avoiding cars.

🛠️ What We Did

Environment Preprocessing: Resized, grayscaled, stacked frames, and normalized inputs for efficient model training.
Algorithms: Implemented Deep Q-Networks (DQN) with extensions like Double DQN, Dueling Networks, and Prioritized Replay Buffer. Also implemented Reinforce, a policy gradient algorithm, for comparison.
Results: Achieved the optimal reward of 21 using DQN with basic preprocessing and Reinforce with policy gradient optimization.

📂 Files

Freeway.ipynb: Notebook demonstrating the preprocessing pipeline.
TabularMethod.py: Code for tabular methods (state transformations and Monte Carlo integration).
Rainbow_dwn.py: DQN implementation with extensions.

🎾 Tennis

A more complex Atari environment requiring the agent to learn hitting, positioning, and serving strategies in a tennis game.

🛠️ What We Did

Environment Simplification: Cropped frames, extracted ball positions, and reduced the action space from 18 to 6 essential moves.
Algorithms: Implemented PPO, Maskable PPO, and A2C. Enhanced Maskable PPO with wrappers for action masking, ball tracking, and intermediate rewards.
Results: Maskable PPO significantly outperformed other models but could not win a full game due to environment complexity.

📂 Files

SB3_Tennis_A2C.py: Train a single agent with A2C.
SB3_Tennis_ppo.py: Train a single agent with PPO.
SB3_TennisDetectPilotA2C.py: Train Maskable A2C with advanced wrappers.
SB3_TennisDetectPilotPPO.py: Train Maskable PPO with action masking and ball tracking.

🏓 Pong

Simulates a two-player table tennis game where agents compete to score points.

🛠️ What We Did

Single-Agent Approach: Trained a right paddle agent using PPO in a simplified setup.
Multi-Agent Approach: Extended the Gym environment to PettingZoo, enabling two agents to train against each other.
Challenges: Despite promising setups, agents converged to suboptimal strategies, focusing on serving rather than rallying.

📂 Files

MAS_tournament_ppo.py: Train two agents with PettingZoo and PPO.
MAS_tournament_dqn.py: Train two agents with DQN.
MAS_tournament_reinforce.py: Train two agents with Reinforce.
LoadModels.py: Utilities to load pre-trained models for inference.
single_agent_ppo.py: Train a single agent with PPO.
Config.py: Containing the config files
wrappers.py Wrappers implementations

🚀 Highlights

Algorithms Implemented: DQN, Reinforce, PPO, Maskable PPO, A2C.
Preprocessing Innovations: Simplified observations using cropping, grayscale, red channel filtering, and frame stacking.
Insights Gained: Highlighted the strengths and limitations of value-based vs. policy gradient methods across varying complexities.

📧 Questions or Contributions?

Feel free to open an issue or submit a pull request for improvements or suggestions.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
Pong		Pong
Tennis		Tennis
freeway		freeway
.gitignore		.gitignore
Presentation.pdf		Presentation.pdf
README.md		README.md
ppo_weight_left		ppo_weight_left
ppo_weight_right		ppo_weight_right
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project_RL

🚗 Freeways

🛠️ What We Did

📂 Files

🎾 Tennis

🛠️ What We Did

📂 Files

🏓 Pong

🛠️ What We Did

📂 Files

🚀 Highlights

📧 Questions or Contributions?

About

Releases

Packages

Contributors 2

Languages

mpilligua/Project_RL

Folders and files

Latest commit

History

Repository files navigation

Project_RL

🚗 Freeways

🛠️ What We Did

📂 Files

🎾 Tennis

🛠️ What We Did

📂 Files

🏓 Pong

🛠️ What We Did

📂 Files

🚀 Highlights

📧 Questions or Contributions?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages