Easy21 Reinforcement Learning Project

This repository implements a solution to David Silver's Easy21 assignment from his 2015 Reinforcement Learning course. The project demonstrates various reinforcement learning algorithms by creating an environment (Easy21 game) and different RL agents that can interact and play the game.

🎮 Easy21 Game

Easy21 is a simplified version of Blackjack with the following rules:

Objective: Get as close to 21 as possible without going over
Cards: Values 1-10, with red cards being negative and black cards positive
Actions: Hit (draw a card) or Stick (end turn)
State: [dealer's showing card, player's current sum]
Rewards: +1 for winning, -1 for losing, 0 for tie
Terminal conditions: Player busts (sum < 1 or > 21) or sticks

🤖 Implemented RL Agents

1. Monte Carlo Agent (`MCAgent`)

Algorithm: Monte Carlo Control with exploring starts
Features:
- Learns from complete episodes
- Uses first-visit MC for state-action value estimation
- Implements ε-greedy exploration strategy
File: ai/monte_carlo.py

2. Sarsa(λ) Agent (`SarsaAgent`)

Algorithm: Backward-view Sarsa(λ) with eligibility traces
Features:
- Temporal difference learning with eligibility traces
- Configurable λ parameter (0-1)
- Online learning with immediate updates
File: ai/sarsa_lambda.py

3. Sarsa(λ) with Linear Function Approximation (`SarsaApproxAgent`)

Algorithm: Sarsa(λ) with linear function approximation
Features:
- Uses feature vectors to represent state-action pairs
- 36-dimensional feature space
- Gradient descent weight updates
- Handles continuous state spaces efficiently
File: ai/sarsa_lambda.py

4. Random Agent (`RandomAgent`)

Algorithm: Random action selection
Purpose: Baseline for performance comparison
File: ai/random_agent.py

5. Human Agent (`HumanAgent`)

Algorithm: Human input with optional AI assistance
Features:
- Interactive gameplay
- Can receive recommendations from trained agents
- Input validation and error handling
File: ai/human_agent.py

📁 Project Structure

easy21/
├── ai/                          # Reinforcement learning agents
│   ├── __init__.py
│   ├── ai.py                    # Main AI framework and Learner class
│   ├── general_agent.py         # Base Agent class with ε-greedy policy
│   ├── monte_carlo.py           # Monte Carlo agent implementation
│   ├── sarsa_lambda.py          # Sarsa(λ) and Sarsa(λ) with approximation
│   ├── human_agent.py           # Human player agent
│   ├── random_agent.py          # Random baseline agent
│   └── results/                 # Generated plots and saved models
│       ├── *.png                # Value function visualizations
│       └── *.pickle             # Saved value functions and weights
├── game/                        # Game environment
│   ├── __init__.py
│   └── game.py                  # Easy21 game implementation
├── utils/                       # Utility functions
│   ├── __init__.py
│   └── display.py               # Plotting and visualization utilities
├── README.md                    # This file
└── Reference_Instructions.pdf   # Original assignment instructions

🚀 Usage

Prerequisites

pip install numpy matplotlib

Training Agents

# Train Monte Carlo agent
from ai.ai import Learner
from ai.monte_carlo import MCAgent

mc_agent = MCAgent()
mc_learner = Learner(mc_agent)
mc_learner.learn(num_train_epochs=1000000, save=True)

Playing the Game

# Human vs AI
from game.game import Game
from ai.human_agent import HumanAgent
from ai.sarsa_lambda import SarsaAgent

# Load trained agent
sarsa_agent = SarsaAgent(_lambda=0.3)
sarsa_learner = Learner(sarsa_agent)
sarsa_learner.load_val_func()

# Play with AI assistance
human_agent = HumanAgent(assist_agent=sarsa_agent)
game = Game(ai=human_agent, stdout=True)
game.play_game()

Running the Complete Experiment

cd ai
python ai.py

This will:

Train all agents (Monte Carlo, Sarsa, Sarsa with approximation)
Generate performance comparisons
Create value function visualizations
Plot MSE vs λ for different algorithms
Optionally allow human gameplay

📊 Results and Visualizations

The project generates several types of outputs:

Value Function Plots

3D surface plots showing optimal action values
Dealer showing card vs Player sum
Generated for each agent type

Performance Analysis

Mean Squared Error vs λ parameter for Sarsa agents
Learning curves showing convergence
Performance comparisons between agents

Saved Models

Pickle files containing trained value functions
Weight vectors for function approximation
Can be loaded for evaluation or human gameplay

🔧 Key Features

Modular Design: Clean separation between game environment and agents
Extensible Framework: Easy to add new RL algorithms
Comprehensive Evaluation: Multiple metrics and visualizations
Interactive Play: Human can play with AI assistance
Persistent Storage: Save and load trained models
Visualization Tools: 3D plots and learning curves

📚 Learning Objectives

This project demonstrates:

Monte Carlo Methods: Learning from complete episodes
Temporal Difference Learning: Online learning with bootstrapping
Eligibility Traces: Efficient credit assignment in Sarsa(λ)
Function Approximation: Handling large state spaces
Exploration vs Exploitation: ε-greedy policies
Value Function Visualization: Understanding learned policies

🤝 Contributing

Feel free to extend this project by:

Implementing additional RL algorithms (Q-learning, DQN, etc.)
Adding new game environments
Improving visualization capabilities
Optimizing performance

📄 License

This project is based on David Silver's Reinforcement Learning course materials. Please refer to the original course for licensing information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Easy21 Reinforcement Learning Project

🎮 Easy21 Game

🤖 Implemented RL Agents

1. Monte Carlo Agent (`MCAgent`)

2. Sarsa(λ) Agent (`SarsaAgent`)

3. Sarsa(λ) with Linear Function Approximation (`SarsaApproxAgent`)

4. Random Agent (`RandomAgent`)

5. Human Agent (`HumanAgent`)

📁 Project Structure

🚀 Usage

Prerequisites

Training Agents

Playing the Game

Running the Complete Experiment

📊 Results and Visualizations

Value Function Plots

Performance Analysis

Saved Models

🔧 Key Features

📚 Learning Objectives

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
ai		ai
game		game
utils		utils
README.md		README.md
Reference_Instructions.pdf		Reference_Instructions.pdf

arielfayol37/Easy21

Folders and files

Latest commit

History

Repository files navigation

Easy21 Reinforcement Learning Project

🎮 Easy21 Game

🤖 Implemented RL Agents

1. Monte Carlo Agent (MCAgent)

2. Sarsa(λ) Agent (SarsaAgent)

3. Sarsa(λ) with Linear Function Approximation (SarsaApproxAgent)

4. Random Agent (RandomAgent)

5. Human Agent (HumanAgent)

📁 Project Structure

🚀 Usage

Prerequisites

Training Agents

Playing the Game

Running the Complete Experiment

📊 Results and Visualizations

Value Function Plots

Performance Analysis

Saved Models

🔧 Key Features

📚 Learning Objectives

🤝 Contributing

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. Monte Carlo Agent (`MCAgent`)

2. Sarsa(λ) Agent (`SarsaAgent`)

3. Sarsa(λ) with Linear Function Approximation (`SarsaApproxAgent`)

4. Random Agent (`RandomAgent`)

5. Human Agent (`HumanAgent`)

Packages