AI_Tag — Reinforcement Learning Tag Game

Overview

This project is a small multi-agent "tag" environment implemented with Pygame and PyTorch. Two types of agents interact: Seekers (try to catch) and Hiders (try to avoid). Agents are controlled by small neural networks (Q-learning) and trained inside the play.py training loop.

Repository structure

agent_hider.py, agent_seeker.py — trainer/controller classes for each agent (policy, memory, training loop calls)
model_hider.py, model_seeker.py — simple feed-forward Linear_QNet and QTrainer (MSE loss, Adam)
game.py — environment (map, agents, rewards, collision detection, drawing)
raycast.py — agent class that builds radars and draws/rotates sprites
play.py — top-level training loop
settings.py — constants and hyperparameters
assets/ — images and map files (map1.png, seeker.png, hider.png)
model/ — saved model weights (created at runtime)

Requirements

Use Python 3 (3.8+ tested). Create a virtual environment and install dependencies:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Running

Graphical (open a window):

python3 ./play.py

Headless (no display) and log output to a file (good for long training runs on servers):

SDL_VIDEODRIVER=dummy MAX_EPISODES=2000 python3 ./play.py | tee train_2000.log

Explanation of that command:

SDL_VIDEODRIVER=dummy: run Pygame without opening a window
MAX_EPISODES=2000: stop after 2000 episodes (optional variable read by play.py)
| tee train_2000.log: write terminal output to train_2000.log while displaying it

Useful variants

Run in background and save stdout/stderr to file:

SDL_VIDEODRIVER=dummy MAX_EPISODES=2000 nohup python3 ./play.py > train_2000.log 2>&1 &

Watch logs in real time:

tail -f train_2000.log

Configuration

Most runtime hyperparameters live in settings.py:

Screen / visuals: SCREEN_WIDTH, SCREEN_HEIGHT, AGENT_SCALE
Agent spawn: HIDERS_POS, SEEKERS_POS (format (x,y) or (x,y,angle))
Epsilon schedule for exploration:
- EPS_START (default 1.0)
- EPS_MIN (default 0.05)
- EPS_TARGET_EPISODES (default 5000) — number of episodes to reach EPS_MIN from EPS_START (decay is computed automatically)

Model saving/loading

Models are saved into ./model/ by calling model.save() (this saves the PyTorch state_dict).
Current saving policy (in play.py):
- Seeker model saved when seeker achieves a new record score (filename model_seeker.pth by default).
- Hider model saved every 10 episodes (filename model_hider.pth by default).
At trainer initialization the code attempts to load existing weights automatically from these filenames (or common alternates). If a model file exists, you'll see a Loaded ... message in the console.
If you prefer full checkpoints (model + optimizer + metadata), modify saving/loading in model_*.py and play.py to store a dict with 'model_state' and 'optimizer_state'.

Reward & training logic (what agents are rewarded/penalized for)

Key reward rules are implemented in game.py (summary of current setup):

AI_Tag — Reinforcement Learning Tag Game

Overview

AI_Tag is a small multi-agent environment where Seekers try to catch Hiders. The project uses Pygame for the environment and PyTorch for simple Q-learning agents. Agents receive radar-like observations and learn using a small feed-forward network.

Features

Two agent types: Seeker and Hider
Simple Q-learning update (neural network approximator)
Headless training support for servers
Automatic model save/load (state_dict) into ./model/

Repository structure

agent_hider.py, agent_seeker.py — trainer/controller logic for each agent
model_hider.py, model_seeker.py — model (Linear_QNet) and trainer (QTrainer)
game.py — environment, reward logic, collision detection, rendering
raycast.py — agent class (radars, drawing, rotation)
play.py — main training loop
settings.py — configuration and hyperparameters
assets/ — images and map files
model/ — saved model weights (created at runtime)

Requirements

Python 3.8+ and the dependencies listed in requirements.txt.

Create an environment and install dependencies:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Running

Graphical (GUI):

python3 ./play.py

Headless (no display):

SDL_VIDEODRIVER=dummy MAX_EPISODES=2000 python3 ./play.py

Explanation:

SDL_VIDEODRIVER=dummy runs Pygame without opening a window (useful on servers).
MAX_EPISODES (optional env var) limits training episodes.
tee writes output to both terminal and train_2000.log.

Run in background:

SDL_VIDEODRIVER=dummy MAX_EPISODES=2000 nohup python3 ./play.py > train_2000.log 2>&1 &

Watch logs:

tail -f train_2000.log

Configuration

Edit settings.py to configure:

Display and scale: SCREEN_WIDTH, SCREEN_HEIGHT, AGENT_SCALE
Spawn points: HIDERS_POS, SEEKERS_POS (each can be (x, y) or (x, y, angle)).
Exploration schedule:
- EPS_START (default 1.0)
- EPS_MIN (default 0.05)
- EPS_TARGET_EPISODES (default 5000) — number of episodes to reach EPS_MIN from EPS_START (decay computed automatically)

Model saving & loading

Models are saved as PyTorch state_dict in ./model/.
Default filenames used by the trainers: model_seeker.pth, model_hider.pth (the code also accepts common alternates).
On trainer initialization the code attempts to load saved weights automatically and will print Loaded ... if successful.
To persist full training state (optimizer, episode counters), extend the saving logic to store a checkpoint dict with 'model_state' and 'optimizer_state'.

Rewards summary

Implemented (see game.py):

Seekers:
- Wall collision: -10 and done
- Catch (dist < 30px): +100 and done (increments score)
- Near a hider (dist < 200px): +0.1 small positive
- Else: 0
Hiders:
- Wall collision: -10 and done
- Caught (dist < 30px): -10 and done
- Near seeker (dist < 200px): -1 (penalty to discourage approach)
- Else: +0.1 small positive for staying far

Training uses a Q-target: Q_new = reward if done else reward + gamma * max_a Q(next_state,a); loss is MSE.

Tips & suggestions

If agents repeatedly die on walls, check spawn positions in settings.py and assets/map1.png for overlaps.
For longer exploration use larger EPS_TARGET_EPISODES or higher EPS_MIN.
Use headless mode for faster, unattended training.
Consider curriculum training: start on empty maps and progressively add obstacles.

Monitoring & metrics

Track moving averages (e.g. over 100 episodes) for:

Seeker score per episode
Hider average survival time
Average reward per step
Percentage of episodes with at least one capture

Add logging or save intermediate metrics to CSV for plotting.

Extending the project

Add checkpointing with optimizer state for exact resume.
Implement multiple maps and domain randomization.
Improve agent networks, replay strategy, or use actor-critic algorithms for smoother training.

Common commands

# GUI run
python3 ./play.py

# Headless 2000 episodes, save log
SDL_VIDEODRIVER=dummy MAX_EPISODES=2000 python3 ./play.py | tee train_2000.log

# Background
SDL_VIDEODRIVER=dummy MAX_EPISODES=2000 nohup python3 ./play.py > train_2000.log 2>&1 &

# Tail log
tail -f train_2000.log

If you want me to change defaults (for example set EPS_TARGET_EPISODES=2000), add automatic periodic checkpoints, or implement a curriculum, tell me which option you prefer and I'll apply it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI_Tag — Reinforcement Learning Tag Game

Overview

Repository structure

Requirements

AI_Tag — Reinforcement Learning Tag Game

Overview

Features

Repository structure

Requirements

Running

Configuration

Model saving & loading

Rewards summary

Tips & suggestions

Monitoring & metrics

Extending the project

Common commands

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
model		model
.gitignore		.gitignore
README.md		README.md
agent_hider.py		agent_hider.py
agent_seeker.py		agent_seeker.py
game.py		game.py
model_hider.py		model_hider.py
model_seeker.py		model_seeker.py
play.py		play.py
raycast.py		raycast.py
requirements.txt		requirements.txt
settings.py		settings.py

Sfabi28/AI_tag

Folders and files

Latest commit

History

Repository files navigation

AI_Tag — Reinforcement Learning Tag Game

Overview

Repository structure

Requirements

AI_Tag — Reinforcement Learning Tag Game

Overview

Features

Repository structure

Requirements

Running

Configuration

Model saving & loading

Rewards summary

Tips & suggestions

Monitoring & metrics

Extending the project

Common commands

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages