Train a PPO agent to play Super Mario Bros using Stable-Baselines3 on stable-retro. This project targets Linux (Ubuntu 22.04 LTS on WSL) and uses uv for a fast, reproducible Python environment.
A single training entrypoint (main.py) is provided along with a separate runner (play.py) to play the full game with a trained policy.
Overview • Requirements • Important Links • Quickstart • ROM Import • Train • Play • Monitoring • Tips • Legal
- Stable-Baselines3 (PPO) with stable-retro
- Linux-first workflow (Ubuntu 22.04 LTS on WSL)
- ROM import instructions
- CNN Policy applied
- TensorBoard monitoring to analyze the performance
- Custom game window using OpenCV [Optional]
- Runner script to play the full game using a trained policy
- The compiled understandings can be found in this PDF
- All resources are available on TLDRAW board
- Ubuntu 22.04 LTS (native or WSL)
- Python 3.10
- uv, FFmpeg, and OpenGL
- A legally obtained Super Mario Bros NES ROM
Note
Ubuntu 22.04 LTS includes Python 3.10 by default.
- Install Ubuntu 22.04 (WSL):
wsl --install --distribution Ubuntu-22.04Open Ubuntu from the Start menu and continue in that terminal.
- Clone and open:
git clone https://github.com/amugoodbad229/MarioRL.git
cd MarioRL
code .- System dependencies:
sudo apt-get update -y && sudo apt-get upgrade -y
sudo apt-get install -y git curl zlib1g-dev libssl-dev ffmpeg python3-opengl- Install
uv:
curl -LsSf https://astral.sh/uv/install.sh | sh
uv --version- Sync the environment (creates .venv and installs dependencies):
uv sync- Activate the environment:
source .venv/bin/activate- Place your ROM:
mkdir -p ROM
# Copy your legally obtained Super Mario Bros.nes file into ./ROM- Import into Retro’s database:
python3 -m retro.import ./ROMEnsure the game name (e.g., SuperMarioBros-Nes) matches what your scripts expect.
Run a quick environment sanity check:
python3 randomAgent.pyImportant
Terminate the process with Ctrl+C.
OPTIONAL: check CUDA availability for PyTorch
python3 -c "import torch; print(torch.cuda.is_available())"Warning
Set device = "cuda", as we are dealing with images
Run the training entrypoint:
python3 main.pyOPTIONAL: You can select the path file to your desired folder. For now, I have set it up for you.
After training, run the separate runner to play:
python3 play.pyUse the same observation wrappers in play.py that you used in training.
Launch TensorBoard:
python -m tensorboard.main --logdir tensorboardOpen the printed URL in your browser to view training curves and metrics.
- Ensure the environment is active:
source .venv/bin/activate- Re-run sync if dependencies change:
uv sync- Re-import ROMs if not detected:
python3 -m retro.import ./ROMImportant
If Imported 0 games shows up, it could mean you downloaded a corrupted copy.
- Clean stop vs suspend:
# Clean stop
# Press Ctrl+C
# If you pressed Ctrl+Z by mistake:
# enter fg on the terminal to bring it to foreground, then press Ctrl+CIf this repository helps you learn or prototype faster:
- Star the repo
- Share feedback and ideas
- Open issues for bugs or improvements
ROMs are not included and must be obtained legally. Do not commit or distribute ROMs in this repository.