A streamlined chess move prediction model using GPT-2 and LoRA fine-tuning, specifically optimized for Apple Silicon (M1/M2/M3) Macs.
This project fine-tunes GPT-2 to predict chess moves using parameter-efficient LoRA (Low-Rank Adaptation) training. The implementation is designed to run efficiently on Apple Silicon using Metal Performance Shaders (MPS) for GPU acceleration.
Key Features:
- ⚡ Fast Training: Complete fine-tuning in ~15 seconds
- 🍎 Apple Silicon Optimized: Native MPS support for M1/M2/M3 chips
- 📦 Lightweight: Only 3.1MB of fine-tuned weights
- 🔧 Parameter Efficient: Trains only 0.65% of model parameters using LoRA
- ♟️ Chess Focused: Learns common chess opening patterns and moves
- Apple Silicon Mac (M1/M2/M3)
- Python 3.8+
- pyenv (recommended) or conda
-
Clone and setup:
git clone <your-repo> cd chess-llm # Create virtual environment (using pyenv) pyenv virtualenv 3.13.3 chess echo "chess" > .python-version
-
Install dependencies:
pip install -r requirements.txt
-
Train the model:
./train.sh
-
View results:
python demo.py
The model is trained on chess opening patterns like:
- Spanish Opening:
1. e4 e5 2. Nf3 Nc6 3. Bb5 ?
→a6
- Queen's Gambit:
1. d4 Nf6 2. c4 e6 3. Nc3 d5 4. Bg5 ?
→Be7
- Sicilian Defense:
1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 ?
→g6
- Base Model: GPT-2 (124M parameters)
- Fine-tuning: LoRA with rank 8
- Target Modules:
c_attn
,c_proj
(attention layers) - Training Data: 250 chess sequence examples
- Device: Apple Silicon GPU (MPS)
chess-llm/
├── src/
│ └── train.py # Main training script
├── models/
│ └── chess-gpt2-final/ # Trained model weights
├── data/ # Training data (generated)
├── train.sh # Training script runner
├── demo.py # Results demonstration
├── requirements.txt # Python dependencies
└── README.md # This file
- LoRA Rank: 8 (efficient parameter usage)
- Learning Rate: 5e-4
- Batch Size: 1 (with gradient accumulation)
- Training Steps: 50
- Optimizer: AdamW
- Device: Apple Metal Performance Shaders (MPS)
After training, you'll find:
adapter_config.json
- LoRA configurationadapter_model.safetensors
- Fine-tuned weights (3.1MB)- Tokenizer files - GPT-2 tokenizer components
Training:
./train.sh
View Results:
python demo.py
Check Model:
ls -la models/chess-gpt2-final/
- Training Time: ~13 seconds on M2 MacBook Pro
- Model Size: 3.1MB fine-tuned weights
- Trainable Parameters: 811,008 (0.65% of total)
- Base Parameters: 124,439,808 (frozen)
This project was inspired by this Medium article about fine-tuning small language models. Our implementation creates a simplified, Apple Silicon-native version that achieves similar results using GPT-2 and standard PyTorch libraries.
See requirements.txt
for exact versions. Key dependencies:
torch>=2.6.0
(with MPS support)transformers>=4.40.0
peft>=0.10.0
(for LoRA)datasets>=2.16.0
accelerate>=0.28.0
Feel free to submit issues and enhancement requests! This project demonstrates efficient fine-tuning on Apple Silicon and can be extended to other chess-related NLP tasks.
This project is open source. The trained model uses GPT-2 (MIT License) as its foundation.