Skip to content

peterfich/chess-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chess LLM - Apple Silicon Optimized

A streamlined chess move prediction model using GPT-2 and LoRA fine-tuning, specifically optimized for Apple Silicon (M1/M2/M3) Macs.

🎯 Overview

This project fine-tunes GPT-2 to predict chess moves using parameter-efficient LoRA (Low-Rank Adaptation) training. The implementation is designed to run efficiently on Apple Silicon using Metal Performance Shaders (MPS) for GPU acceleration.

Key Features:

  • Fast Training: Complete fine-tuning in ~15 seconds
  • 🍎 Apple Silicon Optimized: Native MPS support for M1/M2/M3 chips
  • 📦 Lightweight: Only 3.1MB of fine-tuned weights
  • 🔧 Parameter Efficient: Trains only 0.65% of model parameters using LoRA
  • ♟️ Chess Focused: Learns common chess opening patterns and moves

🚀 Quick Start

Prerequisites

  • Apple Silicon Mac (M1/M2/M3)
  • Python 3.8+
  • pyenv (recommended) or conda

Installation

  1. Clone and setup:

    git clone <your-repo>
    cd chess-llm
    
    # Create virtual environment (using pyenv)
    pyenv virtualenv 3.13.3 chess
    echo "chess" > .python-version
  2. Install dependencies:

    pip install -r requirements.txt
  3. Train the model:

    ./train.sh
  4. View results:

    python demo.py

📊 What It Learns

The model is trained on chess opening patterns like:

  • Spanish Opening: 1. e4 e5 2. Nf3 Nc6 3. Bb5 ?a6
  • Queen's Gambit: 1. d4 Nf6 2. c4 e6 3. Nc3 d5 4. Bg5 ?Be7
  • Sicilian Defense: 1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 ?g6

🏗 Architecture

  • Base Model: GPT-2 (124M parameters)
  • Fine-tuning: LoRA with rank 8
  • Target Modules: c_attn, c_proj (attention layers)
  • Training Data: 250 chess sequence examples
  • Device: Apple Silicon GPU (MPS)

📁 Project Structure

chess-llm/
├── src/
│   └── train.py              # Main training script
├── models/
│   └── chess-gpt2-final/     # Trained model weights
├── data/                     # Training data (generated)
├── train.sh                  # Training script runner
├── demo.py                   # Results demonstration  
├── requirements.txt          # Python dependencies
└── README.md                 # This file

🔧 Technical Details

Training Configuration

  • LoRA Rank: 8 (efficient parameter usage)
  • Learning Rate: 5e-4
  • Batch Size: 1 (with gradient accumulation)
  • Training Steps: 50
  • Optimizer: AdamW
  • Device: Apple Metal Performance Shaders (MPS)

Model Files

After training, you'll find:

  • adapter_config.json - LoRA configuration
  • adapter_model.safetensors - Fine-tuned weights (3.1MB)
  • Tokenizer files - GPT-2 tokenizer components

🎮 Usage Examples

Training:

./train.sh

View Results:

python demo.py

Check Model:

ls -la models/chess-gpt2-final/

📈 Performance

  • Training Time: ~13 seconds on M2 MacBook Pro
  • Model Size: 3.1MB fine-tuned weights
  • Trainable Parameters: 811,008 (0.65% of total)
  • Base Parameters: 124,439,808 (frozen)

🔗 Inspiration

This project was inspired by this Medium article about fine-tuning small language models. Our implementation creates a simplified, Apple Silicon-native version that achieves similar results using GPT-2 and standard PyTorch libraries.

🛠 Requirements

See requirements.txt for exact versions. Key dependencies:

  • torch>=2.6.0 (with MPS support)
  • transformers>=4.40.0
  • peft>=0.10.0 (for LoRA)
  • datasets>=2.16.0
  • accelerate>=0.28.0

🤝 Contributing

Feel free to submit issues and enhancement requests! This project demonstrates efficient fine-tuning on Apple Silicon and can be extended to other chess-related NLP tasks.

📄 License

This project is open source. The trained model uses GPT-2 (MIT License) as its foundation.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •