Robust Deep Monte Carlo Counterfactual Regret Minimization : Addressing Theoretical Risks in Neural Fictitious Self-Play
A PyTorch implementation of Monte Carlo Counterfactual Regret Minimization (MCCFR) with deep neural networks for learning optimal strategies in imperfect information games.
This library implements state-of-the-art algorithms for solving imperfect information games using deep learning. It combines the theoretical foundations of Counterfactual Regret Minimization (CFR) with modern deep learning techniques to learn near-optimal strategies in complex game environments.
π This is an implementation of the paper: Robust Deep Monte Carlo Counterfactual Regret Minimization: Addressing Theoretical Risks in Neural Fictitious Self-Play.
- Multiple Neural Network Architectures: From simple feedforward networks to advanced transformer-based architectures
- Robust Training: Includes importance weight clipping, target networks, and variance reduction techniques
- Comprehensive Evaluation: Built-in exploitability calculation and strategy analysis tools
- Modular Design: Easy to extend to new games and network architectures
- Research-Ready: Includes experimental frameworks and diagnostic tools
- Kuhn Poker: A simplified poker variant perfect for testing and research -Leduc Poker: A poker variant that is more complex than Kuhn Poker, perfect for testing scalability and robustness.
- Extensible Framework: Easy to add new imperfect information games
git clone https://github.com/your-username/robust-deep-mccfr.git
cd robust-deep-mccfr
pip install -e .
from deep_mccfr import DeepMCCFR, KuhnGame
# Initialize the algorithm
mccfr = DeepMCCFR(
network_type='ultra_deep',
learning_rate=0.00003,
batch_size=384
)
# Train on Kuhn Poker
results = mccfr.train(num_iterations=10000)
print(f"Final exploitability: {results['final_exploitability']:.6f}")
print(f"Training time: {results['training_time']:.1f}s")
from deep_mccfr import RobustDeepMCCFR, RobustMCCFRConfig
# Configure robust training
config = RobustMCCFRConfig(
network_type='mega_transformer',
exploration_epsilon=0.1,
importance_weight_clip=10.0,
use_target_networks=True,
prioritized_replay=True,
num_iterations=20000
)
# Initialize robust MCCFR
robust_mccfr = RobustDeepMCCFR(config)
# Train with advanced features
results = robust_mccfr.train(config.num_iterations)
The library includes several neural network architectures optimized for strategy learning:
- BaseNN: Simple feedforward network with dropout
- DeepResidualNN: Deep residual network with skip connections
- FeatureAttentionNN: Self-attention mechanism for feature interactions
- HybridAdvancedNN: Combines attention and residual processing
- MegaTransformerNN: Large-scale transformer architecture
- UltraDeepNN: Ultra-deep network with bottleneck residual blocks
- Feature Extraction: Sophisticated state representation for game states
- Experience Replay: Prioritized sampling for stable learning
- Risk Mitigation: Multiple techniques to ensure robust training
- Diagnostic Tools: Comprehensive monitoring and analysis
The library includes extensive experimental frameworks for comparing different approaches:
from deep_mccfr.experiments import ExperimentRunner, get_ablation_configs
# Run systematic ablation study
runner = ExperimentRunner()
configs = get_ablation_configs()
for config in configs:
results = runner.run_experiment(config)
# Analyze results
runner.analyze_results()
If you use this library in your research, please cite:
@software{eljaafari2024dlmccfr,
author = {El Jaafari, Zakaria},
title = {Deep Learning Monte Carlo Counterfactual Regret Minimization},
url = {https://github.com/nier2kirito/robust-deep-mccfr},
version = {1.0.0},
year = {2024}
}
# Clone the repository
git clone https://github.com/your-username/robust-deep-mccfr.git
cd robust-deep-mccfr
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/
# Format code
black src/
# Type checking
mypy src/
robust-deep-mccfr/
βββ src/robust-deep-mccfr/ # Main package
β βββ games/ # Game implementations
β βββ networks.py # Neural network architectures
β βββ mccfr.py # Core MCCFR algorithms
β βββ features.py # Feature extraction
β βββ utils.py # Utility functions
β βββ __init__.py # Package initialization
βββ examples/ # Example scripts
βββ tests/ # Unit tests
βββ docs/ # Documentation
βββ requirements.txt # Dependencies
βββ setup.py # Package setup
βββ README.md # This file
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
# Available network architectures
NETWORK_TYPES = [
'simple', # Basic feedforward
'deep_residual', # Deep residual network
'feature_attention', # Attention-based
'hybrid_advanced', # Hybrid architecture
'mega_transformer', # Large transformer
'ultra_deep' # Ultra-deep network
]
# Common training configurations
CONFIGS = {
'fast': {
'batch_size': 128,
'learning_rate': 0.001,
'train_every': 50
},
'stable': {
'batch_size': 384,
'learning_rate': 0.00003,
'train_every': 25
},
'research': {
'batch_size': 512,
'learning_rate': 0.00001,
'train_every': 10
}
}
- CUDA Out of Memory: Reduce batch size or use a smaller network
- Slow Training: Enable GPU acceleration or use simpler architectures
- Numerical Instability: Adjust learning rate or enable gradient clipping
- Use GPU acceleration for large networks
- Adjust batch size based on available memory
- Use mixed precision training for faster computation
- Enable experience replay for sample efficiency
This project is licensed under the MIT License - see the LICENSE file for details.
- Original MCCFR algorithm by Lanctot et al.
- Deep CFR extensions by Brown et al.
- PyTorch team for the excellent deep learning framework
- Game theory research community
- Author: Zakaria El Jaafari
- Email: [zakariaeljaafari0@gmail.com]
- GitHub: https://github.com/nier2kirito
β If you find this project helpful, please consider giving it a star on GitHub!