Skip to content

Robust Deep Monte Carlo Counterfactual Regret Minimization: Addressing Theoretical Risks in Neural Fictitious Self-Play

License

Notifications You must be signed in to change notification settings

nier2kirito/robust-deep-mccfr

Repository files navigation

Robust Deep Monte Carlo Counterfactual Regret Minimization : Addressing Theoretical Risks in Neural Fictitious Self-Play

License: MIT Python 3.8+ PyTorch

A PyTorch implementation of Monte Carlo Counterfactual Regret Minimization (MCCFR) with deep neural networks for learning optimal strategies in imperfect information games.

🎯 Overview

This library implements state-of-the-art algorithms for solving imperfect information games using deep learning. It combines the theoretical foundations of Counterfactual Regret Minimization (CFR) with modern deep learning techniques to learn near-optimal strategies in complex game environments.

πŸ‘‰ This is an implementation of the paper: Robust Deep Monte Carlo Counterfactual Regret Minimization: Addressing Theoretical Risks in Neural Fictitious Self-Play.

Key Features

  • Multiple Neural Network Architectures: From simple feedforward networks to advanced transformer-based architectures
  • Robust Training: Includes importance weight clipping, target networks, and variance reduction techniques
  • Comprehensive Evaluation: Built-in exploitability calculation and strategy analysis tools
  • Modular Design: Easy to extend to new games and network architectures
  • Research-Ready: Includes experimental frameworks and diagnostic tools

Supported Games

  • Kuhn Poker: A simplified poker variant perfect for testing and research -Leduc Poker: A poker variant that is more complex than Kuhn Poker, perfect for testing scalability and robustness.
  • Extensible Framework: Easy to add new imperfect information games

πŸš€ Quick Start

Installation

From Source (Recommended)

git clone https://github.com/your-username/robust-deep-mccfr.git
cd robust-deep-mccfr
pip install -e .

Basic Usage

from deep_mccfr import DeepMCCFR, KuhnGame

# Initialize the algorithm
mccfr = DeepMCCFR(
    network_type='ultra_deep',
    learning_rate=0.00003,
    batch_size=384
)

# Train on Kuhn Poker
results = mccfr.train(num_iterations=10000)

print(f"Final exploitability: {results['final_exploitability']:.6f}")
print(f"Training time: {results['training_time']:.1f}s")

Advanced Usage with Robust Features

from deep_mccfr import RobustDeepMCCFR, RobustMCCFRConfig

# Configure robust training
config = RobustMCCFRConfig(
    network_type='mega_transformer',
    exploration_epsilon=0.1,
    importance_weight_clip=10.0,
    use_target_networks=True,
    prioritized_replay=True,
    num_iterations=20000
)

# Initialize robust MCCFR
robust_mccfr = RobustDeepMCCFR(config)

# Train with advanced features
results = robust_mccfr.train(config.num_iterations)

πŸ—οΈ Architecture

Neural Network Architectures

The library includes several neural network architectures optimized for strategy learning:

  1. BaseNN: Simple feedforward network with dropout
  2. DeepResidualNN: Deep residual network with skip connections
  3. FeatureAttentionNN: Self-attention mechanism for feature interactions
  4. HybridAdvancedNN: Combines attention and residual processing
  5. MegaTransformerNN: Large-scale transformer architecture
  6. UltraDeepNN: Ultra-deep network with bottleneck residual blocks

Key Components

  • Feature Extraction: Sophisticated state representation for game states
  • Experience Replay: Prioritized sampling for stable learning
  • Risk Mitigation: Multiple techniques to ensure robust training
  • Diagnostic Tools: Comprehensive monitoring and analysis

πŸ“Š Experimental Results

The library includes extensive experimental frameworks for comparing different approaches:

from deep_mccfr.experiments import ExperimentRunner, get_ablation_configs

# Run systematic ablation study
runner = ExperimentRunner()
configs = get_ablation_configs()

for config in configs:
    results = runner.run_experiment(config)
    
# Analyze results
runner.analyze_results()

Citation

If you use this library in your research, please cite:

@software{eljaafari2024dlmccfr,
  author = {El Jaafari, Zakaria},
  title = {Deep Learning Monte Carlo Counterfactual Regret Minimization},
  url = {https://github.com/nier2kirito/robust-deep-mccfr},
  version = {1.0.0},
  year = {2024}
}

πŸ› οΈ Development

Setting up Development Environment

# Clone the repository
git clone https://github.com/your-username/robust-deep-mccfr.git
cd robust-deep-mccfr

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

# Format code
black src/

# Type checking
mypy src/

Project Structure

robust-deep-mccfr/
β”œβ”€β”€ src/robust-deep-mccfr/          # Main package
β”‚   β”œβ”€β”€ games/             # Game implementations
β”‚   β”œβ”€β”€ networks.py        # Neural network architectures
β”‚   β”œβ”€β”€ mccfr.py          # Core MCCFR algorithms
β”‚   β”œβ”€β”€ features.py       # Feature extraction
β”‚   β”œβ”€β”€ utils.py          # Utility functions
β”‚   └── __init__.py       # Package initialization
β”œβ”€β”€ examples/             # Example scripts
β”œβ”€β”€ tests/               # Unit tests
β”œβ”€β”€ docs/                # Documentation
β”œβ”€β”€ requirements.txt     # Dependencies
β”œβ”€β”€ setup.py            # Package setup
└── README.md           # This file

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit a pull request

πŸ”§ Configuration

Network Types

# Available network architectures
NETWORK_TYPES = [
    'simple',           # Basic feedforward
    'deep_residual',    # Deep residual network
    'feature_attention', # Attention-based
    'hybrid_advanced',  # Hybrid architecture
    'mega_transformer', # Large transformer
    'ultra_deep'        # Ultra-deep network
]

Training Parameters

# Common training configurations
CONFIGS = {
    'fast': {
        'batch_size': 128,
        'learning_rate': 0.001,
        'train_every': 50
    },
    'stable': {
        'batch_size': 384,
        'learning_rate': 0.00003,
        'train_every': 25
    },
    'research': {
        'batch_size': 512,
        'learning_rate': 0.00001,
        'train_every': 10
    }
}

πŸ› Troubleshooting

Common Issues

  1. CUDA Out of Memory: Reduce batch size or use a smaller network
  2. Slow Training: Enable GPU acceleration or use simpler architectures
  3. Numerical Instability: Adjust learning rate or enable gradient clipping

Performance Optimization

  • Use GPU acceleration for large networks
  • Adjust batch size based on available memory
  • Use mixed precision training for faster computation
  • Enable experience replay for sample efficiency

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Original MCCFR algorithm by Lanctot et al.
  • Deep CFR extensions by Brown et al.
  • PyTorch team for the excellent deep learning framework
  • Game theory research community

πŸ“ž Contact


⭐ If you find this project helpful, please consider giving it a star on GitHub!