Colver

Fast Belote Contree game environment for reinforcement learning. Rust core with Python bindings.

Live demo: avok.me/colver/ — running on a Raspberry Pi.

Features

~1.4M rollouts/sec single-threaded (play phase), ~895K rollouts/sec on a full deal
56-byte Copy game state for fast MCTS cloning
Six AI agents — DMC Q-network, IS-DD with belief network, DD oracle, Smart/Naive IS-MCTS, and heuristic
NN bidding — "Bid a Dede" (v2), a Dueling DQN trained on DD-solved deals with suit augmentation, used by all agents
Belief network — NN-based card location prediction for IS-DD search
Web interface — play against AI, spectate, analyze, and solve problems (FastAPI + WebSocket)
Python bindings via PyO3 — Env class with full type stubs, installable from PyPI
Zero dependencies in the core (only rand behind a feature flag)

Web Interface

Play against AI agents directly in your browser at avok.me/colver/, or run it locally:

uv run python -m colver.web
# Or: uv run colver-web
# Open http://localhost:8000

Humain vs IA — Play as South against AI opponents. Choose the agent for your opponents (East/West) and your partner (North) independently. The game follows official FFB Belote Contree rules: bidding with coinche/surcoinche, then 8 tricks. Cards are played instantly on click; the pause slider controls AI thinking delay.

IA vs IA — Spectate AI vs AI matches with all hands visible. Assign a different agent to each of the 4 seats. Step through actions, play full tricks, or use auto-play. The stats panel shows Q-values, DD scores, or hand evaluations for each decision. Paste a CFN string to load a specific position.

Rejouer — Browse and replay past games (played or spectated). Click an entry to step through it with navigation controls.

Annonces — Compose an 8-card hand, choose your position in the bidding round, and see what Bid a Dede (the NN bidder) would bid — with Q-values for every legal action.

Croyances — Visualize how the belief network and heuristic model predict card locations as a game progresses. Generate a random game, step through it, and see per-card probability bars with ground truth overlay and accuracy stats. Switch observer perspective (N/E/S/W) and compare NN vs heuristic predictions side by side.

Problemes d'annonce — Bidding practice problems. See a hand and bidding history, then find the right bid. The AI evaluates your answer against the NN bidder's recommendation.

Problemes de jeu — Card play practice problems. See a mid-game position and find the best card. Compare your choice to the DD solver's optimal play.

Build & Run

Requires Rust 1.70+ and Python 3.10+.

# Tests (357 tests)
cargo test -p colver-core

# Performance benchmark
cargo run -p colver-core --bin bench --release

# MCTS vs random demo
cargo run -p colver-core --bin mcts_demo --release -- 100

# Smart IS-MCTS vs random + vs naive demo
cargo run -p colver-core --bin smart_ismcts_demo --release -- 100

# Python bindings (via uv)
uv sync
uv run python3 -c "import colver; env = colver.Env(); print(env.reset())"

# Web interface (play against AI)
uv run python -m colver.web

# DMC training (Q-network)
PYTHONPATH=scripts/training uv run python scripts/training/train_dmc.py --num-envs 256 --steps 20000000

# DMC evaluation vs IS-MCTS
uv run python scripts/analysis/eval_dmc.py models/dmc_final.pt --baseline smart --time-ms 20 --both-sides

AI Agents

Oracle — DD Solver (`solver.rs`)

Perfect-information double-dummy solver that sees all 4 hands — it cheats. Alpha-beta with transposition tables, PVS, killer moves, and card equivalence pruning. Computes the exact optimal card in ~7ms (median). Useful as an upper bound.

Dede — IS-DD (`is_dd.rs`)

Information Set Double-Dummy search. Maintains a probabilistic belief model over hidden cards — updated after every action via hard constraints (voids, trump ceiling) and soft inference (bidding signals, play patterns). Optionally augmented with a belief network (NN-based card location prediction, 330→512→512→128, ~2MB). Samples plausible opponent hands weighted by these beliefs, then solves each world exactly with the alpha-beta DD solver. IS-DD sounds like "is Dede" — hence the name.

DouDou50 — DMC Q-Network (`dmc_net.rs`)

DouZero-style reinforcement learning agent. A Q-network picks card plays with a single forward pass — no search tree. Default play model, trained 50M steps with Bid a Dede frozen (triforge play-only phase).

Architecture: ResNet Dueling DQN 411→1024→1024→1024→32 with LayerNorm and skip connections (~2.6M parameters). Uses canonical suit encoding (no augmentation needed). Inference in pure Rust (~1ms/decision, no PyTorch needed). Strongest overall agent.

The previous model DouDou35 (415→1024³→32, legacy obs, 35M steps) is still supported for backward compatibility. DouDou = a reference to DouZero.

Older search agents

Smart IS-MCTS (smart_ismcts.rs) — Belief-weighted Information Set MCTS with heuristic card beliefs. Naive IS-MCTS (naive_ismcts.rs) — Ensemble determinization without beliefs. Both are configurable and documented in docs/SMART_ISMCTS.md.

Bid a Dede — NN Bidder (`bid_net.rs`)

Dueling DQN (108→512→512→512→43) trained on DD-solved deals with 24x suit augmentation. Default bidder for all agents. Beats the best heuristic bidder 70-76% across all play engines. BidNet::load auto-detects hidden size (tries 256, 512, 1024).

The previous model Bid a Doudou (v1, 114→256→256→43, trained with DouZero self-play) is still supported.

Agent Comparison

Agent	Type	Speed/move	Notes
Oracle (DD)	DD solver (cheats)	~7ms	Perfect info upper bound
Dede (IS-DD)	DD solver + beliefs	~20ms	Strongest search-based
DouDou50	Q-network (ResNet)	<1ms	Strongest overall, no search
Smart IS-MCTS	Search + beliefs	~9ms	Configurable budget
Naive IS-MCTS	Search	~8ms	Configurable budget

Note: Search-based agents get stronger with more time budget. The DMC agent uses no search — one forward pass per decision.

Architecture

Workspace: colver-core (pure Rust) + colver-py (PyO3/NumPy FFI) + colver-web (FastAPI/WebSocket)

Card Representation

Bitmask system: Card = u8 (0-31), CardSet = u32 (bitmask). Layout: Spades[0-7], Hearts[8-15], Diamonds[16-23], Clubs[24-31]. Within each suit: 7, 8, 9, J, Q, K, 10, A (plain strength order). Trump strength: J > 9 > A > 10 > K > Q > 8 > 7.

Game State

GameState is Copy and ≤96 bytes (compile-time enforced) for fast MCTS cloning. Contains hands, current trick, contract, points/tricks per team, bidding state, played cards bitmask, void tracking, and belote tracking.

Action Encoding

Phase	Actions	Encoding
Bidding	43 total	0=PASS, 1-36=bids (9 values x 4 suits), 37-40=capot x 4, 41=COINCHE, 42=SURCOINCHE
Playing	32 total	Card index 0-31 directly

Game Flow

Bidding → Playing → Done. Bidding ends after 3 consecutive passes, a surcoinche, or 4 passes (void deal). Playing runs 8 tricks of 4 cards. Card point total = 152; with dix de der = 162 (normal) or 252 (capot).

Python API

import colver

print(colver.__version__)  # "0.3.2"

# Single environment
env = colver.Env()
obs, legal_actions = env.reset()
obs, reward, done, legal_actions = env.step(action)

env.current_player()       # 0-3
env.phase()                # 0=Bidding, 1=Playing, 2=Done
env.legal_action_mask()    # numpy array (43,)
env.rewards()              # [NS_score, EW_score]
env.bid_improved()         # improved_bid action
env.deal_outcome()         # [NS_outcome, EW_outcome] binary
env.get_observation()      # 415-float observation vector
env.action_naive_ismcts(20)  # naive IS-MCTS action (20ms)
env.action_smart_ismcts(20)  # smart IS-MCTS action (20ms)

# DMC Q-network (if model weights downloaded)
model = colver.model_path()  # ~/.cache/colver/models/dmc_final.bin
if model:
    env.load_dmc_model(str(model))
    result = env.action_dmc_with_stats()  # {"best_action": 5, "q_values": [...]}

Performance

Workload	Throughput	Latency
Play-phase rollout	1.4M/sec	~720 ns
Full-deal rollout	895K/sec	~1118 ns
MCTS game (1000 iter) vs random	—	8 ms
Smart IS-MCTS game (20x50) vs random	—	9 ms
DMC Q-Network inference	—	<1 ms

Docker

The Docker image lets you deploy the web interface on any machine, including a Raspberry Pi (ARM64).

# Build and run
docker build -t colver .
docker run -p 8000:8000 colver

# Or with Docker Compose
docker compose up -d

# Cross-build for Raspberry Pi (ARM64)
docker buildx build --platform linux/arm64 -t colver .

The image is ~257 MB (no PyTorch dependency). All agents run in pure Rust and work on all architectures.

Rules

Implements Belote Contree with 4 suits (Spades, Hearts, Diamonds, Clubs). Scoring mode: "points faits + points demandes". See REGLES-DE-LA-BELOTE-CONTREE.pdf for the full FFB rulebook.

References

Kocsis, L. & Szepesvari, C. (2006). Bandit Based Monte-Carlo Planning. ECML.
Cowling, P.I., Powley, E.J. & Whitehouse, D. (2012). Information Set Monte Carlo Tree Search. IEEE Transactions on Computational Intelligence and AI in Games.
Zha, D. et al. (2021). DouZero: Mastering DouDiZhu with Self-Play Deep Reinforcement Learning. ICML.
Auer, P., Cesa-Bianchi, N. & Fischer, P. (2002). Finite-time Analysis of the Multiarmed Bandit Problem. Machine Learning.

Acknowledgments

Thanks to Ronan Guillou, seasoned coinche player, for his advice on the game and for being the first tester — his good sense guided many UI decisions.

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
.github/workflows		.github/workflows
arena/bots		arena/bots
colver-core		colver-core
colver-py		colver-py
colver-wasm		colver-wasm
deploy		deploy
docs		docs
images		images
python/colver		python/colver
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.fr.md		README.fr.md
README.md		README.md
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Colver

Features

Web Interface

Build & Run

AI Agents

Oracle — DD Solver (`solver.rs`)

Dede — IS-DD (`is_dd.rs`)

DouDou50 — DMC Q-Network (`dmc_net.rs`)

Older search agents

Bid a Dede — NN Bidder (`bid_net.rs`)

Agent Comparison

Architecture

Card Representation

Game State

Action Encoding

Game Flow

Python API

Performance

Docker

Rules

References

Acknowledgments

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Colver

Features

Web Interface

Build & Run

AI Agents

Oracle — DD Solver (solver.rs)

Dede — IS-DD (is_dd.rs)

DouDou50 — DMC Q-Network (dmc_net.rs)

Older search agents

Bid a Dede — NN Bidder (bid_net.rs)

Agent Comparison

Architecture

Card Representation

Game State

Action Encoding

Game Flow

Python API

Performance

Docker

Rules

References

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Oracle — DD Solver (`solver.rs`)

Dede — IS-DD (`is_dd.rs`)

DouDou50 — DMC Q-Network (`dmc_net.rs`)

Bid a Dede — NN Bidder (`bid_net.rs`)

Packages