Skip to content

Conversation

@AlexisOlson
Copy link
Contributor

Summary

Implements node priors for MCTS by initializing edges with virtual visits at expansion time, biasing PUCT selection toward policy priors while maintaining efficiency.

Mechanism

At node expansion, each edge is initialized with:

  • n₀(a) = K·P(a) virtual visits, where K = α·num_legal_moves
  • W₀(a) = n₀(a)·WL_parent virtual win-loss value
  • D₀(a) = n₀(a)·D_parent virtual draw value

These virtual statistics are set once and never updated. Real visits accumulate on top.

Modified PUCT Formula

U(a) = C_puct · P(a) · √N / (1 + n₀(a) + n_real(a) + n_in_flight(a)) Q(a) = (W_real + W₀) / (n_real + n₀)

Implementation Approach

  • Storage: Added n0_, w0_, d0_ as floats in Edge struct (~12 bytes/edge)
  • Overlay scope: Virtual visits combined with real visits locally in selection loops only
  • Hot path impact: Minimal - no new function calls, only arithmetic operations
  • Parent values: Uses fresh NN evaluation (robust, always available)
  • Time management: GetNStarted() remains real-only (virtual visits excluded)

Usage

lc0 --node-prior=0.5 or via UCI: setoption name NodePrior value 0.5

@AlexisOlson AlexisOlson marked this pull request as draft October 23, 2025 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant