Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

base WDL model on material count and normalize evals dynamically #5121

Conversation

robertnurnberg
Copy link
Contributor

This PR proposes to change the parameter dependence of Stockfish's internal WDL model from full move counter to material count. In addition it ensures that an evaluation of 100 centipawns always corresponds to a 50% win probability at fishtest LTC, whereas for master this holds only at move number 32. See also #4920 and the discussion therein.

The new model was fitted based on about 340M positions extracted from 5.6M fishtest LTC games from the last three weeks, involving SF versions from e67cc97 (SF 16.1) to current master.

The involved commands are for WDL_model are:

./updateWDL.sh --firstrev e67cc979fd2c0e66dfc2b2f2daa0117458cfc462
python scoreWDL.py updateWDL.json --plot save --pgnName update_material.png --momType "material" --momTarget 58 --materialMin 10 --modelFitting optimizeProbability

The anchor 58 for the material count value was chosen to be as close as possible to the observed average material count of fishtest LTC games at move 32 (43), while not changing the value of NormalizeToPawnValue compared to the move-based WDL model by more than 1.

The patch only affects the displayed cp and wdl values.

No functional change.

@robertnurnberg
Copy link
Contributor Author

The output from the fitting script is

> python scoreWDL.py updateWDL.json --plot save --pgnName update_material.png --momType "material" --momTarget 58 --materialMin 10 --modelFitting optimizeProbability
Converting evals with NormalizeToPawnValue = 356.
Reading eval stats from updateWDL.json.
Retained (W,D,L) = (79332445, 174784865, 81470532) positions.
Fit WDL model based on material.
Initial objective function:  0.33966379558305415
Final objective function:    0.33966029804753994
Optimization terminated successfully.
const int NormalizeToPawnValue = 355;
Corresponding spread = 73;
Corresponding normalized spread = 0.20596669257972489;
Draw rate at 0.0 eval at move 58 = 0.9845441039596996;
Parameters in internal value units: 
p_a = ((-185.720 * x / 58 + 504.850) * x / 58 + -438.583) * x / 58 + 474.046
p_b = ((89.235 * x / 58 + -137.021) * x / 58 + 73.287) * x / 58 + 47.534
    constexpr double as[] = {-185.71965483, 504.85014385, -438.58295743, 474.04604627};
    constexpr double bs[] = {89.23542728, -137.02141296, 73.28669021, 47.53376190};

update_material

src/uci.cpp Show resolved Hide resolved
@robertnurnberg
Copy link
Contributor Author

The lower limit of a material count of 10 acts as some sort of safeguard. Here the output of the fitting command without a lower limit on material count.

update_material

And here the distribution of the WDL raw data:
distro

Looking at this plot, I guess we could use 8 as the lower limit for material count. Here I would like to await feedback from @vondele .

@robertnurnberg
Copy link
Contributor Author

For completeness, here the raw data in terms of full move counters.

distro_move

@vondele
Copy link
Member

vondele commented Mar 18, 2024

based on these graphs, I would say 10 is a good choice. Extending too much to small material count impacts the quality of the fit for the more relevant material counts.

@Disservin Disservin closed this in 9b92ada Mar 20, 2024
@robertnurnberg robertnurnberg deleted the wdl-material-dynamic branch March 20, 2024 16:26
linrock pushed a commit to linrock/Stockfish that referenced this pull request Mar 27, 2024
This PR proposes to change the parameter dependence of Stockfish's
internal WDL model from full move counter to material count. In addition
it ensures that an evaluation of 100 centipawns always corresponds to a
50% win probability at fishtest LTC, whereas for master this holds only
at move number 32. See also
official-stockfish#4920 and the
discussion therein.

The new model was fitted based on about 340M positions extracted from
5.6M fishtest LTC games from the last three weeks, involving SF versions
from e67cc97 (SF 16.1) to current
master.

The involved commands are for
[WDL_model](https://github.com/official-stockfish/WDL_model) are:
```
./updateWDL.sh --firstrev e67cc97
python scoreWDL.py updateWDL.json --plot save --pgnName update_material.png --momType "material" --momTarget 58 --materialMin 10 --modelFitting optimizeProbability
```

The anchor `58` for the material count value was chosen to be as close
as possible to the observed average material count of fishtest LTC games
at move 32 (`43`), while not changing the value of
`NormalizeToPawnValue` compared to the move-based WDL model by more than
1.

The patch only affects the displayed cp and wdl values.

closes official-stockfish#5121

No functional change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants