Skip to content

Commit

Permalink
Another book with unbalanced human openings (#39)
Browse files Browse the repository at this point in the history
A new book derived from Lichess games, with a model draw rate between 48% and 52%

It attempts to address the following points, relative to the currently used book:

* about 10x larger (2.6M pos), i.e. more variety while testing on fishtest, no repeated openings for any single test played.
* both white and black advantage around +- 1.0
* positions at all game plies between 1 and 16

The construction process involved

1) Parsing all 15B lichess games in the database https://database.lichess.org/ for the period Jan - Sept 2023.
   Extract from these the popular positions, i.e. seen at least twice, within the first 16 plies played, exploring newly added games to at most 8 previously unseen plies.
```
$ ./fastpopular --dir /mnt/md0/chess/lichessgames/2023/ --minCount 2 --stopEarly --countStopEarly 8 --maxPlies 16 --concurrency 9 -o popular_Lichess_JanSept_maxPlies16_stopEarly8.epd
Looking for pgn files in /mnt/md0/chess/lichessgames/2023/
Found 9 .pgn(.gz) files, creating 9 chunks for processing.
Processed 9 files
Retained 296993424 positions from 1127228493 unique visited in 15251265926 games.
Total time for processing: 7374.5 s
```
   fastpopular as available at https://github.com/vondele/fastpopular

2) Score all these 296M games with a modified stockfish, based on master, that analyses positions up to a depth 24, for as long as the draw rate is predicted (UCI_ShowWDL) near 50%.
   Positions will be analysed to low depth if the draw rate is very different from 50% at low depth.
   From these scored positions, extract those with a draw rate in the range 48 - 52%
   That modified branch is available at https://github.com/vondele/Stockfish/tree/createUHO
```
   ./stockfish.createUHO bench 128 1 24 popular_Lichess_JanSept_maxPlies16_stopEarly8.epd > popular_Lichess_JanSept_maxPlies16_stopEarly8_scored.epd
   awk '{if ($15>480 && $15<520) print $0}' popular_Lichess_JanSept_maxPlies16_stopEarly8_scored.epd | cut -d';' -f1 | sed "s/ $//g" > UHO_Lichess_4852_v1.epd
```

Short initial testing at STC shows the draw rate is, as expected, close to 50% for self-play games:
```
Score of master1 vs master2: 1048 - 1031 - 1921 [] 4000
Elo difference: 1.48 +/- 7.75, LOS: 64.54 %, DrawRatio: 48.02 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:        21    473   1026    462     18
```
  • Loading branch information
vondele authored Oct 21, 2023
1 parent abb48fd commit 426eca4
Showing 1 changed file with 0 additions and 0 deletions.
Binary file added UHO_Lichess_4852_v1.epd.zip
Binary file not shown.

0 comments on commit 426eca4

Please sign in to comment.