About • Network Architecture • Training • Results • Sources
This project aims at creating a 9x9 go agent using the methods implemented by the Google Deepmind team in both AlphaGo and AlphaGo Zero. This begins by training a neural network using supervised learning for board evaluation and move prediction and then moving into a self improvement stage using reinforcement learning. The next step is to train a policy and value network tabula rasa using pure reinforcement learning and self-play.
This model utilizes a convolutional neural network with residual blocks and either a policy or value head to make predictions about a given board state. The policy head utilizes a 1x1 convolution, batch norm, and a fully connected layer and the value head...
These value network models were trained on a 2000 game subset of the 40,000 9x9 go games collected from the OGS website. Each game has at least one dan level player ensuring some degree of optimal play. The games were processed from the standard game format to a (n, 11, 9, 9) tensor, where n represents the number of states in the game, 11 where 2 sets of 5 dimensions are allocated for white and blacks stone positions for the past 5 moves and the final dimension for the player turn at that state. Each model was trained for 50 iterations with seeded random shuffling of data to predict the winner of the game. This was done using the a tanh activation function where 1 represents the current player winning and -1 represents the opposing player winning. The decision boundry for a successful prediction is one where |p| > 1/3, anything less than 1/3 it taken to be an uncertainty interval. All models tested converge to an accuracy of about 55%-60% which is relativly good considering the ambiguity of early game states. Although the accuracy remained relativly consistent for the majority of training, using model versions where the validation and training loss were at their lowest proved make the most reasonable predictions in user tesing.
- Aya and Natsukaze's selfplay games (http://www.yss-aya.com/ayaself/ayaself.html#nats2018)
- CGOS Archives for Board Size 9x9 (http://www.yss-aya.com/cgos/9x9/archive.html)
- Mini-Go 9x9 sgf (https://console.cloud.google.com/storage/browser/minigo-pub/v3-9x9/sgf/)
- Professional + Mini-go 9x9 (https://homepages.cwi.nl/~aeb/go/games/index.html)
-
Mastering the game of Go with deep neural networks and tree search
-
ELF OpenGo: An Analysis and Open Reimplementation of AlphaZero
- A Simple Alpha(Go) Zero Tutorial (https://web.stanford.edu/~surag/posts/alphazero.html)
- EGF Elo Rating System (https://senseis.xmp.net/?EGFRatingSystem)
Gregory Eales – @GregoryHamE – gregory.hamilton.e@gmail.com
Distributed under the MIT license. See LICENSE
for more information.