This is a bootleg version of AlphaZero written in C++17, purely using only the PyTorch C++ Frontend. It also includes some work leading up to it such as MCTS, imitation learning and a Python implementation (which I stopped working with due to performance). I tried a kind of bridge whereas the C++ code is the main program which only calls gym routines in Python using the Python C API, but unfortunately that turned out to be too slow. Thus, I went for a pure C++ version and included a few environments from openai/gym rewritten in C++ (see the envs folder).
Below is the result of training 10 times on MountainCar using bootleg AlphaZero with parameter configuration 127. More current configurations can be seen here.
Quite the variance, and takes ages to learn (roughly 6 hours to be more precise). Needs more work.
These are instructions for the C++ version.
- Go to
alphazero/contrib
and build the Docker image:sudo docker build -t grab0 -f Dockerfile .
. - Go to
alphazero/cpp_impl
and run the Docker image:sudo docker run -v $(pwd):/app --privileged -it grab0 bash
. - In the Docker image execute
setup
to compile. - BootlegAlphaZero can be run as
./GRAB0 <game> <parameters>
, e.g../GRAB0 mtcar 133
. All parameters are listed insimulations.json
.
This is for Debian Buster. This could be automated at some point.
- Go to
alphazero/contrib
. - Run
printf "deb http://httpredir.debian.org/debian buster-backports main non-free\ndeb-src http://httpredir.debian.org/debian buster-backports main non-free" > /etc/apt/sources.list.d/backports.list
. - Run
apt-get update --allow-releaseinfo-change && apt-get install -t buster-backports -y g++ vim gdb cmake python3-dev wget unzip git libprotobuf-dev libprotobuf17 protobuf-compiler nlohmann-json3-dev
. - Run
pip3 install pytest numpy cython torch gym gym-minigrid git+https://github.com/instance01/gym-mini-envs.git
. - Go to
alphazero/cpp_impl
. - Run
cmake . && cmake --build .
.