This repository implements a Advantage Actor-Critic agent baseline for the pysc2 environment as described in the DeepMind StarCraft II paper. We use a synchronous variant of A3C (A2C) to effectively train on GPUs.
This repository is part of a research project at the Autonomous Systems Labs , TU Darmstadt by Daniel Palenicek, Marcel Hussing, and Simon Meister.
NOTE: this is still work in progress.
This project is licensed under the MIT License (refer to the LICENSE file for details).
- A2C agent
- FullyConv architecture
- support all spatial screen and minimap observations as well as non-spatial player observations
- support the full action space as described in the DeepMind paper (predicting all arguments independently)
- support training on all mini games
- train MoveToBeacon
- train other mini games and correct any training issues
- LSTM architecture
- Multi-GPU training
Any mini game can in principle be trained with the current code,
although we still have to do experiments on maps other than MoveToBeacon
.
Map | mean score (ours) | mean score (DeepMind) |
---|---|---|
MoveToBeacon | 25 | 26 |
With default settings (32 environments), learning MoveToBeacon currently takes between 3K and 8K episodes in total. This varies each run depending on random initialization and action sampling.
- for fast training, a GPU is recommended
- Python 3
- pysc2 (tested with v1.2)
- TensorFlow (tested with 1.4.0)
- StarCraft II and mini games (see below or pysc2)
pip install numpy tensorflow-gpu pysc2==1.2
- Install StarCraft II. On Linux, use 3.16.1.
- Download the
mini games
and extract them to your
StarcraftII/Maps/
directory.
- train with
python run.py my_experiment --map MoveToBeacon
. - run trained agents with
python run.py my_experiment --map MoveToBeacon --eval
.
You can visualize the agents with the --vis
flag.
See run.py
for all arguments.
Summaries are written to out/summary/<experiment_name>
and model checkpoints are written to out/models/<experiment_name>
.
The code in rl/environment.py
is based on
OpenAI baselines,
with adaptions from
sc2aibot.
Some of the code in rl/agents/a2c/runner.py
is loosely based on
sc2aibot.
Also see pysc2-agents for a similar repository.