@@ -23,50 +23,55 @@ Execution Modes
23
23
24
24
Environments
25
25
* OpenAI Gym
26
- * StarCraft 2 (alpha)
26
+ * StarCraft 2 (alpha, impala mode does not work with SC2 yet )
27
27
28
28
We designed this library to be flexible and extensible. Plugging in novel research ideas should be doable.
29
29
30
30
## Major Dependencies
31
31
* gym
32
- * PyTorch 0.4.x (excluding 0.4.1 due to the [ unbind bug] ( https://github.com/pytorch/pytorch/pull/9995 ) )
32
+ * PyTorch 0.4.x (excluding 0.4.1 due to an [ unbind bug] ( https://github.com/pytorch/pytorch/pull/9995 ) )
33
33
* Python 3.5+
34
34
35
35
## Installation
36
36
* Follow instructions for [ PyTorch] ( https://pytorch.org/ )
37
37
* (Optional) Follow instructions for [ StarCraft 2] ( https://github.com/Blizzard/s2client-proto#downloads )
38
- * More optional dependencies in requirements.txt
39
38
40
39
```
41
40
# Remove mpi, sc2, profiler if you don't plan on using these features:
42
41
pip install adept[mpi,sc2,profiler]
43
42
```
44
43
45
44
## Performance
46
- TODO
45
+ * Used to win a [ Doom competition] ( http://vizdoom.cs.put.edu.pl/competition-cig-2018/competition-results ) (Ben Bell / Marv2in)
46
+ * ~ 2500 training frames per second single-GPU performance on a Dell XPS 15" laptop (Geforce 1050Ti)
47
+ * Will post Atari/SC2 baseline scores here at some point
47
48
48
49
## Examples
49
50
If you write your own scripts, you can provide your own agents or networks, but we have some presets you can run out of the box.
50
- If you pip installed, these scripts are on your classpath and can be run with the commands below .
51
- If you cloned the repo, put a python in front of each command .
51
+ Logs go to ` /tmp/adept_logs/ ` by default .
52
+ The log directory contains the tensorboard file, saved models, and other metadata .
52
53
53
54
```
54
55
# Local Mode (A2C)
55
56
# We recommend 4GB+ GPU memory, 8GB+ RAM, 4+ Cores
56
- local.py --env-id BeamRiderNoFrameskip-v4 --agent ActorCritic --vision-network FourConv --network-body LSTM
57
+ python -m adept.scripts.local --env-id BeamRiderNoFrameskip-v4
57
58
58
- # Towered Mode (A3C Variant)
59
- # We recommend 2x + GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores
60
- towered.py --env-id BeamRiderNoFrameskip-v4 --agent ActorCritic --vision-network FourConv --network-body LSTM
59
+ # Towered Mode (A3C Variant, requires mpi4py )
60
+ # We recommend 2 + GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores
61
+ python -m adept.scripts.towered --env-id BeamRiderNoFrameskip-v4
61
62
62
63
# IMPALA (requires mpi4py and is resource intensive)
63
- # We recommend 2x+ GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores
64
- mpirun -np 3 -H localhost:3 python -m mpi4py `which impala.py` -n 8
64
+ # We recommend 2+ GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores
65
+ mpiexec -n 3 python -m adept.scripts.impala --env-id BeamRiderNoFrameskip-v4
66
+
67
+ # StarCraft 2 (IMPALA not supported yet)
68
+ # Warning: much more resource intensive than Atari
69
+ python -m adept.scripts.local --env-id CollectMineralShards
65
70
66
71
# To see a full list of options:
67
- local.py -h
68
- towered.py -h
69
- impala.py -h
72
+ python -m adept.scripts.local -h
73
+ python -m adept.scripts.towered -h
74
+ python -m adept.scripts.impala -h
70
75
```
71
76
72
77
## API Reference
@@ -77,7 +82,8 @@ Currently only ActorCritic is supported. Other agents, such as DQN or ACER may b
77
82
### Containers
78
83
Containers hold all of the application state. Each subprocess gets a container in Towered and IMPALA modes.
79
84
### Environments
80
- Environments work using the OpenAI Gym wrappers.
85
+ Environments run in subprocesses and send their observation, rewards, terminals, and infos to the host process.
86
+ They work pretty much the same way as OpenAI's code.
81
87
### Experience Caches
82
88
An Experience Cache is a Rollout or Experience Replay that is written to after stepping and read before learning.
83
89
### Modules
@@ -90,5 +96,5 @@ The Body network operates on the flattened embedding and would typically be an L
90
96
The Head depends on the Environment and Agent and is created accordingly.
91
97
92
98
## Acknowledgements
93
- We borrow pieces of OpenAI's (gym) [ https://github.com/openai/gym ] and (baselines) [ https://github.com/openai/baselines ] code.
99
+ We borrow pieces of OpenAI's [ gym ] ( https://github.com/openai/gym ) and [ baselines ] ( https://github.com/openai/baselines ) code.
94
100
We indicate where this is done.
0 commit comments