Skip to content

Commit f225ead

Browse files
authored
Scripts ux (#53)
* scripts module, readme updates, setup updates * tmp log default * impala exec command readme * minor readme tweaks * remove requirements file * eval bugfix * minor bugfix * small bugfix * small bugfix * mts for resume_local
1 parent 31840fb commit f225ead

14 files changed

+61
-35
lines changed

README.md

+23-17
Original file line numberDiff line numberDiff line change
@@ -23,50 +23,55 @@ Execution Modes
2323

2424
Environments
2525
* OpenAI Gym
26-
* StarCraft 2 (alpha)
26+
* StarCraft 2 (alpha, impala mode does not work with SC2 yet)
2727

2828
We designed this library to be flexible and extensible. Plugging in novel research ideas should be doable.
2929

3030
## Major Dependencies
3131
* gym
32-
* PyTorch 0.4.x (excluding 0.4.1 due to the [unbind bug](https://github.com/pytorch/pytorch/pull/9995))
32+
* PyTorch 0.4.x (excluding 0.4.1 due to an [unbind bug](https://github.com/pytorch/pytorch/pull/9995))
3333
* Python 3.5+
3434

3535
## Installation
3636
* Follow instructions for [PyTorch](https://pytorch.org/)
3737
* (Optional) Follow instructions for [StarCraft 2](https://github.com/Blizzard/s2client-proto#downloads)
38-
* More optional dependencies in requirements.txt
3938

4039
```
4140
# Remove mpi, sc2, profiler if you don't plan on using these features:
4241
pip install adept[mpi,sc2,profiler]
4342
```
4443

4544
## Performance
46-
TODO
45+
* Used to win a [Doom competition](http://vizdoom.cs.put.edu.pl/competition-cig-2018/competition-results) (Ben Bell / Marv2in)
46+
* ~2500 training frames per second single-GPU performance on a Dell XPS 15" laptop (Geforce 1050Ti)
47+
* Will post Atari/SC2 baseline scores here at some point
4748

4849
## Examples
4950
If you write your own scripts, you can provide your own agents or networks, but we have some presets you can run out of the box.
50-
If you pip installed, these scripts are on your classpath and can be run with the commands below.
51-
If you cloned the repo, put a python in front of each command.
51+
Logs go to `/tmp/adept_logs/` by default.
52+
The log directory contains the tensorboard file, saved models, and other metadata.
5253

5354
```
5455
# Local Mode (A2C)
5556
# We recommend 4GB+ GPU memory, 8GB+ RAM, 4+ Cores
56-
local.py --env-id BeamRiderNoFrameskip-v4 --agent ActorCritic --vision-network FourConv --network-body LSTM
57+
python -m adept.scripts.local --env-id BeamRiderNoFrameskip-v4
5758
58-
# Towered Mode (A3C Variant)
59-
# We recommend 2x+ GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores
60-
towered.py --env-id BeamRiderNoFrameskip-v4 --agent ActorCritic --vision-network FourConv --network-body LSTM
59+
# Towered Mode (A3C Variant, requires mpi4py)
60+
# We recommend 2+ GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores
61+
python -m adept.scripts.towered --env-id BeamRiderNoFrameskip-v4
6162
6263
# IMPALA (requires mpi4py and is resource intensive)
63-
# We recommend 2x+ GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores
64-
mpirun -np 3 -H localhost:3 python -m mpi4py `which impala.py` -n 8
64+
# We recommend 2+ GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores
65+
mpiexec -n 3 python -m adept.scripts.impala --env-id BeamRiderNoFrameskip-v4
66+
67+
# StarCraft 2 (IMPALA not supported yet)
68+
# Warning: much more resource intensive than Atari
69+
python -m adept.scripts.local --env-id CollectMineralShards
6570
6671
# To see a full list of options:
67-
local.py -h
68-
towered.py -h
69-
impala.py -h
72+
python -m adept.scripts.local -h
73+
python -m adept.scripts.towered -h
74+
python -m adept.scripts.impala -h
7075
```
7176

7277
## API Reference
@@ -77,7 +82,8 @@ Currently only ActorCritic is supported. Other agents, such as DQN or ACER may b
7782
### Containers
7883
Containers hold all of the application state. Each subprocess gets a container in Towered and IMPALA modes.
7984
### Environments
80-
Environments work using the OpenAI Gym wrappers.
85+
Environments run in subprocesses and send their observation, rewards, terminals, and infos to the host process.
86+
They work pretty much the same way as OpenAI's code.
8187
### Experience Caches
8288
An Experience Cache is a Rollout or Experience Replay that is written to after stepping and read before learning.
8389
### Modules
@@ -90,5 +96,5 @@ The Body network operates on the flattened embedding and would typically be an L
9096
The Head depends on the Environment and Agent and is created accordingly.
9197

9298
## Acknowledgements
93-
We borrow pieces of OpenAI's (gym)[https://github.com/openai/gym] and (baselines)[https://github.com/openai/baselines] code.
99+
We borrow pieces of OpenAI's [gym](https://github.com/openai/gym) and [baselines](https://github.com/openai/baselines) code.
94100
We indicate where this is done.

adept/containers/evaluation.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ def run(self, nb_episode):
8686
next_obs, rewards, terminals, infos = self.environment.step(actions)
8787

8888
self.agent.reset_internals(terminals)
89-
episode_rewards = self.update_buffers(rewards, terminals, infos)
89+
episode_rewards, _ = self.update_buffers(rewards, terminals, infos)
9090
for reward in episode_rewards:
9191
self._episode_count += 1
9292
results.append(reward)

adept/scripts/__init__.py

+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
"""
2+
Copyright (C) 2018 Heron Systems, Inc.
3+
4+
This program is free software: you can redistribute it and/or modify
5+
it under the terms of the GNU General Public License as published by
6+
the Free Software Foundation, either version 3 of the License, or
7+
(at your option) any later version.
8+
9+
This program is distributed in the hope that it will be useful,
10+
but WITHOUT ANY WARRANTY; without even the implied warranty of
11+
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12+
GNU General Public License for more details.
13+
14+
You should have received a copy of the GNU General Public License
15+
along with this program. If not, see <http://www.gnu.org/licenses/>.
16+
"""
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

scripts/resume_local.py adept/scripts/resume_local.py

+19-4
Original file line numberDiff line numberDiff line change
@@ -38,11 +38,12 @@ def main(args):
3838
network_file = args.network_file
3939
optimizer_file = args.optimizer_file
4040
args_file_path = args.args_file
41+
mts = args.max_train_steps
4142
with open(args.args_file, 'r') as args_file:
4243
args = dotdict(json.load(args_file))
4344

4445
print_ascii_logo()
45-
log_id = make_log_id(args.tag, args.mode_name, args.agent, args.network)
46+
log_id = make_log_id(args.tag, args.mode_name, args.agent, args.vision_network + args.network_body)
4647
log_id_dir = os.path.join(args.log_dir, args.env_id, log_id)
4748

4849
os.makedirs(log_id_dir)
@@ -59,7 +60,7 @@ def main(args):
5960

6061
# construct network
6162
torch.manual_seed(args.seed)
62-
network_head_shapes = get_head_shapes(env.action_space, env.engine, args)
63+
network_head_shapes = get_head_shapes(env.action_space, env.engine, args.agent)
6364
network = make_network(env.observation_space, network_head_shapes, args)
6465
network.load_state_dict(torch.load(network_file))
6566

@@ -76,9 +77,19 @@ def make_optimizer(params):
7677
opt.load_state_dict(torch.load(optimizer_file))
7778
return opt
7879

79-
container = Local(agent, env, device, make_optimizer, args.epoch_len, args.nb_env, logger, summary_writer, saver)
80+
container = Local(
81+
agent,
82+
env,
83+
make_optimizer,
84+
args.epoch_len,
85+
args.nb_env,
86+
logger,
87+
summary_writer,
88+
args.summary_frequency,
89+
saver
90+
)
8091
try:
81-
container.run(args.max_train_steps + initial_count, initial_count)
92+
container.run(mts + initial_count, initial_count)
8293
finally:
8394
env.close()
8495

@@ -99,6 +110,10 @@ def make_optimizer(params):
99110
'--optimizer-file', default=None,
100111
help='path to args file (.../logs/<env-id>/<log-id>/<epoch>/optimizer.pth)'
101112
)
113+
parser.add_argument(
114+
'-mts', '--max-train-steps', type=int, default=10e6, metavar='MTS',
115+
help='number of steps to train for (default: 10e6)'
116+
)
102117
args = parser.parse_args()
103118
args.mode_name = 'Local'
104119
main(args)
File renamed without changes.

adept/utils/script_helpers.py

+2-3
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,6 @@ def get_head_shapes(action_space, engine, agent_name):
117117

118118

119119
def add_base_args(parser):
120-
root_dir = os.path.abspath(os.pardir)
121120
"""
122121
Common Arguments
123122
"""
@@ -146,8 +145,8 @@ def add_base_args(parser):
146145
help='environment to train on (default: PongNoFrameskip-v4)'
147146
)
148147
parser.add_argument(
149-
'--log-dir', default=os.path.join(root_dir, 'logs/'),
150-
help='folder to save logs. (default: adept/logs)'
148+
'--log-dir', default='/tmp/adept_logs/',
149+
help='folder to save logs. (default: /tmp/adept_logs)'
151150
)
152151
parser.add_argument(
153152
'-mts', '--max-train-steps', type=int, default=10e6, metavar='MTS',

requirements.txt

Whitespace-only changes.

setup.py

-10
Original file line numberDiff line numberDiff line change
@@ -13,16 +13,6 @@
1313
license='GNU',
1414
python_requires='>=3.5.0',
1515
packages=find_packages(),
16-
scripts=[
17-
'scripts/benchmark_atari.py',
18-
'scripts/evaluation.py',
19-
'scripts/impala.py',
20-
'scripts/local.py',
21-
'scripts/render.py',
22-
'scripts/replay_gen.py',
23-
'scripts/resume_local.py',
24-
'scripts/towered.py'
25-
],
2616
install_requires=[
2717
'numpy>=1.14',
2818
'gym[atari]>=0.10',

0 commit comments

Comments
 (0)