adept

adept is a library designed to accelerate reinforcement learning research by providing:

baseline reinforcement learning models and algorithms for PyTorch
multi-GPU compute options
access to various environments
built-in tensorboard logging, model saving, reloading, evaluation, and rendering
abstractions for building custom networks, agents, execution modes, and experiments
proven hyperparameter defaults

This code is alpha, expect rough edges.

Features

Agents / Networks

Actor Critic with Generalized Advantage Estimation
Stateful networks (ie. LSTMs)
Batch norm

Execution Modes

Local (Single-GPU, A2C)
Towered (Multi-GPU, A3C-variant)
Importance Weighted Actor Learner Architectures, IMPALA (Faster Multi-GPU)

Environments

OpenAI Gym
StarCraft 2 (alpha, impala mode does not work with SC2 yet)

We designed this library to be flexible and extensible. Plugging in novel research ideas should be doable.

Major Dependencies

gym
PyTorch 1.x
Python 3.5+
We use CUDA 10, pytorch 1.0, python 3.6

Installation

From docker:

docker instructions

From source:

Follow instructions for PyTorch
(Optional) Follow instructions for StarCraft 2

git clone https://github.com/heronsystems/adeptRL
cd adeptRL
# Remove mpi, sc2, profiler if you don't plan on using these features:
pip install .[mpi,sc2,profiler]

Performance

Atari/SC2 scores pending
~ 3000 TFPS / 12000 FPS (Local Mode / 64 environments / GeForce 2080 Ti / Ryzen 2700x 8-core)
Used to win a Doom competition (Ben Bell / Marv2in)

Examples

If you write your own scripts, you can provide your own agents or networks, but we have some presets you can run out of the box. Logs go to /tmp/adept_logs/ by default. The log directory contains the tensorboard file, saved models, and other metadata.

# Local Mode (A2C)
# We recommend 4GB+ GPU memory, 8GB+ RAM, 4+ Cores
python -m adept.app local --env BeamRiderNoFrameskip-v4

# Towered Mode (A3C Variant, requires mpi4py)
# We recommend 2+ GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores
python -m adept.app towered --env BeamRiderNoFrameskip-v4

# IMPALA (requires mpi4py and is resource intensive)
# We recommend 2+ GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores
python -m adept.app impala --env BeamRiderNoFrameskip-v4

# StarCraft 2 (IMPALA not supported yet)
# Warning: much more resource intensive than Atari
python -m adept.app local --env CollectMineralShards

# To see a full list of options:
python -m adept.app -h
python -m adept.app help <command>

Important Terminology

Training Frame - An environment frame that is trained on
Frame - A raw environment frame, including skipped frames. A skipped frame is a frame that is not trained on. By default, Atari only trains on every fourth frame.
FPS - Frames per second.
TFPS - Training frames per second. Typically research papers report raw environment frames. Our logs report training frames, since we are interested in sample efficiency.

API Reference

Containers

Containers hold all of the application state. Each subprocess gets a container in Towered and IMPALA modes.

Agents

An Agent acts on and observes the environment. Currently only ActorCritic is supported. Other agents, such as DQN or ACER may be added later.

Networks

Networks are not PyTorch modules, they need to implement our abstract NetworkModule or ModularNetwork classes. A ModularNetwork consists of a source nets, body, and heads.

Environments

Environments run in subprocesses and send their observation, rewards, terminals, and infos to the host process. They work pretty much the same way as OpenAI's code.

Experience Caches

An Experience Cache is a Rollout or Experience Replay that is written to after stepping and read before learning.

Acknowledgements

We borrow pieces of OpenAI's gym and baselines code. We indicate where this is done.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

adept

Features

Major Dependencies

Installation

Performance

Examples

Important Terminology

API Reference

Containers

Agents

Networks

Environments

Experience Caches

Acknowledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

adept

Features

Major Dependencies

Installation

Performance

Examples

Important Terminology

API Reference

Containers

Agents

Networks

Environments

Experience Caches

Acknowledgements