Advantage async actor-critic Algorithms (A3C) in PyTorch

@inproceedings{mnih2016asynchronous,
  title={Asynchronous methods for deep reinforcement learning},
  author={Mnih, Volodymyr and Badia, Adria Puigdomenech and Mirza, Mehdi and Graves, Alex and Lillicrap, Timothy P and Harley, Tim and Silver, David and Kavukcuoglu, Koray},
  booktitle={International Conference on Machine Learning},
  year={2016}}

This repository contains an implementation of Adavantage async Actor-Critic (A3C) in PyTorch based on the original paper by the authors and the PyTorch implementation by Ilya Kostrikov.

A3C is the state-of-art Deep Reinforcement Learning method.

Dependencies

Python 2.7
PyTorch
gym (OpenAI)
universe (OpenAI)
opencv (for env state processing)
visdom (for visualization)

Training

./train_lstm.sh

Test wigh trained weight after 169000 updates for PongDeterminisitc-v3.

./test_lstm.sh 169000

A test result video is available.

Check the loss curves of all threads in http://localhost:8097

References

Asynchronous methods for deep reinforcement learning on arXiv
Ilya Kostrikov's implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Advantage async actor-critic Algorithms (A3C) in PyTorch

Dependencies

Training

Test wigh trained weight after 169000 updates for PongDeterminisitc-v3.

Check the loss curves of all threads in http://localhost:8097

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

Advantage async actor-critic Algorithms (A3C) in PyTorch

Dependencies

Training

Test wigh trained weight after 169000 updates for PongDeterminisitc-v3.

Check the loss curves of all threads in http://localhost:8097

References