Skip to content

Latest commit

 

History

History
44 lines (31 loc) · 1.4 KB

README.md

File metadata and controls

44 lines (31 loc) · 1.4 KB
@inproceedings{mnih2016asynchronous,
  title={Asynchronous methods for deep reinforcement learning},
  author={Mnih, Volodymyr and Badia, Adria Puigdomenech and Mirza, Mehdi and Graves, Alex and Lillicrap, Timothy P and Harley, Tim and Silver, David and Kavukcuoglu, Koray},
  booktitle={International Conference on Machine Learning},
  year={2016}}

This repository contains an implementation of Adavantage async Actor-Critic (A3C) in PyTorch based on the original paper by the authors and the PyTorch implementation by Ilya Kostrikov.

A3C is the state-of-art Deep Reinforcement Learning method.

Dependencies

  • Python 2.7
  • PyTorch
  • gym (OpenAI)
  • universe (OpenAI)
  • opencv (for env state processing)
  • visdom (for visualization)

Training

./train_lstm.sh

Test wigh trained weight after 169000 updates for PongDeterminisitc-v3.

./test_lstm.sh 169000

A test result video is available.

Check the loss curves of all threads in http://localhost:8097

loss_png

References