Description
The default hyperparameters of baselines/baselines/deepq/experiments/run_atari.py
, which presumably is the script we should be using for DQN-based models, fail to gain any noticeable reward for both Breakout and Pong. I've attached log files later and the steps to reproduce in this issue; the main reason why I'm filing it is that it probably makes sense to have default hyperparameters be working for the scripts that are provided. Or, alternatively, perhaps list the ones that work somewhere? Upon reading run_atari.py
it seems like the number of steps is a bit low and the replay buffer should be 10x larger, but I don't think that's going to fix the issue since Pong should be able to learn quickly with this kind of setup.
I know this is probably not the top priority now but in theory this is easy to fix (just run it with the correct hyperparameters), and it would be great for users since running even 10 million steps (the default value right now) can take over 10 hours on a decent personal workstation. If you're in the process of refactoring this code, is there any chance you can take this feedback into account? Thank you!
Steps to reproduce:
- Use a machine with Ubuntu 16.04.
- I doubt this matters, but I'm also using an NVIDIA Titan X GPU with Pascal.
- Install baselines as of commit 36ee5d1
- I used a Python 3.5 virtual environment with the following packages, with Tensorflow 1.8.0.
- Enter the experiments directory:
cd baselines/baselines/deepq/experiments/
- Finally, run
python run_atari.py
with eitherPongNoFrameskip-v4
orBreakoutNoFrameskip-v4
as the--env
argument. I kept all other parameters their default value, so this was prioritized dueling double DQN.
By default the logger
in baselines will create log.txt
, progress.csv
, and monitor.csv
files that contain information about training runs. Here are the Breakout and Pong log files:
Since GitHub doesn't upload csv files, here are the monitor.csv
files for Breakout and then Pong:
https://www.dropbox.com/s/ibl8lvub2igr9kw/breakout_monitor.csv?dl=0
https://www.dropbox.com/s/yuf3din6yjb2swl/pong_monitor.csv?dl=0
Finally, here are the progress.csv
files for Breakout and the for Pong:
https://www.dropbox.com/s/79emijmnsdcjm37/breakout_progress.csv?dl=0
https://www.dropbox.com/s/b817wnlyyyriti9/pong_progress.csv?dl=0