Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance evaluation and comparison of algorithms #108

Open
muupan opened this issue Jun 10, 2017 · 6 comments
Open

Performance evaluation and comparison of algorithms #108

muupan opened this issue Jun 10, 2017 · 6 comments

Comments

@muupan
Copy link
Member

muupan commented Jun 10, 2017

It will be great to add performance evaluation and comparisons of algorithms available in ChainerRL.

@ghost
Copy link

ghost commented Jun 10, 2017

Indeed, I like TRPO with exact second derivatives/Hessian-vector products :) Has nice theoretical properties

@muupan
Copy link
Member Author

muupan commented Jun 11, 2017

I agree TRPO is great, but supporting TRPO is off-topic on this issue.

https://github.com/openai/rllab and https://github.com/openai/baselines are doing such evaluation and comparison really well, so it's good to start from them.

@ghost
Copy link

ghost commented Jun 11, 2017

There' also PyTorch implementations which I'm currently using, (once chainer has second derivatives it will be possible to port these over),

https://github.com/mjacar/pytorch-trpo

https://github.com/ikostrikov/pytorch-trpo

ChainerRL seems to be very promising as an alternative to the openai repos given.

@muupan
Copy link
Member Author

muupan commented Jul 31, 2017

Here are DQN's scores on five Atari games https://github.com/muupan/chainerrl/blob/benchmark-dqn/evaluations/visualize.ipynb

@muupan
Copy link
Member Author

muupan commented Aug 2, 2017

Added DoubleDQN and PAL.

@muupan
Copy link
Member Author

muupan commented Aug 4, 2017

Added DQN with prioritized replay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants