-
Notifications
You must be signed in to change notification settings - Fork 698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor replay based scripts #173
Conversation
This pull request is being automatically deployed with Vercel (learn more). 🔍 Inspect: https://vercel.com/vwxyzjn/cleanrl/7NBoCmqbCsrTeVtqKRpZUxAGFZ8N |
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
Here are the benchmarked results. Looks like DQN in this PR gets better performance in Atari games, slightly worse results in Given these results, I recommend we merge this PR. This PR obtains overall better performance and removes an unverified code-level optimization: gradient norm clipping for DQN. @dosssman and @yooceii, does the result from this PR make sense to you? If it does, I will make updates to the docs and ultimately remove the old experiments. I think after this we would be ready for the 1.0 release. CC @araffin who might be interested in this :) Atari games
Classic control
MuJoCo
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good on my side.
Thanks for the great work.
Merging now. |
Description
This PR closes #171, closes #172, closes #168, and closes #148.
dqn.py
and others #171qf2
#172episodic_length
for non-PPO scripts. #168nn.utils.clip_grad_norm_
for DQN, DDPG, and TD3 #148Types of changes
Checklist:
pre-commit run --all-files
passes (required).mkdocs serve
.If you are adding new algorithms or your change could result in performance difference, you may need to (re-)run tracked experiments. See #137 as an example PR.
--capture-video
flag toggled on (required).mkdocs serve
.width=500
andheight=300
).