Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

N-step returns #282

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

N-step returns #282

wants to merge 1 commit into from

Conversation

ghost
Copy link

@ghost ghost commented Jul 29, 2020

Theory

This change replacing the Bellman operator with an N-step variant. N-step returns are widely used in the context of many policy gradient algorithms as well as Q-learning variants. Using N-step returns often lead to faster learning.

I tested it on BipedalWalker-v3 with 1, 5 and 10 steps.

If a user won't use N-step returns, please set the parameter --n_steps to 1.

Charts

chart

Math

math

Links

Rainbow: Combining Improvements in Deep Reinforcement Learning (Multi-step learning)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant