-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rllib] Importance Sampling and KL Loss for APPO #5051
Conversation
Test FAILed. |
Any benchmark results? |
Hi Eric, if you run pong-impala.yaml and pong-appo.yaml (similar configurations), you'll see that APPO does a lot better. At the same time, there is also a new yaml file called halfcheetah-appo.yaml, which gets to 9k reward in 3 hours or so. This could most likely be improved if the bug regarding Impala's performance is found. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
I tested this patch and it worked great for me! |
For some reason, I cannot run APPO in local mode (for debugging).
Could that be because of the incompatibility between versions of Ray and RLlib? |
Test FAILed. |
jenkins retest this please |
Test FAILed. |
Test FAILed. |
Test FAILed. |
jenkins retest this please |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add more documentation on why a target network is being used here?
@ericl Target Documentation pushed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Test FAILed. |
Test FAILed. |
Test PASSed. |
What do these changes do?
There are two improvements to APPO:
Related issue number
Linter
scripts/format.sh
to lint the changes in this PR.