-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[rllib] A3C Refactoring #1166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rllib] A3C Refactoring #1166
Conversation
Merged build finished. Test FAILed. |
Test FAILed. |
Merged build finished. Test PASSed. |
Test PASSed. |
Merged build finished. Test PASSed. |
Test PASSed. |
Merged build finished. Test PASSed. |
Test PASSed. |
python/ray/rllib/a3c/a3c.py
Outdated
@@ -24,76 +22,11 @@ | |||
"use_lstm": True, | |||
"model": {"grayscale": True, | |||
"zero_mean": False, | |||
"dim": 42} | |||
"dim": 42, | |||
"pytorch": True} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a top level option since it affects more than the model?
from ray.rllib.a3c.policy import Policy | ||
|
||
|
||
class TFPolicy(Policy): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually we will want to move this to the top level RLlib dir but I guess we need to do more refactoring for that.
python/ray/rllib/models/catalog.py
Outdated
@@ -21,7 +21,8 @@ | |||
"extra_frameskip", # (int) for number of frames to skip | |||
"fcnet_activation", # Nonlinearity for fully connected net (tanh, relu) | |||
"fcnet_hiddens", # Number of hidden layers for fully connected net | |||
"free_log_std" # Documented in ray.rllib.models.Model | |||
"free_log_std", # Documented in ray.rllib.models.Model | |||
"pytorch", # Pytorch images need to be channel-major |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename this option to "channel_major" instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just had some config naming comments
Merged build finished. Test PASSed. |
Test PASSed. |
get_gradients
tocompute_gradients
model_update
toapply_gradients
compute_actions
tocompute_action
, as we only compute one action at once. There is zero-indexing in the code since the TF nodes can actually take variable number of inputs and hence will return a list of items.a3c.py
@ericl @pcmoritz