Skip to content

Conversation

richardliaw
Copy link
Contributor

  • Renamed:
  • get_gradients to compute_gradients
  • model_update to apply_gradients
  • compute_actions to compute_action, as we only compute one action at once. There is zero-indexing in the code since the TF nodes can actually take variable number of inputs and hence will return a list of items.
  • moved Runner out of a3c.py
  • Moved helper functions out into separate class
  • Introduced TFPolicy in starting effort to contain TF code

@ericl @pcmoritz

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/2210/
Test FAILed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/2211/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/2212/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/2213/
Test PASSed.

@@ -24,76 +22,11 @@
"use_lstm": True,
"model": {"grayscale": True,
"zero_mean": False,
"dim": 42}
"dim": 42,
"pytorch": True}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a top level option since it affects more than the model?

from ray.rllib.a3c.policy import Policy


class TFPolicy(Policy):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually we will want to move this to the top level RLlib dir but I guess we need to do more refactoring for that.

@@ -21,7 +21,8 @@
"extra_frameskip", # (int) for number of frames to skip
"fcnet_activation", # Nonlinearity for fully connected net (tanh, relu)
"fcnet_hiddens", # Number of hidden layers for fully connected net
"free_log_std" # Documented in ray.rllib.models.Model
"free_log_std", # Documented in ray.rllib.models.Model
"pytorch", # Pytorch images need to be channel-major
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename this option to "channel_major" instead?

Copy link
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just had some config naming comments

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/2218/
Test PASSed.

@richardliaw richardliaw merged commit dc66a2d into ray-project:master Oct 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants