[rllib] A3C Refactoring #1166

richardliaw · 2017-10-27T05:40:52Z

Renamed:
get_gradients to compute_gradients
model_update to apply_gradients
compute_actions to compute_action, as we only compute one action at once. There is zero-indexing in the code since the TF nodes can actually take variable number of inputs and hence will return a list of items.
moved Runner out of a3c.py
Moved helper functions out into separate class
Introduced TFPolicy in starting effort to contain TF code

AmplabJenkins · 2017-10-27T06:00:46Z

Merged build finished. Test FAILed.

AmplabJenkins · 2017-10-27T06:00:46Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/2210/
Test FAILed.

AmplabJenkins · 2017-10-27T06:21:25Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-10-27T06:21:26Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/2211/
Test PASSed.

AmplabJenkins · 2017-10-27T06:47:29Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-10-27T06:47:29Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/2212/
Test PASSed.

AmplabJenkins · 2017-10-27T06:55:46Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-10-27T06:55:46Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/2213/
Test PASSed.

ericl · 2017-10-27T17:42:48Z

python/ray/rllib/a3c/a3c.py

@@ -24,76 +22,11 @@
    "use_lstm": True,
    "model": {"grayscale": True,
              "zero_mean": False,
-              "dim": 42}
+              "dim": 42,
+              "pytorch": True}


Should this be a top level option since it affects more than the model?

ericl · 2017-10-27T17:44:21Z

python/ray/rllib/a3c/tfpolicy.py

+from ray.rllib.a3c.policy import Policy
+
+
+class TFPolicy(Policy):


Eventually we will want to move this to the top level RLlib dir but I guess we need to do more refactoring for that.

ericl · 2017-10-27T17:44:43Z

python/ray/rllib/models/catalog.py

@@ -21,7 +21,8 @@
    "extra_frameskip",  # (int) for number of frames to skip
    "fcnet_activation",  # Nonlinearity for fully connected net (tanh, relu)
    "fcnet_hiddens",  # Number of hidden layers for fully connected net
-    "free_log_std"  # Documented in ray.rllib.models.Model
+    "free_log_std",  # Documented in ray.rllib.models.Model
+    "pytorch",  # Pytorch images need to be channel-major


rename this option to "channel_major" instead?

ericl

Looks good, just had some config naming comments

AmplabJenkins · 2017-10-28T00:11:14Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-10-28T00:11:14Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/2218/
Test PASSed.

richardliaw added 4 commits October 26, 2017 22:31

fixing policy

9ec0ba0

Compute Action is singular, fixed weird issue with arrays

35c3d4c

remove vestige

ba60181

extraneous ipdb

2cff36a

Can Drop in Pytorch Model

afa2d3b

lint

0a75e9c

ericl reviewed Oct 27, 2017

View reviewed changes

ericl approved these changes Oct 27, 2017

View reviewed changes

richardliaw added 2 commits October 27, 2017 16:39

naming

a0b3c52

finish comments

96680d5

richardliaw merged commit dc66a2d into ray-project:master Oct 29, 2017

anyscalesam added the stability label Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[rllib] A3C Refactoring #1166

[rllib] A3C Refactoring #1166

Uh oh!

richardliaw commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

ericl Oct 27, 2017

Uh oh!

ericl Oct 27, 2017

Uh oh!

ericl Oct 27, 2017

Uh oh!

ericl left a comment

Uh oh!

AmplabJenkins commented Oct 28, 2017

Uh oh!

AmplabJenkins commented Oct 28, 2017

Uh oh!

Uh oh!

		from ray.rllib.a3c.policy import Policy


		class TFPolicy(Policy):

[rllib] A3C Refactoring #1166

[rllib] A3C Refactoring #1166

Uh oh!

Conversation

richardliaw commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

AmplabJenkins commented Oct 27, 2017

Uh oh!

ericl Oct 27, 2017

Choose a reason for hiding this comment

Uh oh!

ericl Oct 27, 2017

Choose a reason for hiding this comment

Uh oh!

ericl Oct 27, 2017

Choose a reason for hiding this comment

Uh oh!

ericl left a comment

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented Oct 28, 2017

Uh oh!

AmplabJenkins commented Oct 28, 2017

Uh oh!

Uh oh!