[rllib] Improve performance for small rollouts #812

pcmoritz · 2017-08-04T23:16:54Z

No description provided.

AmplabJenkins · 2017-08-04T23:31:56Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-04T23:31:57Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1517/
Test PASSed.

AmplabJenkins · 2017-08-05T02:21:34Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-05T02:21:35Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1518/
Test PASSed.

AmplabJenkins · 2017-08-05T23:16:03Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-05T23:16:04Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1520/
Test PASSed.

AmplabJenkins · 2017-08-05T23:46:30Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-05T23:46:31Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1521/
Test PASSed.

AmplabJenkins · 2017-08-05T23:51:39Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-05T23:51:39Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1522/
Test PASSed.

AmplabJenkins · 2017-08-06T00:01:45Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-06T00:01:46Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1523/
Test PASSed.

…into pg-multinode-fixes Conflicts: python/ray/rllib/policy_gradient/agent.py

AmplabJenkins · 2017-08-06T02:21:25Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-06T02:21:26Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1524/
Test PASSed.

AmplabJenkins · 2017-08-06T03:10:37Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-06T03:10:37Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1525/
Test PASSed.

robertnishihara · 2017-08-06T03:16:25Z

python/ray/rllib/policy_gradient/agent.py

+    def compute_steps(self, gamma, lam, horizon, min_steps_per_task=-1):
+        """Compute multiple rollouts and concatenate the results.
+
+        Parameters:


This should be Args:

robertnishihara · 2017-08-06T03:16:48Z

python/ray/rllib/policy_gradient/agent.py

+
+        Parameters:
+            num_steps: Lower bound on the number of states to be collected.
+        Returns:


add a newline before Returns:

AmplabJenkins · 2017-08-06T03:16:51Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-06T03:16:51Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1526/
Test PASSed.

robertnishihara · 2017-08-06T03:18:11Z

python/ray/rllib/policy_gradient/policy_gradient.py

    "kl_coeff": 0.2,
+    # Number of SGD iteration in each outer loop


robertnishihara · 2017-08-06T03:21:35Z

python/ray/rllib/policy_gradient/agent.py

+        """Compute multiple rollouts and concatenate the results.
+
+        Parameters:
+            num_steps: Lower bound on the number of states to be collected.


These args are not the actual names of the args

robertnishihara · 2017-08-06T03:24:48Z

python/ray/rllib/policy_gradient/agent.py

        trajectory = rollouts(
            self.common_policy,
            self.env, horizon, self.observation_filter, self.reward_filter)
        add_advantage_values(trajectory, gamma, lam, self.reward_filter)
        return trajectory

+    def compute_steps(self, gamma, lam, horizon, min_steps_per_task=-1):


would it make sense to define this method in terms of compute_trajectory?

robertnishihara · 2017-08-06T03:30:05Z

python/ray/rllib/policy_gradient/rollout.py

    num_timesteps_so_far = 0
    trajectories = []
    total_rewards = []
-    traj_len_means = []
+    traj_lengths = []


slight preference for trajectory_lengths

robertnishihara · 2017-08-06T03:30:12Z

python/ray/rllib/policy_gradient/agent.py

+        num_steps_so_far = 0
+        trajectories = []
+        total_rewards = []
+        traj_lengths = []


slight preference for trajectory_lengths

robertnishihara · 2017-08-06T03:30:36Z

python/ray/rllib/policy_gradient/agent.py

+            traj_lengths.append(np.logical_not(trajectory["dones"]).sum(axis=0).mean())
+            trajectory = flatten(trajectory)
+            not_done = np.logical_not(trajectory["dones"])
+            trajectory = {key: val[not_done]


add a comment saying that we're filtering out the environments that are finished

AmplabJenkins · 2017-08-06T04:01:56Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-06T04:01:56Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1527/
Test PASSed.

AmplabJenkins · 2017-08-06T04:16:57Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-06T04:16:57Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1528/
Test PASSed.

AmplabJenkins · 2017-08-06T04:25:08Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-06T04:25:09Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1529/
Test PASSed.

AmplabJenkins · 2017-08-06T04:36:05Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-06T04:36:06Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1530/
Test PASSed.

pcmoritz added 2 commits August 4, 2017 16:15

batch small rollouts together

1647612

implement minimum number of samples for each task

11450b0

add total time

71930dc

pcmoritz force-pushed the pg-multinode-fixes branch from 032fc5f to 71930dc Compare August 5, 2017 23:35

fix linting

99cc663

style

3510cd4

pcmoritz added 2 commits August 5, 2017 19:02

fix

a6f460f

Merge branch 'pg-multinode-fixes' of https://github.com/pcmoritz/ray-1 …

7b1dd49

…into pg-multinode-fixes Conflicts: python/ray/rllib/policy_gradient/agent.py

pcmoritz added 2 commits August 5, 2017 19:53

factor out parameters and document stuff

5d4fb7e

add rollout batchsize

093fc66

robertnishihara reviewed Aug 6, 2017

View reviewed changes

address comments

5e73a94

linting

3994256

pcmoritz force-pushed the pg-multinode-fixes branch from 1240642 to 3994256 Compare August 6, 2017 04:09

small fix

4b61c52

robertnishihara merged commit 0225581 into ray-project:master Aug 6, 2017

robertnishihara deleted the pg-multinode-fixes branch August 6, 2017 05:13

		"kl_coeff": 0.2,
		# Number of SGD iteration in each outer loop

[rllib] Improve performance for small rollouts #812

[rllib] Improve performance for small rollouts #812

Uh oh!

Conversation

pcmoritz commented Aug 4, 2017

Uh oh!

AmplabJenkins commented Aug 4, 2017

Uh oh!

AmplabJenkins commented Aug 4, 2017

Uh oh!

AmplabJenkins commented Aug 5, 2017

Uh oh!

AmplabJenkins commented Aug 5, 2017

Uh oh!

AmplabJenkins commented Aug 5, 2017

Uh oh!

AmplabJenkins commented Aug 5, 2017

Uh oh!

AmplabJenkins commented Aug 5, 2017

Uh oh!

AmplabJenkins commented Aug 5, 2017

Uh oh!

AmplabJenkins commented Aug 5, 2017

Uh oh!

AmplabJenkins commented Aug 5, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

AmplabJenkins commented Aug 6, 2017

Uh oh!

Uh oh!