WIP: [Impala] Scalable Distributed Deep-RL with ImportanceWeighted Actor-Learner Architectures #2147

joneswong · 2018-05-28T03:19:14Z

This is a work-in-progress PR for implementing Impala, a scalable distributed reinforcement learning algorithm.
The impala optimizer has been implemented.
To do

Impala agent (v-trace)
support usage of LSTM and deep residual network
multi-gpu support (allreduce)

and after all the above items have been done, we must carefully test impala via reproducing experimental results reported by original paper.

Related issue number

1924

…figuration

contribute DDPG and related test configurations to Ray RLlib

AmplabJenkins · 2018-05-28T04:28:45Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5654/
Test PASSed.

robertnishihara · 2018-05-28T20:44:24Z

python/ray/rllib/impala/__init__.py

@@ -0,0 +1,3 @@
+from ray.rllib.impala.impala import ImpalaAgent, DEFAULT_CONFIG


Let's add

from __future__ import absolute_import from __future__ import division from __future__ import print_function

to the top of the file

robertnishihara · 2018-05-28T20:45:29Z

python/ray/rllib/optimizers/__init__.py

@@ -1,3 +1,4 @@
+from ray.rllib.optimizers.impala_optimizer import ImpalaOptimizer


Looks like we forgot before, but let's also add

from __future__ import absolute_import from __future__ import division from __future__ import print_function

to the top of this file

AmplabJenkins · 2018-05-31T18:02:32Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5779/
Test FAILed.

AmplabJenkins · 2018-06-04T08:12:51Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5850/
Test PASSed.

ericl · 2018-06-25T03:06:21Z

FYI, #2299

will add efficient batched LSTM support similar to that described in the IMPALA paper.

robertnishihara · 2018-10-27T23:21:30Z

@ericl @joneswong is this PR still relevant or should it be closed?

joneswong and others added 30 commits March 13, 2018 23:07

ongoing ddpg

4ebca67

ongoing ddpg converged

15c2f01

gpu machine changes

795c82c

tuned

841891d

sync with the latest ray-porject

8add6c3

tuned ddpg specification

9a29df5

merge the latest ray

2981a02

ddpg

95a3e86

supplement missed optimizer argument clip_rewards in default DQN con…

39a4537

…figuration

ddpg supports vision env (atari) now

125b0e3

Merge branch 'master' into dev_jones

24dfa67

Merge branch 'master' into dev_jones_contrib

040ebc7

revised according to code review comments

559b6ee

merge this contribution

4b97522

added regression test case

16bfaef

removed irrelevant files

bcebaac

validate ddpg on mountain_car_continuous

2bceb38

Merge branch 'dev_jones'

53e59b9

contribute DDPG and related test configurations to Ray RLlib

restore unnecessary slight changes

763f495

revised according to eric's comments

3550143

added the requested tests

f595276

revised accordingly

4ce455d

revised accordingly and re-validated

c525014

Merge branch 'master' into master

8336763

formatted by yapf

abe1f69

Merge branch 'master' of https://github.com/AlibabaPAI/ray

137dd98

fix lint errors

950e756

formatted by yapf

da37284

fix lint errors

5055453

formatted by yapf

859adf4

joneswong added 10 commits April 19, 2018 14:05

fix lint errors

3c11046

fix lint error

fc8932b

Merge branch 'dev_jones'

0aaba16

sync with upstream

1a08868

Merge remote-tracking branch 'upstream/master'

2838073

Merge remote-tracking branch 'upstream/master'

a3c5291

impala optimizer

4e5e964

Merge remote-tracking branch 'upstream/master'

4567bfe

Merge branch 'master' into dev_jones

dbb75ee

remove outdated files

e621125

robertnishihara reviewed May 28, 2018

View reviewed changes

joneswong added 5 commits May 29, 2018 23:16

impalanet

13df030

Merge remote-tracking branch 'upstream/master'

86f2ba3

Merge branch 'master' into dev_jones

a2fdfcc

test impalanet

3eed7a1

test impala policy

18b8a1c

support netowrks without lstm

0fad79a

ericl mentioned this pull request Jul 29, 2018

[rllib] Basic IMPALA implementation (using deepmind's reference vtrace.py) #2504

Merged

5 tasks

ericl closed this Oct 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: [Impala] Scalable Distributed Deep-RL with ImportanceWeighted Actor-Learner Architectures #2147

WIP: [Impala] Scalable Distributed Deep-RL with ImportanceWeighted Actor-Learner Architectures #2147

Uh oh!

joneswong commented May 28, 2018 •

edited

Loading

Uh oh!

AmplabJenkins commented May 28, 2018

Uh oh!

robertnishihara May 28, 2018

Uh oh!

robertnishihara May 28, 2018

Uh oh!

AmplabJenkins commented May 31, 2018

Uh oh!

AmplabJenkins commented Jun 4, 2018

Uh oh!

ericl commented Jun 25, 2018

Uh oh!

robertnishihara commented Oct 27, 2018

Uh oh!

Uh oh!

		@@ -0,0 +1,3 @@
		from ray.rllib.impala.impala import ImpalaAgent, DEFAULT_CONFIG

		@@ -1,3 +1,4 @@
		from ray.rllib.optimizers.impala_optimizer import ImpalaOptimizer

WIP: [Impala] Scalable Distributed Deep-RL with ImportanceWeighted Actor-Learner Architectures #2147

WIP: [Impala] Scalable Distributed Deep-RL with ImportanceWeighted Actor-Learner Architectures #2147

Uh oh!

Conversation

joneswong commented May 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related issue number

Uh oh!

AmplabJenkins commented May 28, 2018

Uh oh!

robertnishihara May 28, 2018

Choose a reason for hiding this comment

Uh oh!

robertnishihara May 28, 2018

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented May 31, 2018

Uh oh!

AmplabJenkins commented Jun 4, 2018

Uh oh!

ericl commented Jun 25, 2018

Uh oh!

robertnishihara commented Oct 27, 2018

Uh oh!

Uh oh!

joneswong commented May 28, 2018 •

edited

Loading