Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rainbow #374

Merged
merged 85 commits into from
Apr 26, 2019
Merged
Changes from 2 commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
a642136
init rainbow
seann999 Nov 3, 2018
8d9d0c8
Merge branch 'master' of https://github.com/chainer/chainerrl
seann999 Nov 3, 2018
93d87b9
init impl for all except n-step-return
seann999 Nov 3, 2018
254bce4
fix conflict
seann999 Nov 3, 2018
41fbd2a
compat w n-step
seann999 Nov 3, 2018
d026939
test fix
seann999 Nov 3, 2018
60e53c4
gpu fix
seann999 Nov 3, 2018
14d0144
gpu fix
seann999 Nov 3, 2018
8e31fec
gpu fix
seann999 Nov 3, 2018
e9661b7
gpu fix
seann999 Nov 3, 2018
5821a0c
softmax
seann999 Nov 3, 2018
3a64094
applied autopep
seann999 Nov 4, 2018
be1aff9
removed redundant example
seann999 Nov 4, 2018
e500ee4
undo breaking changes
seann999 Nov 4, 2018
9e3b909
fix flake8 errors
seann999 Nov 4, 2018
9d47391
fix docs
seann999 Nov 4, 2018
a3e6923
flake8 fix
seann999 Nov 4, 2018
e9f0b4f
flake8 fix
seann999 Nov 4, 2018
fc899ec
resolves merge conflicts
prabhatnagarajan Dec 26, 2018
c94733e
Merge commit 'fc899ec0b939c492eb606e0ccbe2ca296f65df33' into rainbow
prabhatnagarajan Mar 6, 2019
c3f5746
Merge branch 'master' into rainbow
toslunar Mar 7, 2019
e152292
Merge branch 'rainbow' of https://github.com/prabhatnagarajan/chainer…
prabhatnagarajan Mar 7, 2019
7dd69e3
adds readme for rainbow
prabhatnagarajan Mar 7, 2019
148950f
fixes minor error in readme
prabhatnagarajan Mar 7, 2019
9f0838f
keeps old train_categorical_ and moves existing file to train_rainbow
prabhatnagarajan Mar 7, 2019
1b714cf
adds rainbow to test examples, and cleans up the example more to alig…
prabhatnagarajan Mar 7, 2019
bff26e3
sets prioritized to true by default
prabhatnagarajan Mar 7, 2019
0e99996
removes some arguments
prabhatnagarajan Mar 7, 2019
de315a5
edits the epsilon
prabhatnagarajan Mar 7, 2019
2181ec9
modifies double categorical target computation
prabhatnagarajan Mar 7, 2019
ecc975e
remove sunused variable
prabhatnagarajan Mar 7, 2019
49dfc32
Merge branch 'master' into rainbow
prabhatnagarajan Mar 12, 2019
9bf4ef3
Adds double categorical tests
prabhatnagarajan Mar 12, 2019
6614127
fixes a typo
prabhatnagarajan Mar 13, 2019
16cec4b
resolves merge conflicts
prabhatnagarajan Mar 14, 2019
dc58305
adds profiling
prabhatnagarajan Mar 29, 2019
ae78f1a
removes unnecessary cprof stuff
prabhatnagarajan Mar 30, 2019
cf44c9b
Merge branch 'master' into rainbow
prabhatnagarajan Apr 2, 2019
bf68efb
Merge branch 'master' into rainbowcopy
prabhatnagarajan Apr 2, 2019
518fdfb
removes some reshaping
prabhatnagarajan Apr 3, 2019
bae963f
compresses code more
prabhatnagarajan Apr 3, 2019
0157288
removes trace
prabhatnagarajan Apr 3, 2019
cfea74d
Merge branch 'rainbowcopy' into rainbow
prabhatnagarajan Apr 4, 2019
c4b3e34
cleans up comments
prabhatnagarajan Apr 4, 2019
a8386c2
refactors some categorical double DQN code
prabhatnagarajan Apr 4, 2019
340bfd1
changes episode len in eval to use args
prabhatnagarajan Apr 4, 2019
2a1fb79
adds some print statements
prabhatnagarajan Apr 4, 2019
1a1b688
uses xp instead of np
prabhatnagarajan Apr 4, 2019
d2b78d9
replaces numpy with xp in noisy_linear
prabhatnagarajan Apr 4, 2019
a487893
fixes bug
prabhatnagarajan Apr 7, 2019
c36a927
passes dtype with standard normal computation for cupy
prabhatnagarajan Apr 8, 2019
b6c29fc
imports numpy in noisy linear
prabhatnagarajan Apr 8, 2019
07ea30f
improves dueling architecture and eps call
prabhatnagarajan Apr 8, 2019
0a1f5bb
attempts cuda kernel
prabhatnagarajan Apr 8, 2019
340944e
fixes kernel
prabhatnagarajan Apr 8, 2019
01e8b72
fixes bug
prabhatnagarajan Apr 8, 2019
67d25c6
adds missing statement
prabhatnagarajan Apr 8, 2019
0c6677e
sets kernel in constructor
prabhatnagarajan Apr 8, 2019
f46761e
adds cuda import
prabhatnagarajan Apr 8, 2019
bb07184
adds mul_add and uses muladd in noisy linear
prabhatnagarajan Apr 8, 2019
6f1d63d
replaces another op to use muladd
prabhatnagarajan Apr 9, 2019
b87fbb1
uses chainer add instead of normal add
prabhatnagarajan Apr 9, 2019
8d318ee
uses split axis instead of indexing
prabhatnagarajan Apr 9, 2019
c1c5d2a
fixes incorrect shape access
prabhatnagarajan Apr 9, 2019
8b26dd7
fixes minor syntax issues
prabhatnagarajan Apr 9, 2019
ca25b4c
addresses flakes
prabhatnagarajan Apr 9, 2019
d9868f1
addresses flake issues on tests and example
prabhatnagarajan Apr 9, 2019
9774133
modifies cuda import
prabhatnagarajan Apr 9, 2019
b43632c
minor modifications
prabhatnagarajan Apr 10, 2019
ed0189a
amends prioritized categorical
prabhatnagarajan Apr 10, 2019
0e2dc70
adds basic skeleton code from test_dqn to test new categorical loss f…
prabhatnagarajan Apr 10, 2019
9ea8b91
modifies setup arrays
prabhatnagarajan Apr 10, 2019
2909d5c
Merge branch 'master' into rainbow
prabhatnagarajan Apr 10, 2019
09e5ebe
modifications to tests
prabhatnagarajan Apr 10, 2019
e567278
adds tests and cleans up categorical functions somewhat
prabhatnagarajan Apr 11, 2019
7b46ed2
addresses flakes, minor refactoring
prabhatnagarajan Apr 11, 2019
920c329
sets exploration epsilon to 0 as per the rainbow paper, applies autopep
prabhatnagarajan Apr 12, 2019
f205e9a
address comments from PR
prabhatnagarajan Apr 21, 2019
1ab816c
Update chainerrl/q_functions/dueling_dqn.py
toslunar Apr 23, 2019
1fe2c46
adds scores to Rainbow PR
prabhatnagarajan Apr 26, 2019
454a790
Merge branch 'master' into rainbow
prabhatnagarajan Apr 26, 2019
0ff03ba
Merge remote-tracking branch 'origin/rainbow' into rainbow
prabhatnagarajan Apr 26, 2019
ad2f767
adds results summary and fixes time table
prabhatnagarajan Apr 26, 2019
77acffe
changes reference to DQN to be rainbow
prabhatnagarajan Apr 26, 2019
b2d52be
doesn't export MulAdd
prabhatnagarajan Apr 26, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion chainerrl/q_functions/dueling_dqn.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ def __call__(self, x):
batch_size = x.shape[0]

h = self.activation(self.main_stream(h))
h_a, h_v = F.split_axis(h, 2, axis=--1)
h_a, h_v = F.split_axis(h, 2, axis=-1)
ya = F.reshape(self.a_stream(h_a),
(batch_size, self.n_actions, self.n_atoms))

Expand Down