2121- [ TD3] ( #td3 )
2222- [ SAC] ( #sac )
2323
24+ <hr >
25+
2426<a name =' dqn ' ></a >
2527
2628### DQN
@@ -71,8 +73,11 @@ class ReplayBuffer:
7173$ python DQN/DQN_Discrete.py
7274```
7375
76+ <hr >
77+
7478<a name =' drqn ' ></a >
7579
80+
7681### DRQN
7782
7883** Paper** [ Deep Recurrent Q-Learning for Partially Observable MDPs] ( https://arxiv.org/abs/1507.06527 ) <br >
@@ -85,8 +90,11 @@ $ python DQN/DQN_Discrete.py
8590$ python DRQN/DRQN_Discrete.py
8691```
8792
93+ <hr >
94+
8895<a name =' double_dqn ' ></a >
8996
97+
9098### DoubleDQN
9199
92100** Paper** [ Deep Reinforcement Learning with Double Q-learning] ( https://arxiv.org/abs/1509.06461 ) <br >
@@ -99,6 +107,8 @@ $ python DRQN/DRQN_Discrete.py
99107$ python DoubleQN/DoubleDQN_Discrete.py
100108```
101109
110+ <hr >
111+
102112<a name =' dueling_dqn ' ></a >
103113
104114### DoubleDQN
@@ -113,6 +123,8 @@ $ python DoubleQN/DoubleDQN_Discrete.py
113123$ python DuelingDQN/DuelingDQN_Discrete.py
114124```
115125
126+ <hr >
127+
116128<a name =' a2c ' ></a >
117129
118130### A2C
@@ -130,6 +142,8 @@ $ python A2C/A2C_Discrete.py
130142$ python A2C/A2C_Continuous.py
131143```
132144
145+ <hr >
146+
133147<a name =' a3c ' ></a >
134148
135149### A3C
@@ -147,6 +161,8 @@ $ python A3C/A3C_Discrete.py
147161$ python A3C/A3C_Continuous.py
148162```
149163
164+ <hr >
165+
150166<a name =' ppo ' ></a >
151167
152168### PPO
@@ -164,6 +180,8 @@ $ python PPO/PPO_Discrete.py
164180$ python PPO/PPO_Continuous.py
165181```
166182
183+ <hr >
184+
167185<a name =' trpo ' ></a >
168186
169187### TRPO
@@ -177,6 +195,8 @@ $ python PPO/PPO_Continuous.py
177195# NOTE: Not yet implemented!
178196```
179197
198+ <hr >
199+
180200<a name =' ddpg ' ></a >
181201
182202### DDPG
@@ -190,6 +210,8 @@ $ python PPO/PPO_Continuous.py
190210# NOTE: Not yet implemented!
191211```
192212
213+ <hr >
214+
193215<a name =' td3 ' ></a >
194216
195217### TD3
@@ -203,6 +225,8 @@ $ python PPO/PPO_Continuous.py
203225# NOTE: Not yet implemented!
204226```
205227
228+ <hr >
229+
206230<a name =' sac ' ></a >
207231
208232### SAC
@@ -217,6 +241,8 @@ $ python PPO/PPO_Continuous.py
217241# NOTE: Not yet implemented!
218242```
219243
244+ <hr >
245+
220246## Reference
221247
222248- https://github.com/carpedm20/deep-rl-tensorflow
0 commit comments