1
1
<p align =" center " >
2
2
<a href="https://www.youtube.com/watch?v=pieI7rOXELI&list=PLXO45tsB95cIplu-fLMpUEEZTwrDNh6Ba" target="_blank">
3
- <img width="60%" src="/blob/master/RL_cover.jpg" style="max-width:100%;">
3
+ <img width="60%" src="https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /blob/master/RL_cover.jpg" style="max-width:100%;">
4
4
</a>
5
5
</p >
6
6
@@ -18,67 +18,67 @@ In these tutorials for reinforcement learning, it covers from the basic RL algor
18
18
# Table of Contents
19
19
20
20
* Tutorials
21
- * [ Simple entry example] ( /tree/master/contents/1_command_line_reinforcement_learning )
22
- * [ Q-learning] ( /tree/master/contents/2_Q_Learning_maze )
23
- * [ Sarsa] ( /tree/master/contents/3_Sarsa_maze )
24
- * [ Sarsa(lambda)] ( /tree/master/contents/4_Sarsa_lambda_maze )
25
- * [ Deep Q Network] ( /tree/master/contents/5_Deep_Q_Network )
26
- * [ Using OpenAI Gym] ( /tree/master/contents/6_OpenAI_gym )
27
- * [ Double DQN] ( /tree/master/contents/5.1_Double_DQN )
28
- * [ DQN with Prioitized Experience Replay] ( /tree/master/contents/5.2_Prioritized_Replay_DQN )
29
- * [ Dueling DQN] ( /tree/master/contents/5.3_Dueling_DQN )
30
- * [ Policy Gradients] ( /tree/master/contents/7_Policy_gradient_softmax )
31
- * [ Actor Critic] ( /tree/master/contents/8_Actor_Critic_Advantage )
32
- * [ Deep Deterministic Policy Gradient] ( /tree/master/contents/9_Deep_Deterministic_Policy_Gradient_DDPG )
33
- * [ A3C] ( /tree/master/contents/10_A3C )
34
- * [ Dyna-Q] ( /tree/master/contents/11_Dyna_Q )
35
- * [ Proximal Policy Optimization (PPO)] ( /tree/master/contents/12_Proximal_Policy_Optimization )
36
- * [ Some of my experiments] ( /tree/master/experiments )
37
- * [ 2D Car] ( /tree/master/experiments/2D_car )
38
- * [ Robot arm] ( /tree/master/experiments/Robot_arm )
39
- * [ BipedalWalker] ( /tree/master/experiments/Solve_BipedalWalker )
40
- * [ LunarLander] ( /tree/master/experiments/Solve_LunarLander )
21
+ * [ Simple entry example] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/1_command_line_reinforcement_learning)
22
+ * [ Q-learning] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/2_Q_Learning_maze)
23
+ * [ Sarsa] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/3_Sarsa_maze)
24
+ * [ Sarsa(lambda)] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/4_Sarsa_lambda_maze)
25
+ * [ Deep Q Network] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/5_Deep_Q_Network)
26
+ * [ Using OpenAI Gym] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/6_OpenAI_gym)
27
+ * [ Double DQN] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/5.1_Double_DQN)
28
+ * [ DQN with Prioitized Experience Replay] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/5.2_Prioritized_Replay_DQN)
29
+ * [ Dueling DQN] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/5.3_Dueling_DQN)
30
+ * [ Policy Gradients] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/7_Policy_gradient_softmax)
31
+ * [ Actor Critic] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/8_Actor_Critic_Advantage)
32
+ * [ Deep Deterministic Policy Gradient] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/9_Deep_Deterministic_Policy_Gradient_DDPG)
33
+ * [ A3C] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/10_A3C)
34
+ * [ Dyna-Q] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/11_Dyna_Q)
35
+ * [ Proximal Policy Optimization (PPO)] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/12_Proximal_Policy_Optimization)
36
+ * [ Some of my experiments] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/experiments)
37
+ * [ 2D Car] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/experiments/2D_car)
38
+ * [ Robot arm] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/experiments/Robot_arm)
39
+ * [ BipedalWalker] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/experiments/Solve_BipedalWalker)
40
+ * [ LunarLander] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/experiments/Solve_LunarLander)
41
41
42
42
# Some RL Networks
43
- ### [ Deep Q Network] ( /tree/master/contents/5_Deep_Q_Network )
43
+ ### [ Deep Q Network] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/5_Deep_Q_Network)
44
44
45
- <a href =" /tree/master/contents/5_Deep_Q_Network " >
45
+ <a href =" https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/5_Deep_Q_Network" >
46
46
<img class="course-image" src="https://morvanzhou.github.io/static/results/reinforcement-learning/4-3-2.png">
47
47
</a >
48
48
49
- ### [ Double DQN] ( /tree/master/contents/5.1_Double_DQN )
49
+ ### [ Double DQN] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/5.1_Double_DQN)
50
50
51
- <a href =" /tree/master/contents/5.1_Double_DQN " >
51
+ <a href =" https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/5.1_Double_DQN" >
52
52
<img class="course-image" src="https://morvanzhou.github.io/static/results/reinforcement-learning/4-5-3.png">
53
53
</a >
54
54
55
- ### [ Dueling DQN] ( /tree/master/contents/5.3_Dueling_DQN )
55
+ ### [ Dueling DQN] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/5.3_Dueling_DQN)
56
56
57
- <a href =" /tree/master/contents/5.3_Dueling_DQN " >
57
+ <a href =" https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/5.3_Dueling_DQN" >
58
58
<img class="course-image" src="https://morvanzhou.github.io/static/results/reinforcement-learning/4-7-4.png">
59
59
</a >
60
60
61
- ### [ Actor Critic] ( /tree/master/contents/8_Actor_Critic_Advantage )
61
+ ### [ Actor Critic] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/8_Actor_Critic_Advantage)
62
62
63
- <a href =" /tree/master/contents/8_Actor_Critic_Advantage " >
63
+ <a href =" https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/8_Actor_Critic_Advantage" >
64
64
<img class="course-image" src="https://morvanzhou.github.io/static/results/reinforcement-learning/6-1-1.png">
65
65
</a >
66
66
67
- ### [ Deep Deterministic Policy Gradient] ( /tree/master/contents/9_Deep_Deterministic_Policy_Gradient_DDPG )
67
+ ### [ Deep Deterministic Policy Gradient] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/9_Deep_Deterministic_Policy_Gradient_DDPG)
68
68
69
- <a href =" /tree/master/contents/9_Deep_Deterministic_Policy_Gradient_DDPG " >
69
+ <a href =" https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/9_Deep_Deterministic_Policy_Gradient_DDPG" >
70
70
<img class="course-image" src="https://morvanzhou.github.io/static/results/reinforcement-learning/6-2-2.png">
71
71
</a >
72
72
73
- ### [ A3C] ( /tree/master/contents/10_A3C )
73
+ ### [ A3C] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/10_A3C)
74
74
75
- <a href =" /tree/master/contents/10_A3C " >
75
+ <a href =" https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/10_A3C" >
76
76
<img class="course-image" src="https://morvanzhou.github.io/static/results/reinforcement-learning/6-3-2.png">
77
77
</a >
78
78
79
- ### [ Proximal Policy Optimization (PPO)] ( /tree/master/contents/12_Proximal_Policy_Optimization )
79
+ ### [ Proximal Policy Optimization (PPO)] ( https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/12_Proximal_Policy_Optimization)
80
80
81
- <a href =" /tree/master/contents/12_Proximal_Policy_Optimization " >
81
+ <a href =" https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow /tree/master/contents/12_Proximal_Policy_Optimization" >
82
82
<img class="course-image" src="https://morvanzhou.github.io/static/results/reinforcement-learning/6-4-3.png">
83
83
</a >
84
84
0 commit comments