File tree Expand file tree Collapse file tree 1 file changed +3
-3
lines changed Expand file tree Collapse file tree 1 file changed +3
-3
lines changed Original file line number Diff line number Diff line change 16
16
17
17
## 实验目录
18
18
19
- 所有的实验源代码都在` lib ` 目录下,来自[ dennybritz] ( https://github.com/dennybritz/reinforcement-learning ) ,这里只做解读和归总 。
19
+ 所有的实验源代码都在` lib ` 目录下,来自[ dennybritz] ( https://github.com/dennybritz/reinforcement-learning ) 。在原先代码的基础上,增加了对实验背景的具体介绍、代码和公式的对照 。
20
20
21
21
- [ Gridworld] ( https://github.com/applenob/rl_learn/blob/master/1_gridworld.ipynb ) :对应** MDP** 的** Dynamic Programming**
22
22
- [ Blackjack] ( https://github.com/applenob/rl_learn/blob/master/2_blackjack.ipynb ) :对应** Model Free** 的** Monte Carlo** 的Planning和Controlling
23
- - [ Windy Gridworld] ( https://github.com/applenob/rl_learn/blob/master/3_windy_gridworld.ipynb ) :对应** Model Free** 的** Temporal Difference** 的** On-Policy Controlling** , ** SARSA** 。
24
- - [ Cliff Walking] ( https://github.com/applenob/rl_learn/blob/master/4_cliff_walking.ipynb ) :对应** Model Free** 的** Temporal Difference** 的** Off-Policy Controlling** , ** Q-learning** 。
23
+ - [ Windy Gridworld] ( https://github.com/applenob/rl_learn/blob/master/3_windy_gridworld.ipynb ) :对应** Model Free** 的** Temporal Difference** 的** On-Policy Controlling** : ** SARSA** 。
24
+ - [ Cliff Walking] ( https://github.com/applenob/rl_learn/blob/master/4_cliff_walking.ipynb ) :对应** Model Free** 的** Temporal Difference** 的** Off-Policy Controlling** : ** Q-learning** 。
25
25
- [ Mountain Car] ( https://github.com/applenob/rl_learn/blob/master/5_mountain_car.ipynb ) :对应Q表格很大无法处理(state空间连续)的** Q-Learning with Linear Function Approximation** 。
26
26
- [ Atari] ( https://github.com/applenob/rl_learn/blob/master/6_atari.ipynb ) :对应** Deep-Q Learning** 。
27
27
You can’t perform that action at this time.
0 commit comments