Skip to content

Commit ac5e50d

Browse files
author
applenob
committed
update
1 parent e28bf7b commit ac5e50d

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,12 +16,12 @@
1616

1717
## 实验目录
1818

19-
所有的实验源代码都在`lib`目录下,来自[dennybritz](https://github.com/dennybritz/reinforcement-learning),这里只做解读和归总
19+
所有的实验源代码都在`lib`目录下,来自[dennybritz](https://github.com/dennybritz/reinforcement-learning)。在原先代码的基础上,增加了对实验背景的具体介绍、代码和公式的对照
2020

2121
- [Gridworld](https://github.com/applenob/rl_learn/blob/master/1_gridworld.ipynb):对应**MDP****Dynamic Programming**
2222
- [Blackjack](https://github.com/applenob/rl_learn/blob/master/2_blackjack.ipynb):对应**Model Free****Monte Carlo**的Planning和Controlling
23-
- [Windy Gridworld](https://github.com/applenob/rl_learn/blob/master/3_windy_gridworld.ipynb):对应**Model Free****Temporal Difference****On-Policy Controlling****SARSA**
24-
- [Cliff Walking](https://github.com/applenob/rl_learn/blob/master/4_cliff_walking.ipynb):对应**Model Free****Temporal Difference****Off-Policy Controlling****Q-learning**
23+
- [Windy Gridworld](https://github.com/applenob/rl_learn/blob/master/3_windy_gridworld.ipynb):对应**Model Free****Temporal Difference****On-Policy Controlling****SARSA**
24+
- [Cliff Walking](https://github.com/applenob/rl_learn/blob/master/4_cliff_walking.ipynb):对应**Model Free****Temporal Difference****Off-Policy Controlling****Q-learning**
2525
- [Mountain Car](https://github.com/applenob/rl_learn/blob/master/5_mountain_car.ipynb):对应Q表格很大无法处理(state空间连续)的**Q-Learning with Linear Function Approximation**
2626
- [Atari](https://github.com/applenob/rl_learn/blob/master/6_atari.ipynb):对应**Deep-Q Learning**
2727

0 commit comments

Comments
 (0)