update

cer · cer · commit a70773debf02 · 2018-05-07T22:11:49.000+08:00
diff --git a/reinforcement_learning.ipynb b/reinforcement_learning.ipynb
@@ -22,16 +22,14 @@
     "- [7. n-step Bootstrapping](#7.-n-step-Bootstrapping)\n",
     "- [8. Planning and Learning with Tabular Methods](#8.-Planning-and-Learning-with-Tabular-Methods)\n",
     "- [9. On-policy Prediction with Approximation](#9.-On-policy-Prediction-with-Approximation)\n",
-    "- [](#)\n",
-    "- [](#)\n",
-    "- [](#)\n",
+    "- [10. On-policy Control with Approximation](#10.-On-policy-Control-with-Approximation)\n",
+    "- [11. Off-policy Methods with Approximation](#11.-Off-policy-Methods-with-Approximation)\n",
+    "- [12. Eligibility Traces](#12.-Eligibility-Traces)\n",
     "- [13. Policy Gradient Methods](#13.-Policy-Gradient-Methods)\n",
-    "- [](#)\n",
-    "- [](#)\n",
-    "- [](#)\n",
-    "- [](#)\n",
-    "- [](#)\n",
-    "\n"
+    "- [14. Psychology](#14.-Psychology)\n",
+    "- [15. Neuroscience](#15.-Neuroscience)\n",
+    "- [16. Applications and Case Studies](#16.-Applications-and-Case-Studies)\n",
+    "- [17. Frontiers](#17.-Frontiers)\n"
    ]
   },
   {
@@ -370,7 +368,7 @@
     "- 因为直接使用现有的估计取更新估计，因此这种方法被称为**自举（bootstrap）**。\n",
     "- ![](https://github.com/applenob/rl_learn/raw/master/res/td0_est.png)\n",
     "- **TD error**：$\\delta_t = R_{t+1}+\\gamma V(S_{t+1})-V(S_t)$\n",
-    "- ![](https://github.com/applenob/rl_learn/raw/master/res/td0.png)\n",
+    "- ![](https://github.com/applenob/rl_learn/raw/master/res/td_0.png)\n",
     "\n",
     "### Sarsa\n",
     "- 一种on-policy的TD控制。\n",
@@ -420,17 +418,23 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": []
+   "source": [
+    "## 10. On-policy Control with Approximation"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": []
+   "source": [
+    "## 11. Off-policy Methods with Approximation"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": []
+   "source": [
+    "## 12. Eligibility Traces"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -442,22 +446,30 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": []
+   "source": [
+    "## 14. Psychology"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": []
+   "source": [
+    "## 15. Neuroscience"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": []
+   "source": [
+    "## 16. Applications and Case Studies"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": []
+   "source": [
+    "## 17. Frontiers"
+   ]
   },
   {
    "cell_type": "code",