Skip to content

Commit 0fe650c

Browse files
committed
Clarify algorithm discussion
1 parent 34aa13d commit 0fe650c

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

report/report.tex

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -83,13 +83,13 @@
8383

8484
\subsubsection{Function Approximation}
8585

86-
Though it may initially seem like the microcontroller could support a reasonable number of features, inspection reveals this is not the case. Consider the episodic semi-gradient one-step Sarsa algorithm.
86+
The microcontroller cannot support as many features as one might hope. Consider the episodic semi-gradient one-step Sarsa algorithm.
8787

8888
\begin{equation}\label{eqn:update}
8989
\bm{\theta}_{t+1} = \bm{\theta}_t + \alpha \Big[R_{t+1} + \gamma \hat{q}(S_{t+1}, A_{t+1}\, \bm{\theta}_t) - \hat{q}(S_t, A_t, \bm{\theta}_t)\Big]\Delta\hat{q}(S_t, A_t, \bm{\theta}_t)\tag{1}
9090
\end{equation}
9191

92-
As shown in the implementation, it is possible to implement the update using only $n$ additional space, where $n$ is the number of weights. While this implementation is not difficult, it is easy to do incorrectly. For instance, if the action selection step is not placed before the memory allocation, the implementation will actually consume $2n$ stack memory; maximizing the value function over possible next states requires an additional $n$ stack space.
92+
As shown in the implementation, it is possible to implement the update using only $n$ additional space, where $n$ is the number of weights, but it is easy to do incorrectly. For instance, if the action selection step is not placed before the memory allocation, the implementation will actually consume $2n$ stack memory; maximizing the value function over possible next states requires an additional $n$ stack space.
9393

9494
\begin{algorithm}
9595
\caption{Memory-conscious Episodic Semi-gradient One-step Sarsa}

0 commit comments

Comments
 (0)