Skip to content

Commit e2a2bb4

Browse files
committed
Merge pull request #204 from QuantEcon/ddp-fix
DiscreteDP: further fix in docstring
2 parents 43c3b5d + 675759c commit e2a2bb4

File tree

1 file changed

+9
-9
lines changed

1 file changed

+9
-9
lines changed

quantecon/markov/ddp.py

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -23,16 +23,16 @@
2323
\Delta(S)`, where :math:`q(s'|s, a)` is the probability that the state
2424
in the next period is :math:`s'` when the current state is :math:`s`
2525
and the action chosen is :math:`a`; and
26-
* discount factor :math:`\beta \in [0, 1)`.
26+
* discount factor :math:`0 \leq \beta < 1`.
2727
2828
For a policy function :math:`\sigma`, let :math:`r_{\sigma}` and
2929
:math:`Q_{\sigma}` be the reward vector and the transition probability
3030
matrix for :math:`\sigma`, which are defined by :math:`r_{\sigma}(s) =
3131
r(s, \sigma(s))` and :math:`Q_{\sigma}(s, s') = q(s'|s, \sigma(s))`,
3232
respectively. The policy value function :math:`v_{\sigma}` for
33-
:math`\sigma` is defined by
33+
:math:`\sigma` is defined by
3434
35-
..math::
35+
.. math::
3636
3737
v_{\sigma}(s) = \sum_{t=0}^{\infty}
3838
\beta^t (Q_{\sigma}^t r_{\sigma})(s)
@@ -45,23 +45,23 @@
4545
4646
The *Bellman equation* is written as
4747
48-
..math::
48+
.. math::
4949
5050
v(s) = \max_{a \in A(s)} r(s, a)
5151
+ \beta \sum_{s' \in S} q(s'|s, a) v(s') \quad (s \in S).
5252
5353
The *Bellman operator* :math:`T` is defined by the right hand side of
5454
the Bellman equation:
5555
56-
..math::
56+
.. math::
5757
5858
(T v)(s) = \max_{a \in A(s)} r(s, a)
5959
+ \beta \sum_{s' \in S} q(s'|s, a) v(s') \quad (s \in S).
6060
6161
For a policy function :math:`\sigma`, the operator :math:`T_{\sigma}` is
6262
defined by
6363
64-
..math::
64+
.. math::
6565
6666
(T_{\sigma} v)(s) = r(s, \sigma(s))
6767
+ \beta \sum_{s' \in S} q(s'|s, \sigma(s)) v(s')
@@ -117,7 +117,7 @@
117117

118118

119119
class DiscreteDP(object):
120-
"""
120+
r"""
121121
Class for dealing with a discrete dynamic program.
122122
123123
There are two ways to represent the data for instantiating a
@@ -165,7 +165,7 @@ class DiscreteDP(object):
165165
Transition probability array.
166166
167167
beta : scalar(float)
168-
Discount factor. Must be in [0, 1).
168+
Discount factor. Must be 0 <= beta < 1.
169169
170170
s_indices : array_like(int, ndim=1), optional(default=None)
171171
Array containing the indices of the states.
@@ -297,7 +297,7 @@ def __init__(self, R, Q, beta, s_indices=None, a_indices=None):
297297
raise ValueError('R must be 1- or 2-dimensional')
298298

299299
msg_dimension = 'dimensions of R and Q must be either 1 and 2, ' \
300-
'of 2 and 3'
300+
'or 2 and 3'
301301
msg_shape = 'shapes of R and Q must be either (n, m) and (n, m, n), ' \
302302
'or (L,) and (L, n)'
303303

0 commit comments

Comments
 (0)