You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lectures/mccall_model.md
+32-31Lines changed: 32 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -64,7 +64,6 @@ import matplotlib.pyplot as plt
64
64
import numpy as np
65
65
import jax
66
66
import jax.numpy as jnp
67
-
import jax.random as jr
68
67
from typing import NamedTuple
69
68
import quantecon as qe
70
69
from quantecon.distributions import BetaBinomial
@@ -116,7 +115,7 @@ The worker faces a trade-off:
116
115
* Waiting too long for a good offer is costly, since the future is discounted.
117
116
* Accepting too early is costly, since better offers might arrive in the future.
118
117
119
-
To decide optimally in the face of this trade-off, we use dynamic programming.
118
+
To decide optimally in the face of this trade-off, we use [dynamic programming](https://dp.quantecon.org/).
120
119
121
120
Dynamic programming can be thought of as a two-step procedure that
122
121
@@ -139,7 +138,7 @@ To this end, let $v^*(w)$ be the total lifetime *value* accruing to an
139
138
unemployed worker who enters the current period unemployed when the wage is
140
139
$w \in \mathbb{W}$.
141
140
142
-
In particular, the agent has wage offer $w$ in hand.
141
+
(In particular, the agent has wage offer $w$ in hand and can accept or reject it.)
143
142
144
143
More precisely, $v^*(w)$ denotes the value of the objective function
145
144
{eq}`obj_model` when an agent in this situation makes *optimal* decisions now
@@ -167,7 +166,7 @@ v^*(w)
167
166
168
167
for every possible $w$ in $\mathbb{W}$.
169
168
170
-
This important equation is a version of the **Bellman equation**, which is
169
+
This is a version of the **Bellman equation**, which is
171
170
ubiquitous in economic dynamics and other fields involving planning over time.
172
171
173
172
The intuition behind it is as follows:
@@ -178,9 +177,12 @@ $$
178
177
\frac{w}{1 - \beta} = w + \beta w + \beta^2 w + \cdots
179
178
$$
180
179
181
-
* the second term inside the max operation is the **continuation value**, which is the lifetime payoff from rejecting the current offer and then behaving optimally in all subsequent periods
180
+
* the second term inside the max operation is the continuation value, which is
181
+
the lifetime payoff from rejecting the current offer and then behaving
182
+
optimally in all subsequent periods
182
183
183
-
If we optimize and pick the best of these two options, we obtain maximal lifetime value from today, given current offer $w$.
184
+
If we optimize and pick the best of these two options, we obtain maximal
185
+
lifetime value from today, given current offer $w$.
184
186
185
187
But this is precisely $v^*(w)$, which is the left-hand side of {eq}`odu_pv`.
186
188
@@ -197,7 +199,7 @@ All we have to do is select the maximal choice on the right-hand side of {eq}`od
197
199
The optimal action is best thought of as a **policy**, which is, in general, a map from
198
200
states to actions.
199
201
200
-
Given *any* $w$, we can read off the corresponding best choice (accept or
202
+
Given any $w$, we can read off the corresponding best choice (accept or
201
203
reject) by picking the max on the right-hand side of {eq}`odu_pv`.
202
204
203
205
Thus, we have a map from $\mathbb W$ to $\{0, 1\}$, with 1 meaning accept and 0 meaning reject.
@@ -228,7 +230,7 @@ where
228
230
\bar w := (1 - \beta) \left\{ c + \beta \sum_{w'} v^*(w') q (w') \right\}
229
231
```
230
232
231
-
Here $\bar w$ (called the *reservation wage*) is a constant depending on
233
+
Here $\bar w$ (called the **reservation wage**) is a constant depending on
232
234
$\beta, c$ and the wage distribution.
233
235
234
236
The agent should accept if and only if the current wage offer exceeds the reservation wage.
@@ -238,8 +240,7 @@ In view of {eq}`reswage`, we can compute this reservation wage if we can compute
238
240
239
241
## Computing the Optimal Policy: Take 1
240
242
241
-
To put the above ideas into action, we need to compute the value function at
242
-
each possible state $w \in \mathbb W$.
243
+
To put the above ideas into action, we need to compute the value function at each $w \in \mathbb W$.
243
244
244
245
To simplify notation, let's set
245
246
@@ -249,8 +250,7 @@ $$
249
250
v^*(i) := v^*(w_i)
250
251
$$
251
252
252
-
The value function is then represented by the vector
253
-
$v^* = (v^*(i))_{i=1}^n$.
253
+
The value function is then represented by the vector $v^* = (v^*(i))_{i=1}^n$.
254
254
255
255
In view of {eq}`odu_pv`, this vector satisfies the nonlinear system of equations
256
256
@@ -302,8 +302,7 @@ The theory below elaborates on this point.
302
302
303
303
What's the mathematics behind these ideas?
304
304
305
-
First, one defines a mapping $T$ from $\mathbb R^n$ to
306
-
itself via
305
+
First, one defines a mapping $T$ from $\mathbb R^n$ to itself via
307
306
308
307
```{math}
309
308
:label: odu_pv3
@@ -320,11 +319,9 @@ itself via
320
319
(A new vector $Tv$ is obtained from given vector $v$ by evaluating
321
320
the r.h.s. at each $i$.)
322
321
323
-
The element $v_k$ in the sequence $\{v_k\}$ of successive
324
-
approximations corresponds to $T^k v$.
322
+
The element $v_k$ in the sequence $\{v_k\}$ of successive approximations corresponds to $T^k v$.
325
323
326
-
* This is $T$ applied $k$ times, starting at the initial guess
327
-
$v$
324
+
* This is $T$ applied $k$ times, starting at the initial guess $v$
328
325
329
326
One can show that the conditions of the [Banach fixed point theorem](https://en.wikipedia.org/wiki/Banach_fixed-point_theorem) are
330
327
satisfied by $T$ on $\mathbb R^n$.
@@ -333,12 +330,11 @@ One implication is that $T$ has a unique fixed point in $\mathbb R^n$.
333
330
334
331
* That is, a unique vector $\bar v$ such that $T \bar v = \bar v$.
335
332
336
-
Moreover, it's immediate from the definition of $T$ that this fixed
337
-
point is $v^*$.
333
+
Moreover, it's immediate from the definition of $T$ that this fixed point is $v^*$.
338
334
339
335
A second implication of the Banach contraction mapping theorem is that
340
-
$\{ T^k v \}$ converges to the fixed point $v^*$ regardless of
341
-
$v$.
336
+
$\{ T^k v \}$ converges to the fixed point $v^*$ regardless of $v$.
We will use [JAX](https://python-programming.quantecon.org/jax_intro.html) to write our code.
372
368
373
-
*We'll use NamedTuple for our model class to maintain immutability, which works well with JAX's functional programming paradigm.
369
+
We'll use `NamedTuple` for our model class to maintain immutability, which works well with JAX's functional programming paradigm.
374
370
375
-
Here's a class that stores the model parameters with default values, and a separate function that computes the values of state-action pairs (i.e., the value in the maximum bracket on the right hand side of the Bellman equation {eq}`odu_pv2p`).
371
+
Here's a class that stores the model parameters with default values.
376
372
377
373
```{code-cell} python3
378
374
class McCallModel(NamedTuple):
379
375
c: float = 25 # unemployment compensation
380
376
β: float = 0.99 # discount factor
381
377
w: jnp.ndarray = w_default # array of wage values, w[i] = wage at state i
382
378
q: jnp.ndarray = q_default # array of probabilities
379
+
```
383
380
381
+
Here is a function that computes the
382
+
value in the maximum bracket on the right hand side of the Bellman equation {eq}`odu_pv2p`.
0 commit comments