Skip to content

Commit e75d48a

Browse files
committed
misc
1 parent d9cae86 commit e75d48a

File tree

1 file changed

+32
-31
lines changed

1 file changed

+32
-31
lines changed

lectures/mccall_model.md

Lines changed: 32 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,6 @@ import matplotlib.pyplot as plt
6464
import numpy as np
6565
import jax
6666
import jax.numpy as jnp
67-
import jax.random as jr
6867
from typing import NamedTuple
6968
import quantecon as qe
7069
from quantecon.distributions import BetaBinomial
@@ -116,7 +115,7 @@ The worker faces a trade-off:
116115
* Waiting too long for a good offer is costly, since the future is discounted.
117116
* Accepting too early is costly, since better offers might arrive in the future.
118117

119-
To decide optimally in the face of this trade-off, we use dynamic programming.
118+
To decide optimally in the face of this trade-off, we use [dynamic programming](https://dp.quantecon.org/).
120119

121120
Dynamic programming can be thought of as a two-step procedure that
122121

@@ -139,7 +138,7 @@ To this end, let $v^*(w)$ be the total lifetime *value* accruing to an
139138
unemployed worker who enters the current period unemployed when the wage is
140139
$w \in \mathbb{W}$.
141140

142-
In particular, the agent has wage offer $w$ in hand.
141+
(In particular, the agent has wage offer $w$ in hand and can accept or reject it.)
143142

144143
More precisely, $v^*(w)$ denotes the value of the objective function
145144
{eq}`obj_model` when an agent in this situation makes *optimal* decisions now
@@ -167,7 +166,7 @@ v^*(w)
167166

168167
for every possible $w$ in $\mathbb{W}$.
169168

170-
This important equation is a version of the **Bellman equation**, which is
169+
This is a version of the **Bellman equation**, which is
171170
ubiquitous in economic dynamics and other fields involving planning over time.
172171

173172
The intuition behind it is as follows:
@@ -178,9 +177,12 @@ $$
178177
\frac{w}{1 - \beta} = w + \beta w + \beta^2 w + \cdots
179178
$$
180179

181-
* the second term inside the max operation is the **continuation value**, which is the lifetime payoff from rejecting the current offer and then behaving optimally in all subsequent periods
180+
* the second term inside the max operation is the continuation value, which is
181+
the lifetime payoff from rejecting the current offer and then behaving
182+
optimally in all subsequent periods
182183

183-
If we optimize and pick the best of these two options, we obtain maximal lifetime value from today, given current offer $w$.
184+
If we optimize and pick the best of these two options, we obtain maximal
185+
lifetime value from today, given current offer $w$.
184186

185187
But this is precisely $v^*(w)$, which is the left-hand side of {eq}`odu_pv`.
186188

@@ -197,7 +199,7 @@ All we have to do is select the maximal choice on the right-hand side of {eq}`od
197199
The optimal action is best thought of as a **policy**, which is, in general, a map from
198200
states to actions.
199201

200-
Given *any* $w$, we can read off the corresponding best choice (accept or
202+
Given any $w$, we can read off the corresponding best choice (accept or
201203
reject) by picking the max on the right-hand side of {eq}`odu_pv`.
202204

203205
Thus, we have a map from $\mathbb W$ to $\{0, 1\}$, with 1 meaning accept and 0 meaning reject.
@@ -228,7 +230,7 @@ where
228230
\bar w := (1 - \beta) \left\{ c + \beta \sum_{w'} v^*(w') q (w') \right\}
229231
```
230232

231-
Here $\bar w$ (called the *reservation wage*) is a constant depending on
233+
Here $\bar w$ (called the **reservation wage**) is a constant depending on
232234
$\beta, c$ and the wage distribution.
233235

234236
The agent should accept if and only if the current wage offer exceeds the reservation wage.
@@ -238,8 +240,7 @@ In view of {eq}`reswage`, we can compute this reservation wage if we can compute
238240

239241
## Computing the Optimal Policy: Take 1
240242

241-
To put the above ideas into action, we need to compute the value function at
242-
each possible state $w \in \mathbb W$.
243+
To put the above ideas into action, we need to compute the value function at each $w \in \mathbb W$.
243244

244245
To simplify notation, let's set
245246

@@ -249,8 +250,7 @@ $$
249250
v^*(i) := v^*(w_i)
250251
$$
251252

252-
The value function is then represented by the vector
253-
$v^* = (v^*(i))_{i=1}^n$.
253+
The value function is then represented by the vector $v^* = (v^*(i))_{i=1}^n$.
254254

255255
In view of {eq}`odu_pv`, this vector satisfies the nonlinear system of equations
256256

@@ -302,8 +302,7 @@ The theory below elaborates on this point.
302302

303303
What's the mathematics behind these ideas?
304304

305-
First, one defines a mapping $T$ from $\mathbb R^n$ to
306-
itself via
305+
First, one defines a mapping $T$ from $\mathbb R^n$ to itself via
307306

308307
```{math}
309308
:label: odu_pv3
@@ -320,11 +319,9 @@ itself via
320319
(A new vector $Tv$ is obtained from given vector $v$ by evaluating
321320
the r.h.s. at each $i$.)
322321

323-
The element $v_k$ in the sequence $\{v_k\}$ of successive
324-
approximations corresponds to $T^k v$.
322+
The element $v_k$ in the sequence $\{v_k\}$ of successive approximations corresponds to $T^k v$.
325323

326-
* This is $T$ applied $k$ times, starting at the initial guess
327-
$v$
324+
* This is $T$ applied $k$ times, starting at the initial guess $v$
328325

329326
One can show that the conditions of the [Banach fixed point theorem](https://en.wikipedia.org/wiki/Banach_fixed-point_theorem) are
330327
satisfied by $T$ on $\mathbb R^n$.
@@ -333,12 +330,11 @@ One implication is that $T$ has a unique fixed point in $\mathbb R^n$.
333330

334331
* That is, a unique vector $\bar v$ such that $T \bar v = \bar v$.
335332

336-
Moreover, it's immediate from the definition of $T$ that this fixed
337-
point is $v^*$.
333+
Moreover, it's immediate from the definition of $T$ that this fixed point is $v^*$.
338334

339335
A second implication of the Banach contraction mapping theorem is that
340-
$\{ T^k v \}$ converges to the fixed point $v^*$ regardless of
341-
$v$.
336+
$\{ T^k v \}$ converges to the fixed point $v^*$ regardless of $v$.
337+
342338

343339
### Implementation
344340

@@ -368,19 +364,24 @@ ax.set_ylabel('probabilities')
368364
plt.show()
369365
```
370366

371-
We are going to use JAX to accelerate our code.
367+
We will use [JAX](https://python-programming.quantecon.org/jax_intro.html) to write our code.
372368

373-
* We'll use NamedTuple for our model class to maintain immutability, which works well with JAX's functional programming paradigm.
369+
We'll use `NamedTuple` for our model class to maintain immutability, which works well with JAX's functional programming paradigm.
374370

375-
Here's a class that stores the model parameters with default values, and a separate function that computes the values of state-action pairs (i.e., the value in the maximum bracket on the right hand side of the Bellman equation {eq}`odu_pv2p`).
371+
Here's a class that stores the model parameters with default values.
376372

377373
```{code-cell} python3
378374
class McCallModel(NamedTuple):
379375
c: float = 25 # unemployment compensation
380376
β: float = 0.99 # discount factor
381377
w: jnp.ndarray = w_default # array of wage values, w[i] = wage at state i
382378
q: jnp.ndarray = q_default # array of probabilities
379+
```
383380

381+
Here is a function that computes the
382+
value in the maximum bracket on the right hand side of the Bellman equation {eq}`odu_pv2p`.
383+
384+
```{code-cell} python3
384385
@jax.jit
385386
def state_action_values(model, i, v):
386387
"""
@@ -658,8 +659,8 @@ cdf = jnp.cumsum(q_default)
658659
def compute_stopping_time(w_bar, key):
659660
def body_fun(state):
660661
t, key, done = state
661-
key, subkey = jr.split(key)
662-
u = jr.uniform(subkey)
662+
key, subkey = jax.random.split(key)
663+
u = jax.random.uniform(subkey)
663664
w = w_default[jnp.searchsorted(cdf, u)]
664665
done = w >= w_bar
665666
t = jnp.where(done, t, t + 1)
@@ -675,8 +676,8 @@ def compute_stopping_time(w_bar, key):
675676
676677
@jax.jit
677678
def compute_mean_stopping_time(w_bar, num_reps=100000, seed=1234):
678-
key = jr.PRNGKey(seed)
679-
keys = jr.split(key, num_reps)
679+
key = jax.random.PRNGKey(seed)
680+
keys = jax.random.split(key, num_reps)
680681
obs = jax.vmap(compute_stopping_time, in_axes=(None, 0))(w_bar, keys)
681682
return jnp.mean(obs)
682683
@@ -776,8 +777,8 @@ class McCallModelContinuous(NamedTuple):
776777
w_draws: jnp.ndarray # draws of wages for Monte Carlo
777778
778779
def create_mccall_continuous(c=25, β=0.99, σ=0.5, μ=2.5, mc_size=1000, seed=1234):
779-
key = jr.PRNGKey(seed)
780-
s = jr.normal(key, (mc_size,))
780+
key = jax.random.PRNGKey(seed)
781+
s = jax.random.normal(key, (mc_size,))
781782
w_draws = jnp.exp(μ + σ * s)
782783
return McCallModelContinuous(c=c, β=β, σ=σ, μ=μ, w_draws=w_draws)
783784

0 commit comments

Comments
 (0)