forked from bvkrauth/is4e
-
Notifications
You must be signed in to change notification settings - Fork 0
/
03-Probability.Rmd
733 lines (599 loc) · 28.6 KB
/
03-Probability.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
# Probability and random events {#probability-and-random-events}
***Probability*** is a method of mathematically modeling a random process
so that we can understand it and/or make predictions about its future
results. Probability is an essential tool for casinos, as well as for
banks, insurance companies, and any other businesses that manage risks.
::: {.goals data-latex=""}
**Chapter goals**
In this chapter we will learn how to:
- Model random events using the tools of probability
- Calculate and interpret marginal, joint, and conditional probabilities
- Interpret and use the assumptions of independence and equal outcome probability
:::
This chapter uses mathematical notation and terminology that you
have seen before but may need to review. If you have difficulty with
the math, please refer to the sections on [Sets](#sets) and on [Functions](#sets) in the [Math Review](#math-review) appendix.
::: example
**Example application: Roulette**
We will develop ideas by considering the casino game of **Roulette**.
The picture below shows what a roulette wheel looks like.
![roulette wheel image](bin/roulette.png)
Source: <a href="https://www.vecteezy.com/free-vector/roulette">Roulette Vectors by Vecteezy</a>
Here are the rules:
- It features
- a ball.
- a spinning wheel with numbered/colored slots.
- a table on which to place bets
- The slots are numbered from 0 to 36
- Slot number 0 is green
- 18 slots are red
- 18 slots are black.
- The picture above depicts an American roulette table, which has an
additional green slot labeled "00",
- I will assume we have a European roulette table, which does not include
the "00" slot.
- Players can place various bets on the table including:
- Red (ball lands in a red slot) pays \$1 per \$1 bet
- Black (ball lands in a black slot) pays \$1 per \$1 bet
- A straight bet on any specific number (ball lands on that number)
pays \$35 per \$1 bet
Like other casino games, a roulette game is an example of a random process.
Something will happen, it matters (to the players and the casino) what
will happen, but we don't know in advance what will happen.
:::
## Outcomes and events {#outcomes-and-events}
To build a probabilistic model of a random process, we start by defining
the ***outcome*** we are interested in. An outcome can be a simple yes/no
result, it can be a number, or it can be a much more complex object. The
outcome should be a complete description of the random process, in the
sense that everything we are interested in can be defined in terms of
the outcome.
::: example
**Outcomes in roulette**
The outcome of a single game of roulette can be defined as the number of
the slot in which the ball lands. Call that number $b$.
:::
The set of all possible outcomes is called the ***sample space***
::: example
**The sample space in roulette**
The sample space for a game of roulette can be defined as
the set of all numbers the ball can land on:
$$\Omega = \{0,1,2,\ldots,36\}$$
This sample space has $|\Omega| = 37$ elements.
:::
Next, we define a set of ***events*** that we are interested in.
We can think of an event as either:
- A statement that is either true or false OR
- A subset of the sample space
These two concepts are equivalent, though the subset concept
makes the math clearer.
::: example
**Events in roulette**
These roulette events are well-defined for our sample space:
- Ball lands on 14:
$$b \in \{14\}$$
- Ball lands on red:
\begin{align}
b \in Red &= \left\{\begin{aligned}
& 1,3,5,7,9,12,14,16,18, \\
& 19,21,23,25,27,30,32,34,36 \\
\end{aligned}\right\}
\end{align}
- Ball lands on black:
\begin{align}
b \in Black &= \left\{\begin{aligned}
& 2,4,6,8,10,11,13,15,17, \\
& 20,22,24,26,28,29,31,33,35 \\
\end{aligned}\right\}
\end{align}
- Ball lands on one of the first 12 numbers:
$$b \in First12 = \{1,2,3,4,5,6,7,8,9,10,11,12\}$$
We could define many more events, depending on what bets
we are interested in.
:::
An event that contains only one outcome is called an
***elementary event***.
Since events are sets, we can use the terminology and mathematical
tools for sets.
::: example
**Relationships among events**
In our roulette example:
- Two events are *identical* $(A = B)$ if they contain exactly the same outcomes:
- The event "ball lands on 14" and "a bet on 14 wins"
are identical since $\{14\} = \{14\}$.
- Intuitively, identical means they are just two different ways of describing
the same event.
- An event *implies* another event $(A \subset B)$ if all of its outcomes are also in
the implied event
- The event "ball lands on 14" implies the event "ball lands on red"
since $\{14\} \subset Red$.
- When an event happens, any event it implies also happens.
- Two events are *disjoint* $(A \cap B) = \emptyset$
if they share no outcomes:
- The events "ball lands on red" and "ball lands on black" are disjoint
since $Red \cap Black = \emptyset$.
- If two events are disjoint, they cannot both happen.
- But they can both fail to happen. For example, if the ball lands in the
green zero slot ($b = 0$), neither red nor black wins.
- Any two elementary events are either identical or disjoint
- The events "ball lands on 14" and "ball lands on 25" are disjoint
since $\{14\} \cap \{25\} = \emptyset$.
:::
## Probabilities {#probabilities}
Our final step is to define a ***probability distribution***
for this random process, which is a function that assigns
a number to each possible event. The number is called
the event's ***probability***.
Probabilities are normally between zero and one:
- If an event has probability zero, it definitely *will
not* happen
- If an event has probability strictly between zero and
one, it *might* happen.
- If an event has probability one, it definitely *will*
happen.
### The axioms of probability {#the-axioms-of-probability}
All valid probability distributions must obey the following three conditions, which are sometimes called the ***axioms*** of probability.
1. Probabilities are never negative:
$$\Pr(A) \geq 0$$
2. One of the outcomes will definitely happen:
$$\Pr(\Omega) = 1$$
3. For any two *disjoint* events $A$ and $B$, the probability that
$A$ or $B$ happen is the sum of their individual probabilities:
$$\Pr(A \cup B) = \Pr(A) + \Pr(B)$$
Probability distributions have many other properties, but they
can all be derived from these three axioms.
::: example
**Outcome probabilities for a fair roulette game**
Let's assume that the roulette wheel is "fair" in the sense that each
outcome has the same probability. Now, I should emphasize that this
doesn't have to be the case, it's just an assumption. But it's a
reasonable one in this case because casinos are required
by law to run fair roulette wheels and would be subject to heavy
penalties if they run unfair wheels. Later on, we will use statistics
to confirm that a roulette wheel is fair.
Call that probability $p$:
$$p = \Pr(b = 0) = \Pr(b = 1) = \cdots = \Pr(b = 36)$$
To find the value of $p$ we use the rules of probability.
By rule #2 of probability, one of the outcomes will happen:
$$\Pr(\Omega) = 1$$
Since the different outcomes are disjoint, rule #3 implies that:
$$\underbrace{\Pr(\Omega))}_{1} = \underbrace{\Pr(\{0\})}_{p} + \underbrace{\Pr(\{1\})}_{p} + \cdots + \underbrace{\Pr(\{36\})}_{p}$$
Summarizing this equation:
$$1 = 37p$$
Solving for $p$ we get:
$$p = 1/37 \approx 0.027$$
That is, each of the 37 elementary events have a probability of $1/37$.
:::
Since this is an introductory course, our sample space will usually
contain a finite number of outcomes, as in our roulette example. In
that case, probability calculations are pretty simple:
- Find the probability of each elementary event.
- To find the probability of a specific event, just add up the probabilities
of its elementary events.
::: example
**Event probabilities for a fair roulette game**
In the roulette example, the probability of any event $A$ is just the number
of outcomes in $A$ times the probability of each outcome $1/37$:
$$ \Pr(A) = |A|*1/37$$
The notation $|A|$ just means the size of (number of elements in) the set $A$.
For example:
$$\Pr(b=25) = |\{25\}|*1/37 = 1/37 \approx 0.027$$
$$\Pr(Red) = |Red|*1/37 = 18/37 \approx 0.486$$
$$\Pr(Even) = |Even|*1/37 = 18/37 \approx 0.486$$
$$\Pr(First12) = |First12|*1/37 = 12/37 \approx 0.324$$
:::
However, not all sample spaces contain a finite number of outcomes. For
example, suppose we are interested in using probability to model the
unemployment rate, or a person's income. Those are real numbers, and can
take on any of an infinite number of values. This adds a few complications,
and is the reason that the probability axioms refer to events (sets of
outcomes) and not individual outcomes.
::: {.fyi data-latex=""}
**What do probabilities really mean?**
What does it really mean to say that the probability of the ball landing in
a red slot is about 0.486? That's actually a tough question. There
are two standard interpretations for probabilities:
- *Frequentist or classical interpretation*: we are thinking of
the random process as something that could be repeated many times,
and the probability of an event is the approximate fraction of times
that the event will occur. That is, if you go to a casino and
bet 1000 times on Red, you will win about 486 times.
- *Bayesian or subjectivist interpretation*: the random process
is a one-time occurrence, but we have limited information about
it and the probability of event represents the strength of
our belief that the event will happen.
The frequentist interpretation of probability is well-suited
for simple repeated settings like casino games or car insurance,
while the Bayesian interpretation makes more sense for things
like predicting election results.
:::
### Additional rules for probabilities {#some-rules-for-probabilities}
Let $A$ and $B$ be two events. Then our three axioms
of probability imply several additional rules:
- Probabilities cannot be higher than one.
$$\Pr(A) \leq 1$$
- Probabilities of identical events are identical:
$$A = B \implies \Pr(A) = \Pr(B)$$
- Probabilities of implied events are larger:
$$A \subset B \implies \Pr(A) \leq \Pr(B)$$
- The probability of an event *not* happening is:
$$\Pr(A^C) = 1 - \Pr(A)$$
- The probability of nothing happening is:
$$\Pr(\emptyset) = 0$$
- The probability of either $A$ or $B$ happening is:
\begin{align}
\Pr(A \cup B) &= \Pr(A) + \Pr(B) - \Pr(A \cap B) \\
&\leq \Pr(A) +\Pr(B)
\end{align}
These results are not hard to prove, but I will not go through the proofs.
However, I will use these results so you should be familiar with them.
## Joint and conditional probabilities {#related-events}
We are often interested in more than one event, and want to talk about how
they are related. For example:
- In some casino games like poker or blackjack, players take an additional
action after partial information about the outcome is revealed.
- Politicians often use polls to predict the winner of an election.
- Finance people often want to model multiple market scenarios and
forecast a company's earnings under each of them.
- Economists often have data on current economic conditions and want
to predict future economic conditions.
This section will develop some tools for dealing with the relationship
between different random events.
### Joint probabilities {#joint-probabilities}
The ***joint probability*** of two events $A$ *and* $B$ is the probability that
they *both* happen:
$$\Pr(A \cap B)$$
Remember that the intersection ($\cap$) of $A$ and $B$ is the set of all
outcomes that are in both $A$ and $B$.
::: example
**Joint probabilities for roulette bets**
Consider two events for a game of roulette:
\begin{align}
Red &= \left\{\begin{aligned}
& 1,3,5,7,9,12,14,16,18, \\
& 19,21,23,25,27,30,32,34,36 \\
\end{aligned}\right\} \\
Even &= \left\{\begin{aligned}
& 2,4,6,8,10,12,14,16,18, \\
& 20,22,24,26,28,30,32,34,36 \\
\end{aligned}\right\}
\end{align}
Suppose you are interested in the probability that the ball lands on a number that is both
red and even. This event is just the intersection of $Red$ and $Even$
so this joint probability is:
\begin{align}
\Pr(Red \cap Even) &= \Pr(\{12,14,16,18,30,32,34,36\}) \\
&= 8/37 \\
&\approx 0.216
\end{align}
:::
Joint probabilities are just probabilities, so they obey all of the axioms
and rules of probability described in Section \@ref(probabilities).
### Conditional probabilities {#conditional-probabilities}
The ***conditional probability*** of an event $A$ ***given*** another
event $B$ is defined as:
$$\Pr(A|B) = \frac{\Pr(A \cap B)}{\Pr(B)}$$
The conditional probability answers the question: if we already know that
$B$ is true, what are the chances that $A$ is true?
Conditional probabilities are very important when playing poker. At the beginning
of the game, every player has equal chance of having a winning hand.
But that is no longer true after you see your cards - having "good" cards increases
your chance of winning, and having "bad" cards decreases that chance. In other
words, your bet should be based on $\Pr(win|cards)$ rather than $\Pr(win)$. Good
poker players have detailed knowledge of these conditional probabilities.
::: example
**Conditional probabilities in roulette**
In our roulette example:
$$\Pr(Red|Even) = \frac{\Pr(Red \cap Even)}{\Pr(Even)} = \frac{8/37}{18/37} \approx 0.444$$
$$\Pr(b = 14|Even) = \frac{\Pr(b = 14 \cap Even)}{\Pr(Even)} = \frac{1/37}{18/37} \approx 0.056$$
:::
Like joint probabilities, conditional probabilities are just probabilities,
so they obey all of the axioms and rules of probability described in
Section \@ref(probabilities).
### Independent events {#independent-events}
One common "trick" in modeling joint and conditional probabilities is to assume
that certain events are unrelated to each other. This can simplify the math
significantly.
We say that two events $A$ and $B$ are ***independent*** if their joint
probability is just the two individual probabilities multiplied together:
$$\Pr(A \cap B) = \Pr(A)\Pr(B)$$
We usually express independence with the notation $A \bot B$.
The definition of independence is not very intuitive, but we can clarify
it by doing a little math. Consider two independent events $A$
and $B$ that have nonzero[^301] probability. Then by the definition of
independence:
$$\Pr(A|B) = \frac{\Pr(A \cap B)}{\Pr(B)} = \frac{\Pr(A)\Pr(B)}{\Pr(B)} = \Pr(A)$$
By the same reasoning:
$$\Pr(B|A) = \Pr(B)$$
In other words, knowing that one of these events are true
tells you nothing useful about whether the other the other event
is true.
[^301]: You may wonder: if it makes more sense to describe independence
in terms of conditional probabilities, why do we define it in
terms of joint probabilities? The key is the requirement that
the events have nonzero probability. When $B$ has zero
probability the conditional probability $\Pr(A|B)$ is not
well defined since its denominator is zero.
When would it be reasonable to assume events are independent?
The typical scenario would be where there is simply no
physical or logical relationship between them, usually due
to a separation in time and space.
::: example
**Independence across roulette games**
We have already shown that events related to a *single* roulette game
are not necessarily independent. But the outcomes/events of
*two different* roulette games can be reasonably assumed to be independent
of one another.
Suppose that I bring \$100 to a casino this afternoon for a few games
of roulette. I bet all of my money on Red for the first game.
- If I lose, I am broke and stop playing.
- If I win, I keep all of my money (both my initial bet and my winnings)
on Red for the next spin.
- I keep playing until I run out of money.
After 3 games:
- If Red wins all 3 games, I have $w = \$800$.
- Otherwise, I have nothing $w = \$0$
What is the probability of each of these events? Since we can
assume that each game's outcome is independent, this is an easy
problem:
\begin{align}
\Pr(w = \$800) &= \Pr(Red_1 \cap Red_2 \cap Red_3) \\
&= \Pr(Red_1) \times \Pr(Red_2) \times \Pr(Red_3) (\#eq:indep) \\
&= (18/37) \times (18/37) \times (18/37) \approx 0.115 \\
\Pr(w = \$0) &= 1 - \Pr(w = \$800) \\
&\approx 0.885 \\
\end{align}
So we have an 11.5\% chance of winning big, and an 88.5\% chance
of going broke.
Very important: equation \@ref(eq:indep) only follows from the previous
equation because we have assumed the events $Red_1$, $Red_2$, and $Red_3$
are independent.
:::
When is it *not* reasonable to assume that events are independent? In
almost any other case. Remember that events are defined in terms of
the same underlying outcome, so they are typically related unless
you have some very specific reason to assume otherwise.
::: example
**Independence within a roulette game?**
Consider the roulette events "Red wins" and "Even wins". We earlier
showed that the unconditional probability that Red wins is:
$$\Pr(Red) = 18/37 \approx 0.486$$
The conditional probability that Red wins given that Even wins is:
$$\Pr(Red|Even) = 8/18 \approx 0.444$$
Since $0.44 \neq 0.486$, these two events are *not* independent.
:::
A common mistake by students who are new to probability and
statistics is to take results that *only* apply under independence
and use them when there is no reason to believe that independence
holds. Don't make this mistake: independence is an assumption,
and one that can easily be incorrect.
### Law of total probability
In addition to the results we have already discussed, there are two
important results using conditional probabilities:
The first is the ***law of total probability*** which
is a rule for determining unconditional probabilities
from conditional probabilities:
$$\Pr(A) = \Pr(A|B)\Pr(B) + \Pr(A|B^c)\Pr(B^c)$$
The law of total probability allows us to create a set of scenarios,
calculate probabilities under each scenario, and then add them up. It
is useful when we are modeling random outcomes that occur in multiple
stages, for example a poker game or an energy company making a series of
investments to develop an oil field.
::: example
**The law of total probability in poker**
Suppose you are playing [Texas hold'em poker](https://en.wikipedia.org/wiki/Texas_hold_'em)
with a few friends, and the hand has one card left to deal (the "river"). If
the last card has a heart on it (25\% probability) you will have a flush and
win the hand with a probability you estimate to be 90\%. If not, you will
win with a probability you estimate to be 10\%. What are your overall chances
of winning?
The answer can be calculated using the law of total probability:
\begin{align}
\Pr(Win) &= \Pr(\textrm{Win}|\textrm{Hearts})\Pr(\textrm{Hearts})
+ \Pr(\textrm{Win}|\textrm{not Hearts})\Pr(\textrm{not Hearts}) \\
&= 0.9*0.25 + 0.1*0.75 \\
&= 0.3
\end{align}
So you have a 30\% chance of winning.
:::
### Bayes' law
The second is ***Bayes' law***, which is a rule for determining
conditional probabilities:
$$\Pr(A|B) = \frac{\Pr(B|A)\Pr(A)}{\Pr(B)}$$
Bayes' law is particularly useful in evaluating evidence, because
it allows us to restate one conditional probability in terms of
another.
Both the law of total probability and Bayes' law follow from the definition of
conditional probabilities. They are easy to prove, but I won't prove them
here. Instead, I will use an example to show how they can be useful.
::: example
**False positives in medical testing**
When someone is tested for a disease, the test comes back either
"positive" (the person has the disease) or "negative" (the person does
not have the disease). However, no test is perfect. Sometimes people
who do not have the disease test positive ("false positives") and
sometimes people who do have the disease test negative
("false negative").
Let the event $T$ mean a particular patient tests positive
for a disease, and let the event $D$ mean that this patient
actually has the disease.
The *sensitivity* of the test is an infected patient's probability
of testing positive:
$$\Pr(T|D) = p$$
the *specificity* of the test is a healthy patient's probability
of testing negative:
$$\Pr(T^c|D^c) = q$$
and the *prevalence* of the infection is the probability that
a given patient has the disease:
$$\Pr(D) = d$$
Suppose that a patient has tested positive. What is the
probability that he has the disease, i.e. what the value of
$\Pr(D|T)$?
This is a classic probability question, as it makes use
of Bayes' law and the law of total probability, and it has
obvious practical usage.
Since we want a conditional probability, we start by stating Bayes' law:
$$\Pr(D|T) = = \frac{\Pr(T|D)\Pr(D)}{\Pr(T)}$$
Bayes' law will allow us to calculate $\Pr(D|T)$ if we can find
the components of the right side of this equation.
We already know that $\Pr(T|D)=p$ and $\Pr(D)=r$, so all we need
is to find $\Pr(T)$.
Since $\Pr(T)$ is an unconditional probability, we can use
the law of total probability:
$$\Pr(T) = \underbrace{\Pr(T|D)}_{p}
\underbrace{\Pr(D)}_{d}
+ \underbrace{\Pr(T|D^c)}_{1-q}
\underbrace{\Pr(D^c)}_{1-d}$$
Plugging these results into our formula we get:
$$\Pr(D|T) = = \frac{pd}{pd + (1-q)(1-d)}$$
which is the result we need.
Now, let's try out some numbers. Suppose that false positives are rare
($q = 0.99$), and false negatives never happen ($p=1$).
- Suppose the disease itself is fairly common ($d = 0.10$). Then:
$$\Pr(D|T) = = \frac{1*0.1}{1*0.1 + (1-0.99)*(1-0.1)} \approx 0.917$$
- Suppose the disease itself is quite rare ($d = 0.001$). Then
$$\Pr(D|T) = = \frac{1*0.001}{1*0.001 + (1-0.99)*(1-0.001)} \approx 0.091$$
In other words, the exact same test has a very different meaning depending
on the prevalence in the population: when the disease is common a positive
test means a 91.7% chance of having the disease, and when the disease
is rare a positive test result means a 9.1% chance of having the disease.
This general issue (even a small false positive rate can have a big impact
when prevalence is low) appeared repeatedly in March and April of 2020. Several
studies by well-known researchers[^302] dramatically overestimated the early
prevalence of the COVID-19 virus and thus dramatically underestimated its
fatality rate. These studies were regularly cited as support by those who
wanted to substantially relax public health restrictions in April 2020,
and had substantial real world consequences.
:::
[^302]: If you are interested in learning more about this, an
[article in Science](https://www.sciencemag.org/news/2020/04/antibody-surveys-suggesting-vast-undercount-coronavirus-infections-may-be-unreliable)
provides an overview of the controversy, and a
[blog post by statistician Andrew Gelman](https://statmodeling.stat.columbia.edu/2020/04/19/fatal-flaws-in-stanford-study-of-coronavirus-prevalence/)
provides a thorough discussion
of the statistical issues.
## Chapter review {-#review-probability}
In this chapter we have learned the basic terminology and concepts of probability. You
may have seen a number of these terms and ideas in high school, but we are approaching
them at a higher level. Be sure to review these terms and concepts in detail, and
do the practice problems to test your knowledge.
Our next step is to take our general framework of outcomes and events, and apply
them to [random variables](#random-variables) - outcomes that are specifically
numerical.
## Practice problems {-#problems-probability}
Answers can be found in the [appendix](#answers-probability).
Most of these practice problems will be based on the casino game of *craps*. Craps
is played with a pair of 6-sided dice.
Players take turns rolling the dice, and the player currently rolling the dice
is called the "shooter". There are various bets - pass, don't
pass, come, don't come, field, place, buy - that can be placed on the
results of multiple rolls of the dice. These bets and their probability
calculations can be quite complex, so we will focus on "single roll" bets.
- A bet on "Snake Eyes" wins if the total showing on the dice is 2.
- A bet on "Yo" wins if the total showing on the dice is 11.
- A bet on "Boxcars" wins if the total showing on the dice is 12.
- A bet on "Field" wins if the total showing on the dice is 2, 3, 4, 9, 10,
11, or 12.
For this example, assume that
- One die is red and the other is white.
- Both dice are fair, that is each side has equal probability
- The dice are independent of one another
An outcome for a single roll of the dice is a pair of numbers $(r,w)$
where $r$ is the amount showing on the red die, and $w$ is the amount
showing on the white die. For example an outcome $(2,4)$ means that the
red die is showing 2 and the white die is showing 4.
**SKILL #1: Define outcomes and sample space for a simple example**
1. Let $\Omega$ be the sample space for the outcome of a single roll in craps.
a. Define $\Omega$ by enumeration.
b. Find the cardinality of $\Omega$.
2. Using enumeration, define the following events:
a. Yo wins
b. Snake eyes wins
c. Boxcars wins
d. Field wins
**SKILL #2: Use set theory to work with events**
3. Which of the following statements are true?
a. The events "Yo wins" and "Boxcars wins" are identical.
b. The events "Yo wins" and $(r,w) = (5,6)$ are identical.
c. The events "Boxcars wins" and $(r,w) = (6,6)$ are identical.
4. Which of the following statements are true?
a. The events "Yo wins" and "Boxcars wins" are disjoint.
b. The events "Yo wins" and "Field wins" are disjoint.
c. The events "Yo wins" and "Boxcars loses" are disjoint.
d. The events "Yo wins" and "Field loses" are disjoint.
5. Which of the following statements are true?
a. The event "Yo wins" implies the event "Boxcars wins".
b. The event "Yo wins" implies the event "Boxcars loses".
c. The event "Yo wins" implies the event "Field wins".
d. The event "Yo wins" implies the event "Field loses".
6. Which of the following are elementary events?
a. Yo wins.
b. Yo loses.
c. Boxcars wins.
d. Boxcars loses.
e. Field wins.
f. Field loses.
**SKILL #3: Calculate event probabilities from elementary event probabilities**
7. Calculate each of the following elementary event probabilities:
a. $(r,w) = (1,1)$
b. $(r,w) = (3,4)$
c. $(r,w) = (6,6)$
8. Find the probability of each of the following events:
a. A bet on Yo wins.
b. A bet on Snake eyes wins.
c. A bet on Boxcars wins.
d. A bet on Field wins.
**SKILL #4: Calculate joint and conditional probabilities**
9. Calculate each of the following joint probabilities:
a. $\Pr(\textrm{Yo wins} \cap \textrm{Boxcars wins})$
b. $\Pr(\textrm{Yo wins} \cap \textrm{Field wins})$
c. $\Pr(\textrm{Yo wins} \cap \textrm{Boxcars loses})$
10. Calculate each of the following conditional probabilities:
a. $\Pr(\textrm{Yo wins} | \textrm{Boxcars wins})$
b. $\Pr(\textrm{Yo wins} | \textrm{Field wins})$
c. $\Pr(\textrm{Yo wins} \cap \textrm{Boxcars loses})$
d. $\Pr(\textrm{Field wins} | \textrm{Yo wins})$
e. $\Pr(\textrm{Boxcars wins} | \textrm{Yo wins})$
11. Which of the following pairs of events are independent?
a. Yo wins and Boxcars wins.
b. Yo wins and Field wins.
c. Yo wins and Yo wins.
d. $r = 3$ and $r = 5$.
e. $r = 3$ and $w =5$.
**SKILL #5: Apply the axioms of probability**
12. Let $A$ be an event. Which of the following statements are true?
a. $\Pr(A) \geq 0$.
b. $\Pr(A) > 0$.
c. $\Pr(A) \leq 1$.
d. $\Pr(A) < 1$.
e. $\Pr(A^c) \geq 0$.
f. $\Pr(A^c) > 0$.
g. $\Pr(A^c) \leq 1$.
h. $\Pr(A^c) < 1$.
i. $\Pr(A^c) = 1 - \Pr(A)$.
13. Let $A$ and $B$ be two events. Which of the following statements
are true?
a. $\Pr(A \cup B) = \Pr(A) + \Pr(B)$.
b. $\Pr(A \cup B) = \Pr(A) + \Pr(B) - \Pr(A \cap B)$.
c. $\Pr(A \cup B) \leq \Pr(A) + \Pr(B)$.
d. $\Pr(A \cap B) = \Pr(A)\Pr(B)$.
14. Let $A$ and $B$ be two disjoint events. Which of the following
statements are true?
a. $\Pr(A \cap B) = 0$.
b. $\Pr(A \cap B) = \Pr(A) + \Pr(B)$.
c. $\Pr(A \cup B) = 0$.
d. $\Pr(A \cup B) = \Pr(A) + \Pr(B)$.
e. $\Pr(A \cup B) = \Pr(A) + \Pr(B) - \Pr(A \cap B)$.
f. $\Pr(A \cup B) \leq \Pr(A) + \Pr(B)$.
g. $\Pr(A \cap B) = \Pr(A)\Pr(B)$.
h. $\Pr(A | B) = 0$
15. Let $A$ and $B$ be two events such that $A \subset B$. Which of
the following statements are true?
a. $\Pr(A) \leq \Pr(B)$
b. $\Pr(A \cap B) = \Pr(A)$
c. $\Pr(A | B) = 1$
Statements (a), (b) and (c) are true.
16. Let $A$ and $B$ be two independent events. Which of the following
statements are true?
a. $\Pr(A \cap B) = 0$.
b. $\Pr(A \cap B) = \Pr(A)\Pr(B)$.
c. $\Pr(A|B) = \Pr(A)$.