Skip to content

Commit 532e7f6

Browse files
author
Wu Jianxiao
committed
Fix typos in chpt 09 10 and 13
1 parent a0de4ff commit 532e7f6

File tree

6 files changed

+11
-7
lines changed

6 files changed

+11
-7
lines changed

notebooks/09-StatisticalPower.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
"source": [
3838
"## Power analysis\n",
3939
"\n",
40-
"We can compute a power analysis using functions from the `statsmodels.stats.power` package. Let's focus on the power for an independent samples t-test in order to determine a difference in the mean between two groups. Let's say that we think than an effect size of Cohen's d=0.5 is realistic for the study in question (based on previous research) and would be of scientific interest. We wish to have 80% power to find the effect if it exists. We can compute the sample size needed for adequate power using the `TTestIndPower()` function:"
40+
"We can compute a power analysis using functions from the `statsmodels.stats.power` package. Let's focus on the power for an independent samples t-test in order to determine a difference in the mean between two groups. Let's say that we think that an effect size of Cohen's d=0.5 is realistic for the study in question (based on previous research) and would be of scientific interest. We wish to have 80% power to find the effect if it exists. We can compute the sample size needed for adequate power using the `TTestIndPower()` function:"
4141
]
4242
},
4343
{

notebooks/09-StatisticalPower.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@
3333
# %% [markdown]
3434
# ## Power analysis
3535
#
36-
# We can compute a power analysis using functions from the `statsmodels.stats.power` package. Let's focus on the power for an independent samples t-test in order to determine a difference in the mean between two groups. Let's say that we think than an effect size of Cohen's d=0.5 is realistic for the study in question (based on previous research) and would be of scientific interest. We wish to have 80% power to find the effect if it exists. We can compute the sample size needed for adequate power using the `TTestIndPower()` function:
36+
# We can compute a power analysis using functions from the `statsmodels.stats.power` package. Let's focus on the power for an independent samples t-test in order to determine a difference in the mean between two groups. Let's say that we think that an effect size of Cohen's d=0.5 is realistic for the study in question (based on previous research) and would be of scientific interest. We wish to have 80% power to find the effect if it exists. We can compute the sample size needed for adequate power using the `TTestIndPower()` function:
3737

3838
# %%
3939

notebooks/10-BayesianStatistics.ipynb

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
"\n",
1010
"## Applying Bayes' theorem: A simple example\n",
1111
"TBD: MOVE TO MULTIPLE TESTING EXAMPLE SO WE CAN USE BINOMIAL LIKELIHOOD\n",
12-
"A person has a cough and flu-like symptoms, and gets a PCR test for COVID-19, which comes back postiive. What is the likelihood that they actually have COVID-19, as opposed a regular cold or flu? We can use Bayes' theorem to compute this. Let's say that the local rate of symptomatic individuals who actually are infected with COVID-19 is 7.4% (as [reported](https://twitter.com/Bob_Wachter/status/1281792549309386752/photo/1) on July 10, 2020 for San Francisco); thus, our prior probability that someone with symptoms actually has COVID-19 is .074. The RT-PCR test used to identify COVID-19 RNA is highly specific (that is, it very rarelly reports the presence of the virus when it is not present); for our example, we will say that the specificity is 99%. Its sensitivity is not known, but probably is no higher than 90%. \n",
12+
"A person has a cough and flu-like symptoms, and gets a PCR test for COVID-19, which comes back postiive. What is the likelihood that they actually have COVID-19, as opposed to a regular cold or flu? We can use Bayes' theorem to compute this. Let's say that the local rate of symptomatic individuals who actually are infected with COVID-19 is 7.4% (as [reported](https://twitter.com/Bob_Wachter/status/1281792549309386752/photo/1) on July 10, 2020 for San Francisco); thus, our prior probability that someone with symptoms actually has COVID-19 is .074. The RT-PCR test used to identify COVID-19 RNA is highly specific (that is, it very rarelly reports the presence of the virus when it is not present); for our example, we will say that the specificity is 99%. Its sensitivity is not known, but probably is no higher than 90%. \n",
1313
"First let's look at the probability of disease given a single positive test."
1414
]
1515
},
@@ -29,6 +29,8 @@
2929
"marginal_likelihood = sensitivity * prior + (1 - specificity) * (1 - prior)\n",
3030
"posterior = (likelihood * prior) / marginal_likelihood\n",
3131
"posterior\n",
32+
"\n",
33+
"\n",
3234
"\n"
3335
]
3436
},

notebooks/10-BayesianStatistics.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
#
2020
# ## Applying Bayes' theorem: A simple example
2121
# TBD: MOVE TO MULTIPLE TESTING EXAMPLE SO WE CAN USE BINOMIAL LIKELIHOOD
22-
# A person has a cough and flu-like symptoms, and gets a PCR test for COVID-19, which comes back postiive. What is the likelihood that they actually have COVID-19, as opposed a regular cold or flu? We can use Bayes' theorem to compute this. Let's say that the local rate of symptomatic individuals who actually are infected with COVID-19 is 7.4% (as [reported](https://twitter.com/Bob_Wachter/status/1281792549309386752/photo/1) on July 10, 2020 for San Francisco); thus, our prior probability that someone with symptoms actually has COVID-19 is .074. The RT-PCR test used to identify COVID-19 RNA is highly specific (that is, it very rarelly reports the presence of the virus when it is not present); for our example, we will say that the specificity is 99%. Its sensitivity is not known, but probably is no higher than 90%.
22+
# A person has a cough and flu-like symptoms, and gets a PCR test for COVID-19, which comes back postiive. What is the likelihood that they actually have COVID-19, as opposed to a regular cold or flu? We can use Bayes' theorem to compute this. Let's say that the local rate of symptomatic individuals who actually are infected with COVID-19 is 7.4% (as [reported](https://twitter.com/Bob_Wachter/status/1281792549309386752/photo/1) on July 10, 2020 for San Francisco); thus, our prior probability that someone with symptoms actually has COVID-19 is .074. The RT-PCR test used to identify COVID-19 RNA is highly specific (that is, it very rarelly reports the presence of the virus when it is not present); for our example, we will say that the specificity is 99%. Its sensitivity is not known, but probably is no higher than 90%.
2323
# First let's look at the probability of disease given a single positive test.
2424

2525
# %%
@@ -36,6 +36,8 @@
3636

3737

3838

39+
40+
3941
# %% [markdown]
4042
# The high specificity of the test, along with the relatively high base rate of the disease, means that most people who test positive actually have the disease.
4143
# Now let's plot the posterior as a function of the prior. Let's first create a function to compute the posterior, and then apply this with a range of values for the prior.

notebooks/13-GeneralLinearModel.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"# The General Linear Model in R\n",
7+
"# The General Linear Model\n",
88
"In this chapter we will explore how to fit general linear models in Python. We will focus on the tools provided by the `statsmodels` package."
99
]
1010
},

notebooks/13-GeneralLinearModel.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
# ---
1515

1616
# %% [markdown]
17-
# # The General Linear Model in R
17+
# # The General Linear Model
1818
# In this chapter we will explore how to fit general linear models in Python. We will focus on the tools provided by the `statsmodels` package.
1919

2020
# %%
@@ -95,7 +95,7 @@ def generate_linear_data(slope, intercept,
9595
import seaborn as sns
9696
import scipy.stats
9797

98-
scipy.stats.probplot(ols_result.resid, plot=sns.mpl.pyplot)
98+
_ = scipy.stats.probplot(ols_result.resid, plot=sns.mpl.pyplot)
9999

100100
# %% [markdown]
101101
# This looks pretty good, in the sense that the residual data points fall very close to the unit line. This is not surprising, since we generated the data with normally distributed noise. We should also plot the predicted (or *fitted*) values against the residuals, to make sure that the model does work systematically better for some predicted values versus others.

0 commit comments

Comments
 (0)