This repository is archived as DATA 88S is no longer going to be taught at UC Berkeley. Materials are retained exclusively for reference and may not meet accessibility standards.
This textbook was built with Jupyter Books. The textbook was originally built by Francie McQuarrie using Jupyter Book. Shahzar reconfigured the textbook for the updated Jupyter Books.
Only three files/directories need to be edited.
_config.yml: Configuration information about the textbook. Modify this file for things like:- changing the logo or favicon;
- adding or removing launch buttons;
- changing information about the book.
_toc.yml: Table of contents for the textbook. Modify this file for things like:- section and chapter numbering and order;
- adding or removing sections or chapters.
content/: Content of the textbook. All the notebooks with section and chapter content go here. Modify these files to actually change the content of the sections.
This section details how to maintain the textbook.
Follow these steps the first time you set up a computer to modify and maintain the textbook.
- Create a local copy of this repo by running
git clone https://github.com/stat88/textbook.gitfrom the command line in whichever folder you want to contain the textbook. - Next, you need to install all the required packages. Either of the commands
pip install -r requirements.txtorconda install --file requirements.txtshould work. If you have a Windows device, it's preferable to run this in an Anaconda Prompt terminal. This should install the two packagesjupyter-bookandghp-import, which are used for building and deploying the textbook, respectively, and a bunch of other typical packages (e.g.numpy,scipy,matplotlib, etc.) used by thecontent/notebooks.
I ended up with a weird configuration where I have a python virtual environment venv created with python3 -m venv venv that with requirements from requirements.txt installed via pip install -r requirements.txt inside the activated virtual environment. But I get the following conflict when trying to access any jupyter-book command where v2 takes precedence over v1 (required for this textbook):
(venv) (base) silascs@M29LX9JT4N textbook % which jupyter-book
/Users/silascs/.nvm/versions/node/v20.19.2/bin/jupyter-book
(venv) (base) silascs@M29LX9JT4N textbook % ./venv/bin/jupyter-book --version
Jupyter Book : 1.0.4.post1
External ToC : 1.0.1
MyST-Parser : 3.0.1
MyST-NB : 1.3.0
Sphinx Book Theme : 1.1.4
Jupyter-Cache : 1.0.1
NbClient : 0.10.4
Thus, I need to use ./venv/bin/jupyter-book in place of jupyter-book for all commands.
These steps detail the process you should go through every time you update the textbook.
- Pull:
cdintotextbook/, your local copy of the textbook repo andgit pull origin masterto collect any updates which may have been pushed to the remote copy by other collaborators. - Update: Make any changes you wish to make. This should (likely) only consist of changes to
_config.yml,_toc.yml, and the files incontent/.- If you added new sections or chapters, update
_toc.ymlas well to reflect your changes.
- If you added new sections or chapters, update
- Build:
cdinto the directory abovetextbook/(i.e.cd ..) and runjupyter-book build textbook. - Check: Open the file
textbook/_build/html/index.htmlin your browser to view what the textbook will look like with any changes you've made. Make sure nothing is broken and the changes are as you want them.- See the Troubleshooting section for any issues you may be having.
- Take a look at the Issues for problematic parts of the textbook.
- Deploy:
cdback intotextbook/(cd textbook/) and runghp-import -n -p -f _build/html(the-nflag is important, since it adds a.nojekyllfile which allows GitHub to build the website correctly). This will push the_build/htmlfolder to thegh-pagesbranch of the textbook repository, which is configured by GitHub Pages to hold the files for the textbook website. To edit these configurations, from the repository page, go to Settings > Pages. - Push: Stage any changes you made (i.e. using
git add [file],git add -u,git add ., etc.), commit your changes withgit commit -m "[description]"(please include a useful description of any changes you made), and push to the master repository withgit push origin master.
The Jupyter Book website has lots of information about Jupyter Book. Some useful pages are:
- Anatomy of a Jupyter Book
- Table of Contents
- Configuration Reference
- References and Cross-References
- Building
- Deploying
If changes you've made aren't showing up the HTML after building, sometimes deleting _build and then building again helps. Jupyter Book will usually only re-build the HTML of notebooks that it thinks have been changed by any edits made, and so this sometimes means that some changes will go unnoticed. Deleting the entire folder and rebuilding forces it to build from scratch, which prevents any old files or code from sticking around. However, please see the third item in Issues and make sure you manually add the appropriate files to _build/html/_images when you do this.
Links to the internet should be done as they are usually done in Markdown. However, to cross-link to other pages of the textbook, there is an internal linking system that should be used instead (since it is robust to file structure changes in /textbook). This system is described here.
For example, Section 12.4 Exercise 3 contains a link to Section 12.2.
- The flag
(ch12.2)=was added before the primary header of the notebook.
(ch12.2)=
## The Distribution of the Estimated Slope ##
- The link to Section 12.2 was changed to
(ch12.2).
**3.**
Refer to the regression of active pulse rate on resting pulse rate in [Section 12.2](ch12.2). Here are the estimated values again, along with some additional data.
Ideally, every section and subsection should have a flag before the header or subheader. As of July 16, 2022, the only sections/subsections with flags are ones that are linked to by other sections:
- Section 6.2 (linked to by Section 7.1.3),
- Section 5.4 (linked to by Section 7.4 Exercise 4),
- Section 5.4.7 (linked to by Section 11.2),
- Section 12.2 (linked to by Section 12.4 Exercise 3). Someone should go through the textbook at some point and add flags to all section and subsection headers to make cross-linking easier.
For ease of tracking, the sections that load in a dataset from data are enumerated:
- Section 9.4 loads in
baby.csv; - Section 12.2 loads in
pulse.csv; - Section 12.3 loads in
hodgkins.csv; - Section 12.4 loads in
pulse.csv.
The following is a list of somewhat specific cases of weird behavior throughout the textbook.
- The subheaders Arranging in a Line and Choosing Subsets in
Chapter_03/00_Random_Counts.ipynbare done using HTML (<h3> ... </h3>) instead of Markdown (### ... ###), since using Markdown makes Jupyter Book label them as Section 13.1 and 13.2 in the textbook (which results in the actual Section 13.1 being displayed as Section 13.3 in the sidebar). - The subheader IID Trials in
Chapter_04/00_Infinitely_Many_Values.ipynbis done using HTML for the same reasons as the files in the above bullet point. - Section 11.1's visual of four different bias/variance cases is done in HTML instead of Markdown so that the images can be displayed in a table format. This is why the
html_imageoption is enabled undermyst_enable_extensionsin_config.yml. More importantly, when building the textbook, the imagescontent/images/bias_lvar.png,content/images/lbias_lvar.png,content/images/ubias_hvar.png, andcontent/images/ubias_lvar.pngare not automatically moved to the_build/html/_imagesfolder. If those images are deleted in that folder and the textbook is rebuilt, they must manually be copied fromcontent/images/to_build/html/_images.