|
| 1 | +--- |
| 2 | +format: |
| 3 | + html: |
| 4 | + toc: false |
| 5 | +--- |
| 6 | + |
| 7 | +## Welcome |
| 8 | + |
| 9 | +These are the materials for workshops on [tidymodels](https://www.tidymodels.org/). This workshop provides an introduction to machine learning with R using the tidymodels framework, a collection of packages for modeling and machine learning using [tidyverse](https://www.tidyverse.org/) principles. We will build, evaluate, compare, and tune predictive models. Along the way, we'll learn about key concepts in machine learning including overfitting, resampling, and feature engineering. Learners will gain knowledge about good predictive modeling practices, as well as hands-on experience using tidymodels packages like parsnip, rsample, recipes, yardstick, tune, and workflows. |
| 10 | + |
| 11 | +## Is this workshop for me? <img src="slides/images/parsnip-flagger.jpg" align="right" height="150"/> |
| 12 | + |
| 13 | +This course assumes intermediate R knowledge. This workshop is for you if: |
| 14 | + |
| 15 | +- You can use the magrittr pipe `%>%` and/or native pipe `|>` |
| 16 | +- You are familiar with functions from dplyr, tidyr, and ggplot2 |
| 17 | +- You can read data into R, transform and reshape data, and make a wide variety of graphs |
| 18 | + |
| 19 | +We expect participants to have some exposure to basic statistical concepts, but NOT intermediate or expert familiarity with modeling or machine learning. |
| 20 | + |
| 21 | +## Preparation |
| 22 | + |
| 23 | +Please join the workshop with a computer that has the following installed (all available for free): |
| 24 | + |
| 25 | +- A recent version of R, available at <https://cran.r-project.org/> |
| 26 | +- A recent version of RStudio Desktop (RStudio Desktop Open Source License, at least v2022.02), available at <https://www.rstudio.com/download> |
| 27 | +- The following R packages, which you can install from the R console: |
| 28 | + |
| 29 | +```{r} |
| 30 | +#| eval: false |
| 31 | +#| echo: true |
| 32 | +
|
| 33 | +# First, install the pak package: |
| 34 | +
|
| 35 | +install.packages("pak") |
| 36 | +
|
| 37 | +# Then the packages for both days |
| 38 | +pkgs <- |
| 39 | + c("bonsai", "doParallel", "finetune", "lightgbm", "lme4", "plumber", |
| 40 | + "probably", "ranger", "rpart", "rpart.plot", "stacks", "textrecipes", |
| 41 | + "tidymodels", "tidymodels/modeldatatoo", "vetiver") |
| 42 | +pak::pak(pkgs) |
| 43 | +``` |
| 44 | + |
| 45 | +## Slides |
| 46 | + |
| 47 | +These slides are designed to use with live teaching and are published for workshop participants' convenience. There are not meant as standalone learning materials. For that, we recommend [tidymodels.org](https://www.tidymodels.org/start/) and [*Tidy Modeling with R*](https://www.tmwr.org/). |
| 48 | + |
| 49 | +### Introduction to tidymodels |
| 50 | + |
| 51 | +- 01: [Introduction](slides/01-introduction.html){target="_blank"} |
| 52 | +- 02: [Your data budget](slides/02-data-budget.html){target="_blank"} |
| 53 | +- 03: [What makes a model?](slides/03-what-makes-a-model.html){target="_blank"} |
| 54 | +- 04: [Evaluating models](slides/04-evaluating-models.html){target="_blank"} |
| 55 | + |
| 56 | +### Advanced tidymodels |
| 57 | + |
| 58 | +- 01: [Feature engineering using recipes](slides/advanced-01-feature-engineering.html){target="_blank"} |
| 59 | +- 02: [Tuning hyperparameters (grid search)](slides/advanced-02-tuning-hyperparameters.html){target="_blank"} |
| 60 | +- 03: [Grid search via racing](slides/advanced-03-racing.html){target="_blank"} |
| 61 | +- 04: [Iterative search](slides/advanced-04-iterative.html){target="_blank"} |
| 62 | + |
| 63 | +### Extra content (time permitting) |
| 64 | + |
| 65 | +- [Transit Case Study (includes stacking)](slides/extras-transit-case-study.html){target="_blank"} |
| 66 | +- [Effect encodings](slides/extras-effect-encodings.html){target="_blank"} |
| 67 | + |
| 68 | + |
| 69 | +There's also a page for [slide annotations](slides/annotations.html){target="_blank"}; these are extra notes for selected slides. |
| 70 | + |
| 71 | +## Code |
| 72 | + |
| 73 | +Quarto files (version `r system("quarto --version", intern = TRUE)`) for working along [are available on GitHub](https://github.com/tidymodels/workshops/tree/main/classwork). (Don't worry if you haven't used Quarto before; it will feel familiar to R Markdown users.) |
| 74 | + |
| 75 | +## Past workshops |
| 76 | + |
| 77 | +- [July 2022](archive/2022-07-RStudio-conf/index.html) at [rstudio::conf()](https://posit.co/blog/talks-and-workshops-from-rstudio-conf-2022/) |
| 78 | +- [August 2022](archive/2022-08-Reykjavik-City/) in Reykjavik |
| 79 | +- [July 2023](archive/2023-07-nyr/) at the New York R Conference |
| 80 | + |
| 81 | +## Acknowledgments {.appendix} |
| 82 | + |
| 83 | +This website, including the slides, is made with [Quarto](https://quarto.org/). Please [submit an issue](https://github.com/tidymodels/workshops/issues) on the GitHub repo for this workshop if you find something that could be fixed or improved. |
| 84 | + |
| 85 | +## Reuse and licensing {.appendix} |
| 86 | + |
| 87 | +Unless otherwise noted (i.e. not an original creation and reused from another source), these educational materials are licensed under Creative Commons Attribution [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/). |
0 commit comments