Skip to content

Commit 8a5b8b3

Browse files
authored
improve docs and errors re: model formulas (#1015)
1 parent 975a703 commit 8a5b8b3

File tree

7 files changed

+216
-5
lines changed

7 files changed

+216
-5
lines changed

DESCRIPTION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Package: parsnip
22
Title: A Common API to Modeling and Analysis Functions
3-
Version: 1.1.1.9000
3+
Version: 1.1.1.9001
44
Authors@R: c(
55
person("Max", "Kuhn", , "max@posit.co", role = c("aut", "cre")),
66
person("Davis", "Vaughan", , "davis@posit.co", role = "aut"),

NEWS.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
11
# parsnip (development version)
22

3+
* Improved errors and documentation related to special terms in formulas. See `?model_formula` to learn more. (#770, #1014)
4+
35
* Improved errors in cases where the outcome column is mis-specified. (#1003)
46

5-
* Documentation fixed for `mlp(engine = "brulee")`: the default values for `learn_rate` and `epochs` were swapped (#1018).
7+
* Fixed documentation for `mlp(engine = "brulee")`: the default values for `learn_rate` and `epochs` were swapped (#1018).
68

79
# parsnip 1.1.1
810

R/gen_additive_mod.R

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -92,5 +92,21 @@ translate.gen_additive_mod <- function(x, engine = x$engine, ...) {
9292
#' @export
9393
#' @keywords internal
9494
fit_xy.gen_additive_mod <- function(object, ...) {
95-
rlang::abort("`fit()` must be used with GAM models (due to its use of formulas).")
95+
trace <- rlang::trace_back()
96+
97+
if ("workflows" %in% trace$namespace) {
98+
cli::cli_abort(
99+
c("!" = "When working with generalized additive models, please supply the
100+
model specification to {.fun workflows::add_model} along with a \\
101+
{.arg formula} argument.",
102+
"i" = "See {.help parsnip::model_formula} to learn more."),
103+
call = NULL
104+
)
105+
}
106+
107+
cli::cli_abort(c(
108+
"!" = "Please use {.fun fit} rather than {.fun fit_xy} to train \\
109+
generalized additive models.",
110+
"i" = "See {.help model_formula} to learn more."
111+
))
96112
}

R/model_formula.R

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
#' Formulas with special terms in tidymodels
2+
#'
3+
#' @description
4+
#'
5+
#' In R, formulas provide a compact, symbolic notation to specify model terms.
6+
#' Many modeling functions in R make use of ["specials"][stats::terms.formula],
7+
#' or nonstandard notations used in formulas. Specials are defined and handled as
8+
#' a special case by a given modeling package. For example, the mgcv package,
9+
#' which provides support for
10+
#' [generalized additive models][parsnip::gen_additive_mod] in R, defines a
11+
#' function `s()` to be in-lined into formulas. It can be used like so:
12+
#'
13+
#' ``` r
14+
#' mgcv::gam(mpg ~ wt + s(disp, k = 5), data = mtcars)
15+
#' ```
16+
#'
17+
#' In this example, the `s()` special defines a smoothing term that the mgcv
18+
#' package knows to look for when preprocessing model input.
19+
#'
20+
#' The parsnip package can handle most specials without issue. The analogous
21+
#' code for specifying this generalized additive model
22+
#' [with the parsnip "mgcv" engine][parsnip::details_gen_additive_mod_mgcv]
23+
#' looks like:
24+
#'
25+
#' ``` r
26+
#' gen_additive_mod() %>%
27+
#' set_mode("regression") %>%
28+
#' set_engine("mgcv") %>%
29+
#' fit(mpg ~ wt + s(disp, k = 5), data = mtcars)
30+
#' ```
31+
#'
32+
#' However, parsnip is often used in conjunction with the greater tidymodels
33+
#' package ecosystem, which defines its own pre-processing infrastructure and
34+
#' functionality via packages like hardhat and recipes. The specials defined
35+
#' in many modeling packages introduce conflicts with that infrastructure.
36+
#'
37+
#' To support specials while also maintaining consistent syntax elsewhere in
38+
#' the ecosystem, **tidymodels delineates between two types of formulas:
39+
#' preprocessing formulas and model formulas**. Preprocessing formulas specify
40+
#' the input variables, while model formulas determine the model structure.
41+
#'
42+
#' @section Example:
43+
#'
44+
#' To create the preprocessing formula from the model formula, just remove
45+
#' the specials, retaining references to input variables themselves. For example:
46+
#'
47+
#' ```
48+
#' model_formula <- mpg ~ wt + s(disp, k = 5)
49+
#' preproc_formula <- mpg ~ wt + disp
50+
#' ```
51+
#'
52+
#' \itemize{
53+
#' \item **With parsnip,** use the model formula:
54+
#'
55+
#' ``` r
56+
#' model_spec <-
57+
#' gen_additive_mod() %>%
58+
#' set_mode("regression") %>%
59+
#' set_engine("mgcv")
60+
#'
61+
#' model_spec %>%
62+
#' fit(model_formula, data = mtcars)
63+
#' ```
64+
#'
65+
#' \item **With recipes**, use the preprocessing formula only:
66+
#'
67+
#' ``` r
68+
#' library(recipes)
69+
#'
70+
#' recipe(preproc_formula, mtcars)
71+
#' ```
72+
#'
73+
#' The recipes package supplies a large variety of preprocessing techniques
74+
#' that may replace the need for specials altogether, in some cases.
75+
#'
76+
#' \item **With workflows,** use the preprocessing formula everywhere, but
77+
#' pass the model formula to the `formula` argument in `add_model()`:
78+
#'
79+
#' ``` r
80+
#' library(workflows)
81+
#'
82+
#' wflow <-
83+
#' workflow() %>%
84+
#' add_formula(preproc_formula) %>%
85+
#' add_model(model_spec, formula = model_formula)
86+
#'
87+
#' fit(wflow, data = mtcars)
88+
#' ```
89+
#'
90+
#' The workflow will then pass the model formula to parsnip, using the
91+
#' preprocessor formula elsewhere. We would still use the preprocessing
92+
#' formula if we had added a recipe preprocessor using `add_recipe()`
93+
#' instead a formula via `add_formula()`.
94+
#'
95+
#' }
96+
#'
97+
#' @name model_formula
98+
NULL

_pkgdown.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,7 @@ reference:
7878
- control_parsnip
7979
- glance.model_fit
8080
- model_fit
81+
- model_formula
8182
- model_spec
8283
- multi_predict
8384
- parsnip_addin

man/model_formula.Rd

Lines changed: 94 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

tests/testthat/test_gen_additive_model.R

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ test_that('regression', {
2626
y = mtcars$mpg,
2727
control = ctrl
2828
),
29-
regexp = "must be used with GAM models"
29+
regexp = "to train generalized additive"
3030
)
3131
mgcv_mod <- mgcv::gam(mpg ~ s(disp) + wt + gear, data = mtcars, select = TRUE)
3232
expect_equal(coef(mgcv_mod), coef(extract_fit_engine(f_res)))
@@ -70,7 +70,7 @@ test_that('classification', {
7070
y = two_class_dat$Class,
7171
control = ctrl
7272
),
73-
regexp = "must be used with GAM models"
73+
regexp = "to train generalized additive"
7474
)
7575
mgcv_mod <-
7676
mgcv::gam(Class ~ s(A, k = 10) + B,

0 commit comments

Comments
 (0)