Skip to content

improve prompting about missing implementations #793

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Sep 6, 2022
Merged

Conversation

simonpcouch
Copy link
Contributor

@simonpcouch simonpcouch commented Aug 17, 2022

Follows up on #732 and #778. The goal here is to tighten the logic for prompts about missing implementations. This PR eliminates false positives and makes sure we always suggest "registered" extension packages when they could be relevant.

To do so, we defer model specification checking for type / engine / mode combinations until we have some sort of clue that the user might be "done" specifying the model (If only we could <-.model_spec😭). This is at:

  • print
  • fit

...and functions that use either downstream. Some examples of how these prompts used to be raised:

library(parsnip)

# errors with a default engine, though doesn't call out baguette
bag_tree()
#> parsnip could not locate an implementation for `bag_tree` model specifications
#> using the `rpart` engine.
#> Bagged Decision Tree Model Specification (unknown)
#> 
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#> 
#> Computational engine: rpart

# bad: still includes default engine in error, bc triggered in bag_tree(),
# and still does not point to baguette
bag_tree() %>%
  set_engine("C5.0")
#> parsnip could not locate an implementation for `bag_tree` model specifications
#> using the `rpart` engine.
#> Bagged Decision Tree Model Specification (unknown)
#> 
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#> 
#> Computational engine: C5.0

# this should call out "censored regression" specifically, but does
# not because the error is triggered in bag_tree().
# also does not point to censored
bag_tree() %>%
  set_engine("rpart") %>%
  set_mode("censored regression")
#> parsnip could not locate an implementation for `bag_tree` model specifications
#> using the `rpart` engine.
#> Bagged Decision Tree Model Specification (censored regression)
#> 
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#> 
#> Computational engine: rpart

# this is well-defined:
library(censored)
#> Loading required package: survival

bag_tree() %>% 
  set_mode("censored regression") %>% 
  set_engine("rpart")
#> Bagged Decision Tree Model Specification (censored regression)
#> 
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#> 
#> Computational engine: rpart

# this errors differently when just censored is loaded
# ought to note that an implementation does live in baguette
bag_tree() %>%
  set_engine("C5.0")
#> Error in `check_spec_mode_engine_val()`:
#> ! Engine 'C5.0' is not supported for `bag_tree()`. See `show_engines('bag_tree')`.

# once baguette is loaded, this is well-defined:
library(baguette)

bag_tree() %>%
  set_engine("C5.0")
#> Bagged Decision Tree Model Specification (unknown)
#> 
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#> 
#> Computational engine: C5.0

Created on 2022-08-17 by the reprex package (v2.0.1)

Same code with the dev version:

library(parsnip)

bag_tree()
#> ! parsnip could not locate an implementation for `bag_tree` model
#>   specifications.
#> ℹ The parsnip extension packages censored and baguette implement support for
#>   this specification.
#> ℹ Please install (if needed) and load to continue.
#> Bagged Decision Tree Model Specification (unknown)
#> 
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#> 
#> Computational engine: rpart

# now includes correct engine in error
bag_tree() %>%
  set_engine("C5.0")
#> ! parsnip could not locate an implementation for `bag_tree` model
#>   specifications using the `C5.0` engine.
#> ℹ The parsnip extension package baguette implements support for this
#>   specification.
#> ℹ Please install (if needed) and load to continue.
#> Bagged Decision Tree Model Specification (unknown)
#> 
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#> 
#> Computational engine: C5.0

# calls out "censored regression" specifically
bag_tree() %>%
  set_engine("rpart") %>%
  set_mode("censored regression")
#> ! parsnip could not locate an implementation for `bag_tree` censored regression
#>   model specifications using the `rpart` engine.
#> ℹ The parsnip extension package censored implements support for this
#>   specification.
#> ℹ Please install (if needed) and load to continue.
#> Bagged Decision Tree Model Specification (censored regression)
#> 
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#> 
#> Computational engine: rpart

# this is well-defined:
library(censored)
#> Loading required package: survival

bag_tree() %>% 
  set_mode("censored regression") %>% 
  set_engine("rpart")
#> Bagged Decision Tree Model Specification (censored regression)
#> 
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#> 
#> Computational engine: rpart

# this now prompts informatively, as before censored was loaded
bag_tree() %>%
  set_engine("C5.0")
#> ! parsnip could not locate an implementation for `bag_tree` model
#>   specifications using the `C5.0` engine.
#> ℹ The parsnip extension package baguette implements support for this
#>   specification.
#> ℹ Please install (if needed) and load to continue.
#> Bagged Decision Tree Model Specification (unknown)
#> 
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#> 
#> Computational engine: C5.0

library(baguette)

# prompt goes away once baguette is loaded
bag_tree() %>%
  set_engine("C5.0")
#> Bagged Decision Tree Model Specification (unknown)
#> 
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#> 
#> Computational engine: C5.0

Created on 2022-08-30 by the reprex package (v2.0.1)

This also catches bugs related to how loading one unrelated extension would affect the prompts related to another. PR to extratests to make sure we test this here! There are also additional tests with decision trees there.

On fit(), this same message is elevated to an error.

Open to ideas if we have any other cues that a user may be "done" putting together a model spec—adding to a workflow is another instance, though we'll get "automatic" workflow support otherwise in that it wraps parsnip's print and fit methods. Same with parameter extraction.

…pec`

also, deprecates `check_missing_spec` argument
this code used to miss the edge case where a non-relevant extension package was loaded. e.g.:

```
bag_tree() %>%
  set_engine("C5.0")
#> parsnip could not locate an implementation for `bag_tree` model specifications
#> using the `C5.0` engine.
#> Bagged Decision Tree Model Specification (unknown)
#>
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#>
#> Computational engine: C5.0

library(censored)

bag_tree() %>%
  set_engine("C5.0")
#> Error in `check_spec_mode_engine_val()`:
#> ! Engine 'C5.0' is not supported for `bag_tree()`. See `show_engines('bag_tree')`.
```
also:

* fixes duplicated packages in prompt
* appends a newline regardless of whether a specific extension is recommended
* allows passing additional arguments to `prompt`
mostly undoing changes made for 732.
still need to test interactions with extensions--this will have to live in extratests
simonpcouch added a commit to tidymodels/extratests that referenced this pull request Aug 17, 2022
@simonpcouch simonpcouch marked this pull request as draft August 18, 2022 18:56
@simonpcouch
Copy link
Contributor Author

Note: this community post points out additional locations where we might want to drop these checks, none of which live in parsnip—workflows::add_model and workflows::extract_parameter_set_dials.workflow. May be worth exporting helpers.

@simonpcouch simonpcouch marked this pull request as ready for review August 30, 2022 15:57
Copy link
Member

@topepo topepo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll run this with TMwR and our workshop notes to test more.

Can you run revdeps?

@@ -37,8 +37,10 @@ bag_mars <-
args = args,
eng_args = NULL,
mode = mode,
user_specified_mode = !missing(mode),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes will have to propagate to the external extension packages too (e.g. model time)?

We'll also need to update the docs on tidymodels.org too (I believe).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

engine_filter_condition and mode_filter_condition (to be renamed re: comment), where these slots are processed, anticipate model objects from external extension packages not having this slot—in those cases, we bypass the additional prompts so that those external extensions fall back to the same errors as before. Those extensions can add the slot and get the prompting introduced in this PR "for free," since these prompts look to a full join of the model environment and model_info_table.

Good thought on the tidymodels.org docs. Will follow-up with a PR there.

@simonpcouch
Copy link
Contributor Author

Thanks for the review! Running revdeps and addressing comments ATM.

@simonpcouch
Copy link
Contributor Author

No new problems in revdeps, though I found a backward-compatibility issue that wasn't caught.

* transition `model_info_table` read to a helper
* `implementation_exists_somewhere` -> `spec_is_possible`
* comment on `*_filter_condition` helpers
* minimal testing for old/external objects
* comment on checking strategy
* name arguments to `prompt_missing_implementation`
* mark parsnip with `.pkg` cli tag
also, renames `has_loaded_implementation` -> `spec_is_loaded`
@simonpcouch
Copy link
Contributor Author

The most recent commit exports and documents the relationship between the different checking functions more thoroughly:

The helpers spec_is_possible(), spec_is_loaded(), and
prompt_missing_implementation() provide tooling for checking
model specifications. In addition to the cls, engine, and mode
arguments, the functions take arguments user_specified_engine and
user_specified_mode, denoting whether the user themselves has
specified the engine or mode, respectively.

spec_is_possible() checks against the union of

  • the current parsnip model environment and
  • the model_info_table of "pre-registered" model specifications

to determine whether a model is well-specified. See
parsnip:::read_model_info_table() for this table.

spec_is_loaded() checks only against the current parsnip model environment.

spec_is_possible() is executed automatically on new_model_spec(),
set_mode(), and set_engine(), and spec_is_loaded() is executed
automatically in print.model_spec(), among other places. spec_is_possible()
should be used when a model specification is still "in progress" of being
specified, while spec_is_loaded should only be called when parsnip or an
extension receives some indication that the user is "done" specifying a model
specification: at print, fit, addition to a workflow, or extract_*(), for
example.

When spec_is_loaded() is FALSE, the prompt_missing_implementation()
helper will construct an informative message to prompt users to load or
install needed packages. It's prompt argument refers to the prompting
function to use, usually cli::cli_inform or cli::cli_abort, and the
ellipses are passed to that function.

@simonpcouch simonpcouch requested a review from topepo August 31, 2022 18:15
@topepo topepo merged commit c54f07b into main Sep 6, 2022
@topepo topepo deleted the extension-prompt branch September 6, 2022 21:24
@github-actions
Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Sep 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants