Skip to content

prompt on unavailable extension packages #732

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
May 31, 2022
Merged

prompt on unavailable extension packages #732

merged 22 commits into from
May 31, 2022

Conversation

simonpcouch
Copy link
Contributor

@simonpcouch simonpcouch commented May 22, 2022

Closes #731.

I've had a few moments of confusion as I've familiarized with parsnip where I unknowingly specify a model specification/engine/mode combination that requires an extension package, and only come across failures later on in a modeling pipeline. This seems to be the source of confusion in #544 as well.

This PR refines print methods to nudge users to load an extension package if it's needed.

# warns informatively when it ought to:
library(parsnip)

C5_rules()
#> parsnip could not locate an implementation for `C5_rules` classification model specifications using the `C5.0` engine.
#> ℹ The parsnip extension package rules implements support for this specification.
#> ℹ Please install (if needed) and load to continue.
#> 
#> C5.0 Model Specification (classification)
#> 
#> Computational engine: C5.0

C5_rules(engine = "C5.0") %>% translate()
#> parsnip could not locate an implementation for `C5_rules` classification model specifications using the `C5.0` engine.
#> ℹ The parsnip extension package rules implements support for this specification.
#> ℹ Please install (if needed) and load to continue.
#> 
#> C5.0 Model Specification (classification)
#> 
#> Computational engine: C5.0

# doesn't prompt and prints out the call as before if the implementing extension is loaded:
library(rules)

C5_rules()
#> C5.0 Model Specification (classification)
#> 
#> Computational engine: C5.0

C5_rules(engine = "C5.0") %>% translate()
#> C5.0 Model Specification (classification)
#> 
#> Computational engine: C5.0 
#> 
#> Model fit template:
#> rules::c5_fit(x = missing_arg(), y = missing_arg(), weights = missing_arg())

# otherwise behaves the same as before:
linear_reg() %>% translate()
#> Linear Regression Model Specification (regression)
#> 
#> Computational engine: lm 
#> 
#> Model fit template:
#> stats::lm(formula = missing_arg(), data = missing_arg(), weights = missing_arg())

linear_reg()
#> Linear Regression Model Specification (regression)
#> 
#> Computational engine: lm

Created on 2022-05-27 by the reprex package (v2.0.1)

[EDIT: updated reprex and explanation to reflect changes re: Julia's edits]

@juliasilge
Copy link
Member

I think this would make the most sense in an rlang::inform() vs. printing (and we have a lot of tools for handling the rlang messages).

@simonpcouch
Copy link
Contributor Author

Ah, thank you @juliasilge! I agree that rlang::inform likely makes more sense.

Just pushed those changes—they're a bit more invasive now but do result in a better interface.

still need to use suggested packages conditionally
@simonpcouch
Copy link
Contributor Author

Wrapping up for the night, though this isn't quite there.🌝🌚

@simonpcouch simonpcouch marked this pull request as draft May 22, 2022 23:09
drops the dedicated test, as it's failure/success seems to depend on whether rules is currently loaded
fixes:
* checking for unstated dependencies in examples WARNING
* checking Rd line widths NOTE
@simonpcouch simonpcouch marked this pull request as ready for review May 23, 2022 20:39
@simonpcouch simonpcouch requested a review from juliasilge May 23, 2022 20:40
Copy link
Contributor Author

@simonpcouch simonpcouch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick self-review to give some context for changes. :)

Copy link
Member

@juliasilge juliasilge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be a huge help to give more useful info in this situation! 🙌

I've got a bit of specific feedback and then some questions about the approach.

R/misc.R Outdated
Comment on lines 94 to 99
avail <-
get_from_env(spec_) %>%
dplyr::filter(mode %in% mode_, !!eng_cond)
pars <-
utils::read.delim(system.file("models.tsv", package = "parsnip")) %>%
dplyr::filter(model == spec_, !!eng_cond, mode %in% mode_, is.na(pkg))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an expensive check to do every time new_model_spec() is run (which is run a lot, I think)? The env is just there, but what about reading the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly more expensive than our other checks. But—

  • It’s only run on initiation of new model types (i.e. never at set_engine, update, etc)
  • It’s possibly negligible compared to fit times:
library(parsnip)

# running the check
system.time(mod <- C5_rules())
#> parsnip could not locate an implementation for `C5_rules` classification model specifications using the `C5.0` engine.
#> ℹ The parsnip extension package rules implements support for this specification.
#> ℹ Please install (if needed) and load to continue.
#>    user  system elapsed 
#>   0.038   0.001   0.039

# doesn't run the check
system.time(mod <- mod %>% set_engine("C5.0"))
#>    user  system elapsed 
#>   0.008   0.000   0.009

# model fit to a lil fella
library(rules)

mtcars$cyl <- as.factor(mtcars$cyl)

system.time(mod %>% fit(cyl ~ ., mtcars))
#>    user  system elapsed 
#>   0.445   0.020   0.467

Created on 2022-05-25 by the reprex package (v2.0.1)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that it could be common to read this file multiple times seems like something to check and get other feedback on. We could read this file one time at .onLoad or something? Let's have one more person look at this part of the PR and see what they think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good observation. After 9918043, parsnip now assigns the models.tsv system file as model_info_table internally so we can skip that load step on checks:

library(parsnip)
  
# now just assigns models.tsv to internal object:
head(parsnip:::model_info_table)
#>      model                mode engine      pkg
#> 1 bag_mars      classification  earth baguette
#> 2 bag_mars          regression  earth baguette
#> 3 bag_tree censored regression  rpart censored
#> 4 bag_tree      classification   C5.0 baguette
#> 5 bag_tree      classification  rpart baguette
#> 6 bag_tree          regression  rpart baguette

# running the check. used to be:
#>    user  system elapsed 
#>   0.038   0.001   0.039
system.time(mod <- C5_rules())
#> parsnip could not locate an implementation for `C5_rules` classification model specifications using the `C5.0` engine.
#> ℹ The parsnip extension package rules implements support for this specification.
#> ℹ Please install (if needed) and load to continue.
#>    user  system elapsed 
#>   0.033   0.001   0.033


# doesn't run the check. used to be:
#>    user  system elapsed 
#>   0.008   0.000   0.009
system.time(mod <- mod %>% set_engine("C5.0"))
#>    user  system elapsed 
#>   0.007   0.000   0.007


# model fit to a lil fella, for comparison
library(rules)

mtcars$cyl <- as.factor(mtcars$cyl)

system.time(mod %>% fit(cyl ~ ., mtcars))
#>    user  system elapsed 
#>   0.440   0.025   0.464

Created on 2022-05-27 by the reprex package (v2.0.1)

I opted not to ditch that models.tsv file entirely as it seems our machinery to keep it updated with the current docs system works well.

delineates the message from the model printer a bit better:

```
> C5_rules()
parsnip could not locate an implementation for `C5_rules` classification model specifications using the `C5.0` engine.
ℹ The parsnip extension package rules implements support for this specification.
ℹ Please install (if needed) and load to continue.

C5.0 Model Specification (classification)

Computational engine: C5.0
```
@simonpcouch simonpcouch requested a review from juliasilge May 25, 2022 15:38
R/misc.R Outdated
Comment on lines 94 to 99
avail <-
get_from_env(spec_) %>%
dplyr::filter(mode %in% mode_, !!eng_cond)
pars <-
utils::read.delim(system.file("models.tsv", package = "parsnip")) %>%
dplyr::filter(model == spec_, !!eng_cond, mode %in% mode_, is.na(pkg))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that it could be common to read this file multiple times seems like something to check and get other feedback on. We could read this file one time at .onLoad or something? Let's have one more person look at this part of the PR and see what they think.

@simonpcouch simonpcouch requested a review from topepo May 27, 2022 16:22
@simonpcouch
Copy link
Contributor Author

This should be ready for your review, @topepo!

One unresolved discussion that we'd especially appreciate your eye on: #732 (comment)

@simonpcouch
Copy link
Contributor Author

Note from meeting: thumbs up from Max to merge!

@simonpcouch simonpcouch merged commit 6820609 into main May 31, 2022
@simonpcouch simonpcouch deleted the translate-print branch May 31, 2022 16:21
@github-actions
Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Jun 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

translate() print method fails uninformatively when extension isn't loaded
3 participants