Closed
Description
We need a way to toggle whether sparse features should be produced or not. the general idea is fairly simple and could be done like so:
flowchart TD
A[Does model support sparse input?]
AA[don't make sparse data]
B[is there high sparsity in the data?]
BA[make sparse data]
BB[don't make sparse data]
A -->|Yes| B
A -->|No| AA
B -->|Yes| BA
B -->|No| BB
the hard part is finding a good default threshold for is there high sparsity in the data?
. We will investigate that in tidymodels/planning#34, tidymodels/planning#33.
We will likewise get helpers from {recipes} in tidymodels/recipes#1397 to aid in this decision. Determining the sparsity of a recipe should live in the recipes package. the sparsity calculator will be located in {sparsevctrs} r-lib/sparsevctrs#82
Another wrinkle is that this determination will be preprocessor dependent. non-recipes preprocessors are easier to handle.