Skip to content

Default factor levels when coercing from logical #185

Closed
@alejandroschuler

Description

@alejandroschuler

Hi folks, I'm not sure if forcats is the right place to raise this issue, so apologies in advance if this should go elsewhere.

I find it curious that factor(c(TRUE, TRUE)) produces a factor with a single level: TRUE. If we are to imagine that factors are categories where we know all of the levels a-priori, wouldn't it make sense if logicals converted to factors preserved that a-priori knowledge of the two levels TRUE and FALSE (with a consistent order) regardless of whether or not one of those two levels does not appear in the data? It's a trivial fix using levels=c("TRUE", "FALSE"), but I find it to be so philosophically obvious that this should always happen that it should warrant default behavior. Perhaps something along the lines of:

fct.default = base::factor
fct.logical = function(lgl, ...) base::factor(lgl, levels=c("TRUE", "FALSE"))
fct = function(x, ...) UseMethod("fct")

This is a sometimes a minor frustration for me when using tidymodels packages (and caret before that) which require a factor outcome for classification problems. Every now and again I end up in a corner case where I get errors like:

 Error: In metric: `yardstick::sens`
`truth` and `estimate` levels must be equivalent.
`truth`: TRUE, FALSE
`estimate`: TRUE

Metadata

Metadata

Assignees

No one assigned

    Labels

    featurea feature request or enhancementwipwork in progress

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions