Description
Hi folks, I'm not sure if forcats
is the right place to raise this issue, so apologies in advance if this should go elsewhere.
I find it curious that factor(c(TRUE, TRUE))
produces a factor with a single level: TRUE
. If we are to imagine that factors are categories where we know all of the levels a-priori, wouldn't it make sense if logicals converted to factors preserved that a-priori knowledge of the two levels TRUE
and FALSE
(with a consistent order) regardless of whether or not one of those two levels does not appear in the data? It's a trivial fix using levels=c("TRUE", "FALSE")
, but I find it to be so philosophically obvious that this should always happen that it should warrant default behavior. Perhaps something along the lines of:
fct.default = base::factor
fct.logical = function(lgl, ...) base::factor(lgl, levels=c("TRUE", "FALSE"))
fct = function(x, ...) UseMethod("fct")
This is a sometimes a minor frustration for me when using tidymodels
packages (and caret
before that) which require a factor outcome for classification problems. Every now and again I end up in a corner case where I get errors like:
Error: In metric: `yardstick::sens`
`truth` and `estimate` levels must be equivalent.
`truth`: TRUE, FALSE
`estimate`: TRUE