Skip to content

fct_reorder() yields unexpected result in presence of missing values #315

Closed
@DanChaltiel

Description

@DanChaltiel

Hi,

When using fct_reorder() in presence of missing values, you often do not get the expected result.

For instance, in the following code, the "blue" level gets an NA summary and is therefore sent to the last level of the result. In larger datasets, where missing values happen everywhere, this results in fct_reorder() doing frustratingly nothing.

library(tidyverse)
df = tribble(
  ~color,     ~a,
  "purple",      1,
  "purple",      2,
  "blue",     3, #NA
  "blue",     4,
  "green",    5,
  "green",    6
)
df$color = factor(df$color)
df$color %>% levels
#> [1] "blue"   "green"  "purple"
fct_reorder(df$color, df$a) %>% levels
#> [1] "purple" "blue"   "green"

df$a[3]=NA
fct_reorder(df$color, df$a) %>% levels
#> [1] "purple" "green"  "blue"
fct_reorder(df$color, df$a, na.rm=TRUE) %>% levels
#> [1] "purple" "blue"   "green"

Created on 2022-08-10 by the reprex package (v2.0.1)

This is especially unexpected as the default function, median(), has na.rm=FALSE by default. Using other common summary functions like min() and max() has the same problem.

There is a mention of this in the documentation (... Other arguments passed on to .fun. A common argument is na.rm = TRUE.), but I don't think this is explicit enough.

Could there be some kind of warning to suggest we add na.rm=TRUE? For instance if(any(is.na(summary))) warn("missing").

Otherwise, maybe this should be mentioned up in the description, for instance something like "Any missing value returned by the summary function for a level will cause this level to be sent to the end." (ok that's not well written but you get the point)

You might even want the user to explicitly opt-in for na.rm=FALSE, and by default inject na.rm=TRUE to the summary function if na.rm is in formals(.fun).. This is a bit invasive, I'll give you that, but I cannot see any real use case where na.rm=FALSE could be wanted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featurea feature request or enhancement

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions