Closed
Description
When the variable for x
is character, NA
s are not treated as a break of X axis. But when the variable is factor, NA
s are included.
library(ggplot2)
set.seed(1)
d <- c("A", "B", NA)
x <- sample(d, size = 100, replace = TRUE, prob = c(0.7, 0.2, 0.1))
y <- rnorm(100)
df <- data.frame(x, y)
ggplot(df) + geom_boxplot(aes(x, y), na.rm = TRUE)
ggplot(df) + geom_boxplot(aes(as.character(x), y), na.rm = TRUE)
I feel this behaviour is inconsistent. NA
s should be included only when NA
is intentionally included in the level of the factor variable.
levels(df$x)
[1] "A" "B"
I guess this is simply because scales:::clevels()
treats them differently. But, I'm wondering if this is by design... Why do we have to force NA
s included, while we can include NA
in the factor level by ourselves?
set.seed(1)
d <- c("A", "B", NA)
x <- sample(d, size = 100, replace = TRUE, prob = c(0.7, 0.2, 0.1))
scales:::clevels(x, drop = FALSE)
#> [1] "A" "B"
scales:::clevels(as.factor(x), drop = FALSE)
#> [1] "A" "B" NA
Is it possible to eliminate NA from factor X-axis by default? (But I'm afraid this suggestion is too late and many people may already rely on this behaviour)