Skip to content

NAs should be excluded from the level of discrete X-axis by default #1584

Closed
@yutannihilation

Description

@yutannihilation

When the variable for x is character, NAs are not treated as a break of X axis. But when the variable is factor, NAs are included.

library(ggplot2)

set.seed(1)
d <- c("A", "B", NA)
x <- sample(d, size = 100, replace = TRUE, prob = c(0.7, 0.2, 0.1))
y <- rnorm(100)
df <- data.frame(x, y)

ggplot(df) + geom_boxplot(aes(x, y), na.rm = TRUE)

ggplot(df) + geom_boxplot(aes(as.character(x), y), na.rm = TRUE)

na_included na_not_included

I feel this behaviour is inconsistent. NAs should be included only when NA is intentionally included in the level of the factor variable.

levels(df$x)
[1] "A" "B"

I guess this is simply because scales:::clevels() treats them differently. But, I'm wondering if this is by design... Why do we have to force NAs included, while we can include NA in the factor level by ourselves?

set.seed(1)
d <- c("A", "B", NA)
x <- sample(d, size = 100, replace = TRUE, prob = c(0.7, 0.2, 0.1))

scales:::clevels(x, drop = FALSE)
#> [1] "A" "B"

scales:::clevels(as.factor(x), drop = FALSE)
#> [1] "A" "B" NA 

Is it possible to eliminate NA from factor X-axis by default? (But I'm afraid this suggestion is too late and many people may already rely on this behaviour)

Metadata

Metadata

Assignees

Labels

bugan unexpected problem or unintended behavior

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions