Closed
Description
Brief description of the problem
If I add xlim() or limits in scale_x_continuous() using geom_histogram and setting the limits outside the range of the data I see a warning message:
Removed 2 rows containing missing values (geom_bar()
).
but in fact nothing has been removed.
True for me on:
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)
and
R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS
and
R version 4.2.1 (2022-06-23 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)
All three show ggplot2 at version 3.4.0
Reprex follows.
library(tidyverse)
set.seed(12345)
tibble(x = rnorm(5000) / 10) -> tmpTib
tmpTib %>%
summarise(min = min(x),
max = max(x),
nNA = sum(is.na(x)))
# # A tibble: 1 × 3
# min max nNA
# <dbl> <dbl> <int>
# 1 -0.388 0.333 0
### so no missing values and range well inside [-1, 1]
ggplot(data = tmpTib,
aes(x = x)) +
geom_histogram()
### plots all 5000 points
ggplot(data = tmpTib,
aes(x = x)) +
geom_histogram() +
xlim(-1, 1)
### reports:
# Warning message:
# Removed 2 rows containing missing values (`geom_bar()`).
### same happens using scale_x_continuous(limits = c(-1, 1)):
ggplot(data = tmpTib,
aes(x = x)) +
geom_histogram() +
scale_x_continuous(limits = c(-1, 1))
# Warning message:
# Removed 2 rows containing missing values (`geom_bar()`).
tmpTib %>%
filter(row_number() < 6) -> tmpTibSmall
tmpTibSmall
# # A tibble: 5 × 1
# x
# <dbl>
# 1 0.0586
# 2 0.0709
# 3 -0.0109
# 4 -0.0453
# 5 0.0606
### using small dataset shows that there is actually no removal of data
ggplot(data = tmpTibSmall,
aes(x = x)) +
geom_histogram() +
scale_x_continuous(limits = c(-.07, .085))
ggplot(data = tmpTibSmall,
aes(x = x)) +
geom_histogram() +
xlim(-.07, .085)
sessionInfo()
I hope I'm not being stupid!