Skip to content

Allow stat_bin() to compute over single-unique-value data #3047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jan 26, 2019

Conversation

yutannihilation
Copy link
Member

@yutannihilation yutannihilation commented Dec 27, 2018

Fixes #3043.

stat_bin() uses the width of the range of x to calculate the binwidth. So, it fails when x contains only one unique value, because the range is 0-width. Similaly to resolution(), we need to treat zero-ranges specially.

resolution <- function(x, zero = TRUE) {
if (is.integer(x) || zero_range(range(x, na.rm = TRUE)))
return(1)

As suggested in #3047 (comment), the width of the range should be 0.1, for consistency with the width of the expansion expand_default() gives for the 0-width range.

library(ggplot2)

d <- data.frame(x = rep(1, 100))
ggplot(d, aes(x = x)) +
  geom_histogram()
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Created on 2019-01-03 by the reprex package (v0.2.1)

@clauswilke
Copy link
Member

The default range for scales without data range is only 0.1 in the latest version of ggplot2. Maybe use the same value for default bin width?

library(ggplot2)

ggplot(data.frame(), aes(x = 1, y = 0)) + geom_point()

Created on 2018-12-30 by the reprex package (v0.2.1)

@yutannihilation
Copy link
Member Author

yutannihilation commented Dec 31, 2018

Thanks, I didn't notice that.

I still don't understand the details yet, but that 0.1 seems to come from this expansion, which is under control of Coord, not Scale. So, it's not enough to use 0.1 as the binwidth to plot a single stat_bin()ed point in the same x range as stat_identity() because the expansion will be added to 0.1. Let me think a bit more about what's the right behaviour...

ggplot2/R/coord-.r

Lines 153 to 155 in e9d4e5d

expand_default <- function(scale, discrete = c(0, 0.6, 0, 0.6), continuous = c(0.05, 0, 0.05, 0)) {
scale$expand %|W|% if (scale$is_discrete()) discrete else continuous
}

library(ggplot2)

ggplot(data.frame(), aes(x = 1, y = 0)) +
  geom_point() +
  coord_cartesian(expand = FALSE)

Created on 2018-12-31 by the reprex package (v0.2.1)

Here is the result with width <- 0.1 and the comparison with stat_identity():

library(ggplot2)
library(patchwork)

p1 <- ggplot(data.frame(), aes(x = 1, y = 2)) +
  geom_point() +
  ggtitle("stat_identity()")

d <- data.frame(x = c(1, 1))
p2 <- ggplot(d, aes(x = x, y = stat(count))) +
  geom_point(stat = "bin") +
  ggtitle("stat_bin()")


p1 / p2
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Created on 2018-12-31 by the reprex package (v0.2.1)

@clauswilke
Copy link
Member

I realize that the expansion will be added once more for stat_bin() relative to stat_identity(). I think that's fine.

@yutannihilation
Copy link
Member Author

Now I'm convinced that the binwidth for 0-width range should be the same value as the width of the expansion for 0-width range, 0.1. I added the commits and updated the description.

@yutannihilation
Copy link
Member Author

@clauswilke Could you review this again?

@yutannihilation yutannihilation merged commit 256b26a into tidyverse:master Jan 26, 2019
@yutannihilation yutannihilation deleted the fix-issue3043 branch January 26, 2019 01:21
@lock
Copy link

lock bot commented Jul 25, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Jul 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error : Computation failed in stat_bin(): binwidth must be positive
2 participants