-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Weighted density #5254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weighted density #5254
Conversation
I'm fine with this, but looping in @clauswilke as he seemed to have more opinions on design in the issue thread |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two relatively minor comments (see code annotations) and one bigger one. The bigger one is: Have you checked how the weights interact with the density reflection when there are bounds imposed? Is there anything that needs to be done or are things behaving as expected. I'm not sure I remember what the density reflection does, so it may well be that everything is correct as is, just wanted to be certain.
if (is.null(w)) { | ||
w <- rep(1 / nx, nx) | ||
} else { | ||
w <- w / sum(w) | ||
w_sum <- sum(w) | ||
w <- w / w_sum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function fit_data_to_bounds()
checks for division by zero but here you're not doing that. I think in weird corner cases with zero weights this could give you different results depending on whether a point is inside the bounds or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been trying to come up with examples that should be valid, but aren't, but I'm not having an easy time of it. If you censor a non-zero weight and are left with all zero weights, the weighted density should reflect that and show a flat line. Currently, it does this, but also passes on a warning from stats::density()
:
devtools::load_all("~/packages/ggplot2/")
#> ℹ Loading ggplot2
df <- data.frame(x = 1:3)
ggplot(df, aes(x)) +
geom_density(aes(weight = c(1, 0, 0), y = after_stat(wdensity)),
bounds = c(1.1, Inf))
#> Warning: Some data points are outside of `bounds`. Removing them.
#> Warning in density.default(x, weights = w, bw = bw, adjust = adjust, kernel =
#> kernel, : sum(weights) != 1 -- will not get true density
Created on 2023-04-11 with reprex v2.0.2
I'm wondering if there is something to be improved here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. As far as I'm concerned you can handle it however you see fit, I just wanted to call out the inconsistency.
Boundary inflection should work as intended. In the inflected layer (solid line), the two peaks should have equal height because the inflected left bit has half the weight that gets doubled through inflection. This works similar to regular devtools::load_all("~/packages/ggplot2/")
#> ℹ Loading ggplot2
df <- data.frame(x = c(1, 5, 1, 5),
group = c("A", "A", "B", "B"),
weight = c(1, 2, 2, 4))
ggplot(df, aes(x, colour = group)) +
geom_density(aes(weight = weight, y = after_stat(wdensity)),
bounds = c(1, Inf)) +
geom_density(aes(weight = weight, y = after_stat(wdensity)),
linetype = "dotted") +
xlim(-2, 8) Created on 2023-04-11 with reprex v2.0.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merge branch 'main' into density_wcount # Conflicts: # R/stat-density.R # tests/testthat/test-stat-density.R
This PR aims to fix #4176.
It adds the
wdensity
computed variable, which is the density times the sum of weights.If weights are absent, this reduces to the same as the
count
computed variable.An example:
Created on 2023-03-30 with reprex v2.0.2