`geom_density()` for bounded data

Currently `geom_density()` directly uses `density()` to compute kernel density estimation. Due to its nature, it has "extending property": output density can imply that values outside the range of input data are possible (with default "gaussian" kernel). This is a realistic practical setup, but there are cases when this is not true and data is bounded (for example, when only positive values are possible). It can be a good idea to support kernel density estimation on this type of bounded data with new `bounds` argument of `geom_density()`.

In my opinion, one of the methods most easiest to implement, understand, and teach, is "reflection" method. There is one possible description on [this page](https://ned.ipac.caltech.edu/level5/March02/Silverman/Silver2_10.html). Basically, boundary correction is done by doing "standard" kernel density estimation first, and then "reflecting" tails outside of desired interval to be inside. Densities inside and outside of desired interval are added together in "symmetric fashion": `d(x) = d_f(x) + d_f(l - (x-l)) + d_f(r + (r-x))`, where `d_f` is density of input, `d` is density of output, `l` and `r` are left and right edges of desired interval.

I made some quick and dirty changes to ggplot2 for demonstration. `stat_density()` gets `bounds` argument with default value of `c(-Inf, Inf)`. Here are some examples of proposed functionality:

```r
library(tibble)

set.seed(101)

ggplot(tibble(x = runif(100)), aes(x)) +
  geom_density() +
  geom_density(bounds = c(0, 1), color = "blue", ) +
  stat_function(data = tibble(x = c(0, 1)), fun = dunif, color = "red")
```

![](https://i.imgur.com/aYGG5Fs.png)

```r
ggplot(tibble(x = rexp(100)), aes(x)) +
  geom_density() +
  geom_density(bounds = c(0, Inf), color = "blue") +
  stat_function(data = tibble(x = c(0, 5)), fun = dexp, color = "red")
```

![](https://i.imgur.com/OyzYbJl.png)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`geom_density()` for bounded data #3387

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

geom_density() for bounded data #3387

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`geom_density()` for bounded data #3387