Skip to content

stat_density_2d error messages uninformative #6374

Closed
@oracle5th

Description

@oracle5th

I found an error thrown by stat_density_2d not very informative. It computes an illegal bandwidth for me behind the scene, which causes an internal error that is not explained in the messages. Specifying the bandwidth explicitly can fix the problem. However, I expect stat_density_2d can either handle these edge cases or point out that the default value given the data is illegal and manual input is required.

The following example works fine:

library(ggplot2)
df <- data.frame(x=sample(0:10, 100, replace=T), y=rep(0:10, 100, replace=T))
ggplot(df) + stat_density_2d(geom='density_2d', mapping=aes(x,y))

but the next one will throw an error:

df <- data.frame(x=sample(0:10, 100, replace=T), y=c(rep(5, 80), sample(0:10, 20, replace=T)))
ggplot(df) + stat_density_2d(geom='density_2d', mapping=aes(x,y))

Error in stat_density_2d():
! Problem while computing stat.
ℹ Error occurred in the 1st layer.
Caused by error in seq_len():
! argument must be coercible to non-negative integer

The error messages is quite confusing. By digging into the warnings, I found the root cause of the problem:

1: Computation failed in stat_density2d()
Caused by error in MASS::kde2d():
! bandwidths must be strictly positive

In stat_density_2d, h is automatically computed before calling kde2d, if not given

if (is.null(h)) {
  h <- c(MASS::bandwidth.nrd(data$x), MASS::bandwidth.nrd(data$y))
  h <- h * adjust
}

# calculate density
dens <- MASS::kde2d(
  data$x, data$y, h = h, n = n,
  lims = c(scales$x$dimension(), scales$y$dimension())
)

and bandwidth.nrd uses the following formula by default

function(x)
{
    r <- quantile(x, c(0.25, 0.75))
    h <- (r[2] - r[1])/1.34
    4 * 1.06 * min(sqrt(var(x)), h) * length(x)^(-1/5)
}

So if one of data$x and data$y has more than 75% of identical values, defualt bandwidth will become 0 without warning, and it will immediately be considered as illegal by kde2d.

Metadata

Metadata

Assignees

No one assigned

    Labels

    messagesrequests for improvements to error, warning, or feedback messages

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions