Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ref_interval option to mcmc_rank_* functions. #248

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hhau
Copy link
Contributor

@hhau hhau commented Nov 6, 2020

Not sure if this is something that you've thought about before but decided against, but I quite often want to add an uncertainty interval to the mcmc_rank_* functions (not everyone knows how to interpret them yet). Because these are binned rank histograms, we can use the argument employed in the Simulated Bayesian Calibration paper to get appropriate uncertainty intervals. Said intervals are based on the expected distribution of the rank histobins (which follow a binomial distribution).

mcmc_rank_* functions default to not drawing this interval, but when it is requested (using ref_interval = TRUE), the defaults are a 95% uncertainty interval drawn with alpha = 0.2.

Here is the default behaviour:

library(bayesplot)
#> This is bayesplot version 1.7.2.9000
#> - Online documentation and vignettes at mc-stan.org/bayesplot
#> - bayesplot theme set to bayesplot::theme_default()
#>    * Does _not_ affect other ggplot2 plots
#>    * See ?bayesplot_theme_set for details on theme setting

mcmc_rank_hist(
 example_mcmc_draws(),
 ref_interval = TRUE
)

And here is a slightly more complicated example, with some very bad draws added:

n_chain <- 4
n_sample_per_chain <- 5000
n_parameter <- 2
n_total <- n_chain * n_sample_per_chain * n_parameter
n_bins <- 20

ex_samples <- array(
  data = c(rnorm(n_total - 200), rep(1, 200)),
  dim = c(n_sample_per_chain, n_chain, n_parameter),
  dimnames = list(
    Sample = 1 : n_sample_per_chain,
    Chain = 1 : n_chain,
    Parameter = c("theta", "phi")
  )
)

mcmc_rank_overlay(
  ex_samples,
  ref_interval = TRUE
)

Created on 2020-11-06 by the reprex package (v0.3.0)

@hhau
Copy link
Contributor Author

hhau commented Nov 6, 2020

Additionally, I added some tests for the mcmc_rank_* functions, and added the requirement for there to be more than 1 chain (rank histograms make no sense when there is only 1 chain, they are uniform by construction).

@jgabry
Copy link
Member

jgabry commented Nov 10, 2020

@hhau Thanks for this! I like the idea of adding the uncertainty intervals. @avehtari What do you think?

@avehtari
Copy link
Contributor

I deleted my previous comment that I wrote as I was not thinking straight. As the mcmc_rank histograms are based on multiple samples that are dependent via computing ranks for the combined values, the confidence intervals for the bins are not binomially distributed. I'll get back to you soon with the correct intervals, and suggestion to add also the reference line for exact uniformity.

@jgabry
Copy link
Member

jgabry commented Nov 11, 2020

Ok thanks Aki!

@hhau
Copy link
Contributor Author

hhau commented Jan 12, 2021

I recently watched https://youtu.be/HKPm6txxxQM?t=1889, and it made me reconsider this.

  1. The distribution of the ranks when the samples contain within-chain correlation seems tricky to derive, but if we ignore the said correlation then they do have a binomial distribution.
  2. The 'right thing' to do is to look at the ECDF difference between the uniform CDF and the ECDF of the ranks, because uncertainty intervals are more readily available for the ECDF and/or uniform CDF.
  3. The way the ranks deviate from uniformity gives us more information than whether the deviation exceeds some threshold, so the usefulness of such an interval is a bit questionable.

Probably best to close this then? Up to y'all.

@avehtari
Copy link
Contributor

Let's keep this open for a moment as there might be a solution for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants