Skip to content
Luigi Acerbi edited this page Oct 6, 2020 · 29 revisions

IBS: Frequently Asked Questions

This FAQ is curated by Luigi Acerbi, and in constant expansion.

For a IBS tutorial and example, see ibs_example.m (in MATLAB); other languages to be added.

If you have questions not covered here, please feel free to ask me at luigi.acerbi@helsinki.fi (putting 'IBS' in the subject of the email).

Acknowlegments: Most of the questions currently answered here originated in a live Q&A session with the Ma lab, and thanks to Hsin-Hung Li for taking notes.

General

  • Is it okay to stop the IBS algorithm for one trial after a fixed number of samples (e.g., 20)?

    No, this is not okay in the sense that by doing it one would essentially be reverting IBS to be a fixed-sampling method, with all the associated problems discussed in the paper. A more principled way is to put an early-stopping threshold on the log-likelihood, as described in the paper.

  • Is it important to provide the standard deviation of the IBS estimator to the optimization/inference algorithm for every parameter combination evaluated?

    It depends:

    • If you are optimizing the target log-likelihood (e.g., for maximum-likelihood or maximum-a-posteriori estimation) then it might help but it is not necessary because the IBS estimator variance, somewhat surprisingly, is nearly constant across the parameter space. However, the BADS optimizer (which we recommend to use in combination with IBS; see also below) does not currently support user-provided, input-dependent noise; so in that case it is not even an option.
    • If you are performing Bayesian inference, for example using the VBMC toolbox, then it is necessary to provide the standard deviation of the IBS estimate to the algorithm. Bayesian inference is very sensitive to noisy estimates of the log-likelihood (or log-posterior), so it is crucial to provide the inference algorithm with all available information about the magnitude of observation noise.
  • For computational reasons, we can often not afford to evaluate the log likelihood of every parameter combination with high precision while optimizing the parameters. However, once we have found the (supposedly) best parameter combination, we could increase the precision (e.g., the number of IBS repeats). Is this advisable?

    Yes, absolutely. It should be considered standard practice, regardless of IBS. Whenever optimizing a noisy target function, after obtaining a candidate solution from an optimization method, one should evaluate the target function at the solution with higher precision.

  • In an ideal world, would you let the number of IBS repeats depend on how close the optimization algorithm thinks it is to the maximum — i.e. some form of adaptive precision?

    Yes, this is a good idea and topic of ongoing research.

  • I am trying to get more precise results. As I increase the number of IBS repeats, the standard deviation of the estimated log-likelihood goes down slowly, but the computational time increases linearly. Is this trend normal?

    Well, think about it. The number of repeats is literally the number of times the IBS estimator is run, so the computational time has to be linear in the number of repeats. On the other hand, the number of repeats increases the number of independent log-likelihood estimates you are averaging over. As known, the standard error of the mean decreases with the square root of the number of independent estimates (in this case, number or repeats).

  • Any guideline on how to balance computation time and precision of the IBS estimates (i.e., number of repeats)?

    The algorithm you are using (e.g., BADS or VBMC) will often have some recommendation for how much noise in the log-likelihood it can handle, usually of order ~1. If you cannot decrease the log-likelihood observation noise to be ~1 or less, try to be as precise as possible within the available computational resources.

  • What are the assumptions that go into the trial-dependent repeats technique?

    The main assumption for it to work well is roughly that the trial likelihoods are correlated across (reasonable) regions of parameter space, such that you can compute the resource allocation for a given "representative" set of parameters, and that allocation of resource is still beneficial across iterations of the optimization or inference algorithm.

  • For which cases are trial-dependent repeats more preferable than fixed repeats (e.g., 20 repeats for every trial)?

    In theory whenever the above assumption holds, which seems to hold often in practice. However, more empirical studies are needed.

  • I am interested in using IBS for problems with continuous responses. Should I try and implement ABC-IBS or is discretization enough? And how do I set the number of bins in the discretization (or, equivalently, the epsilon radius for ABC-IBS)?

    While ABC-IBS as briefly described in the paper is a slightly better approach statistically, for most problems it would not make a big difference if one simply discretizes the response space. Using ABC-IBS (or discretizing the space) is roughly equivalent to adding localized uniform noise to the response of the model being fit, with radius equal to half the bin size (or equal to epsilon). So, as a rule of thumb, one wants this added noise to be (much) less than the magnitude of the noise present in the data.

IBS+BADS questions

BADS is a robust optimizer that works well with stochastic target functions, and in particular with the noisy estimates produced by IBS. A general FAQ for BADS can be found at this link. The following questions tackle issues that are common when combining IBS and BADS.

  • Is there a way to tune BADS to optimize towards a higher precision result or to have it optimize for longer?

    There are some input arguments in BADS one can modify to make it search for longer. If you want BADS to search for longer at each iteration, you can modify two key options in the OPTIONS struct that you pass to the algorithm:

    • Set OPTIONS.CompletePoll = true (default is false). This will force BADS to finish the "poll" step (more info in the paper) instead of skipping it when it thinks that it is not worth continuing. This is public option of BADS.
    • Change OPTIONS.SearchNtry. Be careful that this is a "secret" option of BADS, and I do not recommend to change it unless you have to. The default value is max(D,floor(3+D/2)), where D is the dimensionality of the target function. This quantity represents the minimum number of attempted searches (via local Bayesian optimization) per iteration. You can try and increase it to force BADS to search for longer in each iteration.
Clone this wiki locally