Skip to content
Luigi Acerbi edited this page Oct 17, 2020 · 29 revisions

IBS: Frequently Asked Questions

This FAQ is curated by Luigi Acerbi, and in constant expansion.

For a IBS tutorial and example, see ibs_example.m (in MATLAB); other languages to be added.

If you have questions not covered here, please feel free to ask me at luigi.acerbi@helsinki.fi (putting 'IBS' in the subject of the email).

Acknowlegments: Most of the questions currently answered here originated in a live Q&A session with the Ma lab, and thanks to Hsin-Hung Li for taking notes.

General

  • Is it okay to stop the IBS algorithm for one trial after a fixed number of samples (e.g., 20)?

    No, this is not okay in the sense that by doing it one would essentially be reverting IBS to be a fixed-sampling method, with all the associated problems discussed in the paper. A more principled way is to put an early-stopping threshold on the log-likelihood, as described in the paper.

  • Is it important to provide the standard deviation of the IBS estimator to the optimization/inference algorithm for every parameter combination evaluated?

    It depends:

    • If you are optimizing the target log-likelihood (e.g., for maximum-likelihood or maximum-a-posteriori estimation) then it might help but it is not necessary because the IBS estimator variance, somewhat surprisingly, is nearly constant across the parameter space. However, the BADS optimizer (which we recommend to use in combination with IBS; see also below) does not currently support user-provided, input-dependent noise; so in that case it is not even an option.
    • If you are performing Bayesian inference, for example using the VBMC toolbox, then it is necessary to provide the standard deviation of the IBS estimate to the algorithm. Bayesian inference is very sensitive to noisy estimates of the log-likelihood (or log-posterior), so it is crucial to provide the inference algorithm with all available information about the magnitude of observation noise.
  • I am interested in using IBS for problems with continuous responses. Should I try and implement ABC-IBS or is discretization enough? And how do I set the number of bins in the discretization (or, equivalently, the epsilon radius for ABC-IBS)?

    While ABC-IBS as briefly described in the paper is a slightly better approach statistically, for most problems it would not make a big difference if one simply discretizes the response space. Using ABC-IBS (or discretizing the space) is roughly equivalent to adding localized uniform noise to the response of the model being fit, with radius equal to half the bin size (or equal to epsilon). So, as a rule of thumb, one wants this added noise to be (much) less than the magnitude of the noise present in the data.

  • I have several questions about using BADS to optimize the log-likelihood, can you help?

    BADS is a robust optimizer that works well with stochastic target functions, and in particular with the noisy estimates produced by IBS. Many questions related to the usage of BADS can be found in the BADS general FAQ. In particular, you might want to start the section of the FAQ dedicated to noisy objective functions (but do not stop there — all sections of the FAQ are relevant).

  • What if I want to use IBS to perform Bayesian posterior or model inference?

    If you are interested in Bayesian inference, i.e. in recovering the Bayesian posterior over model parameters or computing the marginal likelihood, we recommend to use Variational Bayesian Monte Carlo (VBMC), a toolbox for approximate Bayesian inference that supports potentially noisy estimates of the log-likelihood, such as those produced by IBS. In a large empirical benchmark, VBMC has been shown to work very well in combination with IBS (see paper).

Precision and IBS repeats

IBS affords a simple way to change the precision of its estimates, by using multiple "repeats", each amounting to an independent run of the estimator. In this section, we answer questions related to the precision of the IBS estimate, related to the number of repeats.

  • For computational reasons, we can often not afford to evaluate the log likelihood of every parameter combination with high precision while optimizing the parameters. However, once we have found the (supposedly) best parameter combination, we could increase the precision (e.g., the number of IBS repeats). Is this advisable?

    Yes, absolutely. It should be considered standard practice, regardless of IBS. Whenever optimizing a noisy target function, after obtaining a candidate solution from an optimization method, one should evaluate the target function at the solution with higher precision.

  • In an ideal world, would you let the number of IBS repeats depend on how close the optimization algorithm thinks it is to the maximum — i.e. some form of adaptive precision?

    Yes, this is a good idea and topic of ongoing research.

  • I am trying to get more precise results. As I increase the number of IBS repeats, the standard deviation of the estimated log-likelihood goes down slowly, but the computational time increases linearly. Is this trend normal?

    Well, think about it. The number of repeats is literally the number of times the IBS estimator is run, so the computational time has to be linear in the number of repeats. On the other hand, the number of repeats increases the number of independent log-likelihood estimates you are averaging over. As known, the standard error of the mean decreases with the square root of the number of independent estimates (in this case, number or repeats).

  • Any guideline on how to balance computation time and precision of the IBS estimates (i.e., number of repeats)?

    The algorithm you are using (e.g., BADS or VBMC) will often have some recommendation for how much noise in the log-likelihood it can handle, usually of order ~1. If you cannot decrease the log-likelihood observation noise to be ~1 or less, try to be as precise as possible within the available computational resources.

  • What are the assumptions that go into the trial-dependent repeats technique?

    The main assumption for it to work well is roughly that the trial likelihoods are correlated across (reasonable) regions of parameter space, such that you can compute the resource allocation for a given "representative" set of parameters, and that allocation of resource is still beneficial across iterations of the optimization or inference algorithm.

  • For which cases are trial-dependent repeats more preferable than fixed repeats (e.g., 20 repeats for every trial)?

    In theory whenever the above assumption holds, which seems to hold often in practice. However, more empirical studies are needed.

MATLAB implementation: ibslike

  • The responses of my model depend both on the current trial and on responses or stimuli from previous trials. How can I tell this to ibslike?

    First, note that the inputs to ibslike are ibslike(fun,params,respMat,designMat,options,varargin), where varargin denotes additional arguments that are passed to fun, the function handle to your simulator model. If your simulator depends on data that go beyond the current stimulus and current response in a trial, you should:

    • leave designMat empty — this will tell ibslike to call fun with a list of trial indices;
    • pass the full matrix of stimuli and responses, and any further data, as additional arguments (as varargin);
    • write your fun simulator to take as input a parameter vector PARAMS, an array of trial numbers T, and any other input argument represented by varargin;
    • for each trial indexed by trial number T, compute the response using the appropriate information contained in varargin (e.g., the full matrix of stimuli and responses).

    For more information on variable-length input argument lists (varargin) in MATLAB, see the official documentation here.

  • My model uses data structures which are not easily converted to numerical arrays. Can I still use ibslike?

    In principle, yes.

    • The responses of your model have to be expressed as numerical arrays, but given that ibslike works only with discrete responses, it should always be possible to map the responses of your model to a finite set of numbers.
    • Any other data used to compute such responses need not be a numerical array. If you need your simulator fun to accept inputs which are not numerical arrays, you should use the varargin input argument, as explained in the question above.
Clone this wiki locally