swfcalib
is designed to automate the calibration of complex
multi-parameters multi-outputs models ran on
Slurm equipped HPC.
swfcalib
is not the simplest calibration system to set up. It was
designed to solve a specific set of problems listed below. If you
already have a system that works well you should probably not consider
swfcalib
.
However, if you have the following issues, swfcalib
may be for you:
- your model have many parameters to calibrate and produces many outputs.
- you have an idea of how some parameters influence only some outputs (see example below)
- your outputs are very noisy
- you cannot or don’t want to have a Slurm job that runs continuously for the whole duration of the calibration process (a few days to several weeks)
By using slurmworkflow
,
swfcalib
can implement a loop like behavior in
Slurm without the need of a pilot job
staying alive for the whole duration of the calibration process
(sometimes several weeks).
swfcalib
was created to calibrate epidemic models like this
one. These models have around 20
free parameters to calibrate and 20 outcomes to be matched to observed
targets. However, knowledge of the model allow us to define many
conditional one to one relationship between parameters and outcomes.
swfcalib
can use this knowledge to split the calibration into multiple
waves of simpler parallel calibration jobs.
The calibration process follows a proposal, validation loop. Where the model is run with a set of parameters and new proposals are made according to the results. This goes on until the model is fully calibrated.
Terminology:
- model: a function taking a proposal and returning some outcomes.
- proposal: a set of parameters values to be passed to the model
- outcomes: the output of a model run for a given proposal.
- job: a set of parameters to be calibrated by using a subset of the outcomes
- wave: a set of independent jobs that can be calibrated using the same run of the model.
The specificity of swfcalib
lies in the ability to try many proposal
at once on an HPC, and to set up this proposal validation loop without a
long running orchestrating job (pilot job).
The calibration is split into waves. Each wave can contains multiple jobs, each focusing on a set of parameters and related outcomes. This permits the parallel calibration of multiple independent parameters. At each iteration, the model is ran once per proposal, and each job gather the outcomes it needs to make its next proposal. Once all jobs in a wave are done, i.e. the parameters they govern are calibrated, the system moves to the next wave. This allows the sequential calibration of parameters when strong assumption about their independence can be made.
This design was crafted for the very noisy models where many replications are needed and where most parameters are independent or conditionally independent.
swfcalib
makes no assumptions on how a set of parameters should be
changed and how the quality of fit is assessed. It is up to the user to
provide the mechanism to:
- produce the next proposals to be tested
- assess the quality of the fit
swfcalib
provides some pre-built functions for this. See the getting
started vignette.
You can install the development version of swfcalib like so:
remotes::install_github("EpiModel/swfcalib")
library(swfcalib)