-
Notifications
You must be signed in to change notification settings - Fork 0
Local Random Streams
Home > Model Development Topics > Local Random Streams
The local_random_streams
option implements distinct random number generator streams for individual entities.
This can help maintain run coherence for models which simulate multiple entities together,
but requires additional memory and a unique entity key.
- Model Code
- Microdata Output: Activating and using microdata output
- Background and overview
- Syntax and use How to activate and use
- Illustrative example Decoherence reduction in a time-based model
Model code can draw random values from selected statistical distributions using built-in random number generator (RNG) functions, for example:
double x = RandUniform(1);
double y = RandNormal(2);
double z = RandPoisson(3);
These functions return pseudo-random streams of numbers. The streams appear random but are actually produced by a deterministic algorithm which generates a fixed sequence of values. That algorithm knows which value to return next by maintaining an internal state which changes from one function call to the next.
The sequence of numbers returned depends on the SimulationSeed
for the run, on the run member (aka sub, replicate), and on the case for case-based models.
The small integer argument to an RNG function specifies a distinct underlying random stream which produces values independent of those produced by other random streams.
This avoids spurious interactions among unrelated random processes in the model.
For example, values returned by calling RandUniform(4)
in a Fertility
module will not affect values returned by calling RandUniform(6)
in a Migration
module.
Independent random streams can reduce statistical noise in the difference of two model runs, reducing the run size needed to obtain reliable results for run differences.
They also make microdata comparisons of two runs correspond better with model logic.
For example, if there is no logical dependence between Fertility
and Migration
in the model,
changing a Fertility
parameter should not, logically, affect Migration
.
Had the same random stream, e.g. stream 4 in RandUniform(4)
, been used in both Fertility
and Migration
, a call to RandUniform(4)
in Fertility
would affect the value returned in a subsequent call to RandUniform(4)
in Migration
.
That would produce a spurious (but statistically neutral) interaction between Fertility
and Migration
.
That can be avoided by using a different random stream in Migration
, e.g. by calling RandUniform(6)
to specify stream 6 instead of stream 4.
Spurious correlation of random streams can be avoided by specifying a distinct random stream in each call to an RNG function throughout model code.
However, a model which simulates multiple instances of an entity together, e.g. multiple Person
entities, could have spurious interactions of random streams among those entities.
For example, a call to RandUniform(4)
in Fertility
in Person
A will affect the result from a subsequent call in Fertility
to RandUniform(4)
in Person
B, because the same random stream 4 is used in both.
In a time-based model with many entities, a spurious interaction could extend from one entity to the entire population.
Such spurious interactions do not affect the statistical validity of aggregate model results, but they can create additional statistical noise in run comparisons, and produce differences at the microdata level which are not explained by model logic.
This issue can be resolved by maintaining independent local random streams in each entity, rather than global random streams shared among entities.
For example, using local random streams, a call to RandUniform(4)
in Person
A uses a different random stream from a call to RandUniform(4)
in Person
B.
Local random streams require additional memory in each entity to maintain the state of the pseudo-random number generator for each stream. This additional memory can be significant for time-based models with many entities and many random streams.
Local random streams must also be initialized distinctly in each entity so that different entities produce different random streams. That requirement is met by providing a unique key for each entity. The entity key is used to initialize local random streams independently in each each entity before it enters the simulation. The entity key needs to be stable from one run to another so that local random streams are the same for the same entity in two different runs. The code to specify the unique entity key is, in general, model dependent.
Given these trades, local random streams are not implemented by default in OpenM++.
Sections:
Use the option local_random_streams
to implement local random streams for all instances of a given entity, e.g.
options local_random_streams = Host;
to implement local random streams for all instances of the Host
entity.
Use multiple options
statements to implement local random streams for more than one kind of entity.
During model build, a log message like
Entity 'Host' has 11 local random streams, of which 1 are Normal
is issued for each entity with local random streams. The message indicates how many of the local streams use a Normal distribution, because those use additional memory in the entity to maintain state.
[back to syntax and use]
[back to topic contents]
under construction
[back to syntax and use]
[back to topic contents]
This section applies only to calls to RNG functions in entity context. See section RNG use before simulation for information about using RNG functions outside of entity context before the simulation starts.
If an entity uses local random streams they must be initialized before use. If they are not initialized a fatal error like the following will be raised:
Simulation error: RandUniform called with uninitialized local random streams.
By default, initialization is performed automatically when an entity enters the simulation, after starting values of all attributes have been assigned.
However, if model code in entity scope calls an RNG function before the entity enters the simulation, the local random streams of the entity will not have been uninitialized, causing the run-time error.
For example, the following code
void Host::Start()
{
// Initialize all attributes (OpenM++).
initialize_attributes();
z_score = RandNormal(11);
// Have the entity enter the simulation (OpenM++).
enter_simulation();
}
assigns a random value to the Host
attribute z_score
before the Host
enters the simulation.
If Host
uses local random streams, a run-time error will occur.
However, model code can explicitly initialize local random streams before the entity enters the simulation by calling the provided function initialize_local_random_streams()
.
For example, the following modified code
void Host::Start()
{
// Initialize all attributes (OpenM++).
initialize_attributes();
// Need to explicitly initialize local random streams here
// because they are used before the entity enters the simulation.
initialize_local_random_streams();
z_score = RandNormal(11);
// Have the entity enter the simulation (OpenM++).
enter_simulation();
}
initializes local random streams for the Host
before the call to RandNormal
, avoiding the error.
initialize_local_random_streams()
uses get_entity_key()
internally.
If the definition of get_entity_key()
is supplied in model code,
it is essential that all attributes used in get_entity_key()
be assigned before calling initialize_local_random_streams()
.
initialize_local_random_streams()
can be called (with no effect) even if the entity does not use local random streams,
so there is no need to remove the call from model code if local random streams are not used in the entity.
Modgen specific: Modgen does not recognize the function initialize_local_random_streams
.
That causes an error when building the Modgen version of a x-compatible model.
For x-compatible models, use code like:
#if !defined(MODGEN)
initialize_local_random_streams();
#endif
[back to syntax and use]
[back to topic contents]
under construction
Normal behaviour of random streams in PreSimulation
, and in Simulation
(to create a starting population, for example, as in IDMM).
[back to syntax and use]
[back to topic contents]
under construction
Normal behaviour of random streams in other entities which were not named using the local_random_streams
option.
For entities named in local_random_streams
, the streams used in the entity are maintained at the entity level.
Streams are seeded using the value returned by get_entity_key()
, combined with the run member (aka sub, replicate) and either the run seed for a time-based model or the case seed for a case-based model.
[back to syntax and use]
[back to topic contents]
This example illustrates how local random streams can improve run coherence in a time-based model.
Sections:
This example illustrates the effect of local random streams vs. global random streams on run coherence.
It uses the time-based IDMM
model with minor modifications.
Microdata is output at 100 time points during the simulation, and later merged and compared between Base
and Variant
runs to measure how run coherence evolves during the simulation.
Four runs are used in this example:
-
Base
run with shared random streams -
Variant
run with shared random streams -
Base
run with local random streams -
Variant
run with local random streams
The same version of IDMM
was used for runs 1 and 2.
A different version of IDMM
was used for runs 3 and 4,
the only difference being activation of local random streams.
All 4 runs have the same number of Hosts
and an identical contact network.
The Base
runs have identical inputs in runs 1 and 3.
The Variant
runs have identical inputs in runs 2 and 4.
A single parameter differs slightly between the Variant
runs and the Base
runs.
That change causes only 2 entities to differ at the start of the Variant
runs compared to the Base
runs.
Aggregate outputs over time for the 4 runs are so similar that they are visually indistinguishable (see below). However, the degree of coherence at the microdata level between runs 3 and 4 (with local random streams) is noticeably different than that between runs 1 and 2 (with shared random streams).
[back to illustrative example]
[back to topic contents]
IDMM
simulates an interacting dynamic contact network of Host
entities, together with a disease which can be transmitted over that contact network.
The contact network is initialized randomly at the start of the simulation.
During the simulation, each Host
interacts with other Hosts
in contact events.
Each Host
can change its connected Hosts
in contact change events.
Optionally, a Host
can change a connected Host
in a contact event, if that host is infectious.
During a contact event, the disease can propagate between the two Hosts
, depending on the disease status of each Host
.
An infected Host
progresses through 4 fixed length disease phases: susceptible, latent, infectious, and immune.
On infection, a Host
enters the latent phase, during which it is both asymptomatic and non-infectious.
After the latent phase, the Host
enters an infectious phase during which it can infect another Host
during a contact event.
After the infectious phase, the Host
enters an immune phase.
After the immune phase, the Host
returns to the susceptible state.
Before the simulation starts, all Host
entities are in the susceptible state.
After a Host
enters the simulation but before the first simulated event, it is randomly infected at a given probability.
For this example, some mechanical changes were made to the version of IDMM
in the OpenM++ distribution.
[back to illustrative example]
[back to topic contents]
The Base
run consists of 5,000 Hosts
simulated for 100 time units, with the initial probability of infection set to 0.1000.
All other parameters are left at Default
values, which are as follows:
parameters {
int NumberOfHosts = 5000; //EN Number of hosts
int ContactsOutPerHost = 4; //EN Number of outgoing contacts per host
double MeanContactInterval = 1.00; //EN Mean time between social interactions
double MeanChangeContactsInterval = 5.00; //EN Mean time between changing social contact network
double DumpSickContactProbability = 0.0; //EN Probability of dumping a sick contact
double InitialDiseasePrevalence = 0.1000; //EN Disease prevalence at start of simulation
double TransmissionEfficiency = 0.10; //EN Probability of transmitting disease during a social contact
double LatentPhaseDuration = 5.0; //EN Duration of latent phase
double ContagiousPhaseDuration = 10.0; //EN Duration of contagious phase
double ImmunePhaseDuration = 20.0; //EN Duration of immune phase
logical EnableChangeContactsEvent = TRUE; //EN Enable the change contacts event
};
The ini
file for the Base
run looks like this:
[OpenM]
RunName = Base
[Parameter]
InitialDiseasePrevalence = 0.1000
[Microdata]
ToDb = yes
Host = report_time, disease_phase, age_infected
In the Base
run, 501 Hosts
are infected at the beginning of the simulation.
The time evolution of the susceptible population in Run 1 (Base
with shared random streams) looks like this:
Before the simulation starts, all 5,000 Hosts
are in the susceptible state.
Next, 501 Hosts
are randomly infected and change disease status from susceptible to latent.
This is the cause of the immediate drop of the count of susceptible Hosts
from 5,000 to 4,499 in the chart.
The following plateau is caused by the length 5.0 latent period of the 501 infected hosts, during which no forward infections occur.
At the end of the plateau, the 501 Hosts
leave the latent phase and enter the infectious phase.
At that point, the infection spreads through the social network, progressively reducing the susceptible population.
At around time 30.0, almost all Hosts
are infected and very few susceptible Hosts
remain.
At time 35.0 the 501 initially infected Hosts
lose protective immunity and become susceptible once again.
At time 40.0 and thereafter, progressively more Hosts
lose protective immunity.
The disease then propagates among the new and growing population of susceptible Hosts
.
The number of susceptible Hosts
peaks at around time 57, and decreases subsequently as the disease infects more and more susceptibles.
At around time 84 almost no susceptible hosts remain, as in the first wave at time 30.
A new epidemic wave, with initial conditions smoothed out, commences towards the end of the simulation.
The same chart for Run 3 (Base
with local random streams) looks like this:
This Base
run with local random streams looks very similar to the Base
run with shared random streams immediately above,
despite all RNG draws differing during the simulation phase of the two runs.
This is because the size of the population and the number of RNG draws dominates the effects of individual idiosyncratic RNG draws in the two runs.
Put another way, the two versions of IDMM
are statistically equivalent.
Exactly the same 501 Hosts
are infected in Run 1 and Run 3, despite Run 3 using the version of IDMM
with local random streams.
That's because the initial infections are implemented during the creation of the starting population, before the simulation begins. Calls to RNG functions before the simulation starts use the usual shared random streams, not local random streams, as explained in RNG use before simulation.
[back to illustrative example]
[back to topic contents]
The Variant
runs are the same as the Base
runs, except for a very slightly higher probability of initial infection of 0.1001 versus 0.1000 in the Base
runs.
The ini
file for the Variant
runs looks like this:
[OpenM]
RunName = Variant
[Parameter]
InitialDiseasePrevalence = 0.1001
[Microdata]
ToDb = yes
Host = report_time, disease_phase, age_infected
503 Hosts
are infected at the beginning of the simulation in the Variant
run.
That's 2 more than in the Base
run.
In the Variant
runs, 501 of the 503 infected Hosts
are identical to those infected at the beginning of the Base
runs.
The same 2 additional Hosts
are infected in Variant
runs 2 and 4.
The time evolution of the susceptible population in Run 2 (Variant
with shared random streams) looks like this:
The aggregate effect of 503 initially infected Hosts
vs. 501 initially infected Hosts
is not visually perceptible, as might be expected.
The time evolution of the susceptible population in Run 4 (Variant
with local random streams) looks like this:
Run 3 and Run 4 are indistinguishable visually, despite having entirely different RNG draws in the simulation phase. They are by construction statistically equivalent.
[back to illustrative example]
[back to topic contents]
Base-Variant coherence is measured by counting the number of Hosts
having an identical time of most recent infection (attribute age_infected
) in the Base
and Variant
runs.
age_infected
is initialized to -1.0 to mean 'no previous infection'.
When a Host
is infected, age_infected
is set to the exact (fractional) current age
.
If age_infected
differs, the infection history of the Host
is clearly different in Base
and Variant
.
If age_infected
is identical, the Host
almost certainly experienced the same infection history in Base
and Variant
.
Base-Variant coherence is evaluated at each of 101 time points in the runs by merging Base
and Variant
microdata population snapshots which are output at each time point.
The time evolution of Base-Variant coherence with shared random streams (runs 1 and 2) looks like this:
The plateau in coherence at the beginning of the chart looks to be 5,000 but is actually 4,998.
As described above, exactly 2 Hosts
differ between Base
and Variant
at the beginning of the runs.
Base-Variant coherence is lost precipitously when the infectious phase begins,
due to shared random streams among all Hosts
in the population.
The time evolution of Base-Variant coherence with local random streams (runs 3 and 4) looks like this:
As in the previous chart, Base-Variant coherence starts at 4,998, with 2 Hosts
differing at the beginning of the simulation.
Unlike the previous chart, Base-Variant coherence remains very high until around time 50.
After that time coherence is gradually lost as the effects of the tiny difference in initial conditions propagate in a causative way through the contact network and affect progressively more and more of the population.
The eventual loss of coherence in IDMM
, even with local random streams, is an intrinsic property of this highly interacting and connected population, a 'butterfly effect'.
[back to illustrative example]
[back to topic contents]
under construction
Refer to coherence example using IDMM example in Microdata Output
age_infected
is the age of the Host
at the most recent infection, and is initialized to -1.
age_infected
was added to IDMM
to measure decoherence between runs.
It turned out that disease_phase
did not work well to measure decoherence between runs, because
A custom version of `get_microdata_key() was added to produce a unique microdata key. A microdata record is output at each time unit to measure the evolution of coherence between two runs during the simulation.
- Similar example here, but using two runs Run1 and Run2 of IDMM, with
InitialDiseasePrevalence
very slightly higher to generate at least one (but very few) additional infected Hopsts at the beginning of the simulation. - Use Microdata Output to show decoherence in disease phase at end of runs
- Perhaps, measure coherence between two runs over time, by output microdata at time steps.
- Turn on local rng for Host entities, repeat Run1 and Run2
- Use Microdata Output to show coherence in disease phase at end of runs (or evolution over time).
- Windows: Quick Start for Model Users
- Windows: Quick Start for Model Developers
- Linux: Quick Start for Model Users
- Linux: Quick Start for Model Developers
- MacOS: Quick Start for Model Users
- MacOS: Quick Start for Model Developers
- Model Run: How to Run the Model
- MIT License, Copyright and Contribution
- Model Code: Programming a model
- Windows: Create and Debug Models
- Linux: Create and Debug Models
- MacOS: Create and Debug Models
- MacOS: Create and Debug Models using Xcode
- Modgen: Convert case-based model to openM++
- Modgen: Convert time-based model to openM++
- Modgen: Convert Modgen models and usage of C++ in openM++ code
- Model Localization: Translation of model messages
- How To: Set Model Parameters and Get Results
- Model Run: How model finds input parameters
- Model Output Expressions
- Model Run Options and ini-file
- OpenM++ Compiler (omc) Run Options
- OpenM++ ini-file format
- UI: How to start user interface
- UI: openM++ user interface
- UI: Create new or edit scenario
- UI: Upload input scenario or parameters
- UI: Run the Model
- UI: Use ini-files or CSV parameter files
- UI: Compare model run results
- UI: Aggregate and Compare Microdata
- UI: Filter run results by value
- UI: Disk space usage and cleanup
- UI Localization: Translation of openM++
-
Highlight: hook to self-scheduling or trigger attribute
-
Highlight: The End of Start
-
Highlight: Enumeration index validity and the
index_errors
option -
Highlight: Simplified iteration of range, classification, partition
-
Highlight: Parameter, table, and attribute groups can be populated by module declarations
- Oms: openM++ web-service
- Oms: openM++ web-service API
- Oms: How to prepare model input parameters
- Oms: Cloud and model runs queue
- Use R to save output table into CSV file
- Use R to save output table into Excel
- Run model from R: simple loop in cloud
- Run RiskPaths model from R: advanced run in cloud
- Run RiskPaths model in cloud from local PC
- Run model from R and save results in CSV file
- Run model from R: simple loop over model parameter
- Run RiskPaths model from R: advanced parameters scaling
- Run model from Python: simple loop over model parameter
- Run RiskPaths model from Python: advanced parameters scaling
- Windows: Use Docker to get latest version of OpenM++
- Linux: Use Docker to get latest version of OpenM++
- RedHat 8: Use Docker to get latest version of OpenM++
- Quick Start for OpenM++ Developers
- Setup Development Environment
- 2018, June: OpenM++ HPC cluster: Test Lab
- Development Notes: Defines, UTF-8, Databases, etc.
- 2012, December: OpenM++ Design
- 2012, December: OpenM++ Model Architecture, December 2012
- 2012, December: Roadmap, Phase 1
- 2013, May: Prototype version
- 2013, September: Alpha version
- 2014, March: Project Status, Phase 1 completed
- 2016, December: Task List
- 2017, January: Design Notes. Subsample As Parameter problem. Completed
GET Model Metadata
- GET model list
- GET model list including text (description and notes)
- GET model definition metadata
- GET model metadata including text (description and notes)
- GET model metadata including text in all languages
GET Model Extras
GET Model Run results metadata
- GET list of model runs
- GET list of model runs including text (description and notes)
- GET status of model run
- GET status of model run list
- GET status of first model run
- GET status of last model run
- GET status of last completed model run
- GET model run metadata and status
- GET model run including text (description and notes)
- GET model run including text in all languages
GET Model Workset metadata: set of input parameters
- GET list of model worksets
- GET list of model worksets including text (description and notes)
- GET workset status
- GET model default workset status
- GET workset including text (description and notes)
- GET workset including text in all languages
Read Parameters, Output Tables or Microdata values
- Read parameter values from workset
- Read parameter values from workset (enum id's)
- Read parameter values from model run
- Read parameter values from model run (enum id's)
- Read output table values from model run
- Read output table values from model run (enum id's)
- Read output table calculated values from model run
- Read output table calculated values from model run (enum id's)
- Read output table values and compare model runs
- Read output table values and compare model runs (enun id's)
- Read microdata values from model run
- Read microdata values from model run (enum id's)
- Read aggregated microdata from model run
- Read aggregated microdata from model run (enum id's)
- Read microdata run comparison
- Read microdata run comparison (enum id's)
GET Parameters, Output Tables or Microdata values
- GET parameter values from workset
- GET parameter values from model run
- GET output table expression(s) from model run
- GET output table calculated expression(s) from model run
- GET output table values and compare model runs
- GET output table accumulator(s) from model run
- GET output table all accumulators from model run
- GET microdata values from model run
- GET aggregated microdata from model run
- GET microdata run comparison
GET Parameters, Output Tables or Microdata as CSV
- GET csv parameter values from workset
- GET csv parameter values from workset (enum id's)
- GET csv parameter values from model run
- GET csv parameter values from model run (enum id's)
- GET csv output table expressions from model run
- GET csv output table expressions from model run (enum id's)
- GET csv output table accumulators from model run
- GET csv output table accumulators from model run (enum id's)
- GET csv output table all accumulators from model run
- GET csv output table all accumulators from model run (enum id's)
- GET csv calculated table expressions from model run
- GET csv calculated table expressions from model run (enum id's)
- GET csv model runs comparison table expressions
- GET csv model runs comparison table expressions (enum id's)
- GET csv microdata values from model run
- GET csv microdata values from model run (enum id's)
- GET csv aggregated microdata from model run
- GET csv aggregated microdata from model run (enum id's)
- GET csv microdata run comparison
- GET csv microdata run comparison (enum id's)
GET Modeling Task metadata and task run history
- GET list of modeling tasks
- GET list of modeling tasks including text (description and notes)
- GET modeling task input worksets
- GET modeling task run history
- GET status of modeling task run
- GET status of modeling task run list
- GET status of modeling task first run
- GET status of modeling task last run
- GET status of modeling task last completed run
- GET modeling task including text (description and notes)
- GET modeling task text in all languages
Update Model Profile: set of key-value options
- PATCH create or replace profile
- DELETE profile
- POST create or replace profile option
- DELETE profile option
Update Model Workset: set of input parameters
- POST update workset read-only status
- PUT create new workset
- PUT create or replace workset
- PATCH create or merge workset
- DELETE workset
- POST delete multiple worksets
- DELETE parameter from workset
- PATCH update workset parameter values
- PATCH update workset parameter values (enum id's)
- PATCH update workset parameter(s) value notes
- PUT copy parameter from model run into workset
- PATCH merge parameter from model run into workset
- PUT copy parameter from workset to another
- PATCH merge parameter from workset to another
Update Model Runs
- PATCH update model run text (description and notes)
- DELETE model run
- POST delete model runs
- PATCH update run parameter(s) value notes
Update Modeling Tasks
Run Models: run models and monitor progress
Download model, model run results or input parameters
- GET download log file
- GET model download log files
- GET all download log files
- GET download files tree
- POST initiate entire model download
- POST initiate model run download
- POST initiate model workset download
- DELETE download files
- DELETE all download files
Upload model runs or worksets (input scenarios)
- GET upload log file
- GET all upload log files for the model
- GET all upload log files
- GET upload files tree
- POST initiate model run upload
- POST initiate workset upload
- DELETE upload files
- DELETE all upload files
Download and upload user files
- GET user files tree
- POST upload to user files
- PUT create user files folder
- DELETE file or folder from user files
- DELETE all user files
User: manage user settings
Model run jobs and service state
- GET service configuration
- GET job service state
- GET disk usage state
- POST refresh disk space usage info
- GET state of active model run job
- GET state of model run job from queue
- GET state of model run job from history
- PUT model run job into other queue position
- DELETE state of model run job from history
Administrative: manage web-service state
- POST a request to refresh models catalog
- POST a request to close models catalog
- POST a request to close model database
- POST a request to delete the model
- POST a request to open database file
- POST a request to cleanup database file
- GET the list of database cleanup log(s)
- GET database cleanup log file(s)
- POST a request to pause model run queue
- POST a request to pause all model runs queue
- PUT a request to shutdown web-service