-
Notifications
You must be signed in to change notification settings - Fork 0
Local Random Streams
Home > Model Development Topics > Local Random Streams
The local_random_streams
option implements distinct random number generator streams for individual entities.
This can help maintain run coherence for models which simulate multiple entities together,
but requires additional memory and a unique entity key.
- Model Code
- Microdata Output: Activating and using microdata output
- Background
- Outline
- Syntax and Use How to activate and use
- Illustrative Example Illustrative example using IDMM
Model code can draw random values from selected statistical distributions using built-in random number generator (RNG) functions, for example:
double x = RandUniform(1);
double y = RandNormal(2);
double z = RandPoisson(3);
These functions return pseudo-random streams of numbers. The streams appear random but are actually produced by a deterministic algorithm which generates a fixed sequence of values. That algorithm knows which value to return next by maintaining an internal state which changes from one function call to the next.
The sequence of numbers returned depends on the SimulationSeed
for the run, on the run member (aka sub, replicate), and on the case for case-based models.
The small integer argument to an RNG function specifies a distinct underlying random number stream which produces values independent of those produced by other random number streams.
This avoids spurious interactions among unrelated random processes in the model.
For example, values returned by calling RandUniform(4)
in a Fertility
module will not affect values returned by calling RandUniform(6)
in a Migration
module.
Independent random number streams can reduce statistical noise in the difference of two model runs, reducing the run size needed to obtain reliable results for run differences.
They also make microdata comparisons of two runs correspond better with model logic.
For example, if there is no logical dependence between Fertility
and Migration
in the model,
changing a Fertility
parameter should not, logically, affect Migration
.
Had the same random stream, e.g. RandUniform(4)
been used in both Fertility
and Migration
, a call to RandUniform(4)
in Fertility
would affect the value returned in a subsequent call to RandUniform(4)
in Migration
.
That would produce a spurious (but statistically neutral) interaction between Fertility
and Migration
.
That's avoided by using a different random stream in Migration
, e.g. by calling RandUniform(6)
to specify stream 6 rather than stream 4.
Spurious correlation of random number streams can be avoided by using a distinct random number stream in each call to an RNG function throughout model code.
However, a model which simulates multiple instances of an entity kind together, e.g. multiple Person
entities, could have spurious interactions of random streams among those entities.
For example, a call to RandUniform(4)
in Fertility
in Person
A will affect the result from a subsequent call in Fertility
to RandUniform(4)
in Person
B, because the same random stream 4 is used in both.
In a time-based model with many entities, a spurious interaction could extend from one entity to the entire population.
Such spurious interactions do not affect the statistical validity of aggregate model results, but they can create additional statistical noise in run comparisons, and produce differences at the microdata level which are not explained by model logic.
This issue can be resolved by maintaining independent local random streams in each entity, rather than using global random streams shared among the entities which are simulated together.
For example, using local random streams, a call to RandUniform(4)
in Person
A uses a different random stream from a call to RandUniform(4)
in Person
B.
Local random streams require additional memory in each entity to maintain the state of the pseudo-random number generator for each stream.
This additional memory can be significant for time-based models with many entities and many random streams.
Local random streams also require distinct initialization in each entity, so that different entities produce different random streams.
That requirement is met by providing a function get_entity_key()
which returns a unique key for each entity.
The entity key is used to initialize local random streams independently in each each entity before it enters the simulation.
The entity key needs to be stable from one run to another so that the local random streams are the same for the same entity in two different runs.
The implementation of get_entity_key
is, in general, model dependent.
Given these trades, local random streams are not implemented by default in OpenM++. Instead, a statement like
options local_entity_streams = Person;
causes OpenM++ to implement local random streams for the specified entity.
under construction
options local_random_streams = Host;
options local_random_streams = Ticker;
Multiple statements allowed, one for each entity for which local streams are desired.
During model build, a message like
Entity 'Host' has 11 local random streams, of which 1 are Normal
will be issued for each entity with local random streams.
If an entity with local RNG streams calls RandUniform
, RandNormal
, or RandLogistic
to initialize attributes before it enters the simulation, e.g. in a Start
function, the built-in function initialize_local_random_streams()
must be called first.
The function initialize_local_random_streams()
calls get_entity_key()
, so be sure that any attributes used by get_entity_key()
have been assigned first.
Otherwise, a run-time error like
Simulation error: RandUniform called with uninitialized local random streams.
If there are no RNG calls before the entity enters the simulation, it is not necessary to call initialize_local_random_streams()
when initializing the entity.
Model code can call initialize_local_random_streams
even if the entity has no local RNG streams (no effect).
Normal behaviour of random streams in PreSimulation
, and in Simulation
(to create a starting population, for example, as in IDMM).
Normal behaviour of random streams in other entities which were not named using the local_random_streams
option.
For entities named in local_random_streams
, the streams used in the entity are maintained at the entity level.
Streams are seeded using the value returned by get_entity_key()
, combined with the run member (aka sub, replicate) and either the run seed for a time-based model or the case seed for a case-based model.
under construction
This example is divided into the following sections:
- Summary
- IDMM overview
Base
runVariant
run- Base-Variance coherence
-
IDMM differences
IDMM
differences in this example
This example illustrates the effect of local random streams vs. global random streams on simulation decoherence.
It uses the time-based IDMM
model with minor modifications.
Microdata is output at 100 time points during the simulation, and later merged and compared between Base
and Variant
runs to measure how decoherence evolves as the simulation progresses.
Four runs are used in this example:
-
Base
run with global random streams -
Variant
run with global random streams -
Base
run with local random streams -
Variant
run with local random streams
The 4 runs are very similar.
All 4 runs have the same number of hosts and an identical contact network.
A single parameter differs between Variant
and Base
.
The change in that parameter causes two entities to differ at the start of the Variant
simulation.
Base
and Variant
runs The Variant
runs
[back to illustrative example]
[back to topic contents]
IDMM
simulates an interacting dynamic contact network of Host
entities, together with a disease which can be transmitted over that contact network.
The contact network is initialized randomly at the start of the simulation.
During the simulation, each Host
interacts with other Hosts
during a contact event.
Each Host
can its connected Hosts
in a contact change event.
Optionally, a Host
can change a connected Host
in a contact event, if that host is infectious.
During a contact event, the disease can propagate between the two Hosts
, depending on the disease status of each.
An infected Host
progresses through 4 disease phases of fixed duration: susceptible, latent, infectious, immune.
On infection, the Host
enters the latent phase, during which it is both asymptomatic and non-infectious.
After the latent phase, the Host
enters an infectious phase during which it can infect another Host
during a contact event.
After the infectious phase, the Host
enters an immune phase.
After the immune phase, the Host
returns to the susceptible state.
Before the simulation starts, all Host
entities are in the susceptible state.
At the beginning of the simulation, a portion of the Host
population is randomly infected.
For this example, some mechanical changes were made to the version of IDMM
in the OpenM++ distribution.
[back to illustrative example]
[back to topic contents]
The Base
run consists of 5,000 Hosts
simulated for 100 time units, with the initial probability of infection set to 0.1000.
The ini
file for the Base
run looks like this:
[OpenM]
SubValues = 1
Threads = 1
RunName = Base
[Parameter]
NumberOfHosts = 5000
SimulationEnd = 100
InitialDiseasePrevalence = 0.1000
[Microdata]
ToDb = yes
Host = report_time, disease_phase, age_infected
501 Hosts
are infected at the beginning of the simulation in the Base
run.
The time evolution of the susceptible population in Run 1 (Base
with global random streams) looks like this:
The same chart for Run 3 (Base
with local random streams) looks like this:
The Variant
run is the same as the Base
run, except for a very slightly higher probability of initial infection of 0.1001 compared to 0.1000 in Base
.
The ini
file for the Variant
run looks like this:
[OpenM]
SubValues = 1
Threads = 1
RunName = Variant
[Parameter]
NumberOfHosts = 5000
SimulationEnd = 100
InitialDiseasePrevalence = 0.1001
[Microdata]
ToDb = yes
Host = report_time, disease_phase, age_infected
503 Hosts
are infected at the beginning of the simulation in the Variant
run.
That's 2 more than in the Base
run.
The time evolution of the susceptible population in Run 2 (Variant
with global random streams) looks like this:
The time evolution of the susceptible population in Run 4 (Variant
with local random streams) looks like this:
[back to illustrative example]
[back to topic contents]
The time evolution of coherence between Base
and Variant
runs with global random streams (runs 1 and 2) looks like this:
The plateau in coherence count at the beginning of the chart is actually 4998 which is too close to 5000 to see in the chart. As described above, only 2 Hosts
differ between Base
and Variant
at the beginning of the runs.
The time evolution of coherence between Base
and Variant
runs with local random streams (runs 3 and 4) looks like this:
[back to illustrative example]
[back to topic contents]
under construction
Refer to coherence example using IDMM example in Microdata Output
age_infected
is the age of the Host
at the most recent infection, and is initialized to -1.
age_infected
was added to IDMM
to measure decoherence between runs.
It turned out that disease_phase
did not work well to measure decoherence between runs, because
A custom version of `get_microdata_key() was added to produce a unique microdata key. A microdata record is output at each time unit to measure the evolution of coherence between two runs during the simulation.
- Similar example here, but using two runs Run1 and Run2 of IDMM, with
InitialDiseasePrevalence
very slightly higher to generate at least one (but very few) additional infected Hopsts at the beginning of the simulation. - Use Microdata Output to show decoherence in disease phase at end of runs
- Perhaps, measure coherence between two runs over time, by output microdata at time steps.
- Turn on local rng for Host entities, repeat Run1 and Run2
- Use Microdata Output to show coherence in disease phase at end of runs (or evolution over time).
- Windows: Quick Start for Model Users
- Windows: Quick Start for Model Developers
- Linux: Quick Start for Model Users
- Linux: Quick Start for Model Developers
- MacOS: Quick Start for Model Users
- MacOS: Quick Start for Model Developers
- Model Run: How to Run the Model
- MIT License, Copyright and Contribution
- Model Code: Programming a model
- Windows: Create and Debug Models
- Linux: Create and Debug Models
- MacOS: Create and Debug Models
- MacOS: Create and Debug Models using Xcode
- Modgen: Convert case-based model to openM++
- Modgen: Convert time-based model to openM++
- Modgen: Convert Modgen models and usage of C++ in openM++ code
- Model Localization: Translation of model messages
- How To: Set Model Parameters and Get Results
- Model Run: How model finds input parameters
- Model Output Expressions
- Model Run Options and ini-file
- OpenM++ Compiler (omc) Run Options
- OpenM++ ini-file format
- UI: How to start user interface
- UI: openM++ user interface
- UI: Create new or edit scenario
- UI: Upload input scenario or parameters
- UI: Run the Model
- UI: Use ini-files or CSV parameter files
- UI: Compare model run results
- UI: Aggregate and Compare Microdata
- UI: Filter run results by value
- UI: Disk space usage and cleanup
- UI Localization: Translation of openM++
-
Highlight: hook to self-scheduling or trigger attribute
-
Highlight: The End of Start
-
Highlight: Enumeration index validity and the
index_errors
option -
Highlight: Simplified iteration of range, classification, partition
-
Highlight: Parameter, table, and attribute groups can be populated by module declarations
- Oms: openM++ web-service
- Oms: openM++ web-service API
- Oms: How to prepare model input parameters
- Oms: Cloud and model runs queue
- Use R to save output table into CSV file
- Use R to save output table into Excel
- Run model from R: simple loop in cloud
- Run RiskPaths model from R: advanced run in cloud
- Run RiskPaths model in cloud from local PC
- Run model from R and save results in CSV file
- Run model from R: simple loop over model parameter
- Run RiskPaths model from R: advanced parameters scaling
- Run model from Python: simple loop over model parameter
- Run RiskPaths model from Python: advanced parameters scaling
- Windows: Use Docker to get latest version of OpenM++
- Linux: Use Docker to get latest version of OpenM++
- RedHat 8: Use Docker to get latest version of OpenM++
- Quick Start for OpenM++ Developers
- Setup Development Environment
- 2018, June: OpenM++ HPC cluster: Test Lab
- Development Notes: Defines, UTF-8, Databases, etc.
- 2012, December: OpenM++ Design
- 2012, December: OpenM++ Model Architecture, December 2012
- 2012, December: Roadmap, Phase 1
- 2013, May: Prototype version
- 2013, September: Alpha version
- 2014, March: Project Status, Phase 1 completed
- 2016, December: Task List
- 2017, January: Design Notes. Subsample As Parameter problem. Completed
GET Model Metadata
- GET model list
- GET model list including text (description and notes)
- GET model definition metadata
- GET model metadata including text (description and notes)
- GET model metadata including text in all languages
GET Model Extras
GET Model Run results metadata
- GET list of model runs
- GET list of model runs including text (description and notes)
- GET status of model run
- GET status of model run list
- GET status of first model run
- GET status of last model run
- GET status of last completed model run
- GET model run metadata and status
- GET model run including text (description and notes)
- GET model run including text in all languages
GET Model Workset metadata: set of input parameters
- GET list of model worksets
- GET list of model worksets including text (description and notes)
- GET workset status
- GET model default workset status
- GET workset including text (description and notes)
- GET workset including text in all languages
Read Parameters, Output Tables or Microdata values
- Read parameter values from workset
- Read parameter values from workset (enum id's)
- Read parameter values from model run
- Read parameter values from model run (enum id's)
- Read output table values from model run
- Read output table values from model run (enum id's)
- Read output table calculated values from model run
- Read output table calculated values from model run (enum id's)
- Read output table values and compare model runs
- Read output table values and compare model runs (enun id's)
- Read microdata values from model run
- Read microdata values from model run (enum id's)
- Read aggregated microdata from model run
- Read aggregated microdata from model run (enum id's)
- Read microdata run comparison
- Read microdata run comparison (enum id's)
GET Parameters, Output Tables or Microdata values
- GET parameter values from workset
- GET parameter values from model run
- GET output table expression(s) from model run
- GET output table calculated expression(s) from model run
- GET output table values and compare model runs
- GET output table accumulator(s) from model run
- GET output table all accumulators from model run
- GET microdata values from model run
- GET aggregated microdata from model run
- GET microdata run comparison
GET Parameters, Output Tables or Microdata as CSV
- GET csv parameter values from workset
- GET csv parameter values from workset (enum id's)
- GET csv parameter values from model run
- GET csv parameter values from model run (enum id's)
- GET csv output table expressions from model run
- GET csv output table expressions from model run (enum id's)
- GET csv output table accumulators from model run
- GET csv output table accumulators from model run (enum id's)
- GET csv output table all accumulators from model run
- GET csv output table all accumulators from model run (enum id's)
- GET csv calculated table expressions from model run
- GET csv calculated table expressions from model run (enum id's)
- GET csv model runs comparison table expressions
- GET csv model runs comparison table expressions (enum id's)
- GET csv microdata values from model run
- GET csv microdata values from model run (enum id's)
- GET csv aggregated microdata from model run
- GET csv aggregated microdata from model run (enum id's)
- GET csv microdata run comparison
- GET csv microdata run comparison (enum id's)
GET Modeling Task metadata and task run history
- GET list of modeling tasks
- GET list of modeling tasks including text (description and notes)
- GET modeling task input worksets
- GET modeling task run history
- GET status of modeling task run
- GET status of modeling task run list
- GET status of modeling task first run
- GET status of modeling task last run
- GET status of modeling task last completed run
- GET modeling task including text (description and notes)
- GET modeling task text in all languages
Update Model Profile: set of key-value options
- PATCH create or replace profile
- DELETE profile
- POST create or replace profile option
- DELETE profile option
Update Model Workset: set of input parameters
- POST update workset read-only status
- PUT create new workset
- PUT create or replace workset
- PATCH create or merge workset
- DELETE workset
- POST delete multiple worksets
- DELETE parameter from workset
- PATCH update workset parameter values
- PATCH update workset parameter values (enum id's)
- PATCH update workset parameter(s) value notes
- PUT copy parameter from model run into workset
- PATCH merge parameter from model run into workset
- PUT copy parameter from workset to another
- PATCH merge parameter from workset to another
Update Model Runs
- PATCH update model run text (description and notes)
- DELETE model run
- POST delete model runs
- PATCH update run parameter(s) value notes
Update Modeling Tasks
Run Models: run models and monitor progress
Download model, model run results or input parameters
- GET download log file
- GET model download log files
- GET all download log files
- GET download files tree
- POST initiate entire model download
- POST initiate model run download
- POST initiate model workset download
- DELETE download files
- DELETE all download files
Upload model runs or worksets (input scenarios)
- GET upload log file
- GET all upload log files for the model
- GET all upload log files
- GET upload files tree
- POST initiate model run upload
- POST initiate workset upload
- DELETE upload files
- DELETE all upload files
Download and upload user files
- GET user files tree
- POST upload to user files
- PUT create user files folder
- DELETE file or folder from user files
- DELETE all user files
User: manage user settings
Model run jobs and service state
- GET service configuration
- GET job service state
- GET disk usage state
- POST refresh disk space usage info
- GET state of active model run job
- GET state of model run job from queue
- GET state of model run job from history
- PUT model run job into other queue position
- DELETE state of model run job from history
Administrative: manage web-service state
- POST a request to refresh models catalog
- POST a request to close models catalog
- POST a request to close model database
- POST a request to delete the model
- POST a request to open database file
- POST a request to cleanup database file
- GET the list of database cleanup log(s)
- GET database cleanup log file(s)
- POST a request to pause model run queue
- POST a request to pause all model runs queue
- PUT a request to shutdown web-service