Skip to content

Futures: Parallel random number generation (RNG) #60

Closed
@HenrikBengtsson

Description

@HenrikBengtsson

To prevent non-sound random numbers being produced when running in parallel, futureverse asks the developer to specify when their code needs the RNG. If not asked for, it'll still check to see if the RNG was used (i.e. .Random.seed) was updated. If it was, then a warning is produced.

Here is an example:

> library(pbapply)
> future::plan("multisession")
> y <- pblapply(1:2, FUN = rnorm, cl = "future")
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s  
Warning messages:
1: UNRELIABLE VALUE: One of the 'future.apply' iterations ('future_lapply-1') unexpectedly generated random numbers without declaring so. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify 'future.seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use 'future.seed = NULL', or set option 'future.rng.onMisuse' to "ignore". 
2: UNRELIABLE VALUE: One of the 'future.apply' iterations ('future_lapply-2') unexpectedly generated random numbers without declaring so. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify 'future.seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use 'future.seed = NULL', or set option 'future.rng.onMisuse' to "ignore". 

To avoid this, a quick fix is for you could always pass future.seed = TRUE. That will set up a parallel RNG regardless of random numbers being generated or not. The downside is that it can be computationally expensive to do so. To give the developer the control, you'd have to introduce a new argument allowing the to control the future.seed argument to future_lapply() and likes. One way to do that without adding a new argument could be via attributes, e.g.

y <- pblapply(1:2, FUN = rnorm, cl = structure("future", future.seed = TRUE))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions