add `Random.jump(rng)` API #58353

rfourquet · 2025-05-08T10:14:14Z

We have long had methods for RNG "jumps ahead", i.e. advancing the state by a given number of "steps", but no good API for that.

The only public API is Future.randjump(r::MersenneTwister, steps::Integer), and there are also functions for Xoshiro which are not public (Random.jump_128 and friends).

The following generic API is implemented here:

Random.jump(rng) to jump by a reasonable default number of steps
Random.jump(rng; by::Real) to jump by by steps
Random.jump!(rng; [by]) to equivalently jump in-place
Random.jump(rng, dims...; [by]) to create an array of jumped RNGs

In old julia versions, there also existed a method of randjump returning an array, but the 1st element of this array was the passed argument; the version here does not do this aliasing.

There are two kinds of integers one would wish to pass: dimensions for the array version, and the number of steps.
Using jumps is relatively "niche", but needing to fidle with the number of steps is even more niche. It's expected that in the vast majority of cases, a good default is enough.

Some APIs in other languages have jump (e.g. 2^128 steps) and long_jump (e.g. 2^192 steps), or leap in java, for more complicated cases; for example each process gets a jumped RNG via
long_jump, and within each process, each thread gets a jumped RNG via jump. But this is not very scalable if more kind of jumps are needed: should huge_jump be introduced? For these rare cases where the default number of steps is not sufficient, it seems better to let the programmer explicitly specify the number of steps via an integer.

There is even a third kind of integers one might want to pass: in Random.jump_128(x::Xoshiro, i::Integer), i represents the number of times a jump of size 2^128 is applied; this is because Xoshiro doesn't support arbitrary number of steps; this is not supported in the proposed API, because 1) it's trivial for the user to implement herself, and 2) in probably most use cases, using the array version will be a valid alternative, and more efficient because previous computations are not wasted
(like in [Random.jump_128(x, i) for i=1:num_tasks] vs Random.jump(x, num_tasks)).

Another argument in favor of this API is that it mirrors the proposed Random.fork(rng, dims...) function from #58193.

We have long had methods for RNG "jumps ahead", i.e. advancing the state by a given number of "steps", but no good API for that. The only public API is `Future.randjump(r::MersenneTwister, steps::Integer)`, and there are also functions for `Xoshiro` which are not public (`Random.jump_128` and friends). The following generic API is implemented here: * `Random.jump(rng)` to jump by a reasonable default number of steps * `Random.jump(rng; by::Real)` to jump by `by` steps * `Random.jump!(rng; [by])` to equivalently jump in-place * `Random.jump(rng, dims...; [by])` to create an array of jumped RNGs In old julia versions, there also existed a method of `randjump` returning an array, but the 1st element of this array was the passed argument; the version here does not do this aliasing. There are two kinds of integers one would wish to pass: dimensions for the array version, and the number of steps. Using jumps is relatively "niche", but needing to fidle with the number of steps is even more niche. It's expected that in the vast majority of cases, a good default is enough. Some APIs in other languages have `jump` (e.g. 2^128 steps) and `long_jump` (e.g. 2^192 steps), or `leap` in java, for more complicated cases; for example each process gets a jumped RNG via `long_jump`, and within each process, each thread gets a jumped RNG via `jump`. But this is not very scalable if more kind of jumps are needed: should `huge_jump` be introduced? For these rare cases where the default number of steps is not sufficient, it seems better to let the programmer explicitly specify the number of steps via an integer. There is even a third kind of integers one might want to pass: in `Random.jump_128(x::Xoshiro, i::Integer)`, `i` represents the number of times a jump of size `2^128` is applied; this is because `Xoshiro` doesn't support arbitrary number of steps; this is not supported in the proposed API, because 1) it's trivial for the user to implement herself, and 2) in probably most use cases, using the array version will be a valid alternative, and more efficient because previous computations are not wasted (like in `[Random.jump_128(x, i) for i=1:num_tasks]` vs `Random.jump(x, num_tasks)`). Another argument in favor of this API is that it mirrors the proposed `Random.fork(rng, dims...)` function from #58193.

Seelengrab · 2025-05-08T14:34:10Z

stdlib/Random/src/MersenneTwister.jl

-    j = _randjump(r, Random.DSFMT.calc_jump(steps >> 1))
-    j.adv_jump += steps
-    j
+function jump(rng::MersenneTwister; by::Real=NaN)


Why make this Real, and not default to e.g. -1? Does jumping half of a step make sense?

For convenience; I typed big(2)^128 or big(10)^20 (for MersenneTwister) too often. Jump would often be a power of 2, so accepting 2.0^128 is nice.

I see - so what actually happens when e.g. by=1.5 is passed in?

An error is thrown; in this very method for example, the error happens when trying to convert by to BigInt. I put this in this docstring:

by should be an integer, but can be expressed via non-Integertypes for convenience, e.g.by = 2.0^128`.

Ah, I missed that part of the docstring - I think it would be good to explicitly mention the error case when the conversion fails.

Seelengrab · 2025-05-08T14:36:22Z

stdlib/Random/src/Xoshiro.jl

+    elseif by == 2.0^192
+        jump_192!(rng)
+    else
+        throw(ArgumentError("$(typeof(rng)) RNGs can be jumped only by 2^128 or 2^192 steps"))


Instead of throwing an error, would it make sense to use the step argument as "how many multiples of the stepsize (2^128) should be jumped"?

I'd rather not. I think there needs to be a way to specify the number of steps (different RNGs will have different number of steps available), and to have the API as simple as possible, I prefer not having another integer specify the multiple. If needed, we could eventually support Random.jump(xoshiro, by=3*big(2)^128) automatically detect that it's 3 times of jump of 2.0^128. But maybe I misunderstood your suggestion?

I was only referring to the special case of Xoshiro, not the general case. I agree that limiting the general case doesn't really make sense. I also meant interpreting the existing by/step argument from this PR as that "multiple of 2^128", not adding another argument.

This would prevent using the 2^192 jump, but also it complicates the API:

by is usually the number of steps, except when specified otherwise, where it's interpreted as being a multiple of a pre-defined number of steps

(This formulation above is not clear enough, but that was to give the idea...)

Hmm, reading that, I can see that this is confusing.. Is it possible to make the jump for Xoshiro arbitrary instead? It just feels a bit weird to have this specific limitation.

Yes it would be possible, but I don't currently plan to implement that myself. I agree it feels somewhat weird, but it's not really in practice. Typically libraries only provide jump and long_jump for two different number of steps, and its plenty enough for the vast majority of use-cases. We could easily add a few more specific ones though (2^64, etc.)

vtjnash · 2025-05-08T16:26:25Z

Do we need this now that we have the PR for fork? I think we should consider moving this jump code into a package JumpRandom or AdvancedRandom, and keep this stdlib "lighter"

rfourquet · 2025-05-08T20:03:59Z

That's an interesting idea, and indeed I believe fork should be preferred in most cases.
MersenneTwister doesn't have a native method for fork, but I had some ideas on how to have a generic fork eventually.

One thing is that Future.randjump already offers the jump functionality, so in theory we can't drop it, and adding the API here isn't much more code. Still, we could just keep that code, remove Random.jump_128/192(::Xoshiro) which is internal, and move it to a package, together with this PR.

However, as the code is already here, and jumping ahead is relatively standard, keeping it in Random has some appeal.

ararslan · 2025-05-08T20:34:00Z

stdlib/Random/src/MersenneTwister.jl

-    j.adv_jump += steps
-    j
+function jump(rng::MersenneTwister; by::Real=NaN)
+    isnan(by) && (by = 2.0^128)


Why not set the default value for by to be 2.0^128 rather than NaN if we're already changing the value to be that when the default is provided?

Good point. The weak reason is that the array version takes the same default NaN, and passes it out unchanged to the non-array version (this method above), which then needs to handle NaN anyway. The alternative would be, in the array version, to either call jump(rng) or jump(rng; by) depending on whether or not by was passed explicitly.

rfourquet added randomness Random number generation and the Random stdlib feature Indicates new feature / enhancement requests labels May 8, 2025

rfourquet changed the title ~~add Random.jump(rng) API~~ add Random.jump(rng) API May 8, 2025

rfourquet force-pushed the rf/jump branch from 06464a4 to 8262550 Compare May 8, 2025 14:04

Seelengrab reviewed May 8, 2025

View reviewed changes

ararslan reviewed May 8, 2025

View reviewed changes

Uh oh!

add Random.jump(rng) API #58353

Are you sure you want to change the base?

add Random.jump(rng) API #58353

Uh oh!

Conversation

rfourquet commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rfourquet May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Seelengrab May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Seelengrab May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rfourquet May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vtjnash commented May 8, 2025

Uh oh!

rfourquet commented May 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

add `Random.jump(rng)` API #58353

add `Random.jump(rng)` API #58353

rfourquet commented May 8, 2025 •

edited

Loading

rfourquet May 8, 2025 •

edited

Loading

Seelengrab May 8, 2025 •

edited

Loading

Seelengrab May 8, 2025 •

edited

Loading

rfourquet May 8, 2025 •

edited

Loading