-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Description
In #45222, the default sort algorithm was changed to an algorithm which uses randomness (at least, when Random
is loaded), and therefore can modify the global RNG state.
This can lead to surprising bugs in cases where sort!
is used with the expectation that it will not modify the global RNG state. For example: FluxML/Zygote.jl#1351, where Zygote's metaprogramming (IRTools
) calls sort
when running the program for the first time. This means that a primal program gives a different output when a function is differentiated for the first time, as compared to subsequent runs, even when the global RNG is initialized to the same thing in both cases. This is incorrect behaviour because Zygote should not affect the primal's random draws.
Arguably, this is a bug in IRTools
: IRTools
should either switch to a sort!
algorithm that has a semantic guarantee of determinism, or somehow perform the default sort
but with a different RNG that doesn't affect the global one (note that the latter is difficult to do because Julia does not have native splittable RNGs, and extra global RNGs hidden in packages sounds like a bad idea, because a computation's result would ideally be deterministic given a) the global RNG's initial state, and b) any user-provided RNGs).
Also, to clarify, for a sorting algorithm that needs randomness to make calls to rand
certainly seems like the right thing to do, so the discussion in #45222 (comment) makes sense to me. (Although, perhaps there should be a facility for user-provided RNGs.).
However, in practice, the question is whether breaking the implicit assumption that the default sort
is deterministic could lead to bugs. For example, the REPL
calls sort!
(logging mine, this is on 1.9.0-beta2
):
If sort!
does not have a semantic guarantee of not affecting the global RNG, would it be correct to say that Random.seed!(123); rand();
could have different outputs in the REPL v.s. a script?
This does not actually happen in the above example,, because the polyalg uses InsertionSort
for small inputs, which does not make any calls to rand
. (This is why the Zygote
bug linked above was a bit difficult to discover; presumably the AST had to be big enough). However, relying on this fact seems dangerous: the question is really about getting the semantic guarantees right, and whether modifying any existing (implicit) guarantees on a function such as sort!
that is used in internals could have unforeseen effects.
Please let me know if I'm missing anything big in the story here, it's very possible:) But since this change is slated for Julia 1.9, I thought it would be good to raise this concern asap in case there's something to it.
Edit: see discussion in #45222 (comment) which is highly relevant