-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Reproducible jitter #1996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducible jitter #1996
Conversation
@krlmlr - Great enhancement. As soon as this gets accepted, we'll plan to use it. |
R/position-jitter.r
Outdated
trans_x <- if (params$width > 0) function(x) jitter(x, amount = params$width) | ||
trans_y <- if (params$height > 0) function(x) jitter(x, amount = params$height) | ||
trans_x <- if (params$width > 0) function(x) with_seed(params$seed, jitter(x, amount = params$width)) | ||
trans_y <- if (params$height > 0) function(x) with_seed(params$seed, jitter(x, amount = params$height)) | ||
|
||
transform_position(data, trans_x, trans_y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the with_seed()
should probably go here, otherwise the x and y jitter will use the same random values
R/utilities.r
Outdated
@@ -299,6 +299,16 @@ find_args <- function(...) { | |||
# global data | |||
dummy_data <- function() data.frame(x = NA) | |||
|
|||
with_seed <- function(seed, code) { | |||
if (!is.null(seed)) { | |||
old_seed <- get0(".Random.seed", globalenv()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this definitely ok to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yihui: knitr seems to cache/restore .Random.seed, too. The caching in knitr is explicit, but it looks like the restore happens implicitly. Could you please comment if my way of restoring the random seed is safe?
with_seed <- function(seed, code) {
if (!is.null(seed)) {
old_seed <- get0(".Random.seed", globalenv())
if (!is.null(old_seed)) {
on.exit(assign(".Random.seed", old_seed, globalenv()), add = TRUE)
}
}
code
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is safe (I'd add mode = 'integer'
to get0()
, though), but it seems that you forgot to set.seed(seed)
before code
is evaluated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. The code worked by accident because resetting the random seed after applying the jitter also made it reproducible.
Could you please also add that example to the examples in the doc? I also wonder if |
👍 to this PR, also the correct value is 42. |
@jimhester: Some think it's 4. |
I still think that the seed should be random each time you call geom_jitter - you should have to explicitly opt in to reusing the seed by constructing your own position_jitter (or setting the seed). Otherwise you'll get exactly the same offsets every single time. That's not terrible but it feels like a different geom to jitter (and you could just use an existing precomputed pseudo random sequence) |
in this mode, the random seed is not changed, but reset after processing
It's difficult to achieve both: consistent jitter by default, and jitter dependent on random seed. I have added a mode (now default) that reuses the current random seed and resets it afterwards. |
I think my original proposal to just pick a random seed would be simpler. |
@hadley: Do you mean position_jitter <- function(..., seed = sample.int(2147483647L, 1L)) ? I suspect this isn't going to work as expected, two consecutive |
Right, that's what I mean. If you want to reuse the same jitter, you'll need to share a |
Fixed and clarified example, the default now picks a random seed and advances the RNG. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please make the one change and merge/rebase and then we'll get it in.
NEWS
Outdated
@@ -1,3 +1,6 @@ | |||
* `position_jitter()` gains a `seed` argument that allows specifying a random |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, bullet should go in NEWS.md
Bump |
In pull tidyverse#1996 position_jitter got a new argument called seed. position_jitterdodge might also benefit from having a seed.
This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/ |
by adding a
seed
argument toposition_jitter()
. Useful to apply the same jitter to both points and labels.We should probably move
with_seed()
towithr
before merging.Reprex: