Skip to content

Conversation

@coolreader18
Copy link
Collaborator

@coolreader18 coolreader18 commented Apr 15, 2025

Description of Changes

Fixes for timestamps, so that a reducer that happens after another is observed to have a later timestamp. Specifically:

  • when running a scheduled reducer, clamp it so that ctx.timestamp will be at least the timestamp the reducer was scheduled for.
  • when scheduling a reducer for now + delay, clamp now so it is at least the timestamp of the start of the current reducer run.

The first commit is pulled out from the second, to make it clearer what the actual logic change is.

Expected complexity level and risk

2 - timing stuff is intricate and this could mess with assumptions people have made about the way it works, but I'm pretty confident this is strictly a strong guarantee.

Testing

  • Manually ran a repeating reducer "loop" which asserted monotonicity of its ctx.timestamp and set back the system clock during the run.
    • Triggered the assertion prior to this PR.
    • Did not trigger the assertion with this PR.

@coolreader18 coolreader18 requested a review from gefjon April 15, 2025 21:57
Copy link
Contributor

@gefjon gefjon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes look good, and I like your approach. Thanks for splitting into separate commits; that made review super easy!

I would like to see a test, though it may be manual or a smoketest at your discretion. I would write a module with a reducer that re-schedules itself on a very small delay (a zero delay, perhaps?) while decrementing a counter so that it runs a large number of times, and at each run asserts that its new timestamp is >= the previous one, panicking and breaking the loop if not. You should make sure to observe this failing at least once on master, and then run a loop, say, twice that length on this PR branch and see it succeed.

@coolreader18 coolreader18 force-pushed the noa/monotonic-timestamps branch from 023205a to 5096079 Compare April 16, 2025 20:42
@coolreader18
Copy link
Collaborator Author

Alright, added a test. I couldn't get the issue to reproduce naturally, but calling timedatectl set-time <the_past> in the middle of the test run caused it to fail on master, but still succeed on this branch.

@gefjon
Copy link
Contributor

gefjon commented Apr 17, 2025

I also can't get the issue to repro naturally on master. I tried pretty hard, too: I changed the test to run 100000 iterations, set the delay to 0, and still didn't see an out-of-order run. Makes me wonder what was going on with the reporter's setup, that they apparently saw multiple anomalies of several seconds.

I'm not sure there's much value in having the test checked in, in that case; I think a test with a 100% rate of false negatives is either useless or worse than useless. The manual test where you set the system clock is enough to convince me to merge this. Remove the new smoketest, then I'll approve. Thanks!

@coolreader18 coolreader18 force-pushed the noa/monotonic-timestamps branch from 5096079 to 8d685af Compare April 17, 2025 16:21
@coolreader18 coolreader18 enabled auto-merge April 17, 2025 16:22
@coolreader18
Copy link
Collaborator Author

Yeah, I'm wondering if it's a windows-specific issue or something like that. At the very least it's a their-machine-specific issue.

@gefjon
Copy link
Contributor

gefjon commented Apr 17, 2025

Yeah, I'm wondering if it's a windows-specific issue or something like that. At the very least it's a their-machine-specific issue.

They used the Linux names for CLOCK_MONOTONIC and CLOCK_REALTIME, so I assumed they were on Linux, but who knows?

@coolreader18 coolreader18 added this pull request to the merge queue Apr 17, 2025
Merged via the queue into master with commit 1963cd8 Apr 17, 2025
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants