-
Notifications
You must be signed in to change notification settings - Fork 46
Core ROSS RNG streams; Event Time Signature Paradigm; Deterministic Tiebreaker #180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit adds a separate array of RNG streams on each LP that aren't to be utilized by developed models. These separate RNG streams can be utilized to leverage the deterministic RNG nature that ROSS can manage toward other goals of the ROSS engine itself. Notable example use for this: Deterministic Tiebreaking Deterministic Tiebreaking can be implemented by creating a random value at the creation of an event, this value is encoded into the ROSS event struct and is utilized to break any event ties (same destination LP at same time). Because this separate RNG is only accessed by ROSS, it can be rolled back if the event becomes RC'd or cancelled. Because of determinism, any ordering as a result of this tiebreaker will be consistent across simulation runs regardless of event delivery order or stragglers. If a regular model-accessed LP RNG was used for this purpose, the tiebreaking sequence would be subject to interference.
This commit adds the functionality of the deterministic tiebreaker mentioned in an earlier commit which added the core ROSS engine exclusive RNGs. The deterministic tiebreaker itself is rather simple. When an event is created, a random value is generated from a ROSS core RNG stream. When that event is RC'd or cancelled, that random value is also reversed. Because this stream is only utilized by said tiebreaking mechanism, the ordering of tiebreaking values created by the stream is deterministic across simulations. When comparing two events received by an LP at the same timestamp, the determining factor in which is processed first will be decided - deterministically - by the tiebreaker value. While the concept itself is simple, and implementing the tiebreaker into the event struct is similarly simple, getting this tiebreaker to work with the concept of GVT and rollbacks is not. A better way to think about this tiebreaking mechanism is to think of it as making sure that there are actually no such thing as event ties. This paradigm shift means that determining "when" GVT happens is no longer a single TW_STIME value. There is now an event signature struct which contains a timestamp and a tiebreaker value. This signature is all that is necessary for determining ordering of two events in the simulation. Thus the time of the last GVT is no longer just the single dimensional virtual timestamp, but it also includes a tiebreaker which divides events that happen at the same primary timestamp as GVT but with their own tiebreaker values which will be deterministically separated as "before GVT" and "after GVT". Thus, rollbacks now also no longer go back to a single timestamp value in time, but to a two dimensional timestamp value consisting of the primary timestamp and an event tiebreaker. As complex as this system is, it does have its benefits: 1) Comparing events for Splay and AVL trees to determine ordering require fewer comparisons and thus less compute time spent. 2) If primary timestamp ties are numerous in a model, rolling back of one event at said timestamp will no longer require rolling back all events with the same timestamp, only those whose tiebreaker values determine that they happen "after" the event that prompted the rollback. 3) Event ties are statistically impossible to force. Because the tiebreaker value is generated using its own independent RNG stream with an extremely long period, two events at the same primary timestamp ALSO generating an identical random value is nearly impossible. This also means that model developers will no longer have to generate their own small noise to add onto their event timestamps to prevent event ties, significantly improving the administrative code complexity - reducing the likelihood that a developer will forget to roll back an RNG from noise and plunge their entire model into non-determinism. This feature has been walled off behind a CMAKE Define Variable: USE_RAND_TIEBREAKER. Set this value to ON during CMAKE ROSS configuration and all code enabling the tiebreaking value generation and the timestamp-to-time-signature paradigm shift will be switched on by pre-processor #ifdef's.
Codecov Report
@@ Coverage Diff @@
## develop #180 +/- ##
===========================================
- Coverage 58.17% 57.88% -0.29%
===========================================
Files 33 33
Lines 3565 3588 +23
===========================================
+ Hits 2074 2077 +3
- Misses 1491 1511 +20
Continue to review full report at Codecov.
|
Converted to draft PR - there are some technical limitations of this new paradigm that I want to fully understand and figure out a workaround if possible. Notably: zero offset event timestamps will break this system. I understand why this happens and I have a feeling like there may not be a workaround but just want to make sure this doesn't get merged until it's fully explored. |
This has been thoroughly run through the wringer for my last two papers this year. In the ROSS only paper, there were no issues. In the CODES paper, there were a couple determinism issues but I'm 100% certain that these are CODES issues. |
This is actually way out of date. Creating a new pull request with the better branch. |
This is a big PR. Sorry, there's a lot to unpack here.
Starting with the easiest. This PR adds a new array of RNGs to ROSS LPs. These RNG streams are "Core RNGs" and access and usage of them should be reserved exclusively for the ROSS engine itself. Models should not touch these RNG streams at all. Doing so will break any determinism expected by ROSS patterns that utilize them.
Prior to now, obviously, there's nothing in ROSS that would utilize such a stream but also included in this PR is an implementation of an event tie-breaking mechanism which gives ROSS the capability of handling event ties (events that occur on the same LP at the same virtual time) in a deterministic way that is consistent between simulations, regardless of event delivery order.
Deterministic Tie-breaking can be implemented by creating a random value at the creation of an event, this value is encoded into the ROSS event struct and is utilized to break any event ties (same destination LP at same time). Because this separate RNG is only accessed by ROSS, it can be rolled back if the event becomes RC'd or cancelled. Because of
determinism, any ordering as a result of this tiebreaker will be consistent across simulation runs regardless of event delivery order or stragglers. If a regular model-accessed LP RNG was used for this purpose, the tiebreaking sequence would be subject to interference.
The deterministic tiebreaker itself is rather simple. When an event is created, a random value is generated from a ROSS core RNG stream. When that event is RC'd or cancelled, that random value is also
reversed. Because this stream is only utilized by said tiebreaking mechanism, the ordering of tiebreaking values created by the stream is deterministic across simulations. When comparing two events received by an LP at the same timestamp, the determining factor in which is processed first will be decided - deterministically - by the tiebreaker value.
While the concept itself is simple, and implementing the tiebreaker into the event struct is similarly simple, getting this
to work with the concept of GVT and rollbacks is not.
A better way to think about this tiebreaking mechanism is to think of it as making sure that there are actually no such thing as event ties. This paradigm shift means that determining "when" GVT happens is no longer a single TW_STIME value. There is now an event signature struct which contains a timestamp and a tiebreaker value. This signature is all that is necessary for determining ordering of two events in the simulation. Thus, the time of the last GVT is no longer just the single dimensional virtual timestamp, but it also includes a tiebreaker which divides events that happen at the same primary timestamp as GVT but with their own tiebreaker values which will be deterministically separated as "before GVT" and "after GVT".
Thus, rollbacks now also no longer go back to a single timestamp value in time, but to a two dimensional timestamp value consisting of the primary timestamp and an event tiebreaker.
As complex as this system is, it does have its benefits:
determine that they happen "after" the event that prompted the rollback.
This feature has been walled off behind a CMAKE Define Variable: USE_RAND_TIEBREAKER. Set this value to ON during CMAKE ROSS configuration and all code enabling the tiebreaking value generation and the timestamp-to-time-signature paradigm shift will be switched on by pre-processor #ifdef's.
Ultimately, it may be beneficial to make this event time signature the actual primary mechanism by which the ROSS engine operates but I didn't want to make said major change in this PR. There is probably a cleaner way to implement it as well (possibly using the TW_STIME API?)
If this merge represents a feature addition to ROSS, the following items must be completed before the branch will be merged:
Include a link to your blog post in the Pull Request.