Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test to check timeline ordering semantics #699

Merged
merged 4 commits into from
Jan 9, 2024

Conversation

kegsay
Copy link
Member

@kegsay kegsay commented Jan 5, 2024

Based on random conversations, we want to make sure that clients cannot accidentally miss being told about events if an earlier event (earlier in the DAG) arrives interleaved with newer events.

Based on random conversations, we want to make sure that clients
cannot accidentally miss being told about events if an earlier
event (earlier in the DAG) arrives interleaved with newer events.
@kegsay kegsay requested review from a team as code owners January 5, 2024 15:07
// - a local user on HS1 sends 2,3,4
// - network partition is fixed. HS2 sends 1' to HS1.
// - a local user on HS1 sends 5,6,7
// - At this point, from HS1's pov, the streaming ordering is [1,2,3,4,1',5,6,7] and the topological order is [1 | 1', 2,3,4,5,6,7]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// - At this point, from HS1's pov, the streaming ordering is [1,2,3,4,1',5,6,7] and the topological order is [1 | 1', 2,3,4,5,6,7]
// - At this point, from HS1's pov, the streaming ordering is [1,2,3,4,1',5,6,7] and the topological order is [1 | 1', 2,3,4,5,6,7]

That is Synapse's choice of topological ordering. Valid ones are basically [(1, 2, 3, 4) | 1', 5, 6, 7] (i.e. 1' could appear anywhere before/after/between 1, 2, 3, 4).

// - a local user on HS1 sends 5,6,7
// - At this point, from HS1's pov, the streaming ordering is [1,2,3,4,1',5,6,7] and the topological order is [1 | 1', 2,3,4,5,6,7]
// - client requests timeline limit = 4 => [1',5,6,7] is returned because /sync is stream ordering. Everything is fine so far.
// - using the prev_batch token here in /messages SHOULD return 4,3,2,1, to ensure clients cannot missing 4,3,2.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And potentially 1', depending, e.g. if you do /messages?limit=1 repeatedly 1' would turn up again currently.

@erikjohnston
Copy link
Member

As requested, I'm going to quickly lay out how Synapse implements this currently.

There are two orderings Synapse uses, called "stream ordering" and "topological ordering" (which is a specific topological ordering). Once an event is persisted these orderings do not change.

There are also two types of tokens: stream tokens and topological tokens. These correspond with segmenting events based on the stream ordering and the topological ordering respectively.

The two requests that happen are:

  1. /sync which returns events in stream ordering and includes stream tokens for the start and end of the included set of events. The events returned are the latest N events based on stream ordering in the room.
  2. /messages which returns events in topological ordering and includes topological tokens for the start and end of the included set of events. You can specify either stream or topological tokens for the upper and lower bounds of requested events. The events returned are the latest N events based on topological ordering in the room, filtered by the tokens specified.

The question is: if we use a stream token from the start of the timeline in /sync to paginate via /messages, do we see all events?

Let's say we do a /sync and get a prev_batch stream token of s. When we do /messages this will return the set of events ordered by topological ordering less than s, greatest first. i.e. there is a topological token t (corresponding to the first event /messages returns) for which any event before t will be returned by /messages (except maybe if the event is after s).

Given an arbitrary event E in the room, what cases are there and will it turn up in subsequent /sync or /messages?

  1. If E is after s then it will come down /sync, and we're done.
  2. If E is before t then it will come down /messages and we're done.
  3. By construction of t there cannot be an event that is after t but before s, as t is defined to be a token that returns all events before s. i.e., E must be in one of the two cases above.

This is true for the case where we receive E and then do the the pagination. What about the case where we have already started paginating from a previous /sync request? For non-backfilled events then E must by definition of stream ordering happen after s, and so come down /sync if there are no gaps. For the gappy sync case the client will have to paginate, but then the same rationale as above applies.

@kegsay kegsay merged commit 6b9940b into main Jan 9, 2024
4 checks passed
@kegsay kegsay deleted the kegan/nw-partition-ordering branch January 9, 2024 14:58
@kegsay kegsay restored the kegan/nw-partition-ordering branch January 9, 2024 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants