You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sync makes use of the event stream cache to determine whether a room has changed between the since & current tokens. This is then used to limit the number of rooms events are queried for in get_room_events_stream_for_rooms. After discovering the cache invalidation races above I added a quick log line for this: beeper/synapse@62497db (after beeper/synapse@5297155).
And it logs! Only a very small handful of occurrences over the last ~5 days and the position difference has so far been 1 every time. I suspect this may also occur against other stream caches but have not confirmed.
The worry here is if an event was sent within the gap it may be missed from an incremental sync which is especially bad because the user will never see or know about this event unless they re-init sync (or the client backfills it?).
One solution to this is to implement a waiting mechanism on StreamCache so a worker can wait for the cache to catch up with the current token for a given sync before fetching data. Because this is super rare and even when it happens it's a tiny position difference this would probably have negligable impact in sync performance and provide a shield against cache invalidation races over replication.
The text was updated successfully, but these errors were encountered:
This issue has been migrated from #14158.
Based on my initial work investigating sync cache races in: matrix-org/synapse#14154
Sync makes use of the event stream cache to determine whether a room has changed between the since & current tokens. This is then used to limit the number of rooms events are queried for in
get_room_events_stream_for_rooms
. After discovering the cache invalidation races above I added a quick log line for this: beeper/synapse@62497db (after beeper/synapse@5297155).And it logs! Only a very small handful of occurrences over the last ~5 days and the position difference has so far been 1 every time. I suspect this may also occur against other stream caches but have not confirmed.
The worry here is if an event was sent within the gap it may be missed from an incremental sync which is especially bad because the user will never see or know about this event unless they re-init sync (or the client backfills it?).
One solution to this is to implement a waiting mechanism on
StreamCache
so a worker can wait for the cache to catch up with the current token for a given sync before fetching data. Because this is super rare and even when it happens it's a tiny position difference this would probably have negligable impact in sync performance and provide a shield against cache invalidation races over replication.The text was updated successfully, but these errors were encountered: