Race condition with replication means /messages
backfill lacks read-after-write consistency between workers #14211
Description
We have have a problem where because events are persisted in a queue in a client_reader
worker, there is no guarantee that they are available to read on other workers. So when we fire off a backfill request from /messages
, those backfilled messages aren't necessarily available to paginate with after the backfill completes (even on the worker that put them in the persister queue).
CI failure: https://github.com/matrix-org/synapse/actions/runs/3182998161/jobs/5189731097#step:6:15343 (from discussion). This specific CI flake was addressed in matrix-org/complement#492
WORKERS=1 POSTGRES=1 COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1 COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh -run TestJumpToDateEndpoint/parallel/federation/can_paginate_after_getting_remote_event_from_timestamp_to_event_endpoint
Here is what happens:
serverB
hasevent1
stored as anoutlier
from previous requests (specifically from MSC3030 jump to date pulling in a missingprev_event
after backfilling)- Client on
serverB
calls/messages?dir=b
serverB:client_reader1
accepts the request and drives thingsserverB:client_reader1
has some backward extremities in range and requests/backfill
fromserverA
serverB:client_reader1
processes the events from backfill includingevent1
and puts them in the_event_persist_queue
serverB:master
picks up the events from the_event_persist_queue
and persists them to the database, de-outliersevent1
and invalidates its own cache and sends them over replicationserverB:client_reader1
starts assembling the/messages
response and getsevent1
out of the stale cache still as anoutlier
serverB:client_reader1
responds to the/messages
request withoutevent1
becauseoutliers
are filtered outserverB:client_reader1
finally gets the replication data and invalidates its own cache forevent1
(too late, we already got the events from the stale cache and responded)
In a nutshell, we've written the test expecting "read-after-write consistency" but we don't have that.
It's exactly this but it really sucks that calling /messages
doesn't include events we just backfilled for that request. This is a general problem with Synapse though, see issues labeled with
Z-Read-After-Write
/messages
request so it's a little more insidious.
Having this be possible makes it even more of a reason that we should indicate gaps in /messages
, MSC3871