/messages
pagination fails with AssertionError: pulled event unexpectedly flagged as outlier
if an event has a broken signature
#12584
Labels
This issue has been migrated from #12584.
Sentry link (internal access only): https://sentry.matrix.org/sentry/synapse-matrixorg/issues/248052
We are repeatedly returning 403's to
GET /_matrix/client/r0/rooms/{roomId}/messages
requests due to trying to backfill old events in the room, before then choking on an event with an invalid signature, which we already have in our database (as an outlier).What happens:
A client hit
GET /_matrix/client/r0/rooms/{roomId}/messages
for a room ID beginning with!lPCpzTqvU...
(room version 6).matrix.org tried to gather events to service the request, and in doing so needed to
/backfill
from another homeserver in the room.The
/backfill
response included event ID$EysLi3HhoGiuQ142A6VoD8Y7F8qv2okN9sW74lIXV3M
, which was created by someone onmatrix.cpn.so
. It is am.room.member
event for a (non-3pid) invite from one user onmatrix.cpn.so
to another.This event is signed with
matrix.cpn.so
's keyed25519:a_sWZq
. We have a copy of this key in our database, but it has ats_valid_until_ms
of 500000 (Thu Jan 1 01:08:20 1970), so we attempt to fetch a fresher copy of the key.matrix.org attempts to reach out to
matrix.cpn.so
to download its server keys, but fails withServerKeyFetcher-2245 - Error looking up keys ['ed25519:a_sWZq'] from matrix.cpn.so: Expected a response for server 'matrix.cpn.so' not 'espr.moe'
. matrix.cpn.so's server well-known points toespr.moe:443
, but querying https://espr.moe/_matrix/key/v2/server/ed25519:a_sWZq returns"server_name":"espr.moe"
(!).Since we can't find a copy of the signing key that was valid at the time the event was created, validation fails.
We then attempt to pull an event with the same event ID from the database(!):
https://github.com/matrix-org/synapse/blob/95a038c1069de6c0507eb2c2d9a783c5033a70ec/synapse/federation/federation_client.py#L557-L567
(We're hitting the
except SynapseError: pass
bit.)Now, this event is marked as an outlier in matrix.org's database. Thus, the event gets the
outlier=True
bit of internal metadata.The outlier event is passed to
FederationEventHandler._process_pulled_events
andFederationEventHandler._process_pulled_event
. We then run into the assertion 💥So the problem in short:
matrix.cpn.so
's federation is broken (it points to a homeserver with different keys), so attempting to validate the event while backfilling it fails./messages
call fails.full stacktrace and logs
(Note that the
Host not in room.
error in the logs is due to matrix-org/synapse#3736, and is unrelated to this issue).This one outlier event is causing many of these stacktraces to pop up every few seconds.
The text was updated successfully, but these errors were encountered: