Skip to content

Make the backfill linearizer lock smarter to not block as often #13619

Open
@matrixbot

Description

@matrixbot

This issue has been migrated from #13619.


Mentioned in internal doc about speeding up /messages. Also see "1. Backfill linearizer lock takes forever" in matrix-org/synapse#13356


The linearizer lock on backfill is only used to de-duplicate work at the same time (it doesn't have anything to do with data integrity) (introduced in matrix-org/synapse#10116). But the linearizer is very simplistic and we can make it smarter to not block so often.

Improvements

Block by ranges of depth per-room

Currently, the linearizer blocks per-room, so you're not able to backfill in two separate locations in a room.

We could update to have locks on ranges of depth for the room.

De-duplicate the same request/work

Currently, the linearizer has no de-duplication so if you send the same request 10 times in a row, they will all just rack up and do the same work over and over in sequence.

We could instead share the response from the first one to all the requests that are waiting.

The queue can build up above our timeout threshold

If we see that items in the queue are older than 180s, we should just cancel them because the underlying request to Synapse has timed out anyway. No need really to do work for a request that is no longer running.

Linearizer shared across all worker processes

Currently, the linearizer is per-process so if there are multiple client reader worker processes, we are potentially duplicating work across them. We do try to send traffic for a given room to the same worker though but there is no guarantee.

This is probably one of the optimizations to prioritize last.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions