Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Event creation rate limiting happens very late in a worker environment #16481

Open
richvdh opened this issue Oct 12, 2023 · 4 comments
Open

Event creation rate limiting happens very late in a worker environment #16481

richvdh opened this issue Oct 12, 2023 · 4 comments
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication) O-Occasional Affects or can be seen by some users regularly or most users rarely S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@richvdh
Copy link
Member

richvdh commented Oct 12, 2023

When event persistence happens in a separate worker process to event creation, we don't end up applying the ratelimit until the event reaches the event persister.

This means we end up doing lots of work (including calculating the push actions and inserting into event_push_actions_staging) for an event that gets thrown away.

There is code to attempt to apply the ratelimiter upfront (https://github.com/matrix-org/synapse/blob/v1.93.0/synapse/handlers/message.py#L1001-L1002), but it doesn't work in a worker environment because the count is never updated.

@erikjohnston
Copy link
Member

One option I think is to add the "full" ratelimit check (with update=True) just before we do the HTTP replication call to the event persister. c.f. below for the call that happens on the persisters

await self.request_ratelimiter.ratelimit(
requester, is_admin_redaction=is_admin_redaction
)

@richvdh
Copy link
Member Author

richvdh commented Oct 13, 2023

One option I think is to add the "full" ratelimit check (with update=True) just before we do the HTTP replication call to the event persister. c.f. below for the call that happens on the persisters

Yeah, that would work I guess.

Can we just use a separate ratelimiter for the two checks rather than trying to be "clever" ?

(ideally, request rate would be shared between workers (via redis?) which would solve this problem, as well as the fact that, for requests that get round-robinned between workers, you need to tune your ratelimits for the number of workers. But that's a bunch of work.)

@erikjohnston
Copy link
Member

Can we just use a separate ratelimiter for the two checks rather than trying to be "clever" ?

I don't follow this? Do you mean not having the update=False thing? Though TBH I can't remember the context of why we only want to actually ratelimit quite late in the day?

@richvdh
Copy link
Member Author

richvdh commented Oct 13, 2023

I don't follow this? Do you mean not having the update=False thing?

Yes, in short. We have two instances of the request ratelimiter, one of which we check (and update) when the request first arrives; the other we check and update later.

Though TBH I can't remember the context of why we only want to actually ratelimit quite late in the day?

No, nor me. We should probably find out before changing things here.

@clokep clokep added A-Workers Problems related to running Synapse in Worker Mode (or replication) S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. O-Occasional Affects or can be seen by some users regularly or most users rarely labels Oct 25, 2023
erikjohnston added a commit that referenced this issue Oct 26, 2023
erikjohnston added a commit that referenced this issue Oct 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication) O-Occasional Affects or can be seen by some users regularly or most users rarely S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

No branches or pull requests

3 participants