Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Remove the need for worker_name to simplify scaling #8084

Open
ulope opened this issue Aug 13, 2020 · 6 comments
Open

Remove the need for worker_name to simplify scaling #8084

ulope opened this issue Aug 13, 2020 · 6 comments
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication) T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. z-p3 (Deprecated Label)

Comments

@ulope
Copy link

ulope commented Aug 13, 2020

Description:

The recent-ish changes to workers now recommend to set a unique worker_name per worker process. This makes scaling the number of workers quite a bit more complex than previously since now every instance needs a tailor made config file.

This is especially annoying when using things like docker-compose, swarm or kubernetes where spinning up multiple (identical) instances of a service is a built-in feature.

I have no concrete proposal on how to solve the problem(s) the worker name solves though. (AFAICS they're only really necessary for reverse mapping for federation senders and stream writers?)

@clokep clokep added enhancement z-p3 (Deprecated Label) A-Workers Problems related to running Synapse in Worker Mode (or replication) labels Aug 13, 2020
@ananace
Copy link

ananace commented Aug 14, 2020

Coming from the Kubernetes angle, there's a common method that's used in there to solve this exact issue, where the internal cluster DNS will generate multi-value records that let you look up all the active pods on a certain service through A and PTR lookups.

Though considering Redis can be used for replication now, perhaps that could be a better solution - albeit one that will make Redis a much harder dependency for Synapse with workers.

@ghost
Copy link

ghost commented Mar 23, 2021

A possible solution for Kubernetes would be to use StatefulSets + Environment Variables

Then we need to change this line in the source code
https://github.com/matrix-org/synapse/blob/develop/synapse/config/workers.py#L124

self.worker_name = config.get("worker_name", self.worker_app)

self.worker_name = environ.get("HOSTNAME", "i_have_no_env")

P.S. As a temporary solution in the current situation

@ananace
Copy link

ananace commented Mar 23, 2021

The issue with using a StatefulSet to get a stable name is that they're designed for running applications where the runtime state is extremely important - things like databases.
Using them for stateless applications like the workers would mean that the cluster can't schedule them as easily, and will in fact make them much more fragile as the cluster will refuse to evict or replace them when necessary - as that'd cause them to lose their "state".

I already see far too many stateless things that start up as STS simply to get stable names.

@ulope
Copy link
Author

ulope commented Mar 23, 2021

This also will not help with other solutions like compose, swarm etc.

@ghost
Copy link

ghost commented Mar 23, 2021

FAICS they're only really necessary for reverse mapping for federation senders and stream writers?

Could someone make the documentation more detail to clarify this point?

@ananace
Copy link

ananace commented Mar 24, 2021

I still feel - even more strongly now - that using Redis for this is the right way to go, that offers a good channel for querying all running workers.

@DMRobertson DMRobertson added T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. and removed z-enhancement labels Aug 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication) T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. z-p3 (Deprecated Label)
Projects
None yet
Development

No branches or pull requests

4 participants