-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Separate datastore for devices? #12906
Comments
Related to #11491 |
AIUI we'd lose the ability to have foreign keys between the device tables and the rest of the database, which is unfortunate (not that we seem to have any). Do you know if the database IO for those tables is mostly reads or writes? If they're mostly reads I'd be in favour of adding support for read replicas. |
It’s a pretty even mix of reads and writes. Read replicas would certainly be very helpful (across all of synapse) but I suspect would require significant changes both code and operationally to be supported. The reason I suggested splitting our devices is because it seems like a logically separate group of tables (like state groups) that needn’t have any in-db joins even in the future. Full context: as part of this same investigation I was considering how eventually the synapse events/rooms tables could be sharded by room id, which would in theory provide near infinite scale. Separating non room related tables is kind of a first step towards that. |
HI @Fizzadar, this is something we discussed a bit as our team. We think its totally feasible, we just have a couple of reservations:
So, to move forwards here could you share more about what you're seeing? Ideally we'd like to know exactly which queries are using the IO, but not sure how granular your data is. |
Thanks for looking into this @erikjohnston! I pulled some DB stats on the highest read tables, combining both table + index blocks together to get the following rates (prom query here also):
This aligns with other charts indicating that the In regards to the general issue here though - our aim is ultimately to shard the |
So this morning I added a new index on |
Description:
We are currently experimenting with different ways to scale out the synapse database in particular where it would be possible to divide tables amongst separate database instances, much like the state tables/datastore class.
Based on my analysis it should be possible to extract the following device/e2e related stores into a separate datastore instance:
Device*Store
(stores/main/devices.py
)DeviceInbox*Store
(stores/main/deviceinbox.py
)EndToEndKey*Store
(stores/main/end_to_end_keys.py
)ClientIp*Store
(stores/main/client_ips.py
)I picked these because they're fairly small overall/low inter-dependency and represent a high percentage of database IO on our instance (currently single database all tables).
Note: the one interdependency this misses is the
populate_monthly_active_users
call inclient_ips.py
which could becomeself.hs.get_datastores().device.populate_monthly_active_users(user_id)
.Is there any appetite for this? We can commit engineering time to implement this if so. Also keen to discuss any other groups of stores that may be suitable candidates.
The text was updated successfully, but these errors were encountered: