-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Batch cache invalidation over replication #4671
Conversation
Currently whenever the current state changes in a room invalidate a lot of caches, which cause *a lot* of traffic over replication. Instead, lets batch up all those invalidations and send a single poke down the replication streams. Hopefully this will reduce load on the master process by substantially reducing traffic.
This passes the sytests worker config locally (and breaks when I break the cache invalidation) |
Codecov Report
@@ Coverage Diff @@
## develop #4671 +/- ##
===========================================
+ Coverage 75.17% 75.18% +<.01%
===========================================
Files 340 340
Lines 34712 34736 +24
Branches 5677 5680 +3
===========================================
+ Hits 26096 26116 +20
- Misses 7013 7015 +2
- Partials 1603 1605 +2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm apart from some quibbling
# We probably haven't pulled in the cache in this worker, | ||
# which is fine. | ||
pass | ||
if row.cache_func == _CURRENT_STATE_CACHE_NAME: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see some words about this (and in general what the 'caches' stream means) documented in docs/tcp_replication.tcp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some words. I haven't specified exactly what _CURRENT_STATE_CACHE_NAME
means, as I feel like that's just going to quickly get out of date with the code and is probably unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm otherwise
|
||
However, there are times when a number of caches need to be invalidated at the | ||
same time with the same key. To reduce traffic we batch those invalidations into | ||
a single poke by defining a special cache name that workers understand to mean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind not listing the exact caches, but please can you define the special cache name?
Currently whenever the current state changes in a room invalidate a lot
of caches, which cause a lot of traffic over replication. Instead,
lets batch up all those invalidations and send a single poke down
the replication streams.
Hopefully this will reduce load on the master process by substantially
reducing traffic.