Description
This issue has been migrated from #11165.
Everywhere in Synapse, HomeServer.get_datastore()
is annotated as the DataStore
god class, which is incorrect for workers and allowed #11163 to slip into matrix.org, causing an outage. It would be a good idea for data store consumers (servlets, handlers, the module API, etc) to declare which aspects of the data store they need, and have CI check that we don't pass them a data store that's missing the required interfaces.
There are three data store classes to consider:
DataStore
, which is the data store class used on the main processGenericWorkerSlavedStore
, which is the data store class used on worker processesAdminCmdSlavedStore
, which is the data store class used when running admin commands(?)
DataStore
and GenericWorkerSlavedStore
overlap but aren't subtypes of each other. AdminCmdSlavedStore
is a subset of GenericWorkerSlavedStore
functionality-wise, but not through inheritance.
It's been suggested by @reivilibre that we define two or three types for use in type hints:
WorkerStore
MainStore
(a subtype ofWorkerStore
?)EitherStore = Union[WorkerStore, MainStore]
, if it turns out thatWorkerStore
contains functionality not inMainStore
These don't have to be concrete classes and could be Protocol
s if needed.
We could have more granular store requirements, but it wouldn't catch any extra bugs.
The code is currently structured like this in a lot of places:
class GroupsLocalWorkerHandler:
def __init__(self, hs: "HomeServer"):
self.store = hs.get_datastore()
Proposal 1: HomeServer[WorkerStore]
We could add a covariant type parameter to HomeServer
to have data store consumers declare which type of data store they need:
class GroupsLocalWorkerHandler:
def __init__(self, hs: "HomeServer[WorkerStore]"):
self.store = hs.get_datastore()
HomeServer[WorkerStore]
would mean "a homeserver with a data store that supports at least the capabilities of a WorkerStore
".
We'd have to do this refactor across the entire codebase at once, otherwise the type hints for data stores would be implicitly degraded to Any
everywhere.
Proposal 2: get_worker_datastore()
@clokep suggests that we add new get_*_datastore()
methods to HomeServer
that raise errors when called on the wrong process:
class GroupsLocalWorkerHandler:
def __init__(self, hs: "HomeServer"):
self.store = hs.get_worker_datastore()
mypy would not flag up issues, but any CI stage which spins up workers would fail.