You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
babolivier opened this issue
Oct 18, 2021
· 3 comments
Labels
A-TestingIssues related to testing in complement, synapse, etcA-WorkersProblems related to running Synapse in Worker Mode (or replication)T-TaskRefactoring, removal, replacement, enabling or disabling functionality, other engineering tasks.
The reason for the 20min outage on matrix.org on Friday is because a module we use started using the new module API method introduced in #10833. This method calls to the get_user_ip_and_agents store method, which doesn't exist on a worker store, thus modules can't call it if they run on a worker (and matrix.org uses workers).
This PR included a test to make sure the new method works well. AFAICT, this is supposed to be the right thing to do when introducing a new Synapse-specific feature (i.e. a feature that's not part of the Matrix spec, which should be tested in SyTest/Complement as well).
However, this test suite isn't run in a worker configuration. This means our CI didn't detect the bug despite having a test for it - and it also means that possibly quite a few other features that are Synapse-specific are buggy on workers.
So I think we have an issue with our test suite not being tested against workers (apart for a few tests that are specifically written for workers, to test replication etc). I'm not sure what the first step in fixing this should look like; maybe one could have a look at how difficult it would be to run them on a mock worker rather than a mock homeserver.
The text was updated successfully, but these errors were encountered:
DMRobertson
added
the
T-Task
Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks.
label
Oct 18, 2021
the thing about this is that the trial tests are unit tests - in general, they don't run against a "real" synapse at all (monolith or otherwise).
I don't disagree that we need to find a way to make sure that the module API works against worker-mode Synapses; but I don't think it's meaningful to simply say "let's run the synapse test suite against workers".
the thing about this is that the trial tests are unit tests
A lot of them are a weird flavour of integration tests and a few others are somewhere in the middle, so I don't think that's correct.
in general, they don't run against a "real" synapse at all (monolith or otherwise).
They run against a representation of one, and my interrogation was around how close to a worker (vs a monolith) can we make this representation (e.g. swapping out the store for a worker store, etc). Though I accept I might be looking at it from the wrong angle.
I don't disagree that we need to find a way to make sure that the module API works against worker-mode Synapses; but I don't think it's meaningful to simply say "let's run the synapse test suite against workers".
In this case, it means we need that we need a new tests suite that runs on workers for Synapse-specific feature (so module API, admin API, message retention policies, etc), and migrate the existing integration tests we have for those in Synapse's test suite over there.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
A-TestingIssues related to testing in complement, synapse, etcA-WorkersProblems related to running Synapse in Worker Mode (or replication)T-TaskRefactoring, removal, replacement, enabling or disabling functionality, other engineering tasks.
The reason for the 20min outage on matrix.org on Friday is because a module we use started using the new module API method introduced in #10833. This method calls to the
get_user_ip_and_agents
store method, which doesn't exist on a worker store, thus modules can't call it if they run on a worker (and matrix.org uses workers).This PR included a test to make sure the new method works well. AFAICT, this is supposed to be the right thing to do when introducing a new Synapse-specific feature (i.e. a feature that's not part of the Matrix spec, which should be tested in SyTest/Complement as well).
However, this test suite isn't run in a worker configuration. This means our CI didn't detect the bug despite having a test for it - and it also means that possibly quite a few other features that are Synapse-specific are buggy on workers.
So I think we have an issue with our test suite not being tested against workers (apart for a few tests that are specifically written for workers, to test replication etc). I'm not sure what the first step in fixing this should look like; maybe one could have a look at how difficult it would be to run them on a mock worker rather than a mock homeserver.
The text was updated successfully, but these errors were encountered: