Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3902: Faster remote room joins over federation (overview) #3902

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

richvdh
Copy link
Member

@richvdh richvdh commented Oct 3, 2022

This proposal gives an overview of the changes required to support faster remote joins.

It supercedes MSC2775.

Rendered

@richvdh richvdh marked this pull request as draft October 3, 2022 17:50
@turt2live turt2live added proposal A matrix spec change proposal s2s Server-to-Server API (federation) kind:core MSC which is critical to the protocol's success needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. labels Oct 3, 2022

TBD

## Alternatives

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatives: #3883, or "interactive peeking"

Less complicated state res => easier to verify correctness of spec and implementation

Comment on lines +80 to +81
(Note that we can reliably answer requests that require knowledge only of
the membership state for local users.)
Copy link
Contributor

@DMRobertson DMRobertson Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? Suppose @alice:alice.com partially joins a room via bob.com. Before the resync completes, she is banned from that room by @chris:chris.com. Call the ban event B.

Is there some way that B can be an invalid event whilst appearing valid enough to alice.com? If so, that means we can't reliably answer requests about local users' membership(?): during resync we'd think Alice was banned, but post-resync we'd realise she wasn't.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that can certainly happen. "Reliably" here is doing a bit of heavy lifting: we actually don't know any of the state for certain during the resync (events may be accepted that should not have been, and they can also be rejected when they should not have been. I seem to remember writing complement tests for both cases). We may therefore tell clients about state that turns out to be wrong. In some ways this is an extension of matrix-org/matrix-spec#1209.

But still, membership events for local users are in the same bracket as, for example, m.room.topic events: we assume that our view of the state is good enough, and return the events to clients without waiting for the resync to complete.

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Feb 6, 2023
Synapse 1.76.0 (2023-01-31)
===========================

The 1.76 release is the first to enable faster joins ([MSC3706](matrix-org/matrix-spec-proposals#3706) and [MSC3902](matrix-org/matrix-spec-proposals#3902)) by default. Admins can opt-out: see [the upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.76/docs/upgrade.md#faster-joins-are-enabled-by-default) for more details.

The upgrade from 1.75 to 1.76 changes the account data replication streams in a backwards-incompatible manner. Server operators running a multi-worker deployment should consult [the upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.76/docs/upgrade.md#changes-to-the-account-data-replication-streams).

Those who are `poetry install`ing from source using our lockfile should ensure their poetry version is 1.3.2 or higher; [see upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.76/docs/upgrade.md#minimum-version-of-poetry-is-now-132).


Notes on faster joins
---------------------

The faster joins project sees the most benefit when joining a room with a large number of members (joined or historical). We expect it to be particularly useful for joining large public rooms like the [Matrix HQ](https://matrix.to/#/#matrix:matrix.org) or [Synapse Admins](https://matrix.to/#/#synapse:matrix.org) rooms.

After a faster join, Synapse considers that room "partially joined". In this state, you should be able to

- read incoming messages;
- see incoming state changes, e.g. room topic changes; and
- send messages, if the room is unencrypted.

Synapse has to spend more effort to complete the join in the background. Once this finishes, you will be able to

- send messages, if the room is in encrypted;
- retrieve room history from before your join, if permitted by the room settings; and
- access the full list of room members.


Improved Documentation
----------------------

- Describe the ideas and the internal machinery behind faster joins. ([\#14677](matrix-org/synapse#14677))


Synapse 1.76.0rc2 (2023-01-27)
==============================

Bugfixes
--------

- Faster joins: Fix a bug introduced in Synapse 1.69 where device list EDUs could fail to be handled after a restart when a faster join sync is in progress. ([\#14914](matrix-org/synapse#14914))


Internal Changes
----------------

- Faster joins: Improve performance of looking up partial-state status of rooms. ([\#14917](matrix-org/synapse#14917))


Synapse 1.76.0rc1 (2023-01-25)
==============================

Features
--------

- Update the default room version to [v10](https://spec.matrix.org/v1.5/rooms/v10/) ([MSC 3904](matrix-org/matrix-spec-proposals#3904)). Contributed by @FSG-Cat. ([\#14111](matrix-org/synapse#14111))
- Add a `set_displayname()` method to the module API for setting a user's display name. ([\#14629](matrix-org/synapse#14629))
- Add a dedicated listener configuration for `health` endpoint. ([\#14747](matrix-org/synapse#14747))
- Implement support for [MSC3890](matrix-org/matrix-spec-proposals#3890): Remotely silence local notifications. ([\#14775](matrix-org/synapse#14775))
- Implement experimental support for [MSC3930](matrix-org/matrix-spec-proposals#3930): Push rules for ([MSC3381](matrix-org/matrix-spec-proposals#3381)) Polls. ([\#14787](matrix-org/synapse#14787))
- Per [MSC3925](matrix-org/matrix-spec-proposals#3925), bundle the whole of the replacement with any edited events, and optionally inhibit server-side replacement. ([\#14811](matrix-org/synapse#14811))
- Faster joins: always serve a partial join response to servers that request it with the stable query param. ([\#14839](matrix-org/synapse#14839))
- Faster joins: allow non-lazy-loading ("eager") syncs to complete after a partial join by omitting partial state rooms until they become fully stated. ([\#14870](matrix-org/synapse#14870))
- Faster joins: request partial joins by default. Admins can opt-out of this for the time being---see the upgrade notes. ([\#14905](matrix-org/synapse#14905))


Bugfixes
--------

- Add index to improve performance of the `/timestamp_to_event` endpoint used for jumping to a specific date in the timeline of a room. ([\#14799](matrix-org/synapse#14799))
- Fix a long-standing bug where Synapse would exhaust the stack when processing many federation requests where the remote homeserver has disconencted early. ([\#14812](matrix-org/synapse#14812), [\#14842](matrix-org/synapse#14842))
- Fix rare races when using workers. ([\#14820](matrix-org/synapse#14820))
- Fix a bug introduced in Synapse 1.64.0 when using room version 10 with frozen events enabled. ([\#14864](matrix-org/synapse#14864))
- Fix a long-standing bug where the `populate_room_stats` background job could fail on broken rooms. ([\#14873](matrix-org/synapse#14873))
- Faster joins: Fix a bug in worker deployments where the room stats and user directory would not get updated when finishing a fast join until another event is sent or received. ([\#14874](matrix-org/synapse#14874))
- Faster joins: Fix incompatibility with joins into restricted rooms where no local users have the ability to invite. ([\#14882](matrix-org/synapse#14882))
- Fix a regression introduced in Synapse 1.69.0 which can result in database corruption when database migrations are interrupted on sqlite. ([\#14910](matrix-org/synapse#14910))


Updates to the Docker image
---------------------------

- Bump default Python version in the Dockerfile from 3.9 to 3.11. ([\#14875](matrix-org/synapse#14875))


Improved Documentation
----------------------

- Include `x_forwarded` entry in the HTTP listener example configs and remove the remaining `worker_main_http_uri` entries. ([\#14667](matrix-org/synapse#14667))
- Remove duplicate commands from the Code Style documentation page; point to the Contributing Guide instead. ([\#14773](matrix-org/synapse#14773))
- Add missing documentation for `tag` to `listeners` section. ([\#14803](matrix-org/synapse#14803))
- Updated documentation in configuration manual for `user_directory.search_all_users`. ([\#14818](matrix-org/synapse#14818))
- Add `worker_manhole` to configuration manual. ([\#14824](matrix-org/synapse#14824))
- Fix the example config missing the `id` field in [application service documentation](https://matrix-org.github.io/synapse/latest/application_services.html). ([\#14845](matrix-org/synapse#14845))
- Minor corrections to the logging configuration documentation. ([\#14868](matrix-org/synapse#14868))
- Document the export user data command. Contributed by @thezaidbintariq. ([\#14883](matrix-org/synapse#14883))


Deprecations and Removals
-------------------------

- Poetry 1.3.2 or higher is now required when `poetry install`ing from source. ([\#14860](matrix-org/synapse#14860))


Internal Changes
----------------

- Faster remote room joins (worker mode): do not populate external hosts-in-room cache when sending events as this requires blocking for full state. ([\#14749](matrix-org/synapse#14749))
- Enable Complement tests for Faster Remote Room Joins against worker-mode Synapse. ([\#14752](matrix-org/synapse#14752))
- Add some clarifying comments and refactor a portion of the `Keyring` class for readability. ([\#14804](matrix-org/synapse#14804))
- Add local poetry config files (`poetry.toml`) to `.gitignore`. ([\#14807](matrix-org/synapse#14807))
- Add missing type hints. ([\#14816](matrix-org/synapse#14816), [\#14885](matrix-org/synapse#14885), [\#14889](matrix-org/synapse#14889))
- Refactor push tests. ([\#14819](matrix-org/synapse#14819))
- Re-enable some linting that was disabled when we switched to ruff. ([\#14821](matrix-org/synapse#14821))
- Add `cargo fmt` and `cargo clippy` to the lint script. ([\#14822](matrix-org/synapse#14822))
- Drop unused table `presence`. ([\#14825](matrix-org/synapse#14825))
- Merge the two account data and the two device list replication streams. ([\#14826](matrix-org/synapse#14826), [\#14833](matrix-org/synapse#14833))
- Faster joins: use stable identifiers from [MSC3706](matrix-org/matrix-spec-proposals#3706). ([\#14832](matrix-org/synapse#14832), [\#14841](matrix-org/synapse#14841))
- Add a parameter to control whether the federation client performs a partial state join. ([\#14843](matrix-org/synapse#14843))
- Add check to avoid starting duplicate partial state syncs. ([\#14844](matrix-org/synapse#14844))
- Add an early return when handling no-op presence updates. ([\#14855](matrix-org/synapse#14855))
- Fix `wait_for_stream_position` to correctly wait for the right instance to advance its token. ([\#14856](matrix-org/synapse#14856), [\#14872](matrix-org/synapse#14872))
- Always notify replication when a stream advances automatically. ([\#14877](matrix-org/synapse#14877))
- Reduce max time we wait for stream positions. ([\#14881](matrix-org/synapse#14881))
- Faster joins: allow the resync process more time to fetch `/state` ids. ([\#14912](matrix-org/synapse#14912))
- Bump regex from 1.7.0 to 1.7.1. ([\#14848](matrix-org/synapse#14848))
- Bump peaceiris/actions-gh-pages from 3.9.1 to 3.9.2. ([\#14861](matrix-org/synapse#14861))
- Bump ruff from 0.0.215 to 0.0.224. ([\#14862](matrix-org/synapse#14862))
- Bump types-pillow from 9.4.0.0 to 9.4.0.3. ([\#14863](matrix-org/synapse#14863))
- Bump types-opentracing from 2.4.10 to 2.4.10.1. ([\#14896](matrix-org/synapse#14896))
- Bump ruff from 0.0.224 to 0.0.230. ([\#14897](matrix-org/synapse#14897))
- Bump types-requests from 2.28.11.7 to 2.28.11.8. ([\#14899](matrix-org/synapse#14899))
- Bump types-psycopg2 from 2.9.21.2 to 2.9.21.4. ([\#14900](matrix-org/synapse#14900))
- Bump types-commonmark from 0.9.2 to 0.9.2.1. ([\#14901](matrix-org/synapse#14901))


Synapse 1.75.0 (2023-01-17)
===========================

No significant changes since 1.75.0rc2.


Synapse 1.75.0rc2 (2023-01-12)
==============================

Bugfixes
--------

- Fix a bug introduced in Synapse 1.75.0rc1 where device lists could be miscalculated with some sync filters. ([\#14810](matrix-org/synapse#14810))
- Fix race where calling `/members` or `/state` with an `at` parameter could fail for newly created rooms, when using multiple workers. ([\#14817](matrix-org/synapse#14817))


Synapse 1.75.0rc1 (2023-01-10)
==============================

Features
--------

- Add a `cached` function to `synapse.module_api` that returns a decorator to cache return values of functions. ([\#14663](matrix-org/synapse#14663))
- Add experimental support for [MSC3391](matrix-org/matrix-spec-proposals#3391) (removing account data). ([\#14714](matrix-org/synapse#14714))
- Support [RFC7636](https://datatracker.ietf.org/doc/html/rfc7636) Proof Key for Code Exchange for OAuth single sign-on. ([\#14750](matrix-org/synapse#14750))
- Support non-OpenID compliant userinfo claims for subject and picture. ([\#14753](matrix-org/synapse#14753))
- Improve performance of `/sync` when filtering all rooms, message types, or senders. ([\#14786](matrix-org/synapse#14786))
- Improve performance of the `/hierarchy` endpoint. ([\#14263](matrix-org/synapse#14263))


Bugfixes
--------

- Fix the *MAU Limits* section of the Grafana dashboard relying on a specific `job` name for the workers of a Synapse deployment. ([\#14644](matrix-org/synapse#14644))
- Fix a bug introduced in Synapse 1.70.0 which could cause spurious `UNIQUE constraint failed` errors in the `rotate_notifs` background job. ([\#14669](matrix-org/synapse#14669))
- Ensure stream IDs are always updated after caches get invalidated with workers. Contributed by Nick @ Beeper (@Fizzadar). ([\#14723](matrix-org/synapse#14723))
- Remove the unspecced `device` field from `/pushrules` responses. ([\#14727](matrix-org/synapse#14727))
- Fix a bug introduced in Synapse 1.73.0 where the `picture_claim` configured under `oidc_providers` was unused (the default value of `"picture"` was used instead). ([\#14751](matrix-org/synapse#14751))
- Unescape HTML entities in URL preview titles making use of oEmbed responses. ([\#14781](matrix-org/synapse#14781))
- Disable sending confirmation email when 3pid is disabled. ([\#14725](matrix-org/synapse#14725))


Improved Documentation
----------------------

- Declare support for Python 3.11. ([\#14673](matrix-org/synapse#14673))
- Fix `target_memory_usage` being used in the description for the actual `cache_autotune` sub-option `target_cache_memory_usage`. ([\#14674](matrix-org/synapse#14674))
- Move `email` to Server section in config file documentation. ([\#14730](matrix-org/synapse#14730))
- Fix broken links in the Synapse documentation. ([\#14744](matrix-org/synapse#14744))
- Add missing worker settings to shared configuration documentation. ([\#14748](matrix-org/synapse#14748))
- Document using Twitter as a OAuth 2.0 authentication provider. ([\#14778](matrix-org/synapse#14778))
- Fix Synapse 1.74 upgrade notes to correctly explain how to install pyICU when installing Synapse from PyPI. ([\#14797](matrix-org/synapse#14797))
- Update link to towncrier in contribution guide. ([\#14801](matrix-org/synapse#14801))
- Use `htmltest` to check links in the Synapse documentation. ([\#14743](matrix-org/synapse#14743))


Internal Changes
----------------

- Faster remote room joins: stream the un-partial-stating of events over replication. ([\#14545](matrix-org/synapse#14545), [\#14546](matrix-org/synapse#14546))
- Use [ruff](https://github.com/charliermarsh/ruff/) instead of flake8. ([\#14633](matrix-org/synapse#14633), [\#14741](matrix-org/synapse#14741))
- Change `handle_new_client_event` signature so that a 429 does not reach clients on `PartialStateConflictError`, and internally retry when needed instead. ([\#14665](matrix-org/synapse#14665))
- Remove dependency on jQuery on reCAPTCHA page. ([\#14672](matrix-org/synapse#14672))
- Faster joins: make `compute_state_after_events` consistent with other state-fetching functions that take a `StateFilter`. ([\#14676](matrix-org/synapse#14676))
- Add missing type hints. ([\#14680](matrix-org/synapse#14680), [\#14681](matrix-org/synapse#14681), [\#14687](matrix-org/synapse#14687))
- Improve type annotations for the helper methods on a `CachedFunction`. ([\#14685](matrix-org/synapse#14685))
- Check that the SQLite database file exists before porting to PostgreSQL. ([\#14692](matrix-org/synapse#14692))
- Add `.direnv/` directory to .gitignore to prevent local state generated by the [direnv](https://direnv.net/) development tool from being committed. ([\#14707](matrix-org/synapse#14707))
- Batch up replication requests to request the resyncing of remote users's devices. ([\#14716](matrix-org/synapse#14716))
- If debug logging is enabled, log the `msgid`s of any to-device messages that are returned over `/sync`. ([\#14724](matrix-org/synapse#14724))
- Change GHA CI job to follow best practices. ([\#14772](matrix-org/synapse#14772))
- Switch to our fork of `dh-virtualenv` to work around an upstream Python 3.11 incompatibility. ([\#14774](matrix-org/synapse#14774))
- Skip testing built wheels for PyPy 3.7 on Linux x86_64 as we lack new required dependencies in the build environment. ([\#14802](matrix-org/synapse#14802))

### Dependabot updates

<details>

- Bump JasonEtco/create-an-issue from 2.8.1 to 2.8.2. ([\#14693](matrix-org/synapse#14693))
- Bump anyhow from 1.0.66 to 1.0.68. ([\#14694](matrix-org/synapse#14694))
- Bump blake2 from 0.10.5 to 0.10.6. ([\#14695](matrix-org/synapse#14695))
- Bump serde_json from 1.0.89 to 1.0.91. ([\#14696](matrix-org/synapse#14696))
- Bump serde from 1.0.150 to 1.0.151. ([\#14697](matrix-org/synapse#14697))
- Bump lxml from 4.9.1 to 4.9.2. ([\#14698](matrix-org/synapse#14698))
- Bump types-jsonschema from 4.17.0.1 to 4.17.0.2. ([\#14700](matrix-org/synapse#14700))
- Bump sentry-sdk from 1.11.1 to 1.12.0. ([\#14701](matrix-org/synapse#14701))
- Bump types-setuptools from 65.6.0.1 to 65.6.0.2. ([\#14702](matrix-org/synapse#14702))
- Bump minimum PyYAML to 3.13. ([\#14720](matrix-org/synapse#14720))
- Bump JasonEtco/create-an-issue from 2.8.2 to 2.9.1. ([\#14731](matrix-org/synapse#14731))
- Bump towncrier from 22.8.0 to 22.12.0. ([\#14732](matrix-org/synapse#14732))
- Bump isort from 5.10.1 to 5.11.4. ([\#14733](matrix-org/synapse#14733))
- Bump attrs from 22.1.0 to 22.2.0. ([\#14734](matrix-org/synapse#14734))
- Bump black from 22.10.0 to 22.12.0. ([\#14735](matrix-org/synapse#14735))
- Bump sentry-sdk from 1.12.0 to 1.12.1. ([\#14736](matrix-org/synapse#14736))
- Bump setuptools from 65.3.0 to 65.5.1. ([\#14738](matrix-org/synapse#14738))
- Bump serde from 1.0.151 to 1.0.152. ([\#14758](matrix-org/synapse#14758))
- Bump ruff from 0.0.189 to 0.0.206. ([\#14759](matrix-org/synapse#14759))
- Bump pydantic from 1.10.2 to 1.10.4. ([\#14760](matrix-org/synapse#14760))
- Bump gitpython from 3.1.29 to 3.1.30. ([\#14761](matrix-org/synapse#14761))
- Bump pillow from 9.3.0 to 9.4.0. ([\#14762](matrix-org/synapse#14762))
- Bump types-requests from 2.28.11.5 to 2.28.11.7. ([\#14763](matrix-org/synapse#14763))
- Bump dawidd6/action-download-artifact from 2.24.2 to 2.24.3. ([\#14779](matrix-org/synapse#14779))
- Bump peaceiris/actions-gh-pages from 3.9.0 to 3.9.1. ([\#14791](matrix-org/synapse#14791))
- Bump types-pillow from 9.3.0.4 to 9.4.0.0. ([\#14792](matrix-org/synapse#14792))
- Bump pyopenssl from 22.1.0 to 23.0.0. ([\#14793](matrix-org/synapse#14793))
- Bump types-setuptools from 65.6.0.2 to 65.6.0.3. ([\#14794](matrix-org/synapse#14794))
- Bump importlib-metadata from 4.2.0 to 6.0.0. ([\#14795](matrix-org/synapse#14795))
- Bump ruff from 0.0.206 to 0.0.215. ([\#14796](matrix-org/synapse#14796))
</details>

Synapse 1.74.0 (2022-12-20)
===========================

Improved Documentation
----------------------

- Add release note and update documentation regarding optional ICU support in user search. ([\#14712](matrix-org/synapse#14712))


Synapse 1.74.0rc1 (2022-12-13)
==============================

Features
--------

- Improve user search for international display names. ([\#14464](matrix-org/synapse#14464))
- Stop using deprecated `keyIds` parameter when calling `/_matrix/key/v2/server`. ([\#14490](matrix-org/synapse#14490), [\#14525](matrix-org/synapse#14525))
- Add new `push.enabled` config option to allow opting out of push notification calculation. ([\#14551](matrix-org/synapse#14551), [\#14619](matrix-org/synapse#14619))
- Advertise support for Matrix 1.5 on `/_matrix/client/versions`. ([\#14576](matrix-org/synapse#14576))
- Improve opentracing and logging for to-device message handling. ([\#14598](matrix-org/synapse#14598))
- Allow selecting "prejoin" events by state keys in addition to event types. ([\#14642](matrix-org/synapse#14642))


Bugfixes
--------

- Fix a long-standing bug where a device list update might not be sent to clients in certain circumstances. ([\#14435](matrix-org/synapse#14435), [\#14592](matrix-org/synapse#14592), [\#14604](matrix-org/synapse#14604))
- Suppress a spurious warning when `POST /rooms/<room_id>/<membership>/`, `POST /join/<room_id_or_alias`, or the unspecced `PUT /join/<room_id_or_alias>/<txn_id>` receive an empty HTTP request body. ([\#14600](matrix-org/synapse#14600))
- Return spec-compliant JSON errors when unknown endpoints are requested. ([\#14620](matrix-org/synapse#14620), [\#14621](matrix-org/synapse#14621))
- Update html templates to load images over HTTPS. Contributed by @ashfame. ([\#14625](matrix-org/synapse#14625))
- Fix a long-standing bug where the user directory would return 1 more row than requested. ([\#14631](matrix-org/synapse#14631))
- Reject invalid read receipt requests with empty room or event IDs. Contributed by Nick @ Beeper (@Fizzadar). ([\#14632](matrix-org/synapse#14632))
- Fix a bug introduced in Synapse 1.67.0 where not specifying a config file or a server URL would lead to the `register_new_matrix_user` script failing. ([\#14637](matrix-org/synapse#14637))
- Fix a long-standing bug where the user directory and room/user stats might be out of sync. ([\#14639](matrix-org/synapse#14639), [\#14643](matrix-org/synapse#14643))
- Fix a bug introduced in Synapse 1.72.0 where the background updates to add non-thread unique indexes on receipts would fail if they were previously interrupted. ([\#14650](matrix-org/synapse#14650))
- Improve validation of field size limits in events. ([\#14664](matrix-org/synapse#14664))
- Fix bugs introduced in Synapse 1.55.0 and 1.69.0 where application services would not be notified of events in the correct rooms, due to stale caches. ([\#14670](matrix-org/synapse#14670))


Improved Documentation
----------------------

- Update worker settings for `pusher` and `federation_sender` functionality. ([\#14493](matrix-org/synapse#14493))
- Add links to third party package repositories, and point to the bug which highlights Ubuntu's out-of-date packages. ([\#14517](matrix-org/synapse#14517))
- Remove old, incorrect minimum postgres version note and replace with a link to the [Dependency Deprecation Policy](https://matrix-org.github.io/synapse/v1.73/deprecation_policy.html). ([\#14590](matrix-org/synapse#14590))
- Add Single-Sign On setup instructions for Mastodon-based instances. ([\#14594](matrix-org/synapse#14594))
- Change `turn_allow_guests` example value to lowercase `true`. ([\#14634](matrix-org/synapse#14634))


Internal Changes
----------------

- Optimise push badge count calculations. Contributed by Nick @ Beeper (@Fizzadar). ([\#14255](matrix-org/synapse#14255))
- Faster remote room joins: stream the un-partial-stating of rooms over replication. ([\#14473](matrix-org/synapse#14473), [\#14474](matrix-org/synapse#14474))
- Share the `ClientRestResource` for both workers and the main process. ([\#14528](matrix-org/synapse#14528))
- Add `--editable` flag to `complement.sh` which uses an editable install of Synapse for faster turn-around times whilst developing iteratively. ([\#14548](matrix-org/synapse#14548))
- Faster joins: use servers list approximation to send read receipts when in partial state instead of waiting for the full state of the room. ([\#14549](matrix-org/synapse#14549))
- Modernize unit tests configuration related to workers. ([\#14568](matrix-org/synapse#14568))
- Bump jsonschema from 4.17.0 to 4.17.3. ([\#14591](matrix-org/synapse#14591))
- Fix Rust lint CI. ([\#14602](matrix-org/synapse#14602))
- Bump JasonEtco/create-an-issue from 2.5.0 to 2.8.1. ([\#14607](matrix-org/synapse#14607))
- Alter some unit test environment parameters to decrease time spent running tests. ([\#14610](matrix-org/synapse#14610))
- Switch to Go recommended installation method for `gotestfmt` template in CI. ([\#14611](matrix-org/synapse#14611))
- Bump phonenumbers from 8.13.0 to 8.13.1. ([\#14612](matrix-org/synapse#14612))
- Bump types-setuptools from 65.5.0.3 to 65.6.0.1. ([\#14613](matrix-org/synapse#14613))
- Bump twine from 4.0.1 to 4.0.2. ([\#14614](matrix-org/synapse#14614))
- Bump types-requests from 2.28.11.2 to 2.28.11.5. ([\#14615](matrix-org/synapse#14615))
- Bump cryptography from 38.0.3 to 38.0.4. ([\#14616](matrix-org/synapse#14616))
- Remove useless cargo install with apt from Dockerfile. ([\#14636](matrix-org/synapse#14636))
- Bump certifi from 2021.10.8 to 2022.12.7. ([\#14645](matrix-org/synapse#14645))
- Bump flake8-bugbear from 22.10.27 to 22.12.6. ([\#14656](matrix-org/synapse#14656))
- Bump packaging from 21.3 to 22.0. ([\#14657](matrix-org/synapse#14657))
- Bump types-pillow from 9.3.0.1 to 9.3.0.4. ([\#14658](matrix-org/synapse#14658))
- Bump serde from 1.0.148 to 1.0.150. ([\#14659](matrix-org/synapse#14659))
- Bump phonenumbers from 8.13.1 to 8.13.2. ([\#14660](matrix-org/synapse#14660))
- Bump authlib from 1.1.0 to 1.2.0. ([\#14661](matrix-org/synapse#14661))
- Move `StateFilter` to `synapse.types`. ([\#14668](matrix-org/synapse#14668))
- Improve type hints. ([\#14597](matrix-org/synapse#14597), [\#14646](matrix-org/synapse#14646), [\#14671](matrix-org/synapse#14671))

In the current implementation, requests that require knowledge of
`m.room.member` events for remote users will *block* until the
resynchronisation completes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to send the currently known members and send updates to this list over /sync's state. Then it also makes sense to include partial rooms in non-lazyloaded syncs. What do you think about that? I think that's better than blocking, especially when we don't know when or if the room ever fully loads.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation in Synapse, non-lazy_load_members syncs don't block but instead pretend the room has not been joined until the homeserver has the full state of the room available. Other endpoints, like /members, still block.

I'm not sure whether it's spec-compliant to initially send the known members over /sync (we can propose a change to the spec if it isn't) or how clients would interpret it. It might be that all clients are okay with it. We just haven't tested it.

This process means that we are largely trusting remote servers not to send
invalid events (hence the need for a revalidation during the
resynchronisation process); however it does mean that if we have a ban for
a particular user, then their events will be rejected.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have a ban for
a particular user, then their events will be rejected.

This seems to be the goal of soft failing and not the "state at event" check. Before fast remote joins, these banned user's events would still be accepted as outliers. To get the same behavior, we could remove the "state at event" auth check (unless there's a good reason to keep it?).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case we had in mind was where the "true" state of the room after event P has a ban for user @bad:some-homeserver.com and @bad:some-homeserver.com sends an event E, with {P} as its prev_events and citing its membership prior to the ban as an auth event. The faster joining homeserver may or may not have the ban for that user in its idea of the state after P. If it does, we want event E to be correctly rejected rather than trusting its auth events. To reject E, we have to run the "Passes authorization rules based on the (partial) state before the event" check. But the check would also reject most normal messages because their sender's memberships won't be in the partial state, so we combine the partial state with the auth events using state resolution.

In the soft fail case, where @bad:some-homeserver.com is banned in the current state of the room, but picks very old prev_events to send events with, the accept/reject behavior is unchanged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we "solve" this by adding a new federation API to get latest state by type/state_key (_matrix/federation/v1/state/<type>/<state_key> or similar)?

If we assume we trust the server used to join the room via we can then use that same server to fetch any current m.room.member state as needed when events are received. Wouldn't this allow a server to auth events completely without being fully synchronised?

state before the event, otherwise it is rejected". Since we do not know
the (full) state before the event, we can no longer apply this
check. Instead, we perform a state-resolution between the limited state
that we do have, and the event's auth events; we then check that the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This state resolution will be extremely expensive because nearly all events are conflicting

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very familiar with state resolution, but if we take the expected case where the auth events match those in the partial state, we'd be calculating state_res( partial_state={ create, power levels, name, topic, ... }, auth_events={ create, power levels, sender membership }).

You're right that the conflicted state set would include most state events in the room: {name, topic, ..., sender membership}
but wouldn't this set be small, since most memberships aren't in the room's partial state? I'd also expect the auth difference to be small because both sets of state contain the same create and power levels events in the common case.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember the rationale for doing state res here. I think a way to combine the partial state with the auth events was needed and state resolution was the usual way we combine state. Something simpler, like filling in the gaps in partial state with the auth events, might be appropriate too, but I may be overlooking some considerations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember the rationale for doing state res here. I think a way to combine the partial state with the auth events was needed and state resolution was the usual way we combine state.

Yes, basically this. It was an attempt to enforce bans, as far as possible.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since most memberships aren't in the room's partial state

Wouldn't this fill up with time and when the room is almost fully synced include a lot of events?

Comment on lines +97 to +98

(This is [pending implementation](https://github.com/matrix-org/synapse/issues/12989) in Synapse.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line is redundant, now that the Synapse issue is closed (by this PR).

Suggested change
(This is [pending implementation](https://github.com/matrix-org/synapse/issues/12989) in Synapse.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest-accepted kind:core MSC which is critical to the protocol's success needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. proposal A matrix spec change proposal s2s Server-to-Server API (federation)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants