-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Start application service stream token tracking from 1 #12193
Conversation
Factored out into #12193.
Well this is fun, thanks for the explanation! The fix certainly makes sense. Though I wonder if we actually want to send all the current presence updates to the application service on first start? Otherwise it won't ever see people that are e.g. always online. It's worth noting that the code has specifically chosen to send all presence updates to the streamer if it falls too far behind. If that is much of a problem then we should also fix that. |
@erikjohnston Thanks for taking a look! By "all known presence states" above, I meant that the AS would be receiving presence updates for every user, not just users the AS is interested in receiving the presence of. Looking through again though, I think I misread this bit of code as including every user, which it isn't. synapse/synapse/handlers/presence.py Lines 1657 to 1660 in e24ff8e
For clarity, we're now going through the truthy body of this conditional, whereas before synapse/synapse/handlers/presence.py Lines 1630 to 1660 in e24ff8e
So actually, this change is removing presence updates from the initial burst for users that have never set a presence state. Instead of receiving an "offline" state for them, they'll just be omitted. This is a change in behaviour, but probably isn't a problem as the AS would just assume those users would be offline anyways? ...I'm going to change that to |
Co-authored-by: Erik Johnston <erik@matrix.org>
…_min_type_stream_id_for_as
Synapse 1.57.0 (2022-04-19) =========================== This version includes a [change](matrix-org/synapse#12209) to the way transaction IDs are managed for application services. If your deployment uses a dedicated worker for application service traffic, **it must be stopped** when the database is upgraded (which normally happens when the main process is upgraded), to ensure the change is made safely without any risk of reusing transaction IDs. See the [upgrade notes](https://github.com/matrix-org/synapse/blob/v1.57.0rc1/docs/upgrade.md#upgrading-to-v1570) for more details. No significant changes since 1.57.0rc1. Synapse 1.57.0rc1 (2022-04-12) ============================== Features -------- - Send device list changes to application services as specified by [MSC3202](matrix-org/matrix-spec-proposals#3202), using unstable prefixes. The `msc3202_transaction_extensions` experimental homeserver config option must be enabled and `org.matrix.msc3202: true` must be present in the application service registration file for device list changes to be sent. The "left" field is currently always empty. ([\#11881](matrix-org/synapse#11881)) - Optimise fetching large quantities of missing room state over federation. ([\#12040](matrix-org/synapse#12040)) - Offload the `update_client_ip` background job from the main process to the background worker, when using Redis-based replication. ([\#12251](matrix-org/synapse#12251)) - Move `update_client_ip` background job from the main process to the background worker. ([\#12252](matrix-org/synapse#12252)) - Add a module callback to react to new 3PID (email address, phone number) associations. ([\#12302](matrix-org/synapse#12302)) - Add a configuration option to remove a specific set of rooms from sync responses. ([\#12310](matrix-org/synapse#12310)) - Add a module callback to react to account data changes. ([\#12327](matrix-org/synapse#12327)) - Allow setting user admin status using the module API. Contributed by Famedly. ([\#12341](matrix-org/synapse#12341)) - Reduce overhead of restarting synchrotrons. ([\#12367](matrix-org/synapse#12367), [\#12372](matrix-org/synapse#12372)) - Update `/messages` to use historic pagination tokens if no `from` query parameter is given. ([\#12370](matrix-org/synapse#12370)) - Add a module API for reading and writing global account data. ([\#12391](matrix-org/synapse#12391)) - Support the stable `v1` endpoint for `/relations`, per [MSC2675](matrix-org/matrix-spec-proposals#2675). ([\#12403](matrix-org/synapse#12403)) - Include bundled aggregations in search results ([MSC3666](matrix-org/matrix-spec-proposals#3666)). ([\#12436](matrix-org/synapse#12436)) Bugfixes -------- - Fix a long-standing bug where updates to the server notices user profile (display name/avatar URL) in the configuration would not be applied to pre-existing rooms. Contributed by Jorge Florian. ([\#12115](matrix-org/synapse#12115)) - Fix a long-standing bug where events from ignored users were still considered for bundled aggregations. ([\#12235](matrix-org/synapse#12235), [\#12338](matrix-org/synapse#12338)) - Fix non-member state events not resolving for historical events when used in [MSC2716](matrix-org/matrix-spec-proposals#2716) `/batch_send` `state_events_at_start`. ([\#12329](matrix-org/synapse#12329)) - Fix a long-standing bug affecting URL previews that would generate a 500 response instead of a 403 if the previewed URL includes a port that isn't allowed by the relevant blacklist. ([\#12333](matrix-org/synapse#12333)) - Default to `private` room visibility rather than `public` when a client does not specify one, according to spec. ([\#12350](matrix-org/synapse#12350)) - Fix a spec compliance issue where requests to the `/publicRooms` federation API would specify `limit` as a string. ([\#12364](matrix-org/synapse#12364), [\#12410](matrix-org/synapse#12410)) - Fix a bug introduced in Synapse 1.49.0 which caused the `synapse_event_persisted_position` metric to have invalid values. ([\#12390](matrix-org/synapse#12390)) Updates to the Docker image --------------------------- - Bundle locked versions of dependencies into the Docker image. ([\#12385](matrix-org/synapse#12385), [\#12439](matrix-org/synapse#12439)) - Fix up healthcheck generation for workers docker image. ([\#12405](matrix-org/synapse#12405)) Improved Documentation ---------------------- - Clarify documentation for running SyTest against Synapse, including use of Postgres and worker mode. ([\#12271](matrix-org/synapse#12271)) - Document the behaviour of `LoggingTransaction.call_after` and `LoggingTransaction.call_on_exception` methods when transactions are retried. ([\#12315](matrix-org/synapse#12315)) - Update dead links in `check-newsfragment.sh` to point to the correct documentation URL. ([\#12331](matrix-org/synapse#12331)) - Upgrade the version of `mdbook` in CI to 0.4.17. ([\#12339](matrix-org/synapse#12339)) - Updates to the Room DAG concepts development document to clarify that we mark events as outliers because we don't have any state for them. ([\#12345](matrix-org/synapse#12345)) - Update the link to Redis pub/sub documentation in the workers documentation. ([\#12369](matrix-org/synapse#12369)) - Remove documentation for converting a legacy structured logging configuration to the new format. ([\#12392](matrix-org/synapse#12392)) Deprecations and Removals ------------------------- - Remove the unused and unstable `/aggregations` endpoint which was removed from [MSC2675](matrix-org/matrix-spec-proposals#2675). ([\#12293](matrix-org/synapse#12293)) Internal Changes ---------------- - Remove lingering unstable references to MSC2403 (knocking). ([\#12165](matrix-org/synapse#12165)) - Avoid trying to calculate the state at outlier events. ([\#12191](matrix-org/synapse#12191), [\#12316](matrix-org/synapse#12316), [\#12330](matrix-org/synapse#12330), [\#12332](matrix-org/synapse#12332), [\#12409](matrix-org/synapse#12409)) - Omit sending "offline" presence updates to application services after they are initially configured. ([\#12193](matrix-org/synapse#12193)) - Switch to using a sequence to generate AS transaction IDs. Contributed by Nick @ Beeper. If running synapse with a dedicated appservice worker, this MUST be stopped before upgrading the main process and database. ([\#12209](matrix-org/synapse#12209)) - Add missing type hints for storage. ([\#12267](matrix-org/synapse#12267)) - Add missing type definitions for scripts in docker folder. Contributed by Jorge Florian. ([\#12280](matrix-org/synapse#12280)) - Move [MSC2654](matrix-org/matrix-spec-proposals#2654) support behind an experimental configuration flag. ([\#12295](matrix-org/synapse#12295)) - Update docstrings to explain how to decipher live and historic pagination tokens. ([\#12317](matrix-org/synapse#12317)) - Add ground work for speeding up device list updates for users in large numbers of rooms. ([\#12321](matrix-org/synapse#12321)) - Fix typechecker problems exposed by signedjson 1.1.2. ([\#12326](matrix-org/synapse#12326)) - Remove the `tox` packaging job: it will be redundant once #11537 lands. ([\#12334](matrix-org/synapse#12334)) - Ignore `.envrc` for `direnv` users. ([\#12335](matrix-org/synapse#12335)) - Remove the (broadly unused, dev-only) dockerfile for pg tests. ([\#12336](matrix-org/synapse#12336)) - Remove redundant `get_success` calls in test code. ([\#12346](matrix-org/synapse#12346)) - Add type annotations for `tests/unittest.py`. ([\#12347](matrix-org/synapse#12347)) - Move single-use methods out of `TestCase`. ([\#12348](matrix-org/synapse#12348)) - Remove broken and unused development scripts. ([\#12349](matrix-org/synapse#12349), [\#12351](matrix-org/synapse#12351), [\#12355](matrix-org/synapse#12355)) - Convert `Linearizer` tests from `inlineCallbacks` to async. ([\#12353](matrix-org/synapse#12353)) - Update docstrings for `ReadWriteLock` tests. ([\#12354](matrix-org/synapse#12354)) - Refactor `Linearizer`, convert methods to async and use an async context manager. ([\#12357](matrix-org/synapse#12357)) - Fix a long-standing bug where `Linearizer`s could get stuck if a cancellation were to happen at the wrong time. ([\#12358](matrix-org/synapse#12358)) - Make `StreamToken.from_string` and `RoomStreamToken.parse` propagate cancellations instead of replacing them with `SynapseError`s. ([\#12366](matrix-org/synapse#12366)) - Add type hints to tests files. ([\#12371](matrix-org/synapse#12371)) - Allow specifying the Postgres database's port when running unit tests with Postgres. ([\#12376](matrix-org/synapse#12376)) - Remove temporary pin of signedjson<=1.1.1 that was added in Synapse 1.56.0. ([\#12379](matrix-org/synapse#12379)) - Add opentracing spans to calls to external cache. ([\#12380](matrix-org/synapse#12380)) - Lay groundwork for using `poetry` to manage Synapse's dependencies. ([\#12381](matrix-org/synapse#12381), [\#12407](matrix-org/synapse#12407), [\#12412](matrix-org/synapse#12412), [\#12418](matrix-org/synapse#12418)) - Make missing `importlib_metadata` dependency explicit. ([\#12384](matrix-org/synapse#12384), [\#12400](matrix-org/synapse#12400)) - Update type annotations for compatiblity with prometheus_client 0.14. ([\#12389](matrix-org/synapse#12389)) - Remove support for the unstable identifiers specified in [MSC3288](matrix-org/matrix-spec-proposals#3288). ([\#12398](matrix-org/synapse#12398)) - Add missing type hints to configuration classes. ([\#12402](matrix-org/synapse#12402)) - Add files used to build the Docker image used for complement testing into the Synapse repository. ([\#12404](matrix-org/synapse#12404)) - Do not include groups in the sync response when disabled. ([\#12408](matrix-org/synapse#12408)) - Improve type hints related to HTTP query parameters. ([\#12415](matrix-org/synapse#12415)) - Stop maintaining a list of lint targets. ([\#12420](matrix-org/synapse#12420)) - Make `synapse._scripts` pass type checks. ([\#12421](matrix-org/synapse#12421), [\#12422](matrix-org/synapse#12422)) - Add some type hints to datastore. ([\#12423](matrix-org/synapse#12423)) - Enable certificate checking during complement tests. ([\#12435](matrix-org/synapse#12435)) - Explicitly specify the `tls` extra for Twisted dependency. ([\#12444](matrix-org/synapse#12444))
Synapse 1.57.0 (2022-04-19) =========================== This version includes a [change](matrix-org#12209) to the way transaction IDs are managed for application services. If your deployment uses a dedicated worker for application service traffic, **it must be stopped** when the database is upgraded (which normally happens when the main process is upgraded), to ensure the change is made safely without any risk of reusing transaction IDs. See the [upgrade notes](https://github.com/matrix-org/synapse/blob/v1.57.0rc1/docs/upgrade.md#upgrading-to-v1570) for more details. No significant changes since 1.57.0rc1. Synapse 1.57.0rc1 (2022-04-12) ============================== Features -------- - Send device list changes to application services as specified by [MSC3202](matrix-org/matrix-spec-proposals#3202), using unstable prefixes. The `msc3202_transaction_extensions` experimental homeserver config option must be enabled and `org.matrix.msc3202: true` must be present in the application service registration file for device list changes to be sent. The "left" field is currently always empty. ([\matrix-org#11881](matrix-org#11881)) - Optimise fetching large quantities of missing room state over federation. ([\matrix-org#12040](matrix-org#12040)) - Offload the `update_client_ip` background job from the main process to the background worker, when using Redis-based replication. ([\matrix-org#12251](matrix-org#12251)) - Move `update_client_ip` background job from the main process to the background worker. ([\matrix-org#12252](matrix-org#12252)) - Add a module callback to react to new 3PID (email address, phone number) associations. ([\matrix-org#12302](matrix-org#12302)) - Add a configuration option to remove a specific set of rooms from sync responses. ([\matrix-org#12310](matrix-org#12310)) - Add a module callback to react to account data changes. ([\matrix-org#12327](matrix-org#12327)) - Allow setting user admin status using the module API. Contributed by Famedly. ([\matrix-org#12341](matrix-org#12341)) - Reduce overhead of restarting synchrotrons. ([\matrix-org#12367](matrix-org#12367), [\matrix-org#12372](matrix-org#12372)) - Update `/messages` to use historic pagination tokens if no `from` query parameter is given. ([\matrix-org#12370](matrix-org#12370)) - Add a module API for reading and writing global account data. ([\matrix-org#12391](matrix-org#12391)) - Support the stable `v1` endpoint for `/relations`, per [MSC2675](matrix-org/matrix-spec-proposals#2675). ([\matrix-org#12403](matrix-org#12403)) - Include bundled aggregations in search results ([MSC3666](matrix-org/matrix-spec-proposals#3666)). ([\matrix-org#12436](matrix-org#12436)) Bugfixes -------- - Fix a long-standing bug where updates to the server notices user profile (display name/avatar URL) in the configuration would not be applied to pre-existing rooms. Contributed by Jorge Florian. ([\matrix-org#12115](matrix-org#12115)) - Fix a long-standing bug where events from ignored users were still considered for bundled aggregations. ([\matrix-org#12235](matrix-org#12235), [\matrix-org#12338](matrix-org#12338)) - Fix non-member state events not resolving for historical events when used in [MSC2716](matrix-org/matrix-spec-proposals#2716) `/batch_send` `state_events_at_start`. ([\matrix-org#12329](matrix-org#12329)) - Fix a long-standing bug affecting URL previews that would generate a 500 response instead of a 403 if the previewed URL includes a port that isn't allowed by the relevant blacklist. ([\matrix-org#12333](matrix-org#12333)) - Default to `private` room visibility rather than `public` when a client does not specify one, according to spec. ([\matrix-org#12350](matrix-org#12350)) - Fix a spec compliance issue where requests to the `/publicRooms` federation API would specify `limit` as a string. ([\matrix-org#12364](matrix-org#12364), [\matrix-org#12410](matrix-org#12410)) - Fix a bug introduced in Synapse 1.49.0 which caused the `synapse_event_persisted_position` metric to have invalid values. ([\matrix-org#12390](matrix-org#12390)) Updates to the Docker image --------------------------- - Bundle locked versions of dependencies into the Docker image. ([\matrix-org#12385](matrix-org#12385), [\matrix-org#12439](matrix-org#12439)) - Fix up healthcheck generation for workers docker image. ([\matrix-org#12405](matrix-org#12405)) Improved Documentation ---------------------- - Clarify documentation for running SyTest against Synapse, including use of Postgres and worker mode. ([\matrix-org#12271](matrix-org#12271)) - Document the behaviour of `LoggingTransaction.call_after` and `LoggingTransaction.call_on_exception` methods when transactions are retried. ([\matrix-org#12315](matrix-org#12315)) - Update dead links in `check-newsfragment.sh` to point to the correct documentation URL. ([\matrix-org#12331](matrix-org#12331)) - Upgrade the version of `mdbook` in CI to 0.4.17. ([\matrix-org#12339](matrix-org#12339)) - Updates to the Room DAG concepts development document to clarify that we mark events as outliers because we don't have any state for them. ([\matrix-org#12345](matrix-org#12345)) - Update the link to Redis pub/sub documentation in the workers documentation. ([\matrix-org#12369](matrix-org#12369)) - Remove documentation for converting a legacy structured logging configuration to the new format. ([\matrix-org#12392](matrix-org#12392)) Deprecations and Removals ------------------------- - Remove the unused and unstable `/aggregations` endpoint which was removed from [MSC2675](matrix-org/matrix-spec-proposals#2675). ([\matrix-org#12293](matrix-org#12293)) Internal Changes ---------------- - Remove lingering unstable references to MSC2403 (knocking). ([\matrix-org#12165](matrix-org#12165)) - Avoid trying to calculate the state at outlier events. ([\matrix-org#12191](matrix-org#12191), [\matrix-org#12316](matrix-org#12316), [\matrix-org#12330](matrix-org#12330), [\matrix-org#12332](matrix-org#12332), [\matrix-org#12409](matrix-org#12409)) - Omit sending "offline" presence updates to application services after they are initially configured. ([\matrix-org#12193](matrix-org#12193)) - Switch to using a sequence to generate AS transaction IDs. Contributed by Nick @ Beeper. If running synapse with a dedicated appservice worker, this MUST be stopped before upgrading the main process and database. ([\matrix-org#12209](matrix-org#12209)) - Add missing type hints for storage. ([\matrix-org#12267](matrix-org#12267)) - Add missing type definitions for scripts in docker folder. Contributed by Jorge Florian. ([\matrix-org#12280](matrix-org#12280)) - Move [MSC2654](matrix-org/matrix-spec-proposals#2654) support behind an experimental configuration flag. ([\matrix-org#12295](matrix-org#12295)) - Update docstrings to explain how to decipher live and historic pagination tokens. ([\matrix-org#12317](matrix-org#12317)) - Add ground work for speeding up device list updates for users in large numbers of rooms. ([\matrix-org#12321](matrix-org#12321)) - Fix typechecker problems exposed by signedjson 1.1.2. ([\matrix-org#12326](matrix-org#12326)) - Remove the `tox` packaging job: it will be redundant once matrix-org#11537 lands. ([\matrix-org#12334](matrix-org#12334)) - Ignore `.envrc` for `direnv` users. ([\matrix-org#12335](matrix-org#12335)) - Remove the (broadly unused, dev-only) dockerfile for pg tests. ([\matrix-org#12336](matrix-org#12336)) - Remove redundant `get_success` calls in test code. ([\matrix-org#12346](matrix-org#12346)) - Add type annotations for `tests/unittest.py`. ([\matrix-org#12347](matrix-org#12347)) - Move single-use methods out of `TestCase`. ([\matrix-org#12348](matrix-org#12348)) - Remove broken and unused development scripts. ([\matrix-org#12349](matrix-org#12349), [\matrix-org#12351](matrix-org#12351), [\matrix-org#12355](matrix-org#12355)) - Convert `Linearizer` tests from `inlineCallbacks` to async. ([\matrix-org#12353](matrix-org#12353)) - Update docstrings for `ReadWriteLock` tests. ([\matrix-org#12354](matrix-org#12354)) - Refactor `Linearizer`, convert methods to async and use an async context manager. ([\matrix-org#12357](matrix-org#12357)) - Fix a long-standing bug where `Linearizer`s could get stuck if a cancellation were to happen at the wrong time. ([\matrix-org#12358](matrix-org#12358)) - Make `StreamToken.from_string` and `RoomStreamToken.parse` propagate cancellations instead of replacing them with `SynapseError`s. ([\matrix-org#12366](matrix-org#12366)) - Add type hints to tests files. ([\matrix-org#12371](matrix-org#12371)) - Allow specifying the Postgres database's port when running unit tests with Postgres. ([\matrix-org#12376](matrix-org#12376)) - Remove temporary pin of signedjson<=1.1.1 that was added in Synapse 1.56.0. ([\matrix-org#12379](matrix-org#12379)) - Add opentracing spans to calls to external cache. ([\matrix-org#12380](matrix-org#12380)) - Lay groundwork for using `poetry` to manage Synapse's dependencies. ([\matrix-org#12381](matrix-org#12381), [\matrix-org#12407](matrix-org#12407), [\matrix-org#12412](matrix-org#12412), [\matrix-org#12418](matrix-org#12418)) - Make missing `importlib_metadata` dependency explicit. ([\matrix-org#12384](matrix-org#12384), [\matrix-org#12400](matrix-org#12400)) - Update type annotations for compatiblity with prometheus_client 0.14. ([\matrix-org#12389](matrix-org#12389)) - Remove support for the unstable identifiers specified in [MSC3288](matrix-org/matrix-spec-proposals#3288). ([\matrix-org#12398](matrix-org#12398)) - Add missing type hints to configuration classes. ([\matrix-org#12402](matrix-org#12402)) - Add files used to build the Docker image used for complement testing into the Synapse repository. ([\matrix-org#12404](matrix-org#12404)) - Do not include groups in the sync response when disabled. ([\matrix-org#12408](matrix-org#12408)) - Improve type hints related to HTTP query parameters. ([\matrix-org#12415](matrix-org#12415)) - Stop maintaining a list of lint targets. ([\matrix-org#12420](matrix-org#12420)) - Make `synapse._scripts` pass type checks. ([\matrix-org#12421](matrix-org#12421), [\matrix-org#12422](matrix-org#12422)) - Add some type hints to datastore. ([\matrix-org#12423](matrix-org#12423)) - Enable certificate checking during complement tests. ([\matrix-org#12435](matrix-org#12435)) - Explicitly specify the `tls` extra for Twisted dependency. ([\matrix-org#12444](matrix-org#12444))
Warning: this explanation is a bit complex. Let me know if by reading to the end of it it'd be easier to explain over voice.
An off-by-one error. Pulled out of #11881, though I felt it deserved a bit more attention than being buried in there.
The
application_services_state
table is used, among other things, to track the last stream token (5, 6, 7...) of a given stream type (presence, typing, device lists, etc.) that a given application service has previously been informed of. TheApplicationServiceHandler
then uses this to determine the set (from the stored stream token up to the latest known stream token) of new presence updates, typing notifications, etc. to send to the application service.For example, say this is the stream of presence updates that have occurred:
the
application_services_state
helps track which events have already been sent out to a given appservice:with this, the
ApplicationServiceHandler
knows that it needs to send out the presence updates with stream tokens 4-8 the next time.I noticed an edge case during #11881, in that the default value for a stream token across Synapse is 1:
StreamtokenGenerator
sequences start atstep
(defined as 1 for everything except backfill):synapse/synapse/storage/util/id_generators.py
Lines 86 to 89 in 0326888
MultiWritertokenGenerator
sequences also start at 1 (though the first entity will actually start at 2):synapse/synapse/storage/util/id_generators.py
Lines 344 to 349 in 10a88ba
As an aside, when we create an event, we usually ask for the next available ID, which is initially 2. So the minimum stream token for any entity is 2.
This is inconsistent with
get_type_stream_id_for_appservice
, which assumes stream tokens start from 0 by default:synapse/synapse/storage/databases/main/appservice.py
Lines 448 to 449 in 2cc5ea9
Why would that be a problem? Well, it turns out some bits of the codebase assume stream token positions start at 1, and checking against 0 can cause bugs. I ran into one here:
synapse/synapse/storage/databases/main/devices.py
Lines 702 to 708 in 53f9d01
where we call
get_all_entities_changed
with a minimum stream token to figure out which user tokens have had their device lists changed. This method has a conditional here:synapse/synapse/util/caches/stream_change_cache.py
Lines 147 to 156 in f8d0f72
that we kept unintentionally hitting, as
self._earliest_known_stream_pos
is 1 by default (it is derived from the current stream token when the homeserver starts). Thus this function was returningNone
, even though there was a device list change (at stream token 2) that had occurred, and that our appservice should know about.The impact of this is that the appservice would never receive word of that first device list change, as
from_key
would then be incremented.For any positive impacts on existing EDU -> AS code, I can confirm that presence's logic has a similar bug:
from_key
would be0
here:synapse/synapse/handlers/presence.py
Lines 1630 to 1660 in e24ff8e
which looks like it would lead to sending all known presence states to the AS (#10836?).
Typing didn't have this bug as we don't record stream tokens for that, and receipts is not affected either as it just applies the range to the database without a stream cache potentially bailing out in the middle.