-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Instrument _check_sigs_and_hash_and_fetch
to trace time spent in child concurrent calls
#13588
Instrument _check_sigs_and_hash_and_fetch
to trace time spent in child concurrent calls
#13588
Conversation
_check_sigs_and_hash_and_fetch
_check_sigs_and_hash_and_fetch
to trace time spent in child concurrent calls
logger.warning("Event contains spam, soft-failing %s", pdu.event_id) | ||
log_kv( | ||
{ | ||
"message": "Event contains spam, redacting (to save disk space) " | ||
"as well as soft-failing (to stop using the event in prev_events)", | ||
"event_id": pdu.event_id, | ||
} | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is very interesting. I wonder if we're seeing some of the backfilled messages as spam, soft-failing them, which causes the Unable to get missing prev_event $luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M
error when trying to figure out _get_state_ids_after_missing_prev_event
.
We should definitely skip trying to backfill events over and over when we see them as spam.
But also there is legit events that reference the spam as prev_events
and we will fail to backfill them at the moment 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look like reasonable debug points to me
Thanks for the review @reivilibre 🐭 |
Synapse 1.67.0 (2022-09-13) =========================== This release removes using the deprecated direct TCP replication configuration for workers. Server admins should use Redis instead. See the [upgrade notes](https://matrix-org.github.io/synapse/v1.67/upgrade.html#upgrading-to-v1670). The minimum version of `poetry` supported for managing source checkouts is now 1.2.0. **Notice:** from the next major release (1.68.0) installing Synapse from a source checkout will require a recent Rust compiler. Those using packages or `pip install matrix-synapse` will not be affected. See the [upgrade notes](https://matrix-org.github.io/synapse/v1.67/upgrade.html#upgrading-to-v1670). **Notice:** from the next major release (1.68.0), running Synapse with a SQLite database will require SQLite version 3.27.0 or higher. (The [current minimum version is SQLite 3.22.0](https://github.com/matrix-org/synapse/blob/release-v1.67/synapse/storage/engines/sqlite.py#L69-L78).) See [matrix-org#12983](matrix-org#12983) and the [upgrade notes](https://matrix-org.github.io/synapse/v1.67/upgrade.html#upgrading-to-v1670) for more details. No significant changes since 1.67.0rc1. Synapse 1.67.0rc1 (2022-09-06) ============================== Features -------- - Support setting the registration shared secret in a file, via a new `registration_shared_secret_path` configuration option. ([\matrix-org#13614](matrix-org#13614)) - Change the default startup behaviour so that any missing "additional" configuration files (signing key, etc) are generated automatically. ([\matrix-org#13615](matrix-org#13615)) - Improve performance of sending messages in rooms with thousands of local users. ([\matrix-org#13634](matrix-org#13634)) Bugfixes -------- - Fix a bug introduced in Synapse 1.13 where the [List Rooms admin API](https://matrix-org.github.io/synapse/develop/admin_api/rooms.html#list-room-api) would return integers instead of booleans for the `federatable` and `public` fields when using a Sqlite database. ([\matrix-org#13509](matrix-org#13509)) - Fix bug that user cannot `/forget` rooms after the last member has left the room. ([\matrix-org#13546](matrix-org#13546)) - Faster Room Joins: fix `/make_knock` blocking indefinitely when the room in question is a partial-stated room. ([\matrix-org#13583](matrix-org#13583)) - Fix loading the current stream position behind the actual position. ([\matrix-org#13585](matrix-org#13585)) - Fix a longstanding bug in `register_new_matrix_user` which meant it was always necessary to explicitly give a server URL. ([\matrix-org#13616](matrix-org#13616)) - Fix the running of [MSC1763](matrix-org/matrix-spec-proposals#1763) retention purge_jobs in deployments with background jobs running on a worker by forcing them back onto the main worker. Contributed by Brad @ Beeper. ([\matrix-org#13632](matrix-org#13632)) - Fix a long-standing bug that downloaded media for URL previews was not deleted while database background updates were running. ([\matrix-org#13657](matrix-org#13657)) - Fix [MSC3030](matrix-org/matrix-spec-proposals#3030) `/timestamp_to_event` endpoint to return the correct next event when the events have the same timestamp. ([\matrix-org#13658](matrix-org#13658)) - Fix bug where we wedge media plugins if clients disconnect early. Introduced in v1.22.0. ([\matrix-org#13660](matrix-org#13660)) - Fix a long-standing bug which meant that keys for unwhitelisted servers were not returned by `/_matrix/key/v2/query`. ([\matrix-org#13683](matrix-org#13683)) - Fix a bug introduced in Synapse v1.20.0 that would cause the unstable unread counts from [MSC2654](matrix-org/matrix-spec-proposals#2654) to be calculated even if the feature is disabled. ([\matrix-org#13694](matrix-org#13694)) Updates to the Docker image --------------------------- - Update docker image to use a stable version of poetry. ([\matrix-org#13688](matrix-org#13688)) Improved Documentation ---------------------- - Improve the description of the ["chain cover index"](https://matrix-org.github.io/synapse/latest/auth_chain_difference_algorithm.html) used internally by Synapse. ([\matrix-org#13602](matrix-org#13602)) - Document how ["monthly active users"](https://matrix-org.github.io/synapse/latest/usage/administration/monthly_active_users.html) is calculated and used. ([\matrix-org#13617](matrix-org#13617)) - Improve documentation around user registration. ([\matrix-org#13640](matrix-org#13640)) - Remove documentation of legacy `frontend_proxy` worker app. ([\matrix-org#13645](matrix-org#13645)) - Clarify documentation that HTTP replication traffic can be protected with a shared secret. ([\matrix-org#13656](matrix-org#13656)) - Remove unintentional colons from [config manual](https://matrix-org.github.io/synapse/latest/usage/configuration/config_documentation.html) headers. ([\matrix-org#13665](matrix-org#13665)) - Update docs to make enabling metrics more clear. ([\matrix-org#13678](matrix-org#13678)) - Clarify `(room_id, event_id)` global uniqueness and how we should scope our database schemas. ([\matrix-org#13701](matrix-org#13701)) Deprecations and Removals ------------------------- - Drop support for calling `/_matrix/client/v3/rooms/{roomId}/invite` without an `id_access_token`, which was not permitted by the spec. Contributed by @Vetchu. ([\matrix-org#13241](matrix-org#13241)) - Remove redundant `_get_joined_users_from_context` cache. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13569](matrix-org#13569)) - Remove the ability to use direct TCP replication with workers. Direct TCP replication was deprecated in Synapse v1.18.0. Workers now require using Redis. ([\matrix-org#13647](matrix-org#13647)) - Remove support for unstable [private read receipts](matrix-org/matrix-spec-proposals#2285). ([\matrix-org#13653](matrix-org#13653), [\matrix-org#13692](matrix-org#13692)) Internal Changes ---------------- - Extend the release script to wait for GitHub Actions to finish and to be usable as a guide for the whole process. ([\matrix-org#13483](matrix-org#13483)) - Add experimental configuration option to allow disabling legacy Prometheus metric names. ([\matrix-org#13540](matrix-org#13540)) - Cache user IDs instead of profiles to reduce cache memory usage. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13573](matrix-org#13573), [\matrix-org#13600](matrix-org#13600)) - Optimize how Synapse calculates domains to fetch from during backfill. ([\matrix-org#13575](matrix-org#13575)) - Comment about a better future where we can get the state diff between two events. ([\matrix-org#13586](matrix-org#13586)) - Instrument `_check_sigs_and_hash_and_fetch` to trace time spent in child concurrent calls for understandable traces in Jaeger. ([\matrix-org#13588](matrix-org#13588)) - Improve performance of `@cachedList`. ([\matrix-org#13591](matrix-org#13591)) - Minor speed up of fetching large numbers of push rules. ([\matrix-org#13592](matrix-org#13592)) - Optimise push action fetching queries. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13597](matrix-org#13597)) - Rename `event_map` to `unpersisted_events` when computing the auth differences. ([\matrix-org#13603](matrix-org#13603)) - Refactor `get_users_in_room(room_id)` mis-use with dedicated `get_current_hosts_in_room(room_id)` function. ([\matrix-org#13605](matrix-org#13605)) - Use dedicated `get_local_users_in_room(room_id)` function to find local users when calculating `join_authorised_via_users_server` of a `/make_join` request. ([\matrix-org#13606](matrix-org#13606)) - Refactor `get_users_in_room(room_id)` mis-use to lookup single local user with dedicated `check_local_user_in_room(...)` function. ([\matrix-org#13608](matrix-org#13608)) - Drop unused column `application_services_state.last_txn`. ([\matrix-org#13627](matrix-org#13627)) - Improve readability of Complement CI logs by printing failure results last. ([\matrix-org#13639](matrix-org#13639)) - Generalise the `@cancellable` annotation so it can be used on functions other than just servlet methods. ([\matrix-org#13662](matrix-org#13662)) - Introduce a `CommonUsageMetrics` class to share some usage metrics between the Prometheus exporter and the phone home stats. ([\matrix-org#13671](matrix-org#13671)) - Add some logging to help track down matrix-org#13444. ([\matrix-org#13679](matrix-org#13679)) - Update poetry lock file for v1.2.0. ([\matrix-org#13689](matrix-org#13689)) - Add cache to `is_partial_state_room`. ([\matrix-org#13693](matrix-org#13693)) - Update the Grafana dashboard that is included with Synapse in the `contrib` directory. ([\matrix-org#13697](matrix-org#13697)) - Only run trial CI on all python versions on non-PRs. ([\matrix-org#13698](matrix-org#13698)) - Fix typechecking with latest types-jsonschema. ([\matrix-org#13712](matrix-org#13712)) - Reduce number of CI checks we run for PRs. ([\matrix-org#13713](matrix-org#13713)) # -----BEGIN PGP SIGNATURE----- # # iQFEBAABCgAuFiEEBTGR3/RnAzBGUif3pULk7RsPrAkFAmMgR2QQHGVyaWtAbWF0 # cml4Lm9yZwAKCRClQuTtGw+sCfG7B/94PwW1ChsaI8hkz/3e+93PEl/mNJ6YFaEB # 5pP4Dh/0dipP/iKbpgNuj5xz/JFnIi8D49A8sKNnku3jk0/8AZHgqDiBgOkrN76z # Y3awo5Q9ag4xww/105V3bhdnX1NrX8Avf6F2jchDv6/9q8wQHGBPg6DMgfZ/m/BL # SB4dypbbNpgLykuwtWxx6YMUYH+trsXJOn/MoAqld3QcZsqkDR25wXCt9+Dr+6AT # dPd/czi8kV8ruU59tf2K5HB7XKzBW9S3Qb3dJJmGOTTJ7ccUkN/XuTwqnII950Mo # bSlMXjY2hqk8rKUNhGZpi9bqUkwNhMgOkZl9A0Y1XtsXx6yjy0T/ # =zSGi # -----END PGP SIGNATURE----- # gpg: Signature made Tue Sep 13 10:03:32 2022 BST # gpg: using RSA key 053191DFF4670330465227F7A542E4ED1B0FAC09 # gpg: issuer "erik@matrix.org" # gpg: Can't check signature: No public key # Conflicts: # synapse/config/experimental.py # synapse/push/bulk_push_rule_evaluator.py # synapse/storage/databases/main/event_push_actions.py # synapse/util/caches/descriptors.py
Instrument
_check_sigs_and_hash_and_fetch
to trace time spent in child concurrent calls because I've see_check_sigs_and_hash_and_fetch
take 10.41s to process 100 events.Fix #13587
Part of #13356
Interestingly, each
pdu
takes incrementally longer as you go,Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.Pull request includes a sign off(run the linters)