Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Complement TestPartialStateJoin/Can_change_display_name_during_partial_state_join is flakey #15086

Open
DMRobertson opened this issue Feb 16, 2023 · 5 comments
Labels
A-Federated-Join joins over federation generally suck A-Profiles Displaynames, avatars A-Testing Issues related to testing in complement, synapse, etc T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. Z-Dev-Wishlist Makes developers' lives better, but doesn't have direct user impact Z-Flake Tests that give intermittent failures

Comments

@DMRobertson
Copy link
Contributor

Test added recently in matrix-org/complement#567

Failed With workers: https://github.com/matrix-org/synapse/actions/runs/4196627901/jobs/7277955660

   federation_room_join_partial_state_test.go:3407: Display name change event not received after one second

Maybe bump the timeout? 🤷

@DMRobertson DMRobertson added T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. Z-Flake Tests that give intermittent failures A-Testing Issues related to testing in complement, synapse, etc A-Profiles Displaynames, avatars labels Feb 16, 2023
@squahtx
Copy link
Contributor

squahtx commented Feb 17, 2023

It looks like the federation sender for the complement hostname:port in question was stuck in a retry-backoff from an earlier test. So raising the timeout may not be sufficient to deflake this failure.

2023-02-16T18:15:17.1268502Z federation_sender1 | 2023-02-16 17:58:24,592 - synapse.federation.sender.transaction_manager - 121 - INFO - federation_transaction_transmission_loop-69 - TX [host.docker.internal:33229] {1676570133846} Sending transaction [1676570133846], (PDUs: 0, EDUs: 1)

2023-02-16T18:15:17.1317260Z federation_sender1 | 2023-02-16 17:58:24,896 - synapse.http.federation.matrix_federation_agent - 363 - INFO - federation_transaction_transmission_loop-69 - Failed to connect to host.docker.internal:33229: Connection was refused by other side: 111: Connection refused.
2023-02-16T18:15:17.1318122Z federation_sender1 | 2023-02-16 17:58:24,896 - synapse.http.matrixfederationclient - 672 - INFO - federation_transaction_transmission_loop-69 - {PUT-O-73} [host.docker.internal:33229] Request failed: PUT matrix://host.docker.internal:33229/_matrix/federation/v1/send/1676570133846: ConnectionRefusedError('Connection refused')

2023-02-16T18:15:17.1506818Z federation_sender1 | 2023-02-16 17:58:29,579 - synapse.http.federation.matrix_federation_agent - 363 - INFO - federation_transaction_transmission_loop-69 - Failed to connect to host.docker.internal:33229: Connection was refused by other side: 111: Connection refused.
2023-02-16T18:15:17.1507631Z federation_sender1 | 2023-02-16 17:58:29,579 - synapse.http.matrixfederationclient - 672 - INFO - federation_transaction_transmission_loop-69 - {PUT-O-73} [host.docker.internal:33229] Request failed: PUT matrix://host.docker.internal:33229/_matrix/federation/v1/send/1676570133846: ConnectionRefusedError('Connection refused')

2023-02-16T18:15:17.2006041Z master             | 2023-02-16 17:58:45,894 - synapse.replication.http.membership - 89 - INFO - POST-525 - remote_join: @t45alice:hs1 into room: !0-XCj272ZvqAN6H5W06J:host.docker.internal:33229

2023-02-16T18:15:17.2036181Z event_creator1     | 2023-02-16 17:58:46,435 - synapse.access.http.18013 - 460 - INFO - PUT-155 - ::ffff:127.0.0.1 - 18013 - {@t45alice:hs1} Processed request: 0.210sec/0.002sec (0.008sec, 0.021sec) (0.011sec/0.067sec/22) 2B 200 "PUT /_matrix/client/v3/profile/@t45alice:hs1/displayname HTTP/1.0" "Go-http-client/1.1" [4 dbevts]

2023-02-16T18:15:17.2086530Z federation_sender1 | 2023-02-16 17:58:49,380 - synapse.http.federation.matrix_federation_agent - 363 - INFO - federation_transaction_transmission_loop-69 - Failed to connect to host.docker.internal:33229: Connection was refused by other side: 111: Connection refused.
2023-02-16T18:15:17.2087337Z federation_sender1 | 2023-02-16 17:58:49,380 - synapse.http.matrixfederationclient - 672 - INFO - federation_transaction_transmission_loop-69 - {PUT-O-73} [host.docker.internal:33229] Request failed: PUT matrix://host.docker.internal:33229/_matrix/federation/v1/send/1676570133846: ConnectionRefusedError('Connection refused')

Note that all three "Failed to connect" log lines are for the same PUT-O-73 transaction from before the failing test case.

@squahtx squahtx added A-Federated-Join joins over federation generally suck Z-Dev-Wishlist Makes developers' lives better, but doesn't have direct user impact labels Feb 17, 2023
@DMRobertson
Copy link
Contributor Author

I wonder if we can get Synapse to clear out its backoff data when a new deployment is made?

@DMRobertson
Copy link
Contributor Author

https://github.com/matrix-org/synapse/actions/runs/4215073441/jobs/7315978888

time="2023-02-19T07:29:44Z" level=error msg="Failed to fetch key for server" context=missing error="Post "[https://127.0.0.1:33010/_matrix/key/v2/query](https://127.0.0.1:33010/_matrix/key/v2/query/)": context canceled" fetcher=DirectKeyFetcher
2023/02/19 07:29:44 complement: Transaction '1676791589957': HTTP Code 401. Invalid http request: {Invalid request signature}

Might be the same problem?

@squahtx squahtx changed the title Complement Can_change_display_name_during_partial_state_join is flakey Complement TestPartialStateJoin/Can_change_display_name_during_partial_state_join is flakey Feb 21, 2023
@squahtx
Copy link
Contributor

squahtx commented Mar 29, 2023

Presumed to be fixed by matrix-org/complement#626.

@squahtx squahtx closed this as completed Mar 29, 2023
@anoadragon453
Copy link
Member

Unfortunately, I just experienced a flake on this test in https://github.com/matrix-org/synapse/actions/runs/4893678003/jobs/8737133564?pr=15544#step:7:14018 (the PR branch is only 2 commits behind current develop).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Federated-Join joins over federation generally suck A-Profiles Displaynames, avatars A-Testing Issues related to testing in complement, synapse, etc T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. Z-Dev-Wishlist Makes developers' lives better, but doesn't have direct user impact Z-Flake Tests that give intermittent failures
Projects
None yet
Development

No branches or pull requests

3 participants