Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

When a remote user leaves the last room shared with the local homeserver, their cached device list is not invalidated #13651

Open
squahtx opened this issue Aug 26, 2022 · 2 comments · Fixed by #13749
Assignees
Labels
A-Device-List-Tracking Telling clients about other devices. Often related to E2EE. O-Occasional Affects or can be seen by some users regularly or most users rarely S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@squahtx
Copy link
Contributor

squahtx commented Aug 26, 2022

and /keys/query continues to return stale information, even after the remote user rejoins the room.

  1. Have a room with one local user, @alice:local and one remote user, @bob:remote.
  2. Have @alice:local do a /keys/query request to cache @bob:remote's device list.
  3. Have @bob:remote leave the room.
  4. @alice:local observes @bob:remote in the device_lists.left section of /sync.
  5. (bad) If @bob:remote then updates their device list and @alice:local does a /keys/query request, a stale response is returned.
  6. @bob:remote rejoins the room.
  7. @alice:local observes @bob:remote in the device_lists.changed section of /sync.
  8. (very bad) If @alice:local does a /keys/query request, a stale response is still returned.
@squahtx squahtx added A-Device-List-Tracking Telling clients about other devices. Often related to E2EE. S-Major Major functionality / product severely impaired, no satisfactory workaround. O-Occasional Affects or can be seen by some users regularly or most users rarely T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. labels Aug 26, 2022
@squahtx squahtx self-assigned this Aug 26, 2022
@squahtx squahtx changed the title When a remote user leaves the last room shared with a local user, their cached device list is not invalidated When a remote user leaves the last room shared with the local homeserver, their cached device list is not invalidated Aug 30, 2022
@squahtx
Copy link
Contributor Author

squahtx commented Sep 9, 2022

There is a complement test covering this bug in: matrix-org/complement#459

squahtx added a commit that referenced this issue Sep 14, 2022
When a remote user leaves the last room shared with the homeserver, we
have to mark their device list as unsubscribed, otherwise we would hold
on to a stale device list in our cache. Crucially, the device list would
remain cached even after the remote user rejoined the room, which could
lead to E2EE failures until the next change to the remote user's device
list.

Fixes #13651.

Signed-off-by: Sean Quah <seanq@matrix.org>
@squahtx
Copy link
Contributor Author

squahtx commented Sep 16, 2022

This is fixed for future leaves of remote users. However, existing homeservers still retain possibly-stale device lists in their cache. We will need a background job to mark them all as unsubscribed (like mark_remote_user_device_list_as_unsubscribed does).

#13749 added logging for when a local user queries the devices of a remote user we no longer share any rooms with and we find that we have a device list cached. That logging is triggering for ~2% of /keys/query requests.

Returning stale data for remote users which have left a room is ok, since clients presumably want to decrypt historical messages rather than new ones. Things only go wrong once the remote user rejoins a room, because we will continue serving up stale data.

As a result of #13749, we have inadvertently created a new failure mode where historical messages are undecryptable when the departed remote user's homeserver is unavailable. We possibly need to be a bit more clever and serve up the cached data when:

  • we fail to retrieve it from the remote user's homeserver and
  • the remote user no longer shares any rooms with us and
  • we believe the cached data was up to date at the time the remote user stopped sharing any rooms with us (no entry in the device_lists_remote_resync table)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Device-List-Tracking Telling clients about other devices. Often related to E2EE. O-Occasional Affects or can be seen by some users regularly or most users rarely S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant