-
Notifications
You must be signed in to change notification settings - Fork 411
MSC3391: API to delete account data #3391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
150b07b
faacd53
1864715
9c1ac3f
861933a
cd78ac6
60b1bd3
88f7f7c
da1328a
333264f
ae1785a
75b1ade
edc0d54
7516cf6
e2fce0a
0653f34
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
# MSC3391: Deleting account data | ||
ShadowJonathan marked this conversation as resolved.
Show resolved
Hide resolved
anoadragon453 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Currently, in Matrix, there is no semantic way to "remove" or "delete" account data. There is only a | ||
method to create and update it, them being effectively the same operation, but no method to "remove" | ||
it. | ||
|
||
This violates the CRUD principle, and disallows "cleaning up" account data, or de-reference certain | ||
keys used for certain actions, as then they will stay around forever. | ||
|
||
## Proposal | ||
|
||
This proposal aims to: | ||
- Define "`{}`" account data content as effectively "deleted", alongside out-of-band deletion semantics. | ||
- Allow `DELETE` HTTP methods on account data endpoints to perform the deletion. | ||
Comment on lines
+13
to
+14
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not entirely clear to me that we should have both of these mechanisms. My experience is that, where the protocol allows two ways of doing a thing, they will end up being subtly different somehow and it leads to compatibility problems (eg, one server implementation will forget to implement DELETE because the client they test with only uses PUT). Given that we don't have any special server-side semantics for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In this MSC, The primary benefit of using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As an alternative, should we define |
||
|
||
The first, defining `{}` as "deleted", is to codify this defacto "nullification" into the spec as | ||
expected behavior for deletion. This paves the way for better client interpretation of that data, | ||
and would allow backwards compatibility when clients do `PUT` `{}` as account data. (More on this in | ||
the dedicated section) | ||
|
||
### Endpoints | ||
ShadowJonathan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Account data and room data currently have the following endpoints: | ||
|
||
Global Account Data, | ||
[`GET`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixclientv3useruseridaccount_datatype) | ||
and | ||
[`PUT`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3useruseridaccount_datatype): | ||
``` | ||
GET /_matrix/client/v3/user/{userId}/account_data/{type} | ||
|
||
PUT /_matrix/client/v3/user/{userId}/account_data/{type} | ||
``` | ||
|
||
And Room-specific Account Data, | ||
[`GET`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixclientv3useruseridroomsroomidaccount_datatype) | ||
and | ||
[`PUT`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3useruseridroomsroomidaccount_datatype): | ||
``` | ||
GET /_matrix/client/v3/user/{userId}/rooms/{roomId}/account_data/{type} | ||
|
||
PUT /_matrix/client/v3/user/{userId}/rooms/{roomId}/account_data/{type} | ||
``` | ||
|
||
This proposal adds the following two endpoints (with no request body): | ||
``` | ||
DELETE /_matrix/client/v3/user/{userId}/account_data/{type} | ||
|
||
DELETE /_matrix/client/v3/user/{userId}/rooms/{roomId}/account_data/{type} | ||
``` | ||
ShadowJonathan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
These, respectively, remove account-wide account data, and room-scoped account data. | ||
ShadowJonathan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
For idempotency reasons, these endpoints always return `200 OK`, with an empty JSON body `{}`. | ||
|
||
For any account data key that cannot be set (like `m.fully_read`), the delete API will have a likewise error response. | ||
|
||
These endpoints are authenticated, and can be rate-limited. | ||
|
||
#### Deleted account data responses | ||
|
||
Furthermore, when a client deletes account data, it must expect the `GET` methods above to return a 404 on | ||
the next request. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd like to document our particularly complex portion of the work when implementing this MSC in Synapse. Not a blocker or an issue with the feature (edit: and the problem is not unique to this feature either). After an account data entry has been deleted, that change must be broadcast to all devices. Once that's been done, it would be ideal if we could remove the record of that account data type ever being set in the homeserver (for reduction of user data and (slight) storage benefits). This gets a little complicated when you try to quantify at what point all devices have a user have seen the delete. One way to do this (what is currently planned for Synapse) is to track the devices that haven't yet seen the delete, and then tick them off the list once they have. But what does it mean for a device to have seen an account data change? If they have And what if a device requests All of this falls squarely into the area of implementation - but it may be useful to other homeserver implementations. None of this is an issue if the implementation chooses to keep "deleted" account data entries in their database forever. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Presumably servers can implement the same logic as for when they need to figure out whether a device has seen a to-device message (so that the server can delete it)? Especially as to-device messages have a better case for robustness than account data. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point - it is indeed very similar. The only difference is that account data has this extra way of being accessed (via There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I'm not convinced it does, entirely. It seems important to me that we specify when exactly a client can expect the 'deleted' account data to be served. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I might be missing something obvious but how much is this problem different from any other change in account data? What if it's a good ol' PUT and some other device crashes at processing the /sync batch that has this change? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ShadowJonathan this thread appears to be unresolved, and I think it's blocking FCP from starting. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Can we even tie access tokens to device IDs like this, or do we assume this per access token? That brings up a whole can of worms for me too, and so i wish to assume that deletion of account data be a best-effort case only. I think that enumerating and deleting account data only if all access tokens have seen it is not ideal from a performance perspective, since stale access tokens can "block" deletion of data like this, and it'd require a whole tracking system for what-access-token-has-seen-what in implementations, which i'm not aware exists right now. Ultimately i'd wish to lean on @anoadragon453 for opinion on this; I think that However, what about clients that are offline? Do we expect implementations to internally store tombstones to their stream tokens, to notify remaining access tokens of such a delete? Won't that go against the storage savings, or will it increase performance as it only queries a "small" table every time such a sync happens, to search for such tombstones? What if a client re-uses an old sync token, after it has restarted sync, or used an additional sync? This behaviour is undefined, and with the case of deleting data, even the reference of itself that it has been deleted, it might become problematic. And i concur with @richvdh, this is not implementation-specific, as then different implementations will have different ways of specifying how and when data is deleted. Imo, it should be tied to |
||
|
||
#### Backwards Compatibility | ||
|
||
For backwards compatibility reasons, if a client `PUT`s `{}` for account data, that must be seen | ||
equivalent as `DELETE`-ing that account data. | ||
|
||
### Sync | ||
|
||
Account Data changes are announced through sync; this proposal also aims to change this response slightly after account data deletion. | ||
|
||
On incremental syncs (sync with `since`), in paths `account_data.events` and `rooms.join.{room_id}.account_data.events`, | ||
a `{}` for event content must be interpreted as a deletion by the client. | ||
|
||
These only occur in incremental syncs. An initial sync (without `sync`) must never contain keys with content `{}`, | ||
even if the delete has just occurred. | ||
|
||
## Alternatives | ||
|
||
Instead of sending down deletions through `.events` as `{}`, we could use a new `.deleted_events` | ||
to send down the keys of deleted events. | ||
|
||
anoadragon453 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
## Security considerations | ||
|
||
No considerable security problems, other than the fact that a user can potentially delete important data. | ||
|
||
## Unstable prefix | ||
|
||
When implementing this proposal, servers should use the `org.matrix.msc3391` unstable prefix: | ||
|
||
``` | ||
DELETE /_matrix/client/unstable/org.matrix.msc3391/user/{userId}/account_data/{type} | ||
|
||
DELETE /_matrix/client/unstable/org.matrix.msc3391/user/{userId}/rooms/{roomId}/account_data/{type} | ||
``` | ||
|
||
The `unstable_features` key for this MSC is `org.matrix.msc3391`, it has to be set to `true`. | ||
|
||
Any other value or absence of this key will signal the absence of support for this feature. |
Uh oh!
There was an error while loading. Please reload this page.