-
Notifications
You must be signed in to change notification settings - Fork 405
MSC3983: Sending One-Time Key (OTK) claims to appservices #3983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
f82e463
90006ad
b128030
0630814
bbc79ab
91e6d02
0b9f6d9
8a1dd86
5a4a6c6
c770168
a95bf4f
2177b93
052d480
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,6 @@ | ||
[default] | ||
check-filename = true | ||
|
||
[default.extend-identifiers] | ||
OTK = "OTK" | ||
OTKs = "OTKs" |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,180 @@ | ||||||
# MSC3983: Sending One-Time Key (OTK) claims to appservices | ||||||
|
||||||
Presently in Matrix, the public portion of OTKs are [uploaded](https://spec.matrix.org/v1.6/client-server-api/#uploading-keys) | ||||||
to the homeserver to ensure other devices can encrypt new messages without requiring the device to | ||||||
be online and responsive. This works for devices operating exclusively over the Client-Server API, | ||||||
however [appservices](https://spec.matrix.org/v1.6/application-service-api/) looking to support | ||||||
encryption (through [MSC3202](https://github.com/matrix-org/matrix-spec-proposals/pull/3202) or | ||||||
similar) could have millions or billions of users on them, which can easily translate to quite a | ||||||
few public keys needing to be uploaded to the homeserver. | ||||||
|
||||||
Given appservices *generally* have an uptime which is equivalent to the homeserver itself, and will | ||||||
have already stored the public portion of its OTKs somewhere, we can save a bit of duplication by | ||||||
having the homeserver delegate [`/keys/claim`](https://spec.matrix.org/v1.6/client-server-api/#post_matrixclientv3keysclaim) | ||||||
requests to the appservice. | ||||||
|
||||||
In numbers, a conservative estimate for an interoperable messaging bridge (appservice) would be | ||||||
500 million users. Each user generates between 50 and 100 OTKs, so we'll pick the low end at 50. | ||||||
That's 25 **billion** public keys. Currently in Matrix, that means the appservice stores 25 billion | ||||||
keys and the homeserver stores a copy of those 25 billion keys. | ||||||
|
||||||
This proposal introduces a mechanism for saving the homeserver from duplicating 25 billion keys. | ||||||
|
||||||
## Background | ||||||
|
||||||
Appservices can register a [namespace](https://spec.matrix.org/v1.6/application-service-api/#registration) | ||||||
of users either exclusively (no one else can register users matching the regex) or implicitly (the | ||||||
appservice receives events about those users, but can't prevent registration). Implicit namespaces | ||||||
can be shared across multiple appservices. | ||||||
|
||||||
## Proposal | ||||||
|
||||||
For users under an appservice's explicit namespace, if that user has no unused OTKs (excluding fallback | ||||||
keys) on the homeserver, the homeserver proxies the following APIs to the appservice using the new | ||||||
API described below: | ||||||
* [`/_matrix/client/v3/keys/claim`](https://spec.matrix.org/v1.6/client-server-api/#post_matrixclientv3keysclaim) | ||||||
* [`/_matrix/federation/v1/user/keys/claim`](https://spec.matrix.org/v1.6/server-server-api/#post_matrixfederationv1userkeysclaim) | ||||||
|
||||||
**`POST /_matrix/app/v1/keys/claim`** | ||||||
```jsonc | ||||||
// Request | ||||||
{ | ||||||
"@alice:example.org": { | ||||||
"DEVICEID": ["signed_curve25519", "signed_curve25519"] // device ID to algorithm names | ||||||
}, | ||||||
// ... | ||||||
} | ||||||
``` | ||||||
```jsonc | ||||||
// Response | ||||||
{ | ||||||
"@alice:example.org": { | ||||||
"DEVICEID": { | ||||||
"signed_curve25519:AAAAHg": { | ||||||
"key": "...", | ||||||
"signatures": { | ||||||
"@alice:example.org": { | ||||||
"ed25519:DEVICEID": "..." | ||||||
} | ||||||
} | ||||||
}, | ||||||
"signed_curve25519:BBBBHg": { | ||||||
"key": "...", | ||||||
"signatures": { | ||||||
"@alice:example.org": { | ||||||
"ed25519:DEVICEID": "..." | ||||||
} | ||||||
} | ||||||
} | ||||||
} | ||||||
}, | ||||||
// ... | ||||||
} | ||||||
``` | ||||||
|
||||||
*Note*: Like other appservice endpoints, this endpoint should *not* be ratelimited and *does* require | ||||||
normal [authentication](https://spec.matrix.org/v1.6/application-service-api/#authorization). | ||||||
|
||||||
Multiple users, devices, and keys for those devices can be claimed in a single request. This is to | ||||||
allow homeservers to batch multiple client/federation requests into a single request on the appservice, | ||||||
if desirable. This is an optional optimization for homeserver implementations. In the example above, 2 | ||||||
keys are claimed for one device. | ||||||
|
||||||
If the appservice responds with an error of any kind (including timeout), the homeserver uses the | ||||||
fallback key, if known. The homeserver additionally uses the fallback key (if known) to fill in | ||||||
missing keys from the appservice. For example, if the homeserver requested 2 keys for Alice but | ||||||
the appservice only provided 1, the homeserver would use the fallback key to fulfill the second. | ||||||
Comment on lines
+83
to
+86
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm left wondering why the appservice would bother uploaded fallback keys when (to my reading) it could return them directly as part of the response. I suppose uploading them would still be useful for error conditions, however. |
||||||
|
||||||
In this case, the appservice is responsible for ensuring it doesn't use a key twice. The | ||||||
`device_one_time_keys_count` field for the appservice (over MSC3202, for example) would be zero. In | ||||||
many implementations, when this field falls below a threshold it is common for upload requests to | ||||||
happen: appservices intending on using the new API should not perform those uploads as it means, | ||||||
quite simply, not using the new API. | ||||||
Comment on lines
+88
to
+92
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm having a hard time understanding what this paragraph is attempting to say. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it simply trying to say that implementations implementing MSC3983 should ignore There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's more of a warning to existing implementations of crypto: nearly all of them do a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm sure, but what are implementations supposed to do instead? That's the part I'm unclear about from the above. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "not that" :p Normally implementation details like this wouldn't be called out, however given the chance for everyone to have the exact same bug (intending to use new API, uploads keys by accident, new API isn't used), this is called out here. |
||||||
|
||||||
Normally the homeserver would be [ensuring](https://spec.matrix.org/v1.6/client-server-api/#one-time-and-fallback-keys) | ||||||
OTKs are only used once, however with the appservice serving the endpoint it becomes the responsibility | ||||||
of the appservice to perform this check. | ||||||
|
||||||
If the homeserver uses the fallback key, that will be communicated in the traditional ways to the | ||||||
appservice (namely through `device_unused_fallback_key_types` in the case of MSC3202). | ||||||
|
||||||
We don't apply this API to implicit (non-exclusive) users as it's possible for multiple appservices | ||||||
to have a namespace covering the user: instead of guessing or going around to each, we require the | ||||||
user to be in an exclusive namespace. This guarantees that there's only one appservice responsible | ||||||
for the user. | ||||||
|
||||||
## Returning extra keys | ||||||
|
||||||
**TODO**: This is probably best as its own MSC. | ||||||
|
||||||
Independent of the appservice having `/keys/claim` proxied to it, it may be desirable for both the | ||||||
fallback and one-time key to be returned. Servers should *always* include the fallback key alongside | ||||||
Comment on lines
+110
to
+111
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In addition to both fallback and OTKs, @dkasak suggested that it could be helpful to request multiple keys at once over the C-S API. In particular having 0 or 1 keys of multiple algorithms could be useful. I implemented this using the same unstable endpoint by accepting a map of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The Synapse PR was updated to use the same form sent to appservices (i.e. a list of algorithms). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We'll probably want to add something saying that clients may need to deal with getting back fewer than the requested numbers of keys and that servers should protect against users request too many keys at once to avoid exhaustion. |
||||||
the requested OTKs. When using this proposal's new endpoint, the server should use the fallback key | ||||||
from the appservice's response rather than a previously stored fallback key, if present (if the | ||||||
appservice doesn't respond with a fallback key then the server uses the stored fallback key instead, | ||||||
if known). | ||||||
Comment on lines
+112
to
+115
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This doesn't define if the appservice should be queried for fallback keys if a OTK is in the database. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In lieu of MSC text: the appservice is supposed to be queried, similar to MSC3984 where the server uses what it knows if the appservice didn't provide it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually I suppose the text in the MSC is enough as is:
This would not match what you just wrote though and would mean it is not possible for an appservice to only provide fallback keys. I'm not sure if that's a good thing or a bad thing though. 🤷 If we really do want that then I think we want to use different endpoints or something. Otherwise the appservice wouldn't know if we truly want to query for OTKs or only for fallback keys (it could end up returning OTKs which go unused?) |
||||||
|
||||||
The server SHOULD NOT replace any uploaded fallback keys with ones returned by the appservice via | ||||||
this proposal. The appservice MUST re-upload the fallback key if it wants to replace it, as it would | ||||||
do upon first (known) use. | ||||||
|
||||||
Clients can determine which of the keys returned is the fallback key by `fallback: true` on the returned | ||||||
keys. | ||||||
|
||||||
Servers MUST NOT mark the fallback key as "used" unless no other OTKs are returned. | ||||||
|
||||||
## Potential issues | ||||||
|
||||||
As described, the appservice could be offline or in fact experience a worse uptime than the homeserver. | ||||||
This new API is optional for appservices: if they don't want to use it (because they know their uptime | ||||||
will be bad), they can simply upload keys in advance, just like before this proposal. Similarly, if | ||||||
the appservice is trying to use the API but is offline, they *should* have a fallback key to continue | ||||||
using as, well, a fallback. | ||||||
Comment on lines
+130
to
+132
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Who is offline in this sentence? And which API are we talking about? Also, who is they? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This whole MSC is in context of an appservice. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I gathered as much 😄 But I'm still having trouble parsing that sentence in exact terms. e.g. If I interpret "but is offline" as "but the appservice is offline", how is the appservice able to do anything given it's offline? I'd suggest rewriting the sentence with less implicit words / pronouns. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The appservice is opting into using the API by not uploading keys, effectively. This is it "using" the API, which might be the confusing part? |
||||||
|
||||||
For appservices which never intend to upload keys there is a bit of a wasted lookup to see if there are | ||||||
any keys for the user(s). This could be mitigated with an implementation-specific flag to skip the lookup | ||||||
and just do proxying, though for the general case in this MSC the fallback key consideration is kept for | ||||||
reliability concerns. | ||||||
|
||||||
Similarly, if an appservice doesn't intend on uploading keys (because it doesn't support encryption) and | ||||||
indicates the route is [unknown](https://spec.matrix.org/v1.6/application-service-api/#unknown-routes), | ||||||
the homeserver could avoid calling appservice with a backoff to prevent excessive calls. | ||||||
|
||||||
## Alternatives | ||||||
|
||||||
Many encryption-capable bridges today can avoid uploading OTKs (and sometimes even device keys) because | ||||||
they have a bot user in the room. The bot user uploads its keys, but the remaining bridge users do not. | ||||||
This works if the bridge users don't need to be involved in rooms without the bot user present, though | ||||||
being able to (securely) DM bridge users is a valuable consideration for this MSC. In future, scalable | ||||||
encryption for appservices might take the shape of an appservice-wide device of some sort. | ||||||
|
||||||
It could be argued that supporting a fallback key for appservices is too much considering their uptime, | ||||||
however in practice appservices are not quite able to achieve 100% uptime. This proposal doesn't propose | ||||||
proxying device/signing key queries to the appservice for the same reliability concerns, though appservices | ||||||
which wish to opt to do so anyways could use [MSC3984](https://github.com/matrix-org/matrix-spec-proposals/pull/3984). | ||||||
|
||||||
## Additional uses | ||||||
|
||||||
An appservice aiming to bridge two different encryption systems might use this endpoint to save on data, | ||||||
though currently the encryption used on both sides of the bridge would need to be compatible (ie: signatures | ||||||
from device IDs and user IDs need to exist). In future, other MSCs might make encryption bridges easier to | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Device IDs and user IDs can't produce signatures, so the proposed change would make it a bit clearer to me. |
||||||
build. | ||||||
|
||||||
## Security considerations | ||||||
|
||||||
No major considerations. | ||||||
|
||||||
## Unstable prefix | ||||||
|
||||||
While this MSC is not considered stable, implementations should use | ||||||
`/_matrix/app/unstable/org.matrix.msc3983/keys/claim` as the endpoint instead. There is no version | ||||||
compatibility check: homeservers implementing this functionality would receive an error from appservices | ||||||
which don't support the endpoint and thus engage in the behaviour described by the MSC. | ||||||
|
||||||
## Dependencies | ||||||
|
||||||
This MSC has no direct dependencies, however is of little use without being partnered with something | ||||||
like [MSC3202](https://github.com/matrix-org/matrix-spec-proposals/pull/3202). | ||||||
|
||||||
This MSC is additionally useful when paired with [MSC3984](https://github.com/matrix-org/matrix-spec-proposals/pull/3984), | ||||||
though has no direct dependency. |
Uh oh!
There was an error while loading. Please reload this page.