You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this case, we were just attempting to claim keys for devices which we had previously failed to get one. (Due to matrix-org/matrix-rust-sdk#281, we do this a bit too often). Anyway the point is pretty much all of the devices in this request have run out of OTKs - but I think it is still instructive.
What I see is:
321 calls to db.claim_e2e_one_time_keys. This is presumably one for each device for matrix.org users. These take us to about 1.8 seconds.
321 calls to db._get_fallback_key. Again one for each matrix.org device. Another 2.1 seconds, bringing us to 4.0 seconds.
21 calls to claim_client_keys. One per federated destination. These all happen in parallel, so the critical path is the slowest homeserver to respond. The pathological case here is servers that respond within the timeout (so don't get backed off from) but slowly - and then the device doesn't have any keys so we have to do it again. In this case the slowest server was 2.1 seconds.
What I see here is some easy performance improvements. In particular:
Doing remote and local claims in parallel would roughly halve the time.
up for grabs!
Doing more than one local device per db request would mean much less DB scheduling overhead.
This issue has been migrated from #16554.
/keys/claim
requests often take multiple seconds when requesting keys for hundreds of devices.Out of interest I looked at the anatomy of a slow
/keys/claim
request (https://jaeger.proxy.matrix.org/trace/62603ae20c639720). The request took 6.2 seconds altogether.In this case, we were just attempting to claim keys for devices which we had previously failed to get one. (Due to matrix-org/matrix-rust-sdk#281, we do this a bit too often). Anyway the point is pretty much all of the devices in this request have run out of OTKs - but I think it is still instructive.
What I see is:
db.claim_e2e_one_time_keys
. This is presumably one for each device formatrix.org
users. These take us to about 1.8 seconds.db._get_fallback_key
. Again one for eachmatrix.org
device. Another 2.1 seconds, bringing us to 4.0 seconds.claim_client_keys
. One per federated destination. These all happen in parallel, so the critical path is the slowest homeserver to respond. The pathological case here is servers that respond within the timeout (so don't get backed off from) but slowly - and then the device doesn't have any keys so we have to do it again. In this case the slowest server was 2.1 seconds.What I see here is some easy performance improvements. In particular:
The text was updated successfully, but these errors were encountered: