Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/keys/claim is surprisingly slow #16554

Open
2 of 3 tasks
matrixbot opened this issue Dec 21, 2023 · 0 comments
Open
2 of 3 tasks

/keys/claim is surprisingly slow #16554

matrixbot opened this issue Dec 21, 2023 · 0 comments

Comments

@matrixbot
Copy link
Collaborator

matrixbot commented Dec 21, 2023

This issue has been migrated from #16554.


/keys/claim requests often take multiple seconds when requesting keys for hundreds of devices.

Out of interest I looked at the anatomy of a slow /keys/claim request (https://jaeger.proxy.matrix.org/trace/62603ae20c639720). The request took 6.2 seconds altogether.

In this case, we were just attempting to claim keys for devices which we had previously failed to get one. (Due to matrix-org/matrix-rust-sdk#281, we do this a bit too often). Anyway the point is pretty much all of the devices in this request have run out of OTKs - but I think it is still instructive.

What I see is:

  • 321 calls to db.claim_e2e_one_time_keys. This is presumably one for each device for matrix.org users. These take us to about 1.8 seconds.
  • 321 calls to db._get_fallback_key. Again one for each matrix.org device. Another 2.1 seconds, bringing us to 4.0 seconds.
  • 21 calls to claim_client_keys. One per federated destination. These all happen in parallel, so the critical path is the slowest homeserver to respond. The pathological case here is servers that respond within the timeout (so don't get backed off from) but slowly - and then the device doesn't have any keys so we have to do it again. In this case the slowest server was 2.1 seconds.

What I see here is some easy performance improvements. In particular:

  • Doing remote and local claims in parallel would roughly halve the time.
    • up for grabs!
  • Doing more than one local device per db request would mean much less DB scheduling overhead.
@matrixbot matrixbot changed the title Dummy issue /keys/claim is surprisingly slow Dec 22, 2023
@matrixbot matrixbot reopened this Dec 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant