Skip to content

Epic: compute performance observability (as a storage client) #8926

Open

Description

Motivation

Currently, it's hard to quickly attribute performance issues to a particular part of our I/O path (compute->safekeeper->pageserver).

We have a lot of metrics in the safekeeper and pageserver, but relative few in the compute. The compute is closest to the user, and can give us a clearer picture of what performance the user is experiencing, as well as enabling us to measure end-to-end performance including network latency to the compute.

DoD

  • When we encounter a performance limit on the write or read path, we are able to say with confidence whether the bottleneck is on the compute or storage side
  • When we see apparent slow getpage requests, we can distinguish between slowness inside the server, vs. slowness on the end-to-end path including network latency (by comparing server and client latencies)

Implementation ideas

Tasks

Tasks

Other related tasks and Epics

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

a/observabilityArea: related to observabilityc/computeComponent: compute, excluding postgres itselft/EpicIssue type: Epic

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions