Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB metrics: add query-level metrics #723

Open
dnaik-sfx opened this issue Feb 8, 2024 · 12 comments
Open

DB metrics: add query-level metrics #723

dnaik-sfx opened this issue Feb 8, 2024 · 12 comments
Assignees

Comments

@dnaik-sfx
Copy link

Currently, there are some conventions for connection-level metrics defined on the Database Metrics page. Proposing that query-level metrics are added (in a different section on the same page) to define conventions for the stats and info that can be captured for individual queries.

Some examples could include:

db.query.count (number of times a query was executed)
db.query.plan (execution plan for a query)
db.query.duration (amount of time taken by the query)
db.query.rows_returned (number of rows returned by a query)

@trask
Copy link
Member

trask commented Feb 9, 2024

related to #512

@dnaik-sfx
Copy link
Author

@trask Any objections to me putting up a PR for this, or is this being actively discussed by the DB Client working group? I did see there was some discussion around common DB attributes, which I imagine might be relevant for these metrics as well (i.e, do we create common query metrics conventions like the examples I gave, or database-specific query metrics).

@trask
Copy link
Member

trask commented Feb 16, 2024

@dnaik-sfx does #735 address your needs?

@dnaik-sfx
Copy link
Author

We'd like to be able to collect database metrics like duration, count, rows returned, etc... at the query/statement level. That PR does add a new metric on duration, but it seems that db.statement is being left out of the initial set of attributes; to get the duration information for a given query, that attribute would be needed, but the metric itself would likely suit our needs if that attribute was added. We would also want to add the other metrics mentioned above and some others though (these would also need db.statement or a similar attribute).

I realize there are concerns about the high cardinality when adding db.statement as an attribute on metrics, but I was thinking that the metrics themselves would either be optional (if there is no value in collecting them without tying them to a query/statement), or the db.statement attribute itself would be optional.

@dnaik-sfx
Copy link
Author

@trask Sorry to bother you again, but is this (or related topics) being, or planned to be, discussed by the working group? And if so, is there somewhere I can go to follow the discussion (if not here)? Also happy to put up a PR for this myself and help drive it if this would be beyond the scope of what the working group was intended to accomplish.

@trask
Copy link
Member

trask commented Apr 5, 2024

hey @dnaik-sfx, sorry for the slow response, this issue is on the working group's project board, we are planning to discuss it prior to stabilization, but we may push it post stabilization if we think it can be added after stabilization without introducing any breaking changes

a couple thoughts/questions in the meantime...

would you still want these two:

  • db.query.count
  • db.query.duration

if db.statement is an optional attribute on db.client.operation.duration? (since it's a histogram, you can get counts)

I could see db.query.rows_returned being added, e.g. db.client.operation.rows_returned, it's similar to http.client.response.body.size

for db.query.plan, where would you get the query plan? is this something you're thinking to collect in client instrumentation or some other way?

@seonsfx
Copy link

seonsfx commented Apr 16, 2024

Hi @trask , thanks for responding. Since we just learned there's an upcoming change to add db.client.operation. metrics, the more important question would be "Are db.query. and db.client.operation. the same thing?" or more precisely, "Is db.query a subset of db.client.operation.?".

If the answer is yes, we could be adding more metrics in the db.client.operation. namespace instead of creating another. If not, replacing db.query.duration with db.client.operation.duration wouldn't make sense and make the metric set incomplete.

I feel pretty confident that there's no reason why we cannot go with db.client.operation. metrics but let us go through the PR and the proposed metric. Also please feel free to let us know what you think.

@trask
Copy link
Member

trask commented Apr 25, 2024

@seonsfx db.client.operation.duration includes all database operations, including queries

@trask
Copy link
Member

trask commented Apr 26, 2024

Moving this to post-stability since we can add new metrics later, but we can continue discussing and clarifying the issue in the meantime.

@dnaik-sfx
Copy link
Author

@trask thanks! What does the current timeline look like for stability? Wondering what the timeframe might look like for these query-level metric conventions to go in.

Given the db.client.operation convention, I think the following would be the full list of metric conventions we would want to add (and eventually collect):

  • db.client.operation.rows_returned
  • db.client.operation.rows_written
  • db.client.operation.rows_scanned
  • db.client.operation.read_requests (like physical read requests in Oracle)
  • db.client.operation.write_requests
  • db.client.operation.read_bytes
  • db.client.operation.write_bytes
  • db.client.operation.shared_read_blocks (aka the MySQL concept of shared vs local read blocks)
  • db.client.operation.shared_blocks_written
  • db.client.operation.local_read_blocks
  • db.client.operation.local_blocks_written
  • db.client.operation.pages_scanned
  • db.client.operation.pages_written
  • db.client.operation.granted_memory
  • db.client.operation.used_memory
  • db.client.operation.runtime_memory
  • db.client.operation.sharable_memory
  • db.client.operation.wait_time
  • db.client.operation.cpu_time
  • db.client.operation.hit_percent (like a cache hit percent)
  • db.client.operation.plan

Some of these are kind of specific to a given type of database, but it could still be worth capturing them as a shared convention in case smaller or future technologies share the underlying concepts. Open to feedback about where these conventions should be added though. As far as collection goes, we envision typically collecting them from system tables depending on the database implementation. For example, MySQL has the Performance Schema with tables providing statistics about queries that ran in the past.

I'm happy to put up a PR for these metrics, as well as the attributes that would be desired on them, to discuss specifics and in further depth.

@dnaik-sfx
Copy link
Author

@trask I know work towards stability is still ongoing, but any thoughts about the metrics and collection method I listed above and any possible conflicts with patterns/metrics being added now?

@dnaik-sfx
Copy link
Author

Put up a PR for the sake of discussion (with the understanding that work towards stability is still underway and these additional metrics would come after)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Post Stability
4 participants