Skip to content

Add "baseline" metrics to all built in operators #866

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I would like to be able get an overall understanding of where time is being spent during query execution via EXPLAIN ANALYZE (see #858) so that I know where to focuse additional performance optimization activities

Additionally, I would like to be able to graph a stacked flamechart such as the following see more details on https://github.com/influxdata/influxdb_iox/issues/2273) that shows when the different operators ran in relation to each other.

Screen Shot 2021-08-12 at 11 14 33 AM

Describe the solution you'd like
I would like to instrument all operators (impl ExecutionPlan) included in DataFusion so that they produce at least the following metrics:

  1. output_rows: total rows produced at the output of the operator
  2. cpu_nanos: the total time spent (not including any time spent in the input stream or waiting to be scheduled)
  3. start_time: the wall clock time at which execute was run
  4. stop_time: the wall clock time at which the last output record batch was produced

I plan to use the SQLMetric infrastructure for doing so, probably after #679

Describe alternatives you've considered
Open questions:

  1. Handling the output of different partitions (each operator can produce multiple output partitions / streams, and it is not yet clear to me if recording stats on a per partition level is important)
  2. How to handle operators that don't provide metrics such as potentially user defined ones (probably will fill in with their parents)

Additional context

Related work:

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions