Skip to content

Epic: Decides default metrics to show in EXPLAIN ANALYZE across operators #18116

@2010YOUY01

Description

@2010YOUY01

Is your feature request related to a problem or challenge?

Previously, by default, EXPLAIN ANALYZE will show all available metrics inside an operator. It can get quite verbose for some operator.

In datafusion-cli:

> CREATE EXTERNAL TABLE IF NOT EXISTS lineitem
STORED AS parquet
LOCATION '/Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem';
0 row(s) fetched.
Elapsed 0.000 seconds.

> explain analyze select *
from lineitem
where l_orderkey = 3000000;

The parquet reader includes a large number of low-level details:

metrics=[output_rows=19813, elapsed_compute=14ns, batches_split=0, bytes_scanned=2147308, file_open_errors=0, file_scan_errors=0, files_ranges_pruned_statistics=18, num_predicate_creation_errors=0, page_index_rows_matched=19813, page_index_rows_pruned=729088, predicate_cache_inner_records=0, predicate_cache_records=0, predicate_evaluation_errors=0, pushdown_rows_matched=0, pushdown_rows_pruned=0, row_groups_matched_bloom_filter=0, row_groups_matched_statistics=1, row_groups_pruned_bloom_filter=0, row_groups_pruned_statistics=0, bloom_filter_eval_time=21.997µs, metadata_load_time=273.83µs, page_index_eval_time=29.915µs, row_pushdown_eval_time=42ns, statistics_eval_time=76.248µs, time_elapsed_opening=4.02146ms, time_elapsed_processing=24.787461ms, time_elapsed_scanning_total=24.17671ms, time_elapsed_scanning_until_data=23.103665ms]

I believe only a subset of it is commonly used, for example output_rows, metadata_load_time, and how many file/row-group/pages are pruned, and it would better to only display the most common ones by default.

After #18098, the EXPLAIN ANALYZE detail level can be controlled through an option

> set datafusion.explain.analyze_level = summary;
0 row(s) fetched.
Elapsed 0.000 seconds.

> explain analyze select * from generate_series(10000) as t1(v1) order by v1 desc;
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                         |
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | SortExec: expr=[v1@0 DESC], preserve_partitioning=[false], metrics=[output_rows=10001, elapsed_compute=100µs]                                                |
|                   |   ProjectionExec: expr=[value@0 as v1], metrics=[output_rows=10001, elapsed_compute=1.166µs]                                                                 |
|                   |     LazyMemoryExec: partitions=1, batch_generators=[generate_series: start=0, end=10000, batch_size=8192], metrics=[output_rows=10001, elapsed_compute=43µs] |
|                   |                                                                                                                                                              |
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.001 seconds.

> set datafusion.explain.analyze_level = dev;
0 row(s) fetched.
Elapsed 0.000 seconds.

> explain analyze select * from generate_series(10000) as t1(v1) order by v1 desc;
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                   |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | SortExec: expr=[v1@0 DESC], preserve_partitioning=[false], metrics=[output_rows=10001, elapsed_compute=222.043µs, spill_count=0, spilled_bytes=0.0 B, spilled_rows=0, batches_split=2] |
|                   |   ProjectionExec: expr=[value@0 as v1], metrics=[output_rows=10001, elapsed_compute=2.584µs]                                                                                           |
|                   |     LazyMemoryExec: partitions=1, batch_generators=[generate_series: start=0, end=10000, batch_size=8192], metrics=[output_rows=10001, elapsed_compute=162.625µs]                      |
|                   |                                                                                                                                                                                        |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.003 seconds.

Now only output_rows and elapsed_compute are included in the summary analyze level, and the dev level is the default.

The goal of this issue is, for each operator:

  • Investigate what's the most commonly used metrics for high-level insights, postgres
    and DuckDB's result can be used as reference
  • Set those common metrics to summary type

And finally set summary analyze level to default.

Describe the solution you'd like

Identify which metrics should be included in the summary level for all operators.

  • DataSourceExec
  • DataSinkExec
  • AnalyzeExec
  • AsyncFuncExec
  • AggregateExec
  • CoalesceBatchesExec
  • CoalescePartitionsExec
  • CooperativeExec
  • CrossJoinExec
  • EmptyExec
  • ExplainExec
  • FilterExec
  • GlobalLimitExec
  • LocalLimitExec
  • HashJoinExec
  • NestedLoopJoinExec
  • SortMergeJoinExec
  • SymmetricHashJoinExec
  • PlaceholderRowExec
  • ProjectionExec
  • RecursiveQueryExec
  • RepartitionExec
  • SortExec
  • PartialSortExec
  • SortPreservingMergeExec
  • StreamingTableExec
  • UnionExec
  • InterleaveExec
  • UnnestExec
  • WindowAggExec
  • BoundedWindowAggExec
  • WorkTableExec
  • LazyMemoryExec

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    PROPOSAL EPICA proposal being discussed that is not yet fully underwayenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions