Skip to content

Conversation

@2010YOUY01
Copy link
Contributor

@2010YOUY01 2010YOUY01 commented Nov 12, 2025

Which issue does this PR close?

  • Closes #.

Rationale for this change

Background for dynamic filter: https://datafusion.apache.org/blog/2025/09/10/dynamic-filters/

The following queries can be used for quick global insights:

-- Q1
select min(l_shipdate) from lineitem;
-- Q2
select min(l_shipdate) from lineitem where l_returnflag = 'R';

Now Q1 can get executed very efficiently by directly check the file metadata if possible:

> explain select min(l_shipdate) from lineitem;
+---------------+-------------------------------+
| plan_type     | plan                          |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
|               | │       ProjectionExec      │ |
|               | │    --------------------   │ |
|               | │ min(lineitem.l_shipdate): │ |
|               | │         1992-01-02        │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │     PlaceholderRowExec    │ |
|               | └───────────────────────────┘ |
|               |                               |
+---------------+-------------------------------+
1 row(s) fetched.
Elapsed 0.007 seconds.

However for Q2 now it's still doing the whole scan, and it's possible to use dynamic filters to speed them up.

Benchmarking Q2

Setup

  1. Generate tpch-sf100 parquet file with tpchgen-cli -s 100 --format=parquet (https://github.com/clflushopt/tpchgen-rs/tree/main/tpchgen-cli)
  2. In datafusion-cli, run
CREATE EXTERNAL TABLE lineitem
STORED AS PARQUET
LOCATION '/Users/yongting/data/tpch_sf100/lineitem.parquet';

select min(l_shipdate) from lineitem where l_returnflag = 'R';

Result
Main: 0.55s
PR: 0.09s

Aggregate Dynamic Filter Pushdown Overview

For queries like

  -- `example_table(type TEXT, val INT)`
  SELECT min(val)
  FROM example_table
  WHERE type='A';

And example_table's physical representation is a partitioned parquet file with
column statistics

  • part-0.parquet: val {min=0, max=100}
  • part-1.parquet: val {min=100, max=200}
  • ...
  • part-100.parquet: val {min=10000, max=10100}

After scanning the 1st file, we know we only have to read files if their minimal
value on val column is less than 0, the minimal val value in the 1st file.

We can skip scanning the remaining file by implementing dynamic filter, the
intuition is we keep a shared data structure for current min in both AggregateExec
and DataSourceExec, and let it update during execution, so the scanner can
know during execution if it's possible to skip scanning certain files. See
physical optimizer rule FilterPushdown for details.

Implementation

Enable Condition

  • No grouping (no GROUP BY clause in the sql, only a single global group to aggregate)
  • The aggregate expression must be min/max, and evaluate directly on columns.
    Note multiple aggregate expressions that satisfy this requirement are allowed,
    and a dynamic filter will be constructed combining all applicable expr's
    states. See more in the following example with dynamic filter on multiple columns.

Filter Construction

The filter is kept in the DataSourceExec, and it will gets update during execution,
the reader will interpret it as "the upstream only needs rows that such filter
predicate is evaluated to true", and certain scanner implementation like parquet
can evalaute column statistics on those dynamic filters, to decide if they can
prune a whole range.
Examples

  • Expr: min(a), Dynamic Filter: a < a_cur_min
  • Expr: min(a), max(a), min(b), Dynamic Filter: (a < a_cur_min) OR (a > a_cur_max) OR (b < b_cur_min)

What changes are included in this PR?

The goal is is to let aggregate expressions MIN/MAX with only column reference as argument (e.g. min(col1)) support dynamic filter, the above implementation rationale has explained it further.

The implementation includes:

  1. Added AggrDynFilter struct, and it would be shared across different partition streams to store the current bounds for dynamic filter update.
  2. init_dynamic_filter is responsible checking the conditions for whether to enable dynamic filter in the current aggregate execution plan, and finally build the AggrDynFilter inside the operator.
  3. During aggregation execution, after evaluating each batch, the current bound is refreshed in the dynamic filter, enabling the scanner to skip prunable units using the latest runtime bounds. (now it's updating every batch, perhaps we can let them update every k batches to avoid overheads?)
  4. Updated gather_filters_for_pushdown and handle_child_pushdown_result API in AggregateExec to enable self dynamic filter generation and pushdown.
  5. Added a configuration to turn it on/off

TODO(in this PR)

  • Add tests for grouping set
  • Only update bounds if they're tightened, to reduce lock contention

Are these changes tested?

Yes, optimize UTs and end-to-end tests

Are there any user-facing changes?

No

@github-actions github-actions bot added documentation Improvements or additions to documentation core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) common Related to common crate physical-plan Changes to the physical-plan crate labels Nov 12, 2025
})?;
// First get current partition's bound, then update the shared bound among
// all partitions.
let current_bound = acc.evaluate()?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let current_bound = acc.evaluate()?;
let current_bound = acc.evaluate()?;
if current_bound.is_null() {
continue;
}

?!
because it will affect the scalar_min() below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Additionally, maybe we should only update the shared bound if the local bound has tighten, to reduce lock contention.

@alamb alamb requested a review from adriangb November 12, 2025 14:36
@alamb alamb added the performance Make DataFusion faster label Nov 12, 2025
Comment on lines +510 to +512
/// During filter pushdown optimization, if a child node can accept this filter,
/// it remains `Some(..)` to enable dynamic filtering during aggregate execution;
/// otherwise, it is cleared to `None`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that currently "does the child node accept the filter" is a bit murky: even if it is says No it can still retain a reference e.g. for statistics pruning.

It seems to me we may need to expand the pushdown response from Yes/No to Exact/Inexact/Unsupported.

Or maybe we should check the Arc reference counts 😛? If no one else has a reference... no point in updating?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Also, what's the current semantics? Is Yes map to either Exact or InExact?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Precisely: Yes can mean Exact or Inexact but doesn’t differentiate between them

Copy link
Contributor

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super cool @2010YOUY01 ! I hadn't even thought of this use case. Truly amazing.

I left a small comment for now. Overall the change looks good but requires more in depth review. I'll try over the next couple days but am on vacation so it may take a week 🙏🏻

@2010YOUY01
Copy link
Contributor Author

This is super cool @2010YOUY01 ! I hadn't even thought of this use case. Truly amazing.

I left a small comment for now. Overall the change looks good but requires more in depth review. I'll try over the next couple days but am on vacation so it may take a week 🙏🏻

Thanks! Enjoy your vacation. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate core Core DataFusion crate documentation Improvements or additions to documentation performance Make DataFusion faster physical-plan Changes to the physical-plan crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants