Skip to content

Fuse grouped aggregate and filter operators for improved performance #5944

Closed
@andygrove

Description

@andygrove

Is your feature request related to a problem or challenge?

When we perform a grouped aggregate on a filtered input (such as with TPC-H q1), the filter operator performs two main tasks:

  • Evaluate the filter predicate (usually very fast)
  • Create new batches and copy over the filtered data (very slow if the filter is not very selective, as in q1)

I wonder if we would see a significant performance improvement if we could avoid creating the filtered batches in this case.

One idea would be to create the filtered batches by copying the arrays and mutating the validity bitmap to hide the rows that are filtered out. This would potentially change the semantics in some cases though so we can probably only do this under certain conditions.

Another idea is to update the aggregate logic to perform the predicate evaluation and then use the resulting bitmap to determine which rows to accumulate.

Describe the solution you'd like

I am working on a small prototype of this, outside of DataFusion, that I will share once the code is less embarrassing.

Describe alternatives you've considered

It would be worth seeing how other engines handle this.

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestperformanceMake DataFusion faster

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions