Skip to content

Fusing partial aggregation with repartition #12596

Open
@Rachelint

Description

@Rachelint

Is your feature request related to a problem or challenge?

I impl a poc #12526, and found this idea can actually improve performance.

But for some reasons stated in #11680 (comment)

I think this improvement is not so suitable to be pushed forward currently.

Just file an issue to track it.

Describe the solution you'd like

  • Introduce the partitioned hashtable in partial aggregation, and we partition the datafusion before inserting them into hashtable.
  • And we push them into final aggregation partition by partition after, rather than split them again in repartition, and merge them again in coalesce.

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions