Open
Description
This is a collection of items to improve external (spilling) aggregation
Background
Abstract—Analytical database systems offer high-performance in-memory aggregation. If there are many unique groups, temporary query intermediates may not fit RAM, requiring the use of external storage. However, switching from an in-memory to an external algorithm can degrade performance sharply
DataFusion has supported memory limited / spilling hash aggregation since @kazuyukitanimura added it last year in #7400.
We can likely improve this feature and @2010YOUY01 is considering working on it