Is your feature request related to a problem or challenge?
When SortExec is eliminated via sort pushdown (#21182, statistics-based file reordering), SortPreservingMergeExec becomes the only merge operator, reading directly from I/O-bound DataSourceExec partitions instead of from SortExec's in-memory buffer.
Currently we insert a BufferExec to compensate, but with many partitions the single-threaded K-way merge in SPM can still become a bottleneck — it alternates between partitions, and each partition switch may stall on I/O even with buffering.
Describe the solution you'd like
When SPM reads from I/O-bound sources (no buffering SortExec), split the N input streams into groups and merge each group in parallel, then merge the intermediate results. This creates a tree of merges:
Level 1 (parallel): merge(s1,s2), merge(s3,s4), merge(s5,s6), merge(s7,s8)
Level 2 (parallel): merge(m1,m2), merge(m3,m4)
Level 3: merge(m5,m6) → final output
This would be especially beneficial when:
- Sort elimination removes the buffering
SortExec, making SPM I/O-bound
- Many partitions with I/O-bound sources
- Large datasets where single-threaded merge becomes the bottleneck
Additional context
Suggested by @Dandandan in #21182 (comment)
Related: DuckDB's parallel k-way merge
Parent issue: #17348
Is your feature request related to a problem or challenge?
When
SortExecis eliminated via sort pushdown (#21182, statistics-based file reordering),SortPreservingMergeExecbecomes the only merge operator, reading directly from I/O-boundDataSourceExecpartitions instead of fromSortExec's in-memory buffer.Currently we insert a
BufferExecto compensate, but with many partitions the single-threaded K-way merge in SPM can still become a bottleneck — it alternates between partitions, and each partition switch may stall on I/O even with buffering.Describe the solution you'd like
When SPM reads from I/O-bound sources (no buffering SortExec), split the N input streams into groups and merge each group in parallel, then merge the intermediate results. This creates a tree of merges:
This would be especially beneficial when:
SortExec, making SPM I/O-boundAdditional context
Suggested by @Dandandan in #21182 (comment)
Related: DuckDB's parallel k-way merge
Parent issue: #17348