Concatenate inside hash repartition #16223

Dandandan · 2025-06-01T11:36:48Z

Which issue does this PR close?

--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ concat_in_repartition ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  63.26ms │               63.39ms │     no change │
│ QQuery 2     │  13.65ms │               12.75ms │ +1.07x faster │
│ QQuery 3     │  22.04ms │               22.57ms │     no change │
│ QQuery 4     │  13.17ms │               11.66ms │ +1.13x faster │
│ QQuery 5     │  34.82ms │               35.57ms │     no change │
│ QQuery 6     │  10.93ms │               10.39ms │     no change │
│ QQuery 7     │  70.36ms │               69.08ms │     no change │
│ QQuery 8     │  17.39ms │               17.54ms │     no change │
│ QQuery 9     │  39.17ms │               37.51ms │     no change │
│ QQuery 10    │  37.11ms │               36.13ms │     no change │
│ QQuery 11    │   5.79ms │                5.80ms │     no change │
│ QQuery 12    │  32.15ms │               32.14ms │     no change │
│ QQuery 13    │  19.50ms │               18.44ms │ +1.06x faster │
│ QQuery 14    │   5.36ms │                5.54ms │     no change │
│ QQuery 15    │  12.10ms │               12.09ms │     no change │
│ QQuery 16    │  14.19ms │               14.93ms │  1.05x slower │
│ QQuery 17    │  59.43ms │               55.86ms │ +1.06x faster │
│ QQuery 18    │ 136.18ms │              128.38ms │ +1.06x faster │
│ QQuery 19    │  21.94ms │               19.10ms │ +1.15x faster │
│ QQuery 20    │  21.38ms │               20.65ms │     no change │
│ QQuery 21    │  92.81ms │               93.23ms │     no change │
│ QQuery 22    │  12.41ms │               12.78ms │     no change │
└──────────────┴──────────┴───────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary                    ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (main)                    │ 755.16ms │
│ Total Time (concat_in_repartition)   │ 735.53ms │
│ Average Time (main)                  │  34.33ms │
│ Average Time (concat_in_repartition) │  33.43ms │
│ Queries Faster                       │        6 │
│ Queries Slower                       │        1 │
│ Queries with No Change               │       15 │
└──────────────────────────────────────┴──────────┘
--------------------
Benchmark tpch_mem_sf10.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      main ┃ concat_in_repartition ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  580.08ms │              577.68ms │     no change │
│ QQuery 2     │  112.00ms │              111.95ms │     no change │
│ QQuery 3     │  233.29ms │              239.08ms │     no change │
│ QQuery 4     │  123.81ms │              113.98ms │ +1.09x faster │
│ QQuery 5     │  462.88ms │              481.75ms │     no change │
│ QQuery 6     │   86.76ms │               91.73ms │  1.06x slower │
│ QQuery 7     │  976.26ms │              991.59ms │     no change │
│ QQuery 8     │  318.50ms │              349.00ms │  1.10x slower │
│ QQuery 9     │  775.75ms │              814.27ms │     no change │
│ QQuery 10    │  401.89ms │              395.42ms │     no change │
│ QQuery 11    │   81.35ms │               77.24ms │ +1.05x faster │
│ QQuery 12    │  315.57ms │              307.22ms │     no change │
│ QQuery 13    │  280.05ms │              246.05ms │ +1.14x faster │
│ QQuery 14    │   45.80ms │               48.39ms │  1.06x slower │
│ QQuery 15    │  114.20ms │              115.17ms │     no change │
│ QQuery 16    │   87.50ms │               86.10ms │     no change │
│ QQuery 17    │  856.14ms │              894.89ms │     no change │
│ QQuery 18    │ 2645.17ms │             2316.60ms │ +1.14x faster │
│ QQuery 19    │  161.27ms │              164.10ms │     no change │
│ QQuery 20    │  214.24ms │              218.33ms │     no change │
│ QQuery 21    │ 1473.00ms │             1386.08ms │ +1.06x faster │
│ QQuery 22    │   98.78ms │               94.21ms │     no change │
└──────────────┴───────────┴───────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (main)                    │ 10444.29ms │
│ Total Time (concat_in_repartition)   │ 10120.81ms │
│ Average Time (main)                  │   474.74ms │
│ Average Time (concat_in_repartition) │   460.04ms │
│ Queries Faster                       │          5 │
│ Queries Slower                       │          3 │
│ Queries with No Change               │         14 │
└──────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ concat_in_repartition ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 102.62ms │              104.12ms │     no change │
│ QQuery 2     │  47.89ms │               48.16ms │     no change │
│ QQuery 3     │  55.33ms │               52.82ms │     no change │
│ QQuery 4     │  43.64ms │               40.18ms │ +1.09x faster │
│ QQuery 5     │  78.05ms │               74.44ms │     no change │
│ QQuery 6     │  26.82ms │               26.47ms │     no change │
│ QQuery 7     │  88.89ms │               88.91ms │     no change │
│ QQuery 8     │  70.95ms │               72.93ms │     no change │
│ QQuery 9     │  99.19ms │               97.91ms │     no change │
│ QQuery 10    │ 100.09ms │              102.30ms │     no change │
│ QQuery 11    │  37.46ms │               39.16ms │     no change │
│ QQuery 12    │  58.27ms │               57.89ms │     no change │
│ QQuery 13    │ 131.15ms │              128.73ms │     no change │
│ QQuery 14    │  36.52ms │               37.55ms │     no change │
│ QQuery 15    │  44.09ms │               44.30ms │     no change │
│ QQuery 16    │  29.26ms │               28.05ms │     no change │
│ QQuery 17    │ 112.79ms │              112.40ms │     no change │
│ QQuery 18    │ 152.36ms │              146.60ms │     no change │
│ QQuery 19    │  61.73ms │               62.32ms │     no change │
│ QQuery 20    │  56.72ms │               55.37ms │     no change │
│ QQuery 21    │ 117.40ms │              114.14ms │     no change │
│ QQuery 22    │  31.38ms │               29.80ms │ +1.05x faster │
└──────────────┴──────────┴───────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (main)                    │ 1582.60ms │
│ Total Time (concat_in_repartition)   │ 1564.58ms │
│ Average Time (main)                  │   71.94ms │
│ Average Time (concat_in_repartition) │   71.12ms │
│ Queries Faster                       │         2 │
│ Queries Slower                       │         0 │
│ Queries with No Change               │        20 │
└──────────────────────────────────────┴───────────┘
--------------------
Benchmark tpch_sf10.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      main ┃ concat_in_repartition ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  809.46ms │              806.25ms │     no change │
│ QQuery 2     │  165.80ms │              159.27ms │     no change │
│ QQuery 3     │  432.56ms │              403.77ms │ +1.07x faster │
│ QQuery 4     │  459.88ms │              442.66ms │     no change │
│ QQuery 5     │  671.13ms │              619.00ms │ +1.08x faster │
│ QQuery 6     │  181.36ms │              189.14ms │     no change │
│ QQuery 7     │  959.96ms │              894.05ms │ +1.07x faster │
│ QQuery 8     │  672.39ms │              658.85ms │     no change │
│ QQuery 9     │ 1101.98ms │             1138.28ms │     no change │
│ QQuery 10    │  638.41ms │              650.97ms │     no change │
│ QQuery 11    │  126.22ms │              118.40ms │ +1.07x faster │
│ QQuery 12    │  358.25ms │              354.98ms │     no change │
│ QQuery 13    │  720.99ms │              722.20ms │     no change │
│ QQuery 14    │  247.23ms │              244.06ms │     no change │
│ QQuery 15    │  395.58ms │              383.78ms │     no change │
│ QQuery 16    │  110.37ms │              101.86ms │ +1.08x faster │
│ QQuery 17    │ 1193.78ms │             1189.51ms │     no change │
│ QQuery 18    │ 1846.58ms │             1572.72ms │ +1.17x faster │
│ QQuery 19    │  412.76ms │              404.37ms │     no change │
│ QQuery 20    │  421.08ms │              398.91ms │ +1.06x faster │
│ QQuery 21    │ 1363.39ms │             1247.48ms │ +1.09x faster │
│ QQuery 22    │  149.67ms │              141.16ms │ +1.06x faster │
└──────────────┴───────────┴───────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (main)                    │ 13438.84ms │
│ Total Time (concat_in_repartition)   │ 12841.67ms │
│ Average Time (main)                  │   610.86ms │
│ Average Time (concat_in_repartition) │   583.71ms │
│ Queries Faster                       │          9 │
│ Queries Slower                       │          0 │
│ Queries with No Change               │         13 │
└──────────────────────────────────────┴────────────┘

Rationale for this change

Recently, I found interleave_batches to be faster than the existing code.
That actually doesn't have anything to do with interleave being faster (in fact, it is slower), but the fact that we don't send num_partition batches per input batch to the output channels.
It takes individual batches and sends them to the output channels (and directly blocking progress as the batches have been sent upstream and may all be quickly "non-empty").

We can fix this by internally concatenating the input arrays inside RepartitionExec.

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Dandandan · 2025-06-01T16:29:37Z

FYI @alamb this relates to your quest to remove CoalesceBatches (this doesn't yet remove concat but it shows the potential for optimization).

alamb · 2025-06-01T16:51:19Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing concat_in_repartition (dc7df1a) to 6844e56 diff
Benchmarks: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

alamb · 2025-06-01T17:31:45Z

🤖: Benchmark completed

Details

Comparing HEAD and concat_in_repartition
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ concat_in_repartition ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │  1913.26ms │             1919.02ms │    no change │
│ QQuery 1     │   722.29ms │              700.68ms │    no change │
│ QQuery 2     │  1500.38ms │             1467.99ms │    no change │
│ QQuery 3     │   696.92ms │              696.76ms │    no change │
│ QQuery 4     │  1495.37ms │             1489.28ms │    no change │
│ QQuery 5     │ 15779.55ms │            16668.91ms │ 1.06x slower │
│ QQuery 6     │  2116.95ms │             2044.90ms │    no change │
│ QQuery 7     │  2125.37ms │             2089.67ms │    no change │
│ QQuery 8     │   863.22ms │              867.06ms │    no change │
└──────────────┴────────────┴───────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 27213.31ms │
│ Total Time (concat_in_repartition)   │ 27944.28ms │
│ Average Time (HEAD)                  │  3023.70ms │
│ Average Time (concat_in_repartition) │  3104.92ms │
│ Queries Faster                       │          0 │
│ Queries Slower                       │          1 │
│ Queries with No Change               │          8 │
└──────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ concat_in_repartition ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │    15.86ms │               14.85ms │ +1.07x faster │
│ QQuery 1     │    33.79ms │               34.30ms │     no change │
│ QQuery 2     │    81.29ms │               82.32ms │     no change │
│ QQuery 3     │    97.67ms │               97.49ms │     no change │
│ QQuery 4     │   593.72ms │              576.81ms │     no change │
│ QQuery 5     │   872.11ms │              831.23ms │     no change │
│ QQuery 6     │    23.55ms │               21.66ms │ +1.09x faster │
│ QQuery 7     │    39.27ms │               36.84ms │ +1.07x faster │
│ QQuery 8     │   917.74ms │              907.24ms │     no change │
│ QQuery 9     │  1201.79ms │             1202.36ms │     no change │
│ QQuery 10    │   270.42ms │              262.42ms │     no change │
│ QQuery 11    │   303.32ms │              292.44ms │     no change │
│ QQuery 12    │   927.99ms │              923.22ms │     no change │
│ QQuery 13    │  1297.08ms │             1246.41ms │     no change │
│ QQuery 14    │   857.34ms │              870.62ms │     no change │
│ QQuery 15    │   833.84ms │              814.85ms │     no change │
│ QQuery 16    │  1782.59ms │             1747.95ms │     no change │
│ QQuery 17    │  1653.21ms │             1631.09ms │     no change │
│ QQuery 18    │  3134.19ms │             3105.45ms │     no change │
│ QQuery 19    │    84.39ms │               88.66ms │  1.05x slower │
│ QQuery 20    │  1190.98ms │             1141.01ms │     no change │
│ QQuery 21    │  1395.64ms │             1329.52ms │     no change │
│ QQuery 22    │  2330.04ms │             2243.38ms │     no change │
│ QQuery 23    │  8406.43ms │             8188.72ms │     no change │
│ QQuery 24    │   489.13ms │              479.48ms │     no change │
│ QQuery 25    │   429.30ms │              395.13ms │ +1.09x faster │
│ QQuery 26    │   559.01ms │              538.71ms │     no change │
│ QQuery 27    │  1690.54ms │             1647.99ms │     no change │
│ QQuery 28    │ 12583.01ms │            13495.15ms │  1.07x slower │
│ QQuery 29    │   516.65ms │              543.13ms │  1.05x slower │
│ QQuery 30    │   816.74ms │              816.88ms │     no change │
│ QQuery 31    │   860.30ms │              844.47ms │     no change │
│ QQuery 32    │  2704.29ms │             2735.08ms │     no change │
│ QQuery 33    │  3406.90ms │             3388.81ms │     no change │
│ QQuery 34    │  3420.93ms │             3478.97ms │     no change │
│ QQuery 35    │  1292.91ms │             1342.04ms │     no change │
│ QQuery 36    │   124.76ms │              124.59ms │     no change │
│ QQuery 37    │    57.87ms │               54.63ms │ +1.06x faster │
│ QQuery 38    │   123.75ms │              124.85ms │     no change │
│ QQuery 39    │   198.89ms │              202.61ms │     no change │
│ QQuery 40    │    48.95ms │               47.96ms │     no change │
│ QQuery 41    │    45.21ms │               46.70ms │     no change │
│ QQuery 42    │    37.69ms │               38.21ms │     no change │
└──────────────┴────────────┴───────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 57751.10ms │
│ Total Time (concat_in_repartition)   │ 58036.22ms │
│ Average Time (HEAD)                  │  1343.05ms │
│ Average Time (concat_in_repartition) │  1349.68ms │
│ Queries Faster                       │          5 │
│ Queries Slower                       │          3 │
│ Queries with No Change               │         35 │
└──────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ concat_in_repartition ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 1     │ 115.45ms │              121.28ms │ 1.05x slower │
│ QQuery 2     │  22.05ms │               23.17ms │ 1.05x slower │
│ QQuery 3     │  34.20ms │               41.50ms │ 1.21x slower │
│ QQuery 4     │  19.74ms │               21.28ms │ 1.08x slower │
│ QQuery 5     │  53.14ms │               60.16ms │ 1.13x slower │
│ QQuery 6     │  12.19ms │               12.16ms │    no change │
│ QQuery 7     │  95.32ms │              112.42ms │ 1.18x slower │
│ QQuery 8     │  25.53ms │               28.15ms │ 1.10x slower │
│ QQuery 9     │  60.59ms │               66.57ms │ 1.10x slower │
│ QQuery 10    │  58.20ms │               62.97ms │ 1.08x slower │
│ QQuery 11    │  11.57ms │               11.84ms │    no change │
│ QQuery 12    │  41.92ms │               44.55ms │ 1.06x slower │
│ QQuery 13    │  28.11ms │               29.19ms │    no change │
│ QQuery 14    │   9.74ms │               10.44ms │ 1.07x slower │
│ QQuery 15    │  22.92ms │               23.21ms │    no change │
│ QQuery 16    │  22.36ms │               21.96ms │    no change │
│ QQuery 17    │  95.73ms │               96.48ms │    no change │
│ QQuery 18    │ 207.56ms │              216.17ms │    no change │
│ QQuery 19    │  26.01ms │               25.88ms │    no change │
│ QQuery 20    │  33.88ms │               36.51ms │ 1.08x slower │
│ QQuery 21    │ 159.01ms │              168.68ms │ 1.06x slower │
│ QQuery 22    │  16.62ms │               16.87ms │    no change │
└──────────────┴──────────┴───────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 1171.81ms │
│ Total Time (concat_in_repartition)   │ 1251.45ms │
│ Average Time (HEAD)                  │   53.26ms │
│ Average Time (concat_in_repartition) │   56.88ms │
│ Queries Faster                       │         0 │
│ Queries Slower                       │        13 │
│ Queries with No Change               │         9 │
└──────────────────────────────────────┴───────────┘

Dandandan · 2025-06-01T17:38:56Z

🤖: Benchmark completed

Details

Comparing HEAD and concat_in_repartition
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ concat_in_repartition ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │  1913.26ms │             1919.02ms │    no change │
│ QQuery 1     │   722.29ms │              700.68ms │    no change │
│ QQuery 2     │  1500.38ms │             1467.99ms │    no change │
│ QQuery 3     │   696.92ms │              696.76ms │    no change │
│ QQuery 4     │  1495.37ms │             1489.28ms │    no change │
│ QQuery 5     │ 15779.55ms │            16668.91ms │ 1.06x slower │
│ QQuery 6     │  2116.95ms │             2044.90ms │    no change │
│ QQuery 7     │  2125.37ms │             2089.67ms │    no change │
│ QQuery 8     │   863.22ms │              867.06ms │    no change │
└──────────────┴────────────┴───────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 27213.31ms │
│ Total Time (concat_in_repartition)   │ 27944.28ms │
│ Average Time (HEAD)                  │  3023.70ms │
│ Average Time (concat_in_repartition) │  3104.92ms │
│ Queries Faster                       │          0 │
│ Queries Slower                       │          1 │
│ Queries with No Change               │          8 │
└──────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ concat_in_repartition ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │    15.86ms │               14.85ms │ +1.07x faster │
│ QQuery 1     │    33.79ms │               34.30ms │     no change │
│ QQuery 2     │    81.29ms │               82.32ms │     no change │
│ QQuery 3     │    97.67ms │               97.49ms │     no change │
│ QQuery 4     │   593.72ms │              576.81ms │     no change │
│ QQuery 5     │   872.11ms │              831.23ms │     no change │
│ QQuery 6     │    23.55ms │               21.66ms │ +1.09x faster │
│ QQuery 7     │    39.27ms │               36.84ms │ +1.07x faster │
│ QQuery 8     │   917.74ms │              907.24ms │     no change │
│ QQuery 9     │  1201.79ms │             1202.36ms │     no change │
│ QQuery 10    │   270.42ms │              262.42ms │     no change │
│ QQuery 11    │   303.32ms │              292.44ms │     no change │
│ QQuery 12    │   927.99ms │              923.22ms │     no change │
│ QQuery 13    │  1297.08ms │             1246.41ms │     no change │
│ QQuery 14    │   857.34ms │              870.62ms │     no change │
│ QQuery 15    │   833.84ms │              814.85ms │     no change │
│ QQuery 16    │  1782.59ms │             1747.95ms │     no change │
│ QQuery 17    │  1653.21ms │             1631.09ms │     no change │
│ QQuery 18    │  3134.19ms │             3105.45ms │     no change │
│ QQuery 19    │    84.39ms │               88.66ms │  1.05x slower │
│ QQuery 20    │  1190.98ms │             1141.01ms │     no change │
│ QQuery 21    │  1395.64ms │             1329.52ms │     no change │
│ QQuery 22    │  2330.04ms │             2243.38ms │     no change │
│ QQuery 23    │  8406.43ms │             8188.72ms │     no change │
│ QQuery 24    │   489.13ms │              479.48ms │     no change │
│ QQuery 25    │   429.30ms │              395.13ms │ +1.09x faster │
│ QQuery 26    │   559.01ms │              538.71ms │     no change │
│ QQuery 27    │  1690.54ms │             1647.99ms │     no change │
│ QQuery 28    │ 12583.01ms │            13495.15ms │  1.07x slower │
│ QQuery 29    │   516.65ms │              543.13ms │  1.05x slower │
│ QQuery 30    │   816.74ms │              816.88ms │     no change │
│ QQuery 31    │   860.30ms │              844.47ms │     no change │
│ QQuery 32    │  2704.29ms │             2735.08ms │     no change │
│ QQuery 33    │  3406.90ms │             3388.81ms │     no change │
│ QQuery 34    │  3420.93ms │             3478.97ms │     no change │
│ QQuery 35    │  1292.91ms │             1342.04ms │     no change │
│ QQuery 36    │   124.76ms │              124.59ms │     no change │
│ QQuery 37    │    57.87ms │               54.63ms │ +1.06x faster │
│ QQuery 38    │   123.75ms │              124.85ms │     no change │
│ QQuery 39    │   198.89ms │              202.61ms │     no change │
│ QQuery 40    │    48.95ms │               47.96ms │     no change │
│ QQuery 41    │    45.21ms │               46.70ms │     no change │
│ QQuery 42    │    37.69ms │               38.21ms │     no change │
└──────────────┴────────────┴───────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 57751.10ms │
│ Total Time (concat_in_repartition)   │ 58036.22ms │
│ Average Time (HEAD)                  │  1343.05ms │
│ Average Time (concat_in_repartition) │  1349.68ms │
│ Queries Faster                       │          5 │
│ Queries Slower                       │          3 │
│ Queries with No Change               │         35 │
└──────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ concat_in_repartition ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 1     │ 115.45ms │              121.28ms │ 1.05x slower │
│ QQuery 2     │  22.05ms │               23.17ms │ 1.05x slower │
│ QQuery 3     │  34.20ms │               41.50ms │ 1.21x slower │
│ QQuery 4     │  19.74ms │               21.28ms │ 1.08x slower │
│ QQuery 5     │  53.14ms │               60.16ms │ 1.13x slower │
│ QQuery 6     │  12.19ms │               12.16ms │    no change │
│ QQuery 7     │  95.32ms │              112.42ms │ 1.18x slower │
│ QQuery 8     │  25.53ms │               28.15ms │ 1.10x slower │
│ QQuery 9     │  60.59ms │               66.57ms │ 1.10x slower │
│ QQuery 10    │  58.20ms │               62.97ms │ 1.08x slower │
│ QQuery 11    │  11.57ms │               11.84ms │    no change │
│ QQuery 12    │  41.92ms │               44.55ms │ 1.06x slower │
│ QQuery 13    │  28.11ms │               29.19ms │    no change │
│ QQuery 14    │   9.74ms │               10.44ms │ 1.07x slower │
│ QQuery 15    │  22.92ms │               23.21ms │    no change │
│ QQuery 16    │  22.36ms │               21.96ms │    no change │
│ QQuery 17    │  95.73ms │               96.48ms │    no change │
│ QQuery 18    │ 207.56ms │              216.17ms │    no change │
│ QQuery 19    │  26.01ms │               25.88ms │    no change │
│ QQuery 20    │  33.88ms │               36.51ms │ 1.08x slower │
│ QQuery 21    │ 159.01ms │              168.68ms │ 1.06x slower │
│ QQuery 22    │  16.62ms │               16.87ms │    no change │
└──────────────┴──────────┴───────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 1171.81ms │
│ Total Time (concat_in_repartition)   │ 1251.45ms │
│ Average Time (HEAD)                  │   53.26ms │
│ Average Time (concat_in_repartition) │   56.88ms │
│ Queries Faster                       │         0 │
│ Queries Slower                       │        13 │
│ Queries with No Change               │         9 │
└──────────────────────────────────────┴───────────┘

hmm interesting. this shows something different.

Dandandan · 2025-06-01T17:48:50Z

One commit was missing, but not sure that explains the difference between my result and this one.

Dandandan · 2025-06-01T19:20:14Z

. let me try some other approach later - buffering inputs for each output partition until it reaches the target batch size (just like coalescebatches). perhaps the extra copy for smaller sized batches or increased size might be hurting in some cases.

Dandandan · 2025-06-02T08:17:22Z

I got some amazing results (5-20% on total average on benchmarks) on the latter approach yesterday (buffer inside repartition). Will clean it up later this week (currently ill).

alamb · 2025-06-02T12:26:10Z

I got some amazing results (5-20% on total average on benchmarks) on the latter approach yesterday (buffer inside repartition). Will clean it up later this week (currently ill).

I hope you feel better !

Concatenate inside hash repartition

f731021

github-actions bot added the physical-plan Changes to the physical-plan crate label Jun 1, 2025

Dandandan closed this Jun 1, 2025

Concatenate inside hash repartition

6cad3f7

Dandandan reopened this Jun 1, 2025

Dandandan added 4 commits June 1, 2025 14:06

Concatenate inside hash repartition

73eb2f1

clippy

1a96d2d

fmt

90d8f09

Concatenate inside hash repartition

dc7df1a

Dandandan closed this Jun 1, 2025

Dandandan reopened this Jun 1, 2025

opt

d4ac615

Dandandan marked this pull request as ready for review June 1, 2025 16:26

Dandandan closed this Jun 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Concatenate inside hash repartition #16223

Concatenate inside hash repartition #16223

Uh oh!

Dandandan commented Jun 1, 2025 •

edited

Loading

Uh oh!

Dandandan commented Jun 1, 2025

Uh oh!

alamb commented Jun 1, 2025

Uh oh!

alamb commented Jun 1, 2025

Uh oh!

Dandandan commented Jun 1, 2025

Uh oh!

Dandandan commented Jun 1, 2025

Uh oh!

Dandandan commented Jun 1, 2025

Uh oh!

Dandandan commented Jun 2, 2025 •

edited

Loading

Uh oh!

alamb commented Jun 2, 2025

Uh oh!

Uh oh!

Concatenate inside hash repartition #16223

Concatenate inside hash repartition #16223

Uh oh!

Conversation

Dandandan commented Jun 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Dandandan commented Jun 1, 2025

Uh oh!

alamb commented Jun 1, 2025

Uh oh!

alamb commented Jun 1, 2025

Uh oh!

Dandandan commented Jun 1, 2025

Uh oh!

Dandandan commented Jun 1, 2025

Uh oh!

Dandandan commented Jun 1, 2025

Uh oh!

Dandandan commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alamb commented Jun 2, 2025

Uh oh!

Uh oh!

Dandandan commented Jun 1, 2025 •

edited

Loading

Dandandan commented Jun 2, 2025 •

edited

Loading