bug: TPCH 18 query hangs

### Describe the bug

On my machine
```
OS: Fedora Linux 43 (KDE Plasma Desktop Edition) x86_64
Kernel: Linux 6.19.11-200.fc43.x86_64
CPU: AMD Ryzen AI 9 HX 370 (24) @ 5.16 GHz
GPU: AMD Radeon 890M Graphics [Integrated]
Memory: 11.96 GiB / 86.02 GiB (14%)
Swap: 0 B / 8.00 GiB (0%)
```
with updated rust:
```
$ rustup show
Default host: x86_64-unknown-linux-gnu
rustup home:  /home/bruce/.rustup

installed toolchains
--------------------
stable-x86_64-unknown-linux-gnu (default)
nightly-x86_64-unknown-linux-gnu
1.91.0-x86_64-unknown-linux-gnu
1.92.0-x86_64-unknown-linux-gnu
1.93.0-x86_64-unknown-linux-gnu
1.94.0-x86_64-unknown-linux-gnu (active)

active toolchain
----------------
name: 1.94.0-x86_64-unknown-linux-gnu
active because: overridden by '/home/bruce/dev/datafusion2/rust-toolchain.toml'
installed targets:
  x86_64-unknown-linux-gnu
```

Running against main:
`cd benchmarks;./bench.sh data tpch;./bench.sh run tpch 18` will hang
```
$ ./bench.sh run tpch 18
***************************
DataFusion Benchmark Script
COMMAND: run
BENCHMARK: tpch
QUERY: 18
DATAFUSION_DIR: /home/bruce/dev/datafusion2/benchmarks/..
BRANCH_NAME: HEAD
DATA_DIR: /home/bruce/dev/datafusion2/benchmarks/data
RESULTS_DIR: /home/bruce/dev/datafusion2/benchmarks/results/HEAD
CARGO_COMMAND: cargo run --release
PREFER_HASH_JOIN: true
SIMULATE_LATENCY: false
***************************
RESULTS_FILE: /home/bruce/dev/datafusion2/benchmarks/results/HEAD/tpch_sf1.json
Running tpch benchmark...
+ cargo run --release --bin dfbench -- tpch --iterations 5 --path /home/bruce/dev/datafusion2/benchmarks/data/tpch_sf1 --prefer_hash_join true --format parquet -o /home/bruce/dev/datafusion2/benchmarks/results/HEAD/tpch_sf1.json --query 18
    Finished `release` profile [optimized] target(s) in 0.11s
     Running `/home/bruce/dev/datafusion2/target/release/dfbench tpch --iterations 5 --path /home/bruce/dev/datafusion2/benchmarks/data/tpch_sf1 --prefer_hash_join true --format parquet -o /home/bruce/dev/datafusion2/benchmarks/results/HEAD/tpch_sf1.json --query 18`
Running benchmarks with the following options: RunOpt { query: Some(18), common: CommonOpt { iterations: 5, partitions: None, batch_size: None, mem_pool_type: "fair", memory_limit: None, sort_spill_reservation_bytes: None, debug: false, simulate_latency: false }, path: "/home/bruce/dev/datafusion2/benchmarks/data/tpch_sf1", file_format: "parquet", mem_table: false, output_path: Some("/home/bruce/dev/datafusion2/benchmarks/results/HEAD/tpch_sf1.json"), disable_statistics: false, prefer_hash_join: true, enable_piecewise_merge_join: false, sorted: false, hash_join_buffering_capacity: 0 }
```

git bisect points to [this commit](https://github.com/apache/datafusion/commit/6c5e241e6298e70077259b3a12840c3adab3c810) as the cause. Running the test at the commit just prior to that one succeeds. Running it at that commit fails.

If prefer_hash_join is disabled the query will run as expected:
```
PREFER_HASH_JOIN=false ./bench.sh run tpch 18
***************************
DataFusion Benchmark Script
COMMAND: run
BENCHMARK: tpch
QUERY: 18
DATAFUSION_DIR: /home/bruce/dev/datafusion2/benchmarks/..
BRANCH_NAME: HEAD
DATA_DIR: /home/bruce/dev/datafusion2/benchmarks/data
RESULTS_DIR: /home/bruce/dev/datafusion2/benchmarks/results/HEAD
CARGO_COMMAND: cargo run --release
PREFER_HASH_JOIN: false
SIMULATE_LATENCY: false
***************************
RESULTS_FILE: /home/bruce/dev/datafusion2/benchmarks/results/HEAD/tpch_sf1.json
Running tpch benchmark...
+ cargo run --release --bin dfbench -- tpch --iterations 5 --path /home/bruce/dev/datafusion2/benchmarks/data/tpch_sf1 --prefer_hash_join false --format parquet -o /home/bruce/dev/datafusion2/benchmarks/results/HEAD/tpch_sf1.json --query 18
    Finished `release` profile [optimized] target(s) in 0.15s
     Running `/home/bruce/dev/datafusion2/target/release/dfbench tpch --iterations 5 --path /home/bruce/dev/datafusion2/benchmarks/data/tpch_sf1 --prefer_hash_join false --format parquet -o /home/bruce/dev/datafusion2/benchmarks/results/HEAD/tpch_sf1.json --query 18`
Running benchmarks with the following options: RunOpt { query: Some(18), common: CommonOpt { iterations: 5, partitions: None, batch_size: None, mem_pool_type: "fair", memory_limit: None, sort_spill_reservation_bytes: None, debug: false, simulate_latency: false }, path: "/home/bruce/dev/datafusion2/benchmarks/data/tpch_sf1", file_format: "parquet", mem_table: false, output_path: Some("/home/bruce/dev/datafusion2/benchmarks/results/HEAD/tpch_sf1.json"), disable_statistics: false, prefer_hash_join: false, enable_piecewise_merge_join: false, sorted: false, hash_join_buffering_capacity: 0 }
Query 18 iteration 0 took 206.5 ms and returned 57 rows
Query 18 iteration 1 took 190.0 ms and returned 57 rows
Query 18 iteration 2 took 188.5 ms and returned 57 rows
Query 18 iteration 3 took 185.0 ms and returned 57 rows
Query 18 iteration 4 took 192.1 ms and returned 57 rows
Query 18 avg time: 192.42 ms
+ set +x
Done
```

### To Reproduce

This seems to be machine/OS specific. I've been unable to reproduce on other machines.

### Expected behavior

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: TPCH 18 query hangs #21625

Describe the bug

To Reproduce

Expected behavior

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

bug: TPCH 18 query hangs #21625

Description

Describe the bug

To Reproduce

Expected behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions