Description
Is your feature request related to a problem or challenge?
DataFusion joins are generally performant, but they can start erroring out when memory becomes limited.
Here are the h2o queries run on my local machine (Macbook M3 with 16 GB of RAM):
DataFusion performs really well except for query 5, which joins two 100 million row tables. DataFusion errors out for query 5 on my machine.
DataFusion is the fastest option when joining a 100 million row table with a 100 row or 100,000 row table.
These same queries are more performant in the official benchmarks which are run on a really powerful machine:

The official benchmarks show an error for DataFusion on the 1 billion row table:

So, I am not sure about the underlying issue, but seems like there are problems when memory becomes limited.
Describe the solution you'd like
Hopefully DataFusion can perform similar to other engines for large table to large table joins.
Describe alternatives you've considered
No response
Additional context
No response