More TPC-C tests, fix the slow HASH_JOIN #2130

max-hoffman · 2023-11-06T21:36:25Z

The randIO parameter for LOOKUP_JOIN costing was perhaps too strict, since that cost is already stacked on top of the sequential cost. This isn't a replacement for better costing, but boosts TPC-C perf a bit and isn't less correct than the previous version.

This was the motivating query, executed as a HASH_JOIN before:

sbt> explain SELECT COUNT(DISTINCT (s_i_id)) FROM order_line3, stock3 WHERE ol_w_id = 1 AND ol_d_id = 5 AND ol_o_id < 3003 AND ol_o_id >= 2983 AND s_w_id= 1 AND s_i_id=ol_i_id AND s_quantity < 18;
+------------------------------------------------------------------------------------------------------------+
| plan                                                                                                       |
+------------------------------------------------------------------------------------------------------------+
| Project                                                                                                    |
|  ├─ columns: [countdistinct([stock3.s_i_id])]                                                              |
|  └─ GroupBy                                                                                                |
|      ├─ SelectedExprs(COUNTDISTINCT([stock3.s_i_id]))                                                      |
|      ├─ Grouping()                                                                                         |
|      └─ LookupJoin                                                                                         |
|          ├─ IndexedTableAccess(order_line3)                                                                |
|          │   ├─ index: [order_line3.ol_w_id,order_line3.ol_d_id,order_line3.ol_o_id,order_line3.ol_number] |
|          │   ├─ filters: [{[1, 1], [5, 5], [2983, 3003), [NULL, ∞)}]                                       |
|          │   └─ columns: [ol_o_id ol_d_id ol_w_id ol_i_id]                                                 |
|          └─ Filter                                                                                         |
|              ├─ ((stock3.s_w_id = 1) AND (stock3.s_quantity < 18))                                         |
|              └─ IndexedTableAccess(stock3)                                                                 |
|                  ├─ index: [stock3.s_w_id,stock3.s_i_id]                                                   |
|                  ├─ columns: [s_i_id s_w_id s_quantity]                                                    |
|                  └─ keys: 1, order_line3.ol_i_id                                                           |
+------------------------------------------------------------------------------------------------------------+

max-hoffman · 2023-11-06T23:32:45Z

So this breaks one of our index join tests that's faster with MERGE_JOIN...should I close this for now or change the parameter to a value s.t. both queries have the desired plan?

max-hoffman · 2023-11-06T23:33:54Z

The better fix is when I get to phase 2 of costing, and can get an accurate estimate for the LHS and RHS of joins. Small LHS means LOOKUP_JOIN is better, big LHS means MERGE_JOIN or HASH_JOIN is better, but we can't tell the difference right now.

max-hoffman · 2023-11-07T18:01:53Z

I added sysbench plan tests so that we don't accidentally break benchmarks and to compensate for the randIO magic number.

max-hoffman added 4 commits November 6, 2023 13:33

Add more TPCC plans, fix the slow one

0f5b738

Merge branch 'main' into max/more-tpcc

489d148

fixup plans

aeb5a21

fix build

0acf418

max-hoffman requested a review from zachmu November 6, 2023 22:44

max-hoffman mentioned this pull request Nov 6, 2023

[no-release-notes] GMS bump for TPCC plan change dolthub/dolt#6954

Closed

zachmu approved these changes Nov 6, 2023

View reviewed changes

max-hoffman added 3 commits November 7, 2023 08:56

tweak sysbench plan

502a6f4

Merge branch 'main' into max/more-tpcc

8b36951

update plans

e30eb7e

max-hoffman merged commit 1513b8c into main Nov 7, 2023

max-hoffman deleted the max/more-tpcc branch November 7, 2023 18:09

BrewTestBot mentioned this pull request Nov 7, 2023

dolt 1.24.3 Homebrew/homebrew-core#153596

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

More TPC-C tests, fix the slow HASH_JOIN #2130

More TPC-C tests, fix the slow HASH_JOIN #2130

Uh oh!

max-hoffman commented Nov 6, 2023 •

edited

Loading

Uh oh!

max-hoffman commented Nov 6, 2023

Uh oh!

max-hoffman commented Nov 6, 2023

Uh oh!

max-hoffman commented Nov 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

More TPC-C tests, fix the slow HASH_JOIN #2130

More TPC-C tests, fix the slow HASH_JOIN #2130

Uh oh!

Conversation

max-hoffman commented Nov 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

max-hoffman commented Nov 6, 2023

Uh oh!

max-hoffman commented Nov 6, 2023

Uh oh!

max-hoffman commented Nov 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

max-hoffman commented Nov 6, 2023 •

edited

Loading