Skip to content

Specialize join matching when values in map are unique#15690

Closed
Dandandan wants to merge 1 commit intoapache:mainfrom
Dandandan:join_uniqueness
Closed

Specialize join matching when values in map are unique#15690
Dandandan wants to merge 1 commit intoapache:mainfrom
Dandandan:join_uniqueness

Conversation

@Dandandan
Copy link
Copy Markdown
Contributor

@Dandandan Dandandan commented Apr 11, 2025

Which issue does this PR close?

  • Closes #.

Rationale for this change

Performance improvements for this case.

Details
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ join_uniqueness ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  76.36ms │         77.54ms │     no change │
│ QQuery 2     │  13.22ms │         12.55ms │ +1.05x faster │
│ QQuery 3     │  25.00ms │         21.52ms │ +1.16x faster │
│ QQuery 4     │  14.27ms │         12.20ms │ +1.17x faster │
│ QQuery 5     │  37.87ms │         36.25ms │     no change │
│ QQuery 6     │   5.80ms │          5.20ms │ +1.12x faster │
│ QQuery 7     │  74.47ms │         72.09ms │     no change │
│ QQuery 8     │  17.11ms │         17.79ms │     no change │
│ QQuery 9     │  42.82ms │         40.55ms │ +1.06x faster │
│ QQuery 10    │  36.96ms │         38.60ms │     no change │
│ QQuery 11    │   6.74ms │          6.23ms │ +1.08x faster │
│ QQuery 12    │  27.48ms │         27.19ms │     no change │
│ QQuery 13    │  20.02ms │         18.65ms │ +1.07x faster │
│ QQuery 14    │   6.19ms │          5.52ms │ +1.12x faster │
│ QQuery 15    │  13.80ms │         13.39ms │     no change │
│ QQuery 16    │  14.27ms │         13.48ms │ +1.06x faster │
│ QQuery 17    │  57.61ms │         55.87ms │     no change │
│ QQuery 18    │ 144.33ms │        126.34ms │ +1.14x faster │
│ QQuery 19    │  23.26ms │         24.65ms │  1.06x slower │
│ QQuery 20    │  24.67ms │         21.51ms │ +1.15x faster │
│ QQuery 21    │ 101.15ms │         90.91ms │ +1.11x faster │
│ QQuery 22    │  12.86ms │         13.18ms │     no change │
└──────────────┴──────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary              ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (main)              │ 796.25ms │
│ Total Time (join_uniqueness)   │ 751.22ms │
│ Average Time (main)            │  36.19ms │
│ Average Time (join_uniqueness) │  34.15ms │
│ Queries Faster                 │       12 │
│ Queries Slower                 │        1 │
│ Queries with No Change         │        9 │
└────────────────────────────────┴──────────┘
--------------------
Benchmark tpch_mem_sf10.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      main ┃ join_uniqueness ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  713.30ms │        742.33ms │     no change │
│ QQuery 2     │  117.97ms │        103.63ms │ +1.14x faster │
│ QQuery 3     │  225.08ms │        233.71ms │     no change │
│ QQuery 4     │  126.43ms │        126.15ms │     no change │
│ QQuery 5     │  490.15ms │        413.30ms │ +1.19x faster │
│ QQuery 6     │   45.84ms │         45.32ms │     no change │
│ QQuery 7     │ 1053.36ms │        969.33ms │ +1.09x faster │
│ QQuery 8     │  325.97ms │        321.67ms │     no change │
│ QQuery 9     │  808.04ms │        788.69ms │     no change │
│ QQuery 10    │  402.56ms │        405.22ms │     no change │
│ QQuery 11    │   92.35ms │         77.74ms │ +1.19x faster │
│ QQuery 12    │  276.90ms │        278.14ms │     no change │
│ QQuery 13    │  307.09ms │        236.15ms │ +1.30x faster │
│ QQuery 14    │   41.72ms │         45.89ms │  1.10x slower │
│ QQuery 15    │  123.96ms │        130.16ms │  1.05x slower │
│ QQuery 16    │   91.25ms │         91.01ms │     no change │
│ QQuery 17    │  880.09ms │        867.14ms │     no change │
│ QQuery 18    │ 3028.76ms │       2450.51ms │ +1.24x faster │
│ QQuery 19    │  173.80ms │        172.74ms │     no change │
│ QQuery 20    │  231.96ms │        207.22ms │ +1.12x faster │
│ QQuery 21    │ 1379.85ms │       1438.61ms │     no change │
│ QQuery 22    │  104.42ms │         95.54ms │ +1.09x faster │
└──────────────┴───────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary              ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (main)              │ 11040.86ms │
│ Total Time (join_uniqueness)   │ 10240.19ms │
│ Average Time (main)            │   501.86ms │
│ Average Time (join_uniqueness) │   465.46ms │
│ Queries Faster                 │          8 │
│ Queries Slower                 │          2 │
│ Queries with No Change         │         12 │
└────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ join_uniqueness ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 119.77ms │        121.02ms │     no change │
│ QQuery 2     │  47.11ms │         48.15ms │     no change │
│ QQuery 3     │  53.57ms │         55.08ms │     no change │
│ QQuery 4     │  45.38ms │         44.26ms │     no change │
│ QQuery 5     │  78.05ms │         73.26ms │ +1.07x faster │
│ QQuery 6     │  20.90ms │         20.56ms │     no change │
│ QQuery 7     │  98.65ms │         94.39ms │     no change │
│ QQuery 8     │  74.31ms │         73.84ms │     no change │
│ QQuery 9     │ 115.48ms │        110.37ms │     no change │
│ QQuery 10    │ 100.76ms │        102.86ms │     no change │
│ QQuery 11    │  38.01ms │         37.21ms │     no change │
│ QQuery 12    │  53.72ms │         51.41ms │     no change │
│ QQuery 13    │ 130.48ms │        132.69ms │     no change │
│ QQuery 14    │  36.53ms │         36.61ms │     no change │
│ QQuery 15    │  45.10ms │         44.14ms │     no change │
│ QQuery 16    │  29.42ms │         29.11ms │     no change │
│ QQuery 17    │ 102.52ms │        105.23ms │     no change │
│ QQuery 18    │ 156.62ms │        145.41ms │ +1.08x faster │
│ QQuery 19    │  65.29ms │         64.02ms │     no change │
│ QQuery 20    │  61.27ms │         60.25ms │     no change │
│ QQuery 21    │ 118.45ms │        109.26ms │ +1.08x faster │
│ QQuery 22    │  30.35ms │         30.54ms │     no change │
└──────────────┴──────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary              ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (main)              │ 1621.76ms │
│ Total Time (join_uniqueness)   │ 1589.68ms │
│ Average Time (main)            │   73.72ms │
│ Average Time (join_uniqueness) │   72.26ms │
│ Queries Faster                 │         3 │
│ Queries Slower                 │         0 │
│ Queries with No Change         │        19 │
└────────────────────────────────┴───────────┘
--------------------
Benchmark tpch_sf10.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      main ┃ join_uniqueness ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  917.62ms │        922.40ms │     no change │
│ QQuery 2     │  164.94ms │        162.92ms │     no change │
│ QQuery 3     │  444.79ms │        434.07ms │     no change │
│ QQuery 4     │  481.42ms │        472.70ms │     no change │
│ QQuery 5     │  719.77ms │        667.56ms │ +1.08x faster │
│ QQuery 6     │  145.90ms │        143.08ms │     no change │
│ QQuery 7     │  997.14ms │        963.82ms │     no change │
│ QQuery 8     │  696.29ms │        688.56ms │     no change │
│ QQuery 9     │ 1111.77ms │       1127.86ms │     no change │
│ QQuery 10    │  617.77ms │        649.63ms │  1.05x slower │
│ QQuery 11    │  136.72ms │        134.25ms │     no change │
│ QQuery 12    │  318.16ms │        322.52ms │     no change │
│ QQuery 13    │  703.35ms │        711.53ms │     no change │
│ QQuery 14    │  248.90ms │        248.35ms │     no change │
│ QQuery 15    │  389.32ms │        387.55ms │     no change │
│ QQuery 16    │  111.42ms │        107.75ms │     no change │
│ QQuery 17    │ 1170.66ms │       1183.55ms │     no change │
│ QQuery 18    │ 2003.83ms │       1893.24ms │ +1.06x faster │
│ QQuery 19    │  433.70ms │        422.29ms │     no change │
│ QQuery 20    │  403.94ms │        420.45ms │     no change │
│ QQuery 21    │ 1410.99ms │       1374.66ms │     no change │
│ QQuery 22    │  154.81ms │        151.55ms │     no change │
└──────────────┴───────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary              ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (main)              │ 13783.20ms │
│ Total Time (join_uniqueness)   │ 13590.27ms │
│ Average Time (main)            │   626.51ms │
│ Average Time (join_uniqueness) │   617.74ms │
│ Queries Faster                 │          2 │
│ Queries Slower                 │          1 │
│ Queries with No Change         │         19 │
└────────────────────────────────┴────────────┘

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@Dandandan Dandandan marked this pull request as draft April 11, 2025 21:07
@Dandandan
Copy link
Copy Markdown
Contributor Author

This is promising, need to fix the test and make sure the limit is respected.

@Dandandan Dandandan closed this Apr 13, 2025
@ctsk
Copy link
Copy Markdown
Contributor

ctsk commented May 23, 2025

Mind if I pick this up? - Nevermind, I just saw your new PR!

@Dandandan
Copy link
Copy Markdown
Contributor Author

Yeah, feel free to review my new PR #16153!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants