Skip to content

Optimize MPP Broadcast Join #7084

Closed
Closed
@solotzg

Description

MPP Broadcast Join Enhancement

Implementation

ref pingcap/tidb#40494

TiDB

  • support data compression in Broadcast / Passthrough exchange operator
  • add new session var: tidb_prefer_broadcast_join_by_exchange_data_size (release-7.1):
    • OFF: same behavior like <= v7.0
    • ON: compare data exchange size of join and choose the smallest one
      • tidb_broadcast_join_threshold_size and tidb_broadcast_join_threshold_count will be ignored if planner success to determine whether need broadcast.
  • refine the process about checking broadcast join
    • Broadcast exchange size:
      • Build: (mppStoreCnt - 1) * sizeof(BuildTable)
      • Probe: 0
      • exchange size: Build * (mppStoreCnt^?)
      • choose the child plan with the maximum approximate value as Probe
    • Shuffle exchange size:
      • Build: sizeof(BuildTable) * (mppStoreCnt - 1) / mppStoreCnt
      • Probe: sizeof(ProbeTable) * (mppStoreCnt - 1) / mppStoreCnt
      • exchange size: (Build + Probe) * (mppStoreCnt^?)
    • The size of Build hash table when using shuffle join is about sizeof(broadcast join hash table) / (mppStoreCnt). It will cost more time to construct Build hash table and search Probe while using broadcast join. Set a scale factor (mppStoreCnt^?) when estimating broadcast join in isJoinFitMPPBCJ and isJoinChildFitMPPBCJ (based on TPCH benchmark).
    • If the approximate exchange size of broadcast is smaller than hash, then we need to choose broadcast way.

TiFlash

Support data compression in Broadcast / Passthrough exchange operator

  • support data compression in broadcast/passthrough exchange
  • update parts of utils modules
  • For StorageDisaggregated, use latest mpp version for task meta.
    • TODO: enable data compression if necessary

image

Progress

Benchmark

ENV

Original Threshold x 10: only update threshold about broadcast join

  • set tidb_broadcast_join_threshold_count=102400
  • set tidb_broadcast_join_threshold_size=1048576000
  • set tidb_prefer_broadcast_join_by_exchange_data_size=off

Optimize BCJ

  • set tidb_prefer_broadcast_join_by_exchange_data_size=on

Time(s) Original Optimize BCJ Original Threshold x 10 Performance(QPS) Improvement: (Original) / (Optimize BCJ) - 1.0 Performance(QPS) Improvement: (Original) / ( Original Threshold x 10 ) - 1.0
Q1 2.79 2.79 2.92 0.00% -4.45%
Q2 1.11 0.97 1.11 14.43% 0.00%
Q3 3.05 2.52 3.05 21.03% 0.00%
Q4 2.38 2.45 2.38 -2.86% 0.00%
Q5 6.48 4.33 6.14 49.65% 5.54%
Q6 0.7 0.64 0.7 9.38% 0.00%
Q7 2.99 2.72 2.32 9.93% 28.88%
Q8 4.87 2.18 5.07 123.39% -3.94%
Q9 17.01 13.79 12.72 23.35% 33.73%
Q10 3.39 3.59 3.39 -5.57% 0.00%
Q11 0.84 0.5 0.44 68.00% 90.91%
Q12 1.51 1.31 1.31 15.27% 15.27%
Q13 3.12 3.19 3.19 -2.19% -2.19%
Q14 0.70 0.77 0.84 -9.09% -16.67%
Q15 1.51 1.58 1.58 -4.43% -4.43%
Q16 0.84 0.84 0.84 0.00% 0.00%
Q17 6.41 6.48 6.48 -1.08% -1.08%
Q18 6.95 5.87 6.74 18.40% 3.12%
Q19 1.71 1.85 1.78 -7.57% -3.93%
Q20 1.38 1.44 1.51 -4.17% -8.61%
Q21 8.36 7.21 7.08 15.95% 18.08%
Q22 0.7 0.64 0.64 9.38% 9.38%
SUM 78.8 67.66 72.23 16.46% 9.10%
  Original Optimize BCJ Original Threshold x 10 NET/IO Reduction:(Original-(Optimize BCJ))/Original NET/IO Reduction:(Original-( Original Threshold x 10))/Original
Total Exchange Size By NET (GB) 75.894 22.823 52.376 69.93% 30.99%

After pingcap/tidb#42915

Time Cost(s) Original Optimize BCJ Performance(QPS) Improvement
Q1 2.79 2.85 -2.11%
Q2 1.11 0.97 14.43%
Q3 3.05 3.05 0.00%
Q4 2.38 2.38 0.00%
Q5 6.48 4.66 39.06%
Q6 0.7 0.7 0.00%
Q7 2.99 2.18 37.16%
Q8 4.87 2.11 130.81%
Q9 17.01 10.57 60.93%
Q10 3.39 3.32 2.11%
Q11 0.84 0.44 90.91%
Q12 1.51 1.58 -4.43%
Q13 3.12 3.19 -2.19%
Q14 0.77 0.77 0.00%
Q15 1.51 1.58 -4.43%
Q16 0.84 0.84 0.00%
Q17 6.41 6.27 2.23%
Q18 6.95 6.74 3.12%
Q19 1.78 1.71 4.09%
Q20 1.38 1.38 0.00%
Q21 8.36 7.28 14.84%
Q22 0.7 0.64 9.38%
SUM 78.94 65.21 21.06%
  Original Optimize BCJ NET/IO Reduction:(Original-(Optimize BCJ))/Original
Total Exchange Size By NET (GB) 75.894 30.216 60.19%

Benchmark: One MPP Store

ENV

Time(s) Original Optimize BCJ Performance(QPS) Improvement: (Original) / (Optimize BCJ) - 1.0
Q1 8.15 8.15 0.00%
Q2 2.52 2.25 12.00%
Q3 5.94 4.4 35.00%
Q4 4.53 4.6 -1.52%
Q5 14.13 10.7 32.06%
Q6 1.98 1.91 3.66%
Q7 6.61 5.34 23.78%
Q8 9.56 5.47 74.77%
Q9 38.82 29.83 30.14%
Q10 8.36 7.15 16.92%
Q11 1.71 1.04 64.42%
Q12 4.19 3.46 21.10%
Q13 8.62 8.42 2.38%
Q14 2.05 2.11 -2.84%
Q15 4.13 4.13 0.00%
Q16 2.18 2.05 6.34%
Q17 13.72 13.79 -0.51%
Q18 17.28 15.07 14.66%
Q19 5.00 5.00 0.00%
Q20 3.05 3.05 0.00%
Q21 16.81 13.79 21.90%
Q22 1.31 1.24 5.65%
SUM 180.65 152.95 18.11%

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions