Commit a1d55d7
[SPARK-49386][CORE][SQL] Add memory based thresholds for shuffle spill
Original author: amuraru
### What changes were proposed in this pull request?
This PR aims to support add memory based thresholds for shuffle spill.
Introduce configuration
- spark.shuffle.spill.maxRecordsSizeForSpillThreshold
- spark.sql.windowExec.buffer.spill.size.threshold
- spark.sql.sessionWindow.buffer.spill.size.threshold
- spark.sql.sortMergeJoinExec.buffer.spill.size.threshold
- spark.sql.cartesianProductExec.buffer.spill.size.threshold
### Why are the changes needed?
#24618
We can only determine the number of spills by configuring `spark.shuffle.spill.numElementsForceSpillThreshold`. In some scenarios, the size of a row may be very large in the memory.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
GA
Verified in the production environment, the task time is shortened, the number of spill disks is reduced, there is a better chance to compress the shuffle data, and the size of the spill to disk is also significantly reduced.
**Current**
<img width="1281" alt="image" src="https://github.com/user-attachments/assets/b6e172b8-0da8-4b60-b456-024880d0987e">
```
24/08/19 07:02:54,947 [Executor task launch worker for task 0.0 in stage 53.0 (TID 1393)] INFO ShuffleExternalSorter: Thread 126 spilling sort data of 62.0 MiB to disk (11490 times so far)
24/08/19 07:02:55,029 [Executor task launch worker for task 0.0 in stage 53.0 (TID 1393)] INFO ShuffleExternalSorter: Thread 126 spilling sort data of 62.0 MiB to disk (11491 times so far)
24/08/19 07:02:55,093 [Executor task launch worker for task 0.0 in stage 53.0 (TID 1393)] INFO ShuffleExternalSorter: Thread 126 spilling sort data of 62.0 MiB to disk (11492 times so far)
24/08/19 07:08:59,894 [Executor task launch worker for task 0.0 in stage 53.0 (TID 1393)] INFO Executor: Finished task 0.0 in stage 53.0 (TID 1393). 7409 bytes result sent to driver
```
**PR**
<img width="1294" alt="image" src="https://github.com/user-attachments/assets/aedb83a4-c8a1-4ac9-a805-55ba44ebfc9e">
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #47856 from cxzl25/SPARK-27734.
Lead-authored-by: sychen <sychen@ctrip.com>
Co-authored-by: Adi Muraru <amuraru@adobe.com>
Signed-off-by: attilapiros <piros.attila.zsolt@gmail.com>1 parent eb5af45 commit a1d55d7
File tree
27 files changed
+238
-46
lines changed- common/utils/src/main/scala/org/apache/spark/internal
- core/src
- main
- java/org/apache/spark
- shuffle/sort
- util/collection/unsafe/sort
- scala/org/apache/spark
- internal/config
- util/collection
- test/java/org/apache/spark/util/collection/unsafe/sort
- sql
- catalyst/src/main/scala/org/apache/spark/sql/internal
- core/src
- main
- java/org/apache/spark/sql/execution
- scala/org/apache/spark/sql/execution
- aggregate
- joins
- python
- window
- test/scala/org/apache/spark/sql
- execution
- streaming
27 files changed
+238
-46
lines changedLines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
487 | 487 | | |
488 | 488 | | |
489 | 489 | | |
| 490 | + | |
490 | 491 | | |
491 | 492 | | |
492 | 493 | | |
| |||
768 | 769 | | |
769 | 770 | | |
770 | 771 | | |
| 772 | + | |
| 773 | + | |
771 | 774 | | |
772 | 775 | | |
773 | 776 | | |
| |||
Lines changed: 17 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
92 | 97 | | |
93 | 98 | | |
94 | 99 | | |
| |||
112 | 117 | | |
113 | 118 | | |
114 | 119 | | |
| 120 | + | |
115 | 121 | | |
116 | 122 | | |
117 | 123 | | |
| |||
136 | 142 | | |
137 | 143 | | |
138 | 144 | | |
| 145 | + | |
| 146 | + | |
139 | 147 | | |
140 | 148 | | |
141 | 149 | | |
| |||
338 | 346 | | |
339 | 347 | | |
340 | 348 | | |
| 349 | + | |
341 | 350 | | |
342 | 351 | | |
343 | 352 | | |
| |||
417 | 426 | | |
418 | 427 | | |
419 | 428 | | |
420 | | - | |
421 | 429 | | |
422 | 430 | | |
423 | | - | |
| 431 | + | |
| 432 | + | |
424 | 433 | | |
425 | 434 | | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
426 | 440 | | |
427 | 441 | | |
428 | 442 | | |
| |||
439 | 453 | | |
440 | 454 | | |
441 | 455 | | |
| 456 | + | |
442 | 457 | | |
443 | 458 | | |
444 | 459 | | |
| |||
Lines changed: 22 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
83 | 88 | | |
84 | 89 | | |
85 | 90 | | |
| |||
92 | 97 | | |
93 | 98 | | |
94 | 99 | | |
| 100 | + | |
95 | 101 | | |
96 | 102 | | |
97 | 103 | | |
| |||
110 | 116 | | |
111 | 117 | | |
112 | 118 | | |
| 119 | + | |
113 | 120 | | |
114 | 121 | | |
115 | 122 | | |
116 | 123 | | |
117 | | - | |
| 124 | + | |
| 125 | + | |
118 | 126 | | |
119 | 127 | | |
120 | 128 | | |
| |||
133 | 141 | | |
134 | 142 | | |
135 | 143 | | |
| 144 | + | |
136 | 145 | | |
137 | 146 | | |
138 | 147 | | |
139 | | - | |
| 148 | + | |
140 | 149 | | |
141 | 150 | | |
142 | 151 | | |
| |||
149 | 158 | | |
150 | 159 | | |
151 | 160 | | |
| 161 | + | |
152 | 162 | | |
153 | 163 | | |
154 | 164 | | |
| |||
178 | 188 | | |
179 | 189 | | |
180 | 190 | | |
| 191 | + | |
181 | 192 | | |
182 | 193 | | |
183 | 194 | | |
| |||
238 | 249 | | |
239 | 250 | | |
240 | 251 | | |
| 252 | + | |
241 | 253 | | |
242 | 254 | | |
243 | 255 | | |
| |||
480 | 492 | | |
481 | 493 | | |
482 | 494 | | |
483 | | - | |
| 495 | + | |
| 496 | + | |
484 | 497 | | |
485 | 498 | | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
486 | 504 | | |
487 | 505 | | |
488 | 506 | | |
| |||
497 | 515 | | |
498 | 516 | | |
499 | 517 | | |
| 518 | + | |
500 | 519 | | |
501 | 520 | | |
502 | 521 | | |
| |||
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1596 | 1596 | | |
1597 | 1597 | | |
1598 | 1598 | | |
| 1599 | + | |
| 1600 | + | |
| 1601 | + | |
| 1602 | + | |
| 1603 | + | |
| 1604 | + | |
| 1605 | + | |
| 1606 | + | |
| 1607 | + | |
| 1608 | + | |
| 1609 | + | |
| 1610 | + | |
1599 | 1611 | | |
1600 | 1612 | | |
1601 | 1613 | | |
| |||
Lines changed: 17 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
61 | 65 | | |
62 | 66 | | |
63 | 67 | | |
| |||
80 | 84 | | |
81 | 85 | | |
82 | 86 | | |
83 | | - | |
84 | | - | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
85 | 92 | | |
86 | 93 | | |
87 | 94 | | |
88 | 95 | | |
89 | 96 | | |
90 | 97 | | |
91 | | - | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
92 | 101 | | |
93 | | - | |
94 | 102 | | |
95 | 103 | | |
96 | 104 | | |
97 | | - | |
| 105 | + | |
98 | 106 | | |
99 | 107 | | |
100 | 108 | | |
| |||
140 | 148 | | |
141 | 149 | | |
142 | 150 | | |
| 151 | + | |
143 | 152 | | |
144 | | - | |
| 153 | + | |
145 | 154 | | |
146 | 155 | | |
147 | 156 | | |
148 | | - | |
| 157 | + | |
| 158 | + | |
149 | 159 | | |
150 | 160 | | |
151 | 161 | | |
Lines changed: 11 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
87 | 87 | | |
88 | 88 | | |
89 | 89 | | |
90 | | - | |
| 90 | + | |
91 | 91 | | |
92 | 92 | | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
93 | 97 | | |
94 | 98 | | |
95 | 99 | | |
| |||
163 | 167 | | |
164 | 168 | | |
165 | 169 | | |
166 | | - | |
| 170 | + | |
| 171 | + | |
167 | 172 | | |
168 | 173 | | |
169 | 174 | | |
| |||
453 | 458 | | |
454 | 459 | | |
455 | 460 | | |
456 | | - | |
| 461 | + | |
| 462 | + | |
457 | 463 | | |
458 | 464 | | |
459 | 465 | | |
| |||
515 | 521 | | |
516 | 522 | | |
517 | 523 | | |
518 | | - | |
| 524 | + | |
| 525 | + | |
519 | 526 | | |
520 | 527 | | |
521 | 528 | | |
| |||
Lines changed: 41 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3366 | 3366 | | |
3367 | 3367 | | |
3368 | 3368 | | |
| 3369 | + | |
| 3370 | + | |
| 3371 | + | |
| 3372 | + | |
| 3373 | + | |
| 3374 | + | |
| 3375 | + | |
3369 | 3376 | | |
3370 | 3377 | | |
3371 | 3378 | | |
| |||
3387 | 3394 | | |
3388 | 3395 | | |
3389 | 3396 | | |
| 3397 | + | |
| 3398 | + | |
| 3399 | + | |
| 3400 | + | |
| 3401 | + | |
| 3402 | + | |
| 3403 | + | |
| 3404 | + | |
| 3405 | + | |
3390 | 3406 | | |
3391 | 3407 | | |
3392 | 3408 | | |
| |||
3430 | 3446 | | |
3431 | 3447 | | |
3432 | 3448 | | |
| 3449 | + | |
| 3450 | + | |
| 3451 | + | |
| 3452 | + | |
| 3453 | + | |
| 3454 | + | |
| 3455 | + | |
3433 | 3456 | | |
3434 | 3457 | | |
3435 | 3458 | | |
| |||
3447 | 3470 | | |
3448 | 3471 | | |
3449 | 3472 | | |
| 3473 | + | |
| 3474 | + | |
| 3475 | + | |
| 3476 | + | |
| 3477 | + | |
| 3478 | + | |
| 3479 | + | |
3450 | 3480 | | |
3451 | 3481 | | |
3452 | 3482 | | |
| |||
6699 | 6729 | | |
6700 | 6730 | | |
6701 | 6731 | | |
| 6732 | + | |
| 6733 | + | |
6702 | 6734 | | |
6703 | 6735 | | |
6704 | 6736 | | |
6705 | 6737 | | |
6706 | 6738 | | |
6707 | 6739 | | |
| 6740 | + | |
| 6741 | + | |
| 6742 | + | |
6708 | 6743 | | |
6709 | 6744 | | |
6710 | 6745 | | |
6711 | 6746 | | |
6712 | 6747 | | |
6713 | 6748 | | |
| 6749 | + | |
| 6750 | + | |
| 6751 | + | |
6714 | 6752 | | |
6715 | 6753 | | |
6716 | 6754 | | |
6717 | 6755 | | |
6718 | 6756 | | |
6719 | 6757 | | |
| 6758 | + | |
| 6759 | + | |
| 6760 | + | |
6720 | 6761 | | |
6721 | 6762 | | |
6722 | 6763 | | |
| |||
0 commit comments