Commit 8e7d5ba
committed
SPARK-2792. Fix reading too much or too little data from each stream in ExternalMap / Sorter
All these changes are from mridulm's work in apache#1609, but extracted here to fix this specific issue and make it easier to merge not 1.1. This particular set of changes is to make sure that we read exactly the right range of bytes from each spill file in EAOM: some serializers can write bytes after the last object (e.g. the TC_RESET flag in Java serialization) and that would confuse the previous code into reading it as part of the next batch. There are also improvements to cleanup to make sure files are closed.
In addition to bringing in the changes to ExternalAppendOnlyMap, I also copied them to the corresponding code in ExternalSorter and updated its test suite to test for the same issues.
Author: Matei Zaharia <matei@databricks.com>
Closes apache#1722 from mateiz/spark-2792 and squashes the following commits:
5d4bfb5 [Matei Zaharia] Make objectStreamReset counter count the last object written too
18fe865 [Matei Zaharia] Update docs on objectStreamReset
576ee83 [Matei Zaharia] Allow objectStreamReset to be 0
0374217 [Matei Zaharia] Remove super paranoid code to close file handles
bda37bb [Matei Zaharia] Implement Mridul's ExternalAppendOnlyMap fixes in ExternalSorter too
0d6dad7 [Matei Zaharia] Added Mridul's test changes for ExternalAppendOnlyMap
9a78e4b [Matei Zaharia] Add @mridulm's fixes to ExternalAppendOnlyMap for batch sizes1 parent 59f84a9 commit 8e7d5ba
File tree
6 files changed
+194
-83
lines changed- core/src
- main/scala/org/apache/spark
- serializer
- util/collection
- test/scala/org/apache/spark/util/collection
- docs
6 files changed
+194
-83
lines changedLines changed: 2 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | | - | |
| 38 | + | |
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
| 43 | + | |
43 | 44 | | |
44 | 45 | | |
45 | 46 | | |
46 | | - | |
47 | | - | |
48 | 47 | | |
49 | 48 | | |
50 | 49 | | |
| |||
Lines changed: 65 additions & 21 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
| 20 | + | |
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
| 31 | + | |
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| |||
199 | 199 | | |
200 | 200 | | |
201 | 201 | | |
202 | | - | |
203 | | - | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
204 | 206 | | |
205 | 207 | | |
206 | 208 | | |
207 | 209 | | |
208 | 210 | | |
| 211 | + | |
209 | 212 | | |
210 | 213 | | |
211 | 214 | | |
| |||
215 | 218 | | |
216 | 219 | | |
217 | 220 | | |
218 | | - | |
219 | 221 | | |
220 | 222 | | |
221 | 223 | | |
222 | 224 | | |
223 | 225 | | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
224 | 230 | | |
| 231 | + | |
225 | 232 | | |
226 | | - | |
227 | | - | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
228 | 243 | | |
229 | 244 | | |
230 | 245 | | |
| |||
389 | 404 | | |
390 | 405 | | |
391 | 406 | | |
392 | | - | |
393 | | - | |
394 | | - | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
395 | 414 | | |
396 | 415 | | |
397 | 416 | | |
398 | | - | |
399 | | - | |
400 | | - | |
| 417 | + | |
401 | 418 | | |
402 | 419 | | |
403 | 420 | | |
404 | 421 | | |
405 | 422 | | |
406 | 423 | | |
407 | | - | |
408 | | - | |
409 | | - | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
410 | 448 | | |
411 | 449 | | |
412 | | - | |
| 450 | + | |
| 451 | + | |
413 | 452 | | |
414 | 453 | | |
415 | 454 | | |
| |||
424 | 463 | | |
425 | 464 | | |
426 | 465 | | |
427 | | - | |
428 | | - | |
429 | | - | |
430 | 466 | | |
| 467 | + | |
431 | 468 | | |
432 | 469 | | |
433 | 470 | | |
| |||
439 | 476 | | |
440 | 477 | | |
441 | 478 | | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
442 | 482 | | |
443 | 483 | | |
444 | 484 | | |
| |||
455 | 495 | | |
456 | 496 | | |
457 | 497 | | |
458 | | - | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
459 | 503 | | |
460 | 504 | | |
461 | 505 | | |
| |||
Lines changed: 75 additions & 29 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | | - | |
| 29 | + | |
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| |||
273 | 273 | | |
274 | 274 | | |
275 | 275 | | |
276 | | - | |
277 | | - | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
278 | 280 | | |
279 | 281 | | |
280 | 282 | | |
281 | 283 | | |
282 | 284 | | |
| 285 | + | |
283 | 286 | | |
284 | 287 | | |
285 | 288 | | |
| |||
299 | 302 | | |
300 | 303 | | |
301 | 304 | | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
302 | 321 | | |
303 | | - | |
304 | | - | |
305 | | - | |
306 | | - | |
307 | | - | |
308 | | - | |
309 | 322 | | |
310 | 323 | | |
311 | 324 | | |
| |||
472 | 485 | | |
473 | 486 | | |
474 | 487 | | |
475 | | - | |
476 | | - | |
| 488 | + | |
| 489 | + | |
477 | 490 | | |
478 | 491 | | |
479 | 492 | | |
480 | 493 | | |
481 | 494 | | |
482 | 495 | | |
483 | | - | |
| 496 | + | |
484 | 497 | | |
485 | 498 | | |
486 | 499 | | |
487 | 500 | | |
488 | 501 | | |
489 | | - | |
| 502 | + | |
| 503 | + | |
490 | 504 | | |
491 | | - | |
492 | | - | |
493 | | - | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
494 | 508 | | |
495 | 509 | | |
496 | 510 | | |
497 | 511 | | |
498 | | - | |
499 | | - | |
500 | | - | |
501 | | - | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
502 | 536 | | |
503 | | - | |
504 | | - | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
505 | 540 | | |
506 | 541 | | |
507 | 542 | | |
| |||
525 | 560 | | |
526 | 561 | | |
527 | 562 | | |
528 | | - | |
| 563 | + | |
529 | 564 | | |
530 | 565 | | |
531 | | - | |
532 | | - | |
| 566 | + | |
| 567 | + | |
533 | 568 | | |
534 | 569 | | |
535 | 570 | | |
536 | 571 | | |
537 | | - | |
538 | | - | |
539 | | - | |
540 | 572 | | |
| 573 | + | |
541 | 574 | | |
542 | 575 | | |
543 | 576 | | |
544 | 577 | | |
545 | 578 | | |
546 | 579 | | |
547 | 580 | | |
548 | | - | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
549 | 584 | | |
550 | 585 | | |
551 | 586 | | |
| |||
578 | 613 | | |
579 | 614 | | |
580 | 615 | | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
581 | 627 | | |
582 | 628 | | |
583 | 629 | | |
| |||
0 commit comments