-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-19306][Core] Fix inconsistent state in DiskBlockObject when expection occurred #16657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Change-Id: I837b5135dd67034d74a9832133dc29800c88f089
CC @mridulm please help to review, thanks a lot. |
Test build #71712 has finished for PR 16657 at commit
|
Jenkins, retest this please. |
} catch { | ||
case e: Exception => | ||
logError("Uncaught exception while reverting partial writes to file " + file, e) | ||
file | ||
} finally { | ||
truncateStream.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is to use Utils.tryWithSafeFinally
and do the if block with closeResources in the block with truncate in the finally block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You actually can't call close()
here without a null check.
I'd also move the file
return out of the try/catch block altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about it. I will fix it.
Test build #71716 has finished for PR 16657 at commit
|
Change-Id: I6f88a75cf6467508c77cbfc28c25e4fc7c172373
Test build #71800 has started for PR 16657 at commit |
LGTM, will wait for @vanzin's comments too. |
Jenkins, retest this please. |
Test build #71801 has finished for PR 16657 at commit
|
LGTM. Merging to master / 2.1. |
…pection occurred ## What changes were proposed in this pull request? In `DiskBlockObjectWriter`, when some errors happened during writing, it will call `revertPartialWritesAndClose`, if this method again failed due to some issues like out of disk, it will throw exception without resetting the state of this writer, also skipping the revert. So here propose to fix this issue to offer user a chance to recover from such issue. ## How was this patch tested? Existing test. Author: jerryshao <sshao@hortonworks.com> Closes #16657 from jerryshao/SPARK-19306. (cherry picked from commit e497472) Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
…pection occurred ## What changes were proposed in this pull request? In `DiskBlockObjectWriter`, when some errors happened during writing, it will call `revertPartialWritesAndClose`, if this method again failed due to some issues like out of disk, it will throw exception without resetting the state of this writer, also skipping the revert. So here propose to fix this issue to offer user a chance to recover from such issue. ## How was this patch tested? Existing test. Author: jerryshao <sshao@hortonworks.com> Closes apache#16657 from jerryshao/SPARK-19306.
…pection occurred ## What changes were proposed in this pull request? In `DiskBlockObjectWriter`, when some errors happened during writing, it will call `revertPartialWritesAndClose`, if this method again failed due to some issues like out of disk, it will throw exception without resetting the state of this writer, also skipping the revert. So here propose to fix this issue to offer user a chance to recover from such issue. ## How was this patch tested? Existing test. Author: jerryshao <sshao@hortonworks.com> Closes apache#16657 from jerryshao/SPARK-19306.
What changes were proposed in this pull request?
In
DiskBlockObjectWriter
, when some errors happened during writing, it will callrevertPartialWritesAndClose
, if this method again failed due to some issues like out of disk, it will throw exception without resetting the state of this writer, also skipping the revert. So here propose to fix this issue to offer user a chance to recover from such issue.How was this patch tested?
Existing test.