Skip to content

[STREAMING][FLAKY-TEST] Catch execution context race condition in FileBasedWriteAheadLog.close() #9953

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

brkyvz
Copy link
Contributor

@brkyvz brkyvz commented Nov 25, 2015

There is a race condition in FileBasedWriteAheadLog.close(), where if delete's of old log files are in progress, the write ahead log may close, and result in a RejectedExecutionException. This is okay, and should be handled gracefully.

Example test failures:
https://amplab.cs.berkeley.edu/jenkins/job/Spark-1.6-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop1.0,label=spark-test/95/testReport/junit/org.apache.spark.streaming.util/BatchedWriteAheadLogWithCloseFileAfterWriteSuite/BatchedWriteAheadLog___clean_old_logs/

The reason the test fails is in afterEach, writeAheadLog.close is called, and there may still be async deletes in flight.

@tdas @zsxwing

@tdas
Copy link
Contributor

tdas commented Nov 25, 2015

LGTM. @zsxwing what do you think?

@SparkQA
Copy link

SparkQA commented Nov 25, 2015

Test build #46654 has finished for PR 9953 at commit 890e6dc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member

zsxwing commented Nov 25, 2015

LGTM

@zsxwing
Copy link
Member

zsxwing commented Nov 25, 2015

Thanks! Merging to master and 1.6

asfgit pushed a commit that referenced this pull request Nov 25, 2015
…leBasedWriteAheadLog.close()`

There is a race condition in `FileBasedWriteAheadLog.close()`, where if delete's of old log files are in progress, the write ahead log may close, and result in a `RejectedExecutionException`. This is okay, and should be handled gracefully.

Example test failures:
https://amplab.cs.berkeley.edu/jenkins/job/Spark-1.6-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop1.0,label=spark-test/95/testReport/junit/org.apache.spark.streaming.util/BatchedWriteAheadLogWithCloseFileAfterWriteSuite/BatchedWriteAheadLog___clean_old_logs/

The reason the test fails is in `afterEach`, `writeAheadLog.close` is called, and there may still be async deletes in flight.

tdas zsxwing

Author: Burak Yavuz <brkyvz@gmail.com>

Closes #9953 from brkyvz/flaky-ss.

(cherry picked from commit a5d9887)
Signed-off-by: Shixiong Zhu <shixiong@databricks.com>
@asfgit asfgit closed this in a5d9887 Nov 25, 2015
@zsxwing
Copy link
Member

zsxwing commented Nov 25, 2015

@brkyvz no JIRA for this one?

@brkyvz
Copy link
Contributor Author

brkyvz commented Nov 25, 2015

@zsxwing No. We could create one though if we like

@brkyvz brkyvz deleted the flaky-ss branch February 3, 2019 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants