Skip to content

Conversation

@zsxwing
Copy link
Member

@zsxwing zsxwing commented May 5, 2015

Test failure: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=centos/2240/testReport/junit/org.apache.spark.scheduler/DAGSchedulerSuite/run_shuffle_with_map_stage_failure/

This is because many tests share the same JobListener. Because after each test, scheduler isn't stopped. So actually it's still running. When running the test run shuffle with map stage failure, some previous test may trigger ResubmitFailedStages logic, and report jobFailed and override the global failure variable.

This PR uses after to call scheduler.stop() for each test.

…lerSuite

Test failure: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=centos/2240/testReport/junit/org.apache.spark.scheduler/DAGSchedulerSuite/run_shuffle_with_map_stage_failure/

This is because all tests share the same `JobListener`. Because after each test, `scheduler` isn't stopped. So actually it's still running. When running the test `run shuffle with map stage failure`, some previous test may trigger `ResubmitFailedStages` logic and override the global `failure` variable.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to cancel it, or scheduler.stop() will trigger jobFailed and make this test fail.

@SparkQA
Copy link

SparkQA commented May 5, 2015

Test build #31832 has finished for PR 5903 at commit 1e6f13e.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member Author

zsxwing commented May 5, 2015

retest this please.

@SparkQA
Copy link

SparkQA commented May 5, 2015

Test build #31833 has finished for PR 5903 at commit 1e6f13e.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member Author

zsxwing commented May 5, 2015

retest this please.

@SparkQA
Copy link

SparkQA commented May 5, 2015

Test build #31839 has finished for PR 5903 at commit 1e6f13e.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented May 5, 2015

Jenkins retest this please.

@SparkQA
Copy link

SparkQA commented May 5, 2015

Test build #31873 has finished for PR 5903 at commit 1e6f13e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented May 5, 2015

LGTM since the test starts a scheduler in before so should stop it in after. And yes cancelling the outstanding job seems OK. Tests pass.

@asfgit asfgit closed this in 5ffc73e May 5, 2015
asfgit pushed a commit that referenced this pull request May 5, 2015
… stage failure' in DAGSchedulerSuite

Test failure: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=centos/2240/testReport/junit/org.apache.spark.scheduler/DAGSchedulerSuite/run_shuffle_with_map_stage_failure/

This is because many tests share the same `JobListener`. Because after each test, `scheduler` isn't stopped. So actually it's still running. When running the test `run shuffle with map stage failure`, some previous test may trigger [ResubmitFailedStages](https://github.com/apache/spark/blob/ebc25a4ddfe07a67668217cec59893bc3b8cf730/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1120) logic, and report `jobFailed` and override the global `failure` variable.

This PR uses `after` to call `scheduler.stop()` for each test.

Author: zsxwing <zsxwing@gmail.com>

Closes #5903 from zsxwing/SPARK-5074 and squashes the following commits:

1e6f13e [zsxwing] Fix the flakey test 'run shuffle with map stage failure' in DAGSchedulerSuite

(cherry picked from commit 5ffc73e)
Signed-off-by: Sean Owen <sowen@cloudera.com>
asfgit pushed a commit that referenced this pull request May 5, 2015
… stage failure' in DAGSchedulerSuite

Test failure: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=centos/2240/testReport/junit/org.apache.spark.scheduler/DAGSchedulerSuite/run_shuffle_with_map_stage_failure/

This is because many tests share the same `JobListener`. Because after each test, `scheduler` isn't stopped. So actually it's still running. When running the test `run shuffle with map stage failure`, some previous test may trigger [ResubmitFailedStages](https://github.com/apache/spark/blob/ebc25a4ddfe07a67668217cec59893bc3b8cf730/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1120) logic, and report `jobFailed` and override the global `failure` variable.

This PR uses `after` to call `scheduler.stop()` for each test.

Author: zsxwing <zsxwing@gmail.com>

Closes #5903 from zsxwing/SPARK-5074 and squashes the following commits:

1e6f13e [zsxwing] Fix the flakey test 'run shuffle with map stage failure' in DAGSchedulerSuite

(cherry picked from commit 5ffc73e)
Signed-off-by: Sean Owen <sowen@cloudera.com>
@zsxwing zsxwing deleted the SPARK-5074 branch May 5, 2015 14:06
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
… stage failure' in DAGSchedulerSuite

Test failure: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=centos/2240/testReport/junit/org.apache.spark.scheduler/DAGSchedulerSuite/run_shuffle_with_map_stage_failure/

This is because many tests share the same `JobListener`. Because after each test, `scheduler` isn't stopped. So actually it's still running. When running the test `run shuffle with map stage failure`, some previous test may trigger [ResubmitFailedStages](https://github.com/apache/spark/blob/ebc25a4ddfe07a67668217cec59893bc3b8cf730/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1120) logic, and report `jobFailed` and override the global `failure` variable.

This PR uses `after` to call `scheduler.stop()` for each test.

Author: zsxwing <zsxwing@gmail.com>

Closes apache#5903 from zsxwing/SPARK-5074 and squashes the following commits:

1e6f13e [zsxwing] Fix the flakey test 'run shuffle with map stage failure' in DAGSchedulerSuite
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
… stage failure' in DAGSchedulerSuite

Test failure: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=centos/2240/testReport/junit/org.apache.spark.scheduler/DAGSchedulerSuite/run_shuffle_with_map_stage_failure/

This is because many tests share the same `JobListener`. Because after each test, `scheduler` isn't stopped. So actually it's still running. When running the test `run shuffle with map stage failure`, some previous test may trigger [ResubmitFailedStages](https://github.com/apache/spark/blob/ebc25a4ddfe07a67668217cec59893bc3b8cf730/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1120) logic, and report `jobFailed` and override the global `failure` variable.

This PR uses `after` to call `scheduler.stop()` for each test.

Author: zsxwing <zsxwing@gmail.com>

Closes apache#5903 from zsxwing/SPARK-5074 and squashes the following commits:

1e6f13e [zsxwing] Fix the flakey test 'run shuffle with map stage failure' in DAGSchedulerSuite
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
… stage failure' in DAGSchedulerSuite

Test failure: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=centos/2240/testReport/junit/org.apache.spark.scheduler/DAGSchedulerSuite/run_shuffle_with_map_stage_failure/

This is because many tests share the same `JobListener`. Because after each test, `scheduler` isn't stopped. So actually it's still running. When running the test `run shuffle with map stage failure`, some previous test may trigger [ResubmitFailedStages](https://github.com/apache/spark/blob/ebc25a4ddfe07a67668217cec59893bc3b8cf730/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1120) logic, and report `jobFailed` and override the global `failure` variable.

This PR uses `after` to call `scheduler.stop()` for each test.

Author: zsxwing <zsxwing@gmail.com>

Closes apache#5903 from zsxwing/SPARK-5074 and squashes the following commits:

1e6f13e [zsxwing] Fix the flakey test 'run shuffle with map stage failure' in DAGSchedulerSuite
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants