Skip to content

[SPARK-13655] [STREAMING] [TESTS] Fix WithAggregationKinesisBackedBlockRDDSuite #11558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

srowen
Copy link
Member

@srowen srowen commented Mar 7, 2016

What changes were proposed in this pull request?

Disable Kinesis tests that are always timing out the build. Anything triggering kinesis tests is currently hanging and timing out, like #11481

This does not resolve the problem, just disables the test.

CC @brkyvz @tdas who may know more, having mostly authored / changed this code.

How was this patch tested?

Jenkins tests. As this only removes a test, it shouldn't be able to break anything. We'll see if this was all that's needed.

@SparkQA
Copy link

SparkQA commented Mar 7, 2016

Test build #52551 has finished for PR 11558 at commit d1261fa.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Great! Thank you for trying to resolve this!

@SparkQA
Copy link

SparkQA commented Mar 7, 2016

Test build #52581 has finished for PR 11558 at commit c3ce229.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

@tdas @brkyvz, don't these tests run against a real kinesis or something like that? Maybe there's a misconfiguration of credentials in Jenkins? If you remember how credentials were originally set up then forward those details to me and I can try to port that setup onto new build / jenkins scripts in case that's the issue.

@zsxwing
Copy link
Member

zsxwing commented Mar 7, 2016

@JoshRosen This one was not tested with your block manager changes :(.

I can reproduce the timeout failure locally. Here is the stack trace:

"pool-1-thread-1-ScalaTest-running-WithAggregationKinesisBackedBlockRDDSuite" #11 prio=5 os_prio=31 tid=0x00007fe519242800 nid=0x5703 in Object.wait() [0x0000000130000000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    at java.lang.Object.wait(Object.java:502)
    at org.apache.spark.storage.BlockInfoManager.lockForWriting(BlockInfoManager.scala:236)
    - locked <0x0000000700f35cd8> (a org.apache.spark.storage.BlockInfoManager)
    at org.apache.spark.storage.BlockManager.removeBlock(BlockManager.scala:1126)
    at org.apache.spark.streaming.kinesis.KinesisBackedBlockRDDTests$$anonfun$org$apache$spark$streaming$kinesis$KinesisBackedBlockRDDTests$$testRDD$7.apply(KinesisBackedBlockRDDSuite.scala:172)
    at org.apache.spark.streaming.kinesis.KinesisBackedBlockRDDTests$$anonfun$org$apache$spark$streaming$kinesis$KinesisBackedBlockRDDTests$$testRDD$7.apply(KinesisBackedBlockRDDSuite.scala:172)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
    at org.apache.spark.streaming.kinesis.KinesisBackedBlockRDDTests.org$apache$spark$streaming$kinesis$KinesisBackedBlockRDDTests$$testRDD(KinesisBackedBlockRDDSuite.scala:172)
    at org.apache.spark.streaming.kinesis.KinesisBackedBlockRDDTests$$anonfun$3.apply$mcV$sp(KinesisBackedBlockRDDSuite.scala:105)
    at org.apache.spark.streaming.kinesis.KinesisBackedBlockRDDTests$$anonfun$3.apply(KinesisBackedBlockRDDSuite.scala:105)
    at org.apache.spark.streaming.kinesis.KinesisBackedBlockRDDTests$$anonfun$3.apply(KinesisBackedBlockRDDSuite.scala:105)

@zsxwing
Copy link
Member

zsxwing commented Mar 7, 2016

@JoshRosen you can test it using ENABLE_KINESIS_TESTS=1 sbt/sbt -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Pkinesis-asl "project streaming-kinesis-asl" "test-only *WithAggregationKinesisBackedBlockRDDSuite" with your credentials.

@JoshRosen
Copy link
Contributor

@zsxwing, thanks for the pointer. I think the problem here is that this suite is sharing a blockmanager and block ids across test cases and some of the test cases directly call BlockManager methods outside of a task to get blocks and then do not release those locks.

What do you think about refactoring this suite so that the state is completely reset between tests?

@JoshRosen
Copy link
Contributor

Here's an alternate fix: #11564

asfgit pushed a commit that referenced this pull request Mar 7, 2016
…DSuite

This patch modifies `KinesisBackedBlockRDDTests` to increase the isolation between tests in order to fix a bug which causes the tests to hang.

See #11558 for more details.

/cc zsxwing srowen

Author: Josh Rosen <joshrosen@databricks.com>

Closes #11564 from JoshRosen/SPARK-13655.
@JoshRosen
Copy link
Contributor

@srowen, we merged my fix in #11564 so I think you can close this noe.

@srowen
Copy link
Member Author

srowen commented Mar 7, 2016

Nice! actual fix.

@srowen srowen closed this Mar 7, 2016
@srowen srowen deleted the SPARK-13655 branch March 7, 2016 22:01
roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
…DSuite

This patch modifies `KinesisBackedBlockRDDTests` to increase the isolation between tests in order to fix a bug which causes the tests to hang.

See apache#11558 for more details.

/cc zsxwing srowen

Author: Josh Rosen <joshrosen@databricks.com>

Closes apache#11564 from JoshRosen/SPARK-13655.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants