Skip to content

[SPARK-3054][STREAMING] Add unit tests for Spark Sink. #1958

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from

Conversation

harishreedharan
Copy link
Contributor

This patch adds unit tests for Spark Sink.

It also removes the private[flume] for Spark Sink,
since the sink is instantiated from Flume configuration (looks like this is ignored by reflection which is used by
Flume, but we should still remove it anyway).

This patch adds unit tests for Spark Sink.

It also removes the private[flume] for Spark Sink,
since the sink is instantiated from Flume configuration (looks like this is ignored by reflection which is used by
Flume, but we should still remove it anyway).
@SparkQA
Copy link

SparkQA commented Aug 15, 2014

QA tests have started for PR 1958. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18583/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 15, 2014

QA results for PR 1958:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18583/consoleFull



/*
* Licensed to the Apache Software Foundation (ASF) under one or more
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Hari, ASF header should be at the top of file :).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Done.

@SparkQA
Copy link

SparkQA commented Aug 15, 2014

QA tests have started for PR 1958 at commit f2c56c9.

  • This patch merges cleanly.

@harishreedharan
Copy link
Contributor Author

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Aug 15, 2014

QA tests have started for PR 1958 at commit f2c56c9.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 15, 2014

QA tests have finished for PR 1958 at commit f2c56c9.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@harishreedharan
Copy link
Contributor Author

Failure is unrelated:
[info] SparkSinkSuite:
[info] - Success test
[info] - Nack
[info] - Timeout
[info] - Multiple consumers
[info] - Multiple consumers With Some Failures
[info] ScalaTest
[info] Run completed in 10 minutes, 40 seconds.
[info] Total number of tests run: 5
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 5, failed 0, canceled 0, ignored 0, pending 0

Failures are in the SparkSubmitSuite:
[info] 2 TESTS FAILED
[error] Failed: Total 829, Failed 2, Errors 0, Passed 827, Ignored 6
[error] Failed tests:
[error] org.apache.spark.deploy.SparkSubmitSuite
error sbt.TestsFailedException: Tests unsuccessful

@SparkQA
Copy link

SparkQA commented Aug 15, 2014

QA tests have started for PR 1958 at commit 7b9b649.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 15, 2014

QA tests have finished for PR 1958 at commit 7b9b649.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@harishreedharan
Copy link
Contributor Author

Same failures as before - not caused by this patch

@harishreedharan
Copy link
Contributor Author

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Aug 18, 2014

QA tests have started for PR 1958 at commit 7fedc5a.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 18, 2014

QA tests have finished for PR 1958 at commit 7fedc5a.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -53,7 +53,6 @@ import org.apache.flume.sink.AbstractSink
*
*/

private[flume]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this removed? We dont want to expose this as a public class as this class will then appear in the Scala docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this class would be called from Flume. Flume will create an instance of this class to run the sink - so theoretically it should not be private to this package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, can you add a line right at the top that this class is not intended to be used inside Spark application. Just in case it appears in the scala docs / java docs. I will try to see how to eliminate this module from appearing in the docs.

@SparkQA
Copy link

SparkQA commented Aug 18, 2014

QA tests have started for PR 1958 at commit c9190d1.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 18, 2014

QA tests have finished for PR 1958 at commit c9190d1.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tdas
Copy link
Contributor

tdas commented Aug 19, 2014

unit test is failing.

transAndClient.foreach(x => x._1.close())
}

def initializeChannelAndSink(overrides: Option[Map[String, String]]): (MemoryChannel,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is a little weird. Option is not necessary (add None and Some unnecessarily increases verbosity), as it can simply be def initializeChannelAndSink(overrides: Map[String, String] = Map.empty)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, add private to keep it consistent to other methods.

@harishreedharan
Copy link
Contributor Author

Looks like the tests have passed.

@tdas
Copy link
Contributor

tdas commented Aug 19, 2014

Right, good. But I want to run it another couple of times to test its flakiness.

@SparkQA
Copy link

SparkQA commented Aug 19, 2014

QA tests have finished for PR 1958 at commit e3110b9.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tdas
Copy link
Contributor

tdas commented Aug 19, 2014

Jenkins, test this.

@harishreedharan
Copy link
Contributor Author

Jenkins, test this please

1 similar comment
@harishreedharan
Copy link
Contributor Author

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Aug 20, 2014

QA tests have started for PR 1958 at commit e3110b9.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 20, 2014

QA tests have finished for PR 1958 at commit e3110b9.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@harishreedharan
Copy link
Contributor Author

Could not find Apache license headers in the following files:
!????? /home/jenkins/workspace/SparkPullRequestBuilder/mllib/checkpoint/.temp.crc
!????? /home/jenkins/workspace/SparkPullRequestBuilder/mllib/checkpoint/temp

No idea what is happening here.

@tdas
Copy link
Contributor

tdas commented Aug 20, 2014

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Aug 20, 2014

QA tests have started for PR 1958 at commit e3110b9.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 20, 2014

QA tests have finished for PR 1958 at commit e3110b9.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tdas
Copy link
Contributor

tdas commented Aug 20, 2014

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Aug 20, 2014

QA tests have started for PR 1958 at commit e3110b9.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 20, 2014

QA tests have finished for PR 1958 at commit e3110b9.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tdas
Copy link
Contributor

tdas commented Aug 20, 2014

Alright, I am merging this.

asfgit pushed a commit that referenced this pull request Aug 20, 2014
This patch adds unit tests for Spark Sink.

It also removes the private[flume] for Spark Sink,
since the sink is instantiated from Flume configuration (looks like this is ignored by reflection which is used by
Flume, but we should still remove it anyway).

Author: Hari Shreedharan <hshreedharan@apache.org>
Author: Hari Shreedharan <hshreedharan@cloudera.com>

Closes #1958 from harishreedharan/spark-sink-test and squashes the following commits:

e3110b9 [Hari Shreedharan] Add a sleep to allow sink to commit the transactions
120b81e [Hari Shreedharan] Fix complexity in threading model in test
4df5be6 [Hari Shreedharan] Merge remote-tracking branch 'asf/master' into spark-sink-test
c9190d1 [Hari Shreedharan] Indentation and spaces changes
7fedc5a [Hari Shreedharan] Merge remote-tracking branch 'asf/master' into spark-sink-test
abc20cb [Hari Shreedharan] Minor test changes
7b9b649 [Hari Shreedharan] Merge branch 'master' into spark-sink-test
f2c56c9 [Hari Shreedharan] Update SparkSinkSuite.scala
a24aac8 [Hari Shreedharan] Remove unused var
c86d615 [Hari Shreedharan] [SPARK-3054][STREAMING] Add unit tests for Spark Sink.

(cherry picked from commit 8c5a222)
Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
@asfgit asfgit closed this in 8c5a222 Aug 20, 2014
@tgravescs
Copy link
Contributor

@tdas @harishreedharan This is causing yarn builds to fail on branch-1.1.

I think you are missing commit: 95470a0#diff-74323d4d6986ea21770bf3f49c091e5b

/external/flume-sink/src/test/scala/org/apache/spark/streaming/flume/sink/SparkSinkSuite.scala:66: value getPort is not a member of org.apache.spark.streaming.flume.sink.SparkSink
[ERROR] val port = sink.getPort

@harishreedharan
Copy link
Contributor Author

Correct. This patch uses a method which was introduced in that one.

@tgravescs
Copy link
Contributor

I see @pwendell temporarily removed the test: 1d5e84a#diff-d41d8cd98f00b204e9800998ecf8427e

@tdas @harishreedharan can the other commit go in or can you please file jira to followup.

@harishreedharan
Copy link
Contributor Author

I believe the commit which added the getPort() method had merge conflicts in the 1.1 branch, which is why @tdas did not backport it. We can probably backport all commits that went into the flume-sink module to take care of this.

@tdas
Copy link
Contributor

tdas commented Aug 22, 2014

I have backported this to branch-1.1. However @pwendell still found some
other mysterious issues with SparkSinkSuite that broke the maven build, so
he deleted sparkSinkSuite temporarily to create another snapshot release of
1.1. I am investigating this right now.

TD

On Thu, Aug 21, 2014 at 10:37 AM, Hari Shreedharan <notifications@github.com

wrote:

I believe the commit which added the getPort() method had merge conflicts
in the 1.1 branch, which is why @tdas https://github.com/tdas did not
backport it. We can probably backport all commits that went into the
flume-sink module to take care of this.


Reply to this email directly or view it on GitHub
#1958 (comment).

xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
This patch adds unit tests for Spark Sink.

It also removes the private[flume] for Spark Sink,
since the sink is instantiated from Flume configuration (looks like this is ignored by reflection which is used by
Flume, but we should still remove it anyway).

Author: Hari Shreedharan <hshreedharan@apache.org>
Author: Hari Shreedharan <hshreedharan@cloudera.com>

Closes apache#1958 from harishreedharan/spark-sink-test and squashes the following commits:

e3110b9 [Hari Shreedharan] Add a sleep to allow sink to commit the transactions
120b81e [Hari Shreedharan] Fix complexity in threading model in test
4df5be6 [Hari Shreedharan] Merge remote-tracking branch 'asf/master' into spark-sink-test
c9190d1 [Hari Shreedharan] Indentation and spaces changes
7fedc5a [Hari Shreedharan] Merge remote-tracking branch 'asf/master' into spark-sink-test
abc20cb [Hari Shreedharan] Minor test changes
7b9b649 [Hari Shreedharan] Merge branch 'master' into spark-sink-test
f2c56c9 [Hari Shreedharan] Update SparkSinkSuite.scala
a24aac8 [Hari Shreedharan] Remove unused var
c86d615 [Hari Shreedharan] [SPARK-3054][STREAMING] Add unit tests for Spark Sink.
@harishreedharan harishreedharan deleted the spark-sink-test branch September 25, 2014 03:48
szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Aug 7, 2024
* rdar://127304026 Update UC-Spark-Authz Plugin to 0.2.1

* rdar://127304026 Update UC-Spark-Authz Plugin to 0.2.2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants