-
Notifications
You must be signed in to change notification settings - Fork 384
Hadoop agnostic builds #838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… => SparkHadoopMapReduceUtil
Thank you for submitting this pull request. All automated tests for this request have passed. Refer to this link for build results: http://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/609/ |
Thank you for submitting this pull request. Unfortunately, the automated tests for this request have failed. Refer to this link for build results: http://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/613/ |
Jenkins, retest this please. |
Thank you for submitting this pull request. All automated tests for this request have passed. Refer to this link for build results: http://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/623/ |
Thank you for submitting this pull request. All automated tests for this request have passed. Refer to this link for build results: http://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/627/ |
(I meant: the Maven build is having problems with 0.23.x) |
Thank you for submitting this pull request. All automated tests for this request have passed. Refer to this link for build results: http://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/643/ |
Thank you for submitting this pull request. Unfortunately, the automated tests for this request have failed. Refer to this link for build results: http://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/645/ |
Jenkins, retest this please. |
Thank you for submitting this pull request. All automated tests for this request have passed. Refer to this link for build results: http://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/653/ |
Seq( | ||
"org.apache.hadoop" % "hadoop-yarn-api" % hadoopVersion excludeAll(excludeJackson, excludeNetty, excludeAsm), | ||
"org.apache.hadoop" % "hadoop-yarn-common" % hadoopVersion excludeAll(excludeJackson, excludeNetty, excludeAsm), | ||
"org.apache.hadoop" % "hadoop-yarn-client" % hadoopVersion excludeAll(excludeJackson, excludeNetty, excludeAsm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Jey, just to understand, this means that that users who link to us when running on hadoop 0.23.x have to also add these to their project in addition to hadoop-client version 0.23.x?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, because the issue of explicitly linking against the hadoop libs only applies to non-YARN builds. That does bring up another issue though: right now the spark-core
artifact will by defaualt be built with dependency on hadoop >= 1.2.1. I'll look into figuring out how to specify a more accurate set of constraints to the POM dependency mechanism
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That dependency is fine for spark-core. The main thing is to document what else users should add to use a newer Hadoop. (E.g. They'd add a newer hadoop-client, but they may also have to add this yarn stuff).
Matei
On Aug 19, 2013, at 12:12 PM, Jey Kottalam notifications@github.com wrote:
In project/SparkBuild.scala:
"org.apache.hadoop" % "hadoop-core" % HADOOP_VERSION excludeAll(excludeJackson, excludeNetty, excludeAsm),
"org.apache.hadoop" % "hadoop-client" % HADOOP_VERSION excludeAll(excludeJackson, excludeNetty, excludeAsm)
)
}
} else {
Seq("org.apache.hadoop" % "hadoop-core" % HADOOP_VERSION excludeAll(excludeJackson, excludeNetty) )
}),
- unmanagedSourceDirectories in Compile <+= baseDirectory{ _ /
( if (HADOOP_YARN && HADOOP_MAJOR_VERSION == "2") {
"src/hadoop2-yarn/scala"
if (isYarnMode) {
// This kludge is needed for 0.23.x
Seq(
"org.apache.hadoop" % "hadoop-yarn-api" % hadoopVersion excludeAll(excludeJackson, excludeNetty, excludeAsm),
"org.apache.hadoop" % "hadoop-yarn-common" % hadoopVersion excludeAll(excludeJackson, excludeNetty, excludeAsm),
Nah, because the issue of explicitly linking against the hadoop libs only applies to non-YARN builds. That does bring up another issue though: right now the spark-core artifact will by defaualt be built with dependency on hadoop >= 1.2.1. I'll look into figuring out how to specify a more accurate set of constraints to the POM dependency mechanism"org.apache.hadoop" % "hadoop-yarn-client" % hadoopVersion excludeAll(excludeJackson, excludeNetty, excludeAsm)
—
Reply to this email directly or view it on GitHub.
Hey Jey, I tested this and it looks good, though I had that question above. |
Thank you for submitting this pull request. All automated tests for this request have passed. Refer to this link for build results: http://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/658/ |
Thanks for putting this together, Jey. I've merged it manually due to a small conflict. |
@mateiz @jey - It's very unfortunate that this got merged without any documentation or notification to developers. This will affect many downstream things (tests, anyone running off of master, or building things on top of master, the ec2 scripts, etc). Also, some of the existing documentation, such as |
Have you tried running mvn package? I am getting the following error: *** RUN ABORTED *** |
Did you do mvn clean and sbt clean? Sounds like an old build issue. |
But yes I agree with Patrick on the docs -- I shouldn't have merged this without looking at that and trying Shark as well, so we know what will break there. Sorry about that. |
I tried running mvn dependency:tree after I removed .m2 and .ivy2 and sbt clean and mvn clean. Got the following error [INFO] Reactor Summary: |
@rxin, it's my understanding that this is "normal" for running |
@rxin: actually, apparently Maven in its infinite wisdom requires |
alright thanks @jey. that worked (although a little bit convoluted...) |
… have been merged into Spark master. See mesos/spark#838
Fix build problems due to mesos/spark#838
`lateral_view_outer` query sometimes returns a different set of 10 rows. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes mesos#838 from tdas/hive-test-fix2 and squashes the following commits: 9128a0d [Tathagata Das] Blacklisted flaky HiveCompatibility test.
This PR allows one Spark binary to target multiple Hadoop versions. It also moves YARN support into a separate artifact. This is the follow-up to PR #803.
CC: @mateiz, @mridulm, @tgravescs