-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[WIP][SPARK-29250][BUILD][test-hadoop3.2][test-maven] Upgrade to Hadoop 3.2.1 #25932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -22,6 +22,8 @@ automaton-1.11-8.jar | |||
avro-1.8.2.jar | |||
avro-ipc-1.8.2.jar | |||
avro-mapred-1.8.2-hadoop2.jar | |||
bcpkix-jdk15on-1.60.jar | |||
bcprov-jdk15on-1.60.jar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, we already have the following in NOTICE-binary
. We may need to remove the word optionally
, or to exclude these two like before.
This product optionally depends on 'Bouncy Castle Crypto APIs' to generate
a temporary self-signed X.509 certificate when the JVM does not provide the
equivalent functionality. It can be obtained at:
* LICENSE:
* license/LICENSE.bouncycastle.txt (MIT License)
* HOMEPAGE:
* http://www.bouncycastle.org/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen . Could you give me some advice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few parts here. First, that NOTICE statement is from Hadoop's NOTICE, so I'd copy whatever it says now, to update.
Second, if it's a first-class dependency now, it needs to have a line in LICENSE-binary and a copy of the license in licenses-binary/. It's MIT-licensed so should be OK.
Finally, BC is a special case because it's subject to crypto export laws. We will have to update http://www.apache.org/licenses/exports/ to say that it's again a dependency in 3.0. I can go figure that out again as and when this is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. It sounds too much for me. Please help me after I merge this. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, the process for the last part is https://www.apache.org/dev/crypto.html#sources
I can do it afterwards, it's not hard.
However, hm, I wonder if Hadoop needs a similar disclosure at http://www.apache.org/licenses/exports/ ? It's possible that somehow it isn't distributed directly by Hadoop, but would be surprised if it's a first-class dep and it makes binary releases.
Maybe, eh, CC @steveloughran in case he knows anything about this angle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooh, it is in the binary isn't it
hadoop-3.2.1 find . -print | grep bcp
./share/hadoop/yarn/lib/bcprov-jdk15on-1.60.jar
./share/hadoop/yarn/lib/bcpkix-jdk15on-1.60.jar
let me chase this up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @srowen and @steveloughran .
I will use this chance to learn this legal process. :)
Test build #111375 has finished for PR 25932 at commit
|
Retest this please. |
Test build #111390 has finished for PR 25932 at commit
|
Retest this please. |
Retest this please. |
Test build #111404 has finished for PR 25932 at commit
|
Test build #111403 has finished for PR 25932 at commit
|
Retest this please. |
Retest this please. |
Test build #111436 has finished for PR 25932 at commit
|
Test build #111439 has finished for PR 25932 at commit
|
Hi, All. I've been trying to fix the failures due to Hadoop's Guava dependency update, but there was no luck until now. I'll close this one for now. |
@dongjoon-hyun Hi, is there a jira to track this issue about Hadoop's Guava dependency update? |
What do you mean? We are tracking here, SPARK-29250 . |
@ouyangxiaochen @dongjoon-hyun I ran into this today and this is what the issue is: Hadoop 3.2.1 updates from Guava 11 to Guava 27: I think we may need to match this, not least of which because Guava is such a problem dependency that updates have to happen in a major release, probably. (Kind of surprised to see that in a maintenance release of Hadoop). Previously we'd been reluctant to vary from Hadoop, but, for Hadoop 3 profiles, seems like we need to try? Do you want to try that in this PR as part of moving it along? |
Yes. I knew, @srowen . It would be great, but I'm not sure we can escape. |
cc @gatorsmile , @wangyum |
Heh yeah I started working on a "Guava 27" branch and the changes are non-trivial. I think it will require us to simply avoid a lot of Guava usage with various workarounds. Well, I may take this on in the short term as I think we have to wait on further Scala 2.13 updates, and, I think the Scala 2.13 update, via Kafka 2.4, might force this anyway. I'll work on it as, if there is a change here, we'd best do it for Spark 3.0. |
Thank you for working on this, @srowen ! |
See #26911 for at least directly removing some exposure to Guava |
@dongjoon-hyun #26911 is merged, so you can try making a |
I have some spare time. Let me try at #27009 |
I think we may upgrade to Hadoop 3.2.1 via switching to shaded Hadoop client jars. I've created a PR for this: #29843 |
What changes were proposed in this pull request?
This PR aims to upgrade Hadoop version from 3.2.0 to 3.2.1 in
hadoop-3.2
profile.Why are the changes needed?
Hadoop 3.2.1 has 493 patches including client bug fixes and improvements.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Pass the Jenkins with the existing tests.
For the dependency, this PR is tested on both JDK8/JDK11. There is no difference based on JDK versions.