Skip to content

[SPARK-33212][BUILD] Move to shaded clients for Hadoop 3.x profile #29843

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 20 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion common/network-yarn/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,13 @@
<!-- Provided dependencies -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<artifactId>${hadoop-client-api.artifact}</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>${hadoop-client-runtime.artifact}</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
Expand Down
16 changes: 15 additions & 1 deletion core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,13 @@
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<artifactId>${hadoop-client-api.artifact}</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>${hadoop-client-runtime.artifact}</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down Expand Up @@ -177,6 +183,14 @@
<groupId>org.apache.commons</groupId>
<artifactId>commons-text</artifactId>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
</dependency>
<dependency>
<groupId>commons-collections</groupId>
<artifactId>commons-collections</artifactId>
</dependency>
<dependency>
<groupId>com.google.code.findbugs</groupId>
<artifactId>jsr305</artifactId>
Expand Down
8 changes: 5 additions & 3 deletions core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
Original file line number Diff line number Diff line change
Expand Up @@ -1182,10 +1182,12 @@ private[spark] object SparkSubmitUtils {
def resolveDependencyPaths(
artifacts: Array[AnyRef],
cacheDirectory: File): String = {
artifacts.map { artifactInfo =>
val artifact = artifactInfo.asInstanceOf[Artifact].getModuleRevisionId
artifacts.map { ai =>
val artifactInfo = ai.asInstanceOf[Artifact]
val artifact = artifactInfo.getModuleRevisionId
val testSuffix = if (artifactInfo.getType == "test-jar") "-tests" else ""
cacheDirectory.getAbsolutePath + File.separator +
s"${artifact.getOrganisation}_${artifact.getName}-${artifact.getRevision}.jar"
s"${artifact.getOrganisation}_${artifact.getName}-${artifact.getRevision}${testSuffix}.jar"
}.mkString(",")
}

Expand Down
3 changes: 1 addition & 2 deletions dev/deps/spark-deps-hadoop-2.7-hive-2.3
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ javax.inject/1//javax.inject-1.jar
javax.jdo/3.2.0-m3//javax.jdo-3.2.0-m3.jar
javax.servlet-api/3.1.0//javax.servlet-api-3.1.0.jar
javolution/5.5.1//javolution-5.5.1.jar
jaxb-api/2.2.2//jaxb-api-2.2.2.jar
jaxb-api/2.2.11//jaxb-api-2.2.11.jar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we don't touch hadoop 2.7 deps, why do we need to change this file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a side-effect from the default Hadoop 3.x profile: since the shaded client jars exclude 3rd party dependencies include this, we'd have to explicitly import it even for 2.x profile. With some Maven trick maybe we can avoid the change but I'm not sure if this is a big deal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a concern, and I just want to check why this is changed.

jaxb-runtime/2.3.2//jaxb-runtime-2.3.2.jar
jcl-over-slf4j/1.7.30//jcl-over-slf4j-1.7.30.jar
jdo-api/3.0.1//jdo-api-3.0.1.jar
Expand Down Expand Up @@ -226,7 +226,6 @@ spire-macros_2.12/0.17.0-M1//spire-macros_2.12-0.17.0-M1.jar
spire-platform_2.12/0.17.0-M1//spire-platform_2.12-0.17.0-M1.jar
spire-util_2.12/0.17.0-M1//spire-util_2.12-0.17.0-M1.jar
spire_2.12/0.17.0-M1//spire_2.12-0.17.0-M1.jar
stax-api/1.0-2//stax-api-1.0-2.jar
stax-api/1.0.1//stax-api-1.0.1.jar
stream/2.9.6//stream-2.9.6.jar
super-csv/2.2.0//super-csv-2.2.0.jar
Expand Down
52 changes: 2 additions & 50 deletions dev/deps/spark-deps-hadoop-3.2-hive-2.3
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,12 @@ JLargeArrays/1.5//JLargeArrays-1.5.jar
JTransforms/3.1//JTransforms-3.1.jar
RoaringBitmap/0.9.0//RoaringBitmap-0.9.0.jar
ST4/4.0.4//ST4-4.0.4.jar
accessors-smart/1.2//accessors-smart-1.2.jar
activation/1.1.1//activation-1.1.1.jar
aircompressor/0.10//aircompressor-0.10.jar
algebra_2.12/2.0.0-M2//algebra_2.12-2.0.0-M2.jar
antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar
antlr4-runtime/4.7.1//antlr4-runtime-4.7.1.jar
aopalliance-repackaged/2.6.1//aopalliance-repackaged-2.6.1.jar
aopalliance/1.0//aopalliance-1.0.jar
arpack_combined_all/0.1//arpack_combined_all-0.1.jar
arrow-format/1.0.1//arrow-format-1.0.1.jar
arrow-memory-core/1.0.1//arrow-memory-core-1.0.1.jar
Expand All @@ -27,15 +25,12 @@ breeze_2.12/1.0//breeze_2.12-1.0.jar
cats-kernel_2.12/2.0.0-M4//cats-kernel_2.12-2.0.0-M4.jar
chill-java/0.9.5//chill-java-0.9.5.jar
chill_2.12/0.9.5//chill_2.12-0.9.5.jar
commons-beanutils/1.9.4//commons-beanutils-1.9.4.jar
commons-cli/1.2//commons-cli-1.2.jar
commons-codec/1.10//commons-codec-1.10.jar
commons-collections/3.2.2//commons-collections-3.2.2.jar
commons-compiler/3.0.16//commons-compiler-3.0.16.jar
commons-compress/1.8.1//commons-compress-1.8.1.jar
commons-configuration2/2.1.1//commons-configuration2-2.1.1.jar
commons-crypto/1.0.0//commons-crypto-1.0.0.jar
commons-daemon/1.0.13//commons-daemon-1.0.13.jar
commons-dbcp/1.4//commons-dbcp-1.4.jar
commons-httpclient/3.1//commons-httpclient-3.1.jar
commons-io/2.5//commons-io-2.5.jar
Expand All @@ -55,30 +50,13 @@ datanucleus-api-jdo/4.2.4//datanucleus-api-jdo-4.2.4.jar
datanucleus-core/4.1.17//datanucleus-core-4.1.17.jar
datanucleus-rdbms/4.1.19//datanucleus-rdbms-4.1.19.jar
derby/10.12.1.1//derby-10.12.1.1.jar
dnsjava/2.1.7//dnsjava-2.1.7.jar
dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar
ehcache/3.3.1//ehcache-3.3.1.jar
flatbuffers-java/1.9.0//flatbuffers-java-1.9.0.jar
generex/1.0.2//generex-1.0.2.jar
geronimo-jcache_1.0_spec/1.0-alpha-1//geronimo-jcache_1.0_spec-1.0-alpha-1.jar
gson/2.2.4//gson-2.2.4.jar
guava/14.0.1//guava-14.0.1.jar
guice-servlet/4.0//guice-servlet-4.0.jar
guice/4.0//guice-4.0.jar
hadoop-annotations/3.2.0//hadoop-annotations-3.2.0.jar
hadoop-auth/3.2.0//hadoop-auth-3.2.0.jar
hadoop-client/3.2.0//hadoop-client-3.2.0.jar
hadoop-common/3.2.0//hadoop-common-3.2.0.jar
hadoop-hdfs-client/3.2.0//hadoop-hdfs-client-3.2.0.jar
hadoop-mapreduce-client-common/3.2.0//hadoop-mapreduce-client-common-3.2.0.jar
hadoop-mapreduce-client-core/3.2.0//hadoop-mapreduce-client-core-3.2.0.jar
hadoop-mapreduce-client-jobclient/3.2.0//hadoop-mapreduce-client-jobclient-3.2.0.jar
hadoop-yarn-api/3.2.0//hadoop-yarn-api-3.2.0.jar
hadoop-yarn-client/3.2.0//hadoop-yarn-client-3.2.0.jar
hadoop-yarn-common/3.2.0//hadoop-yarn-common-3.2.0.jar
hadoop-yarn-registry/3.2.0//hadoop-yarn-registry-3.2.0.jar
hadoop-yarn-server-common/3.2.0//hadoop-yarn-server-common-3.2.0.jar
hadoop-yarn-server-web-proxy/3.2.0//hadoop-yarn-server-web-proxy-3.2.0.jar
hadoop-client-api/3.2.0//hadoop-client-api-3.2.0.jar
hadoop-client-runtime/3.2.0//hadoop-client-runtime-3.2.0.jar
hive-beeline/2.3.7//hive-beeline-2.3.7.jar
hive-cli/2.3.7//hive-cli-2.3.7.jar
hive-common/2.3.7//hive-common-2.3.7.jar
Expand Down Expand Up @@ -107,8 +85,6 @@ jackson-core/2.10.0//jackson-core-2.10.0.jar
jackson-databind/2.10.0//jackson-databind-2.10.0.jar
jackson-dataformat-yaml/2.10.0//jackson-dataformat-yaml-2.10.0.jar
jackson-datatype-jsr310/2.10.3//jackson-datatype-jsr310-2.10.3.jar
jackson-jaxrs-base/2.9.5//jackson-jaxrs-base-2.9.5.jar
jackson-jaxrs-json-provider/2.9.5//jackson-jaxrs-json-provider-2.9.5.jar
jackson-mapper-asl/1.9.13//jackson-mapper-asl-1.9.13.jar
jackson-module-jaxb-annotations/2.10.0//jackson-module-jaxb-annotations-2.10.0.jar
jackson-module-paranamer/2.10.0//jackson-module-paranamer-2.10.0.jar
Expand All @@ -121,13 +97,11 @@ jakarta.ws.rs-api/2.1.6//jakarta.ws.rs-api-2.1.6.jar
jakarta.xml.bind-api/2.3.2//jakarta.xml.bind-api-2.3.2.jar
janino/3.0.16//janino-3.0.16.jar
javassist/3.25.0-GA//javassist-3.25.0-GA.jar
javax.inject/1//javax.inject-1.jar
javax.jdo/3.2.0-m3//javax.jdo-3.2.0-m3.jar
javax.servlet-api/3.1.0//javax.servlet-api-3.1.0.jar
javolution/5.5.1//javolution-5.5.1.jar
jaxb-api/2.2.11//jaxb-api-2.2.11.jar
jaxb-runtime/2.3.2//jaxb-runtime-2.3.2.jar
jcip-annotations/1.0-1//jcip-annotations-1.0-1.jar
jcl-over-slf4j/1.7.30//jcl-over-slf4j-1.7.30.jar
jdo-api/3.0.1//jdo-api-3.0.1.jar
jersey-client/2.30//jersey-client-2.30.jar
Expand All @@ -141,30 +115,14 @@ jline/2.14.6//jline-2.14.6.jar
joda-time/2.10.5//joda-time-2.10.5.jar
jodd-core/3.5.2//jodd-core-3.5.2.jar
jpam/1.1//jpam-1.1.jar
json-smart/2.3//json-smart-2.3.jar
json/1.8//json-1.8.jar
json4s-ast_2.12/3.7.0-M5//json4s-ast_2.12-3.7.0-M5.jar
json4s-core_2.12/3.7.0-M5//json4s-core_2.12-3.7.0-M5.jar
json4s-jackson_2.12/3.7.0-M5//json4s-jackson_2.12-3.7.0-M5.jar
json4s-scalap_2.12/3.7.0-M5//json4s-scalap_2.12-3.7.0-M5.jar
jsp-api/2.1//jsp-api-2.1.jar
jsr305/3.0.0//jsr305-3.0.0.jar
jta/1.1//jta-1.1.jar
jul-to-slf4j/1.7.30//jul-to-slf4j-1.7.30.jar
kerb-admin/1.0.1//kerb-admin-1.0.1.jar
kerb-client/1.0.1//kerb-client-1.0.1.jar
kerb-common/1.0.1//kerb-common-1.0.1.jar
kerb-core/1.0.1//kerb-core-1.0.1.jar
kerb-crypto/1.0.1//kerb-crypto-1.0.1.jar
kerb-identity/1.0.1//kerb-identity-1.0.1.jar
kerb-server/1.0.1//kerb-server-1.0.1.jar
kerb-simplekdc/1.0.1//kerb-simplekdc-1.0.1.jar
kerb-util/1.0.1//kerb-util-1.0.1.jar
kerby-asn1/1.0.1//kerby-asn1-1.0.1.jar
kerby-config/1.0.1//kerby-config-1.0.1.jar
kerby-pkix/1.0.1//kerby-pkix-1.0.1.jar
kerby-util/1.0.1//kerby-util-1.0.1.jar
kerby-xdr/1.0.1//kerby-xdr-1.0.1.jar
kryo-shaded/4.0.2//kryo-shaded-4.0.2.jar
kubernetes-client/4.10.3//kubernetes-client-4.10.3.jar
kubernetes-model-admissionregistration/4.10.3//kubernetes-model-admissionregistration-4.10.3.jar
Expand Down Expand Up @@ -202,9 +160,7 @@ metrics-json/4.1.1//metrics-json-4.1.1.jar
metrics-jvm/4.1.1//metrics-jvm-4.1.1.jar
minlog/1.3.0//minlog-1.3.0.jar
netty-all/4.1.51.Final//netty-all-4.1.51.Final.jar
nimbus-jose-jwt/4.41.1//nimbus-jose-jwt-4.41.1.jar
objenesis/2.6//objenesis-2.6.jar
okhttp/2.7.5//okhttp-2.7.5.jar
okhttp/3.12.12//okhttp-3.12.12.jar
okio/1.14.0//okio-1.14.0.jar
opencsv/2.3//opencsv-2.3.jar
Expand All @@ -224,7 +180,6 @@ parquet-jackson/1.10.1//parquet-jackson-1.10.1.jar
protobuf-java/2.5.0//protobuf-java-2.5.0.jar
py4j/0.10.9//py4j-0.10.9.jar
pyrolite/4.30//pyrolite-4.30.jar
re2j/1.1//re2j-1.1.jar
scala-collection-compat_2.12/2.1.1//scala-collection-compat_2.12-2.1.1.jar
scala-compiler/2.12.10//scala-compiler-2.12.10.jar
scala-library/2.12.10//scala-library-2.12.10.jar
Expand All @@ -242,15 +197,12 @@ spire-platform_2.12/0.17.0-M1//spire-platform_2.12-0.17.0-M1.jar
spire-util_2.12/0.17.0-M1//spire-util_2.12-0.17.0-M1.jar
spire_2.12/0.17.0-M1//spire_2.12-0.17.0-M1.jar
stax-api/1.0.1//stax-api-1.0.1.jar
stax2-api/3.1.4//stax2-api-3.1.4.jar
stream/2.9.6//stream-2.9.6.jar
super-csv/2.2.0//super-csv-2.2.0.jar
threeten-extra/1.5.0//threeten-extra-1.5.0.jar
token-provider/1.0.1//token-provider-1.0.1.jar
transaction-api/1.1//transaction-api-1.1.jar
univocity-parsers/2.9.0//univocity-parsers-2.9.0.jar
velocity/1.5//velocity-1.5.jar
woodstox-core/5.0.3//woodstox-core-5.0.3.jar
xbean-asm7-shaded/4.15//xbean-asm7-shaded-4.15.jar
xz/1.5//xz-1.5.jar
zjsonpatch/0.3.0//zjsonpatch-0.3.0.jar
Expand Down
8 changes: 7 additions & 1 deletion external/kafka-0-10-assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -71,9 +71,15 @@
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<artifactId>${hadoop-client-api.artifact}</artifactId>
<version>${hadoop.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>${hadoop-client-runtime.artifact}</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-mapred</artifactId>
Expand Down
4 changes: 4 additions & 0 deletions external/kafka-0-10-sql/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,10 @@
<artifactId>kafka-clients</artifactId>
<version>${kafka.version}</version>
</dependency>
<dependency>
<groupId>com.google.code.findbugs</groupId>
<artifactId>jsr305</artifactId>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-pool2</artifactId>
Expand Down
5 changes: 5 additions & 0 deletions external/kafka-0-10-token-provider/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,11 @@
<artifactId>mockito-core</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>${hadoop-client-runtime.artifact}</artifactId>
<scope>${hadoop.deps.scope}</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-tags_${scala.binary.version}</artifactId>
Expand Down
8 changes: 7 additions & 1 deletion external/kinesis-asl-assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -91,9 +91,15 @@
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<artifactId>${hadoop-client-api.artifact}</artifactId>
<version>${hadoop.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>${hadoop-client-runtime.artifact}</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-ipc</artifactId>
Expand Down
7 changes: 6 additions & 1 deletion hadoop-cloud/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,15 @@
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<artifactId>${hadoop-client-api.artifact}</artifactId>
<version>${hadoop.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>${hadoop-client-runtime.artifact}</artifactId>
<version>${hadoop.version}</version>
</dependency>
<!--
the AWS module pulls in jackson; its transitive dependencies can create
intra-jackson-module version problems.
Expand Down
9 changes: 8 additions & 1 deletion launcher/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,14 @@
<!-- Not needed by the test code, but referenced by SparkSubmit which is used by the tests. -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<artifactId>${hadoop-client-api.artifact}</artifactId>
<version>${hadoop.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>${hadoop-client-runtime.artifact}</artifactId>
<version>${hadoop.version}</version>
<scope>test</scope>
</dependency>
</dependencies>
Expand Down
Loading