Skip to content

Commit 44d2c86

Browse files
yikfLuciferYang
authored andcommitted
[SPARK-45593][BUILD] Building a runnable distribution from master code running spark-sql raise error
### What changes were proposed in this pull request? Fix a build issue, when building a runnable distribution from master code running spark-sql raise error: ``` Caused by: java.lang.ClassNotFoundException: org.sparkproject.guava.util.concurrent.internal.InternalFutureFailureAccess at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520) ... 58 more ``` the problem is due to a gauva dependency in spark-connect-common POM that **conflicts** with the shade plugin of the parent pom. - the spark-connect-common contains `connect.guava.version` version of guava, and it is relocation as `${spark.shade.packageName}.guava` not the `${spark.shade.packageName}.connect.guava`; - The spark-network-common also contains guava related classes, it has also been relocation is `${spark.shade.packageName}.guava`, but guava version `${guava.version}`; - As a result, in the presence of different versions of the classpath org.sparkproject.guava.xx; In addition, after investigation, it seems that module spark-connect-common is not related to guava, so we can remove guava dependency from spark-connect-common. ### Why are the changes needed? Building a runnable distribution from master code is not runnable. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? I ran the build command output a runnable distribution package manually for the tests; Build command: ``` ./dev/make-distribution.sh --name ui --pip --tgz -Phive -Phive-thriftserver -Pyarn -Pconnect ``` Test result: <img width="1276" alt="image" src="https://github.com/apache/spark/assets/51110188/aefbc433-ea5c-4287-8ebd-367806043ac8"> I also checked the `org.sparkproject.guava.cache.LocalCache` from jars dir; Before: ``` ➜ jars grep -lr 'org.sparkproject.guava.cache.LocalCache' ./ .//spark-connect_2.13-4.0.0-SNAPSHOT.jar .//spark-network-common_2.13-4.0.0-SNAPSHOT.jar .//spark-connect-common_2.13-4.0.0-SNAPSHOT.jar ``` Now: ``` ➜ jars grep -lr 'org.sparkproject.guava.cache.LocalCache' ./ .//spark-network-common_2.13-4.0.0-SNAPSHOT.jar ``` ### Was this patch authored or co-authored using generative AI tooling? No Closes #43436 from Yikf/SPARK-45593. Authored-by: yikaifei <yikaifei@apache.org> Signed-off-by: yangjie01 <yangjie01@baidu.com>
1 parent 89727bf commit 44d2c86

File tree

4 files changed

+41
-32
lines changed

4 files changed

+41
-32
lines changed

assembly/pom.xml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,12 @@
149149
<groupId>org.apache.spark</groupId>
150150
<artifactId>spark-connect_${scala.binary.version}</artifactId>
151151
<version>${project.version}</version>
152+
<exclusions>
153+
<exclusion>
154+
<groupId>org.apache.spark</groupId>
155+
<artifactId>spark-connect-common_${scala.binary.version}</artifactId>
156+
</exclusion>
157+
</exclusions>
152158
</dependency>
153159
<dependency>
154160
<groupId>org.apache.spark</groupId>

connector/connect/client/jvm/pom.xml

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -51,15 +51,9 @@
5151
<version>${project.version}</version>
5252
</dependency>
5353
<!--
54-
We need to define guava and protobuf here because we need to change the scope of both from
54+
We need to define protobuf here because we need to change the scope of both from
5555
provided to compile. If we don't do this we can't shade these libraries.
5656
-->
57-
<dependency>
58-
<groupId>com.google.guava</groupId>
59-
<artifactId>guava</artifactId>
60-
<version>${connect.guava.version}</version>
61-
<scope>compile</scope>
62-
</dependency>
6357
<dependency>
6458
<groupId>com.google.protobuf</groupId>
6559
<artifactId>protobuf-java</artifactId>

connector/connect/common/pom.xml

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,11 @@
4747
<groupId>com.google.protobuf</groupId>
4848
<artifactId>protobuf-java</artifactId>
4949
</dependency>
50+
<!--
51+
SPARK-45593: spark connect relies on a specific version of Guava, We perform shading
52+
of the Guava library within the connect-common module to ensure both connect-server and
53+
connect-client modules maintain consistent and accurate Guava dependencies.
54+
-->
5055
<dependency>
5156
<groupId>com.google.guava</groupId>
5257
<artifactId>guava</artifactId>
@@ -145,6 +150,35 @@
145150
</execution>
146151
</executions>
147152
</plugin>
153+
<plugin>
154+
<groupId>org.apache.maven.plugins</groupId>
155+
<artifactId>maven-shade-plugin</artifactId>
156+
<configuration>
157+
<shadedArtifactAttached>false</shadedArtifactAttached>
158+
<artifactSet>
159+
<includes>
160+
<include>org.spark-project.spark:unused</include>
161+
<include>com.google.guava:guava</include>
162+
<include>com.google.guava:failureaccess</include>
163+
<include>org.apache.tomcat:annotations-api</include>
164+
</includes>
165+
</artifactSet>
166+
<relocations>
167+
<relocation>
168+
<pattern>com.google.common</pattern>
169+
<shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
170+
</relocation>
171+
</relocations>
172+
</configuration>
173+
<executions>
174+
<execution>
175+
<phase>package</phase>
176+
<goals>
177+
<goal>shade</goal>
178+
</goals>
179+
</execution>
180+
</executions>
181+
</plugin>
148182
</plugins>
149183
</build>
150184
<profiles>

connector/connect/server/pom.xml

Lines changed: 0 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -51,12 +51,6 @@
5151
<groupId>org.apache.spark</groupId>
5252
<artifactId>spark-connect-common_${scala.binary.version}</artifactId>
5353
<version>${project.version}</version>
54-
<exclusions>
55-
<exclusion>
56-
<groupId>com.google.guava</groupId>
57-
<artifactId>guava</artifactId>
58-
</exclusion>
59-
</exclusions>
6054
</dependency>
6155
<dependency>
6256
<groupId>org.apache.spark</groupId>
@@ -156,17 +150,6 @@
156150
<groupId>org.scala-lang.modules</groupId>
157151
<artifactId>scala-parallel-collections_${scala.binary.version}</artifactId>
158152
</dependency>
159-
<dependency>
160-
<groupId>com.google.guava</groupId>
161-
<artifactId>guava</artifactId>
162-
<version>${connect.guava.version}</version>
163-
<scope>compile</scope>
164-
</dependency>
165-
<dependency>
166-
<groupId>com.google.guava</groupId>
167-
<artifactId>failureaccess</artifactId>
168-
<version>${guava.failureaccess.version}</version>
169-
</dependency>
170153
<dependency>
171154
<groupId>com.google.protobuf</groupId>
172155
<artifactId>protobuf-java</artifactId>
@@ -287,7 +270,6 @@
287270
<shadedArtifactAttached>false</shadedArtifactAttached>
288271
<artifactSet>
289272
<includes>
290-
<include>com.google.guava:*</include>
291273
<include>io.grpc:*:</include>
292274
<include>com.google.protobuf:*</include>
293275

@@ -307,13 +289,6 @@
307289
</includes>
308290
</artifactSet>
309291
<relocations>
310-
<relocation>
311-
<pattern>com.google.common</pattern>
312-
<shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
313-
<includes>
314-
<include>com.google.common.**</include>
315-
</includes>
316-
</relocation>
317292
<relocation>
318293
<pattern>com.google.thirdparty</pattern>
319294
<shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>

0 commit comments

Comments
 (0)