Skip to content

[SPARK-45593][BUILD] Building a runnable distribution from master code running spark-sql raise error #43436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,12 @@
<groupId>org.apache.spark</groupId>
<artifactId>spark-connect_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<exclusions>
<exclusion>
<groupId>org.apache.spark</groupId>
<artifactId>spark-connect-common_${scala.binary.version}</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Runnable distribution tar already contains connect-server and connect-server shade the connect-common, so runnable distribution tar does not need to contain connect-common anymore.

<include>org.apache.spark:spark-connect-common_${scala.binary.version}</include>

</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down
8 changes: 1 addition & 7 deletions connector/connect/client/jvm/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -51,15 +51,9 @@
<version>${project.version}</version>
</dependency>
<!--
We need to define guava and protobuf here because we need to change the scope of both from
We need to define protobuf here because we need to change the scope of both from
provided to compile. If we don't do this we can't shade these libraries.
-->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>${connect.guava.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
Expand Down
34 changes: 34 additions & 0 deletions connector/connect/common/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,11 @@
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
</dependency>
<!--
SPARK-45593: spark connect relies on a specific version of Guava, We perform shading
of the Guava library within the connect-common module to ensure both connect-server and
connect-client modules maintain consistent and accurate Guava dependencies.
-->
<dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spark connect requires a higher version of guava, so we can't remove this dependency. We should reconfigure the maven-shade-plugin in connect-common to exclude the packaging of this Guava.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both connect-server and connect-client have their own independent guava dependencies : )

I think we should use connect-common shade guava, connect-client and connect-server shade connect-common to make connect-server and connect-client use the same guava dependency.

In addition, because connect-common is included in the connect-server shade, the runnable distribution can also exclude connect-common.

Do you like this current commit?

Copy link
Contributor Author

@yikf yikf Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it as #43195 (comment) description and it worked as expected.
image

<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
Expand Down Expand Up @@ -145,6 +150,35 @@
</execution>
</executions>
</plugin>
<plugin>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to shade common? For the off-chance someone takes a direct dependency on it?

<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<configuration>
<shadedArtifactAttached>false</shadedArtifactAttached>
<artifactSet>
<includes>
<include>org.spark-project.spark:unused</include>
<include>com.google.guava:guava</include>
<include>com.google.guava:failureaccess</include>
<include>org.apache.tomcat:annotations-api</include>
</includes>
</artifactSet>
<relocations>
<relocation>
<pattern>com.google.common</pattern>
<shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
</relocation>
</relocations>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
<profiles>
Expand Down
25 changes: 0 additions & 25 deletions connector/connect/server/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -51,12 +51,6 @@
<groupId>org.apache.spark</groupId>
<artifactId>spark-connect-common_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<exclusions>
<exclusion>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down Expand Up @@ -156,17 +150,6 @@
<groupId>org.scala-lang.modules</groupId>
<artifactId>scala-parallel-collections_${scala.binary.version}</artifactId>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>${connect.guava.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>failureaccess</artifactId>
<version>${guava.failureaccess.version}</version>
</dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
Expand Down Expand Up @@ -287,7 +270,6 @@
<shadedArtifactAttached>false</shadedArtifactAttached>
<artifactSet>
<includes>
<include>com.google.guava:*</include>
<include>io.grpc:*:</include>
<include>com.google.protobuf:*</include>

Expand All @@ -307,13 +289,6 @@
</includes>
</artifactSet>
<relocations>
<relocation>
<pattern>com.google.common</pattern>
<shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
<includes>
<include>com.google.common.**</include>
</includes>
</relocation>
<relocation>
<pattern>com.google.thirdparty</pattern>
<shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
Expand Down