Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49569][BUILD][FOLLOWUP] Adds scala-library maven dependency to the spark-connect-shims module to fix Maven build errors #48399

Closed
wants to merge 2 commits into from

Conversation

LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Oct 9, 2024

What changes were proposed in this pull request?

This PR adds scala-library maven dependency to the spark-connect-shims module to fix Maven build errors.

Why are the changes needed?

Maven daily test pipeline build failed:

scaladoc error: fatal error: object scala in compiler mirror not found.
Error:  Failed to execute goal net.alchim31.maven:scala-maven-plugin:4.9.1:doc-jar (attach-scaladocs) on project spark-connect-shims_2.13: MavenReportException: Error while creating archive: wrap: Process exited with an error: 1 (Exit value: 1) -> [Help 1]
Error:  
Error:  To see the full stack trace of the errors, re-run Maven with the -e switch.
Error:  Re-run Maven using the -X switch to enable full debug logging.
Error:  
Error:  For more information about the errors and possible solutions, please read the following articles:
Error:  [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
Error:  
Error:  After correcting the problems, you can resume the build with the command
Error:    mvn <args> -rf :spark-connect-shims_2.13
Error: Process completed with exit code 1.

Does this PR introduce any user-facing change?

No

How was this patch tested?

  • Pass GitHub Actions
  • locally test:
build/mvn clean install -DskipTests -Phive

Before

[INFO] --- scala:4.9.1:doc-jar (attach-scaladocs) @ spark-connect-shims_2.13 ---
scaladoc error: fatal error: object scala in compiler mirror not found.
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Spark Project Parent POM 4.0.0-SNAPSHOT:
[INFO] 
[INFO] Spark Project Parent POM ........................... SUCCESS [  2.833 s]
[INFO] Spark Project Tags ................................. SUCCESS [  5.292 s]
[INFO] Spark Project Sketch ............................... SUCCESS [  5.675 s]
[INFO] Spark Project Common Utils ......................... SUCCESS [ 16.762 s]
[INFO] Spark Project Local DB ............................. SUCCESS [  7.735 s]
[INFO] Spark Project Networking ........................... SUCCESS [ 11.389 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [  9.159 s]
[INFO] Spark Project Variant .............................. SUCCESS [  3.618 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [  9.692 s]
[INFO] Spark Project Connect Shims ........................ FAILURE [  2.478 s]
[INFO] Spark Project Launcher ............................. SKIPPED
[INFO] Spark Project Core ................................. SKIPPED
[INFO] Spark Project ML Local Library ..................... SKIPPED
[INFO] Spark Project GraphX ............................... SKIPPED
[INFO] Spark Project Streaming ............................ SKIPPED
[INFO] Spark Project SQL API .............................. SKIPPED
[INFO] Spark Project Catalyst ............................. SKIPPED
[INFO] Spark Project SQL .................................. SKIPPED
[INFO] Spark Project ML Library ........................... SKIPPED
[INFO] Spark Project Tools ................................ SKIPPED
[INFO] Spark Project Hive ................................. SKIPPED
[INFO] Spark Project Connect Common ....................... SKIPPED
[INFO] Spark Avro ......................................... SKIPPED
[INFO] Spark Protobuf ..................................... SKIPPED
[INFO] Spark Project REPL ................................. SKIPPED
[INFO] Spark Project Connect Server ....................... SKIPPED
[INFO] Spark Project Connect Client ....................... SKIPPED
[INFO] Spark Project Assembly ............................. SKIPPED
[INFO] Kafka 0.10+ Token Provider for Streaming ........... SKIPPED
[INFO] Spark Integration for Kafka 0.10 ................... SKIPPED
[INFO] Kafka 0.10+ Source for Structured Streaming ........ SKIPPED
[INFO] Spark Project Examples ............................. SKIPPED
[INFO] Spark Integration for Kafka 0.10 Assembly .......... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  01:15 min
[INFO] Finished at: 2024-10-09T23:43:58+08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:4.9.1:doc-jar (attach-scaladocs) on project spark-connect-shims_2.13: MavenReportException: Error while creating archive: wrap: Process exited with an error: 1 (Exit value: 1) -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <args> -rf :spark-connect-shims_2.13

After

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Spark Project Parent POM 4.0.0-SNAPSHOT:
[INFO] 
[INFO] Spark Project Parent POM ........................... SUCCESS [  2.766 s]
[INFO] Spark Project Tags ................................. SUCCESS [  5.398 s]
[INFO] Spark Project Sketch ............................... SUCCESS [  6.361 s]
[INFO] Spark Project Common Utils ......................... SUCCESS [ 16.919 s]
[INFO] Spark Project Local DB ............................. SUCCESS [  8.083 s]
[INFO] Spark Project Networking ........................... SUCCESS [ 11.240 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [  9.438 s]
[INFO] Spark Project Variant .............................. SUCCESS [  3.697 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [  9.939 s]
[INFO] Spark Project Connect Shims ........................ SUCCESS [  2.938 s]
[INFO] Spark Project Launcher ............................. SUCCESS [  6.502 s]
[INFO] Spark Project Core ................................. SUCCESS [01:33 min]
[INFO] Spark Project ML Local Library ..................... SUCCESS [ 18.220 s]
[INFO] Spark Project GraphX ............................... SUCCESS [ 20.923 s]
[INFO] Spark Project Streaming ............................ SUCCESS [ 29.949 s]
[INFO] Spark Project SQL API .............................. SUCCESS [ 25.842 s]
[INFO] Spark Project Catalyst ............................. SUCCESS [02:02 min]
[INFO] Spark Project SQL .................................. SUCCESS [02:18 min]
[INFO] Spark Project ML Library ........................... SUCCESS [01:38 min]
[INFO] Spark Project Tools ................................ SUCCESS [  3.365 s]
[INFO] Spark Project Hive ................................. SUCCESS [ 45.357 s]
[INFO] Spark Project Connect Common ....................... SUCCESS [ 33.636 s]
[INFO] Spark Avro ......................................... SUCCESS [ 22.040 s]
[INFO] Spark Protobuf ..................................... SUCCESS [ 24.557 s]
[INFO] Spark Project REPL ................................. SUCCESS [ 13.843 s]
[INFO] Spark Project Connect Server ....................... SUCCESS [ 35.587 s]
[INFO] Spark Project Connect Client ....................... SUCCESS [ 33.929 s]
[INFO] Spark Project Assembly ............................. SUCCESS [  5.121 s]
[INFO] Kafka 0.10+ Token Provider for Streaming ........... SUCCESS [ 12.623 s]
[INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [ 16.908 s]
[INFO] Kafka 0.10+ Source for Structured Streaming ........ SUCCESS [ 23.664 s]
[INFO] Spark Project Examples ............................. SUCCESS [ 30.777 s]
[INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [  6.997 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  15:40 min
[INFO] Finished at: 2024-10-09T23:27:20+08:00
[INFO] ------------------------------------------------------------------------

Was this patch authored or co-authored using generative AI tooling?

No

@LuciferYang
Copy link
Contributor Author

https://github.com/apache/spark/actions/runs/11255598249/job/31295526084

scaladoc error: fatal error: object scala in compiler mirror not found.
Error:  Failed to execute goal net.alchim31.maven:scala-maven-plugin:4.9.1:doc-jar (attach-scaladocs) on project spark-connect-shims_2.13: MavenReportException: Error while creating archive: wrap: Process exited with an error: 1 (Exit value: 1) -> [Help 1]
Error:  
Error:  To see the full stack trace of the errors, re-run Maven with the -e switch.
Error:  Re-run Maven using the -X switch to enable full debug logging.
Error:  
Error:  For more information about the errors and possible solutions, please read the following articles:
Error:  [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
Error:  
Error:  After correcting the problems, you can resume the build with the command
Error:    mvn <args> -rf :spark-connect-shims_2.13
Error: Process completed with exit code 1.

image

@LuciferYang LuciferYang changed the title [SPARK-49569][BUILD][FOLLOWUP] Add maven dep: scala-compiler [SPARK-49569][BUILD][FOLLOWUP] Adds scala-library maven dependency to the spark-connect-shims module to fix Maven build errors Oct 9, 2024
Copy link
Contributor

@hvanhovell hvanhovell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - sorry about that.

@LuciferYang
Copy link
Contributor Author

Thanks @hvanhovell ~

@LuciferYang
Copy link
Contributor Author

+++ dirname /home/runner/work/spark/spark/core/../R/install-dev.sh
++ cd /home/runner/work/spark/spark/core/../R
++ pwd
+ FWDIR=/home/runner/work/spark/spark/R
+ LIB_DIR=/home/runner/work/spark/spark/R/lib
+ mkdir -p /home/runner/work/spark/spark/R/lib
+ pushd /home/runner/work/spark/spark/R
+ . /home/runner/work/spark/spark/R/find-r.sh
++ '[' -z '' ']'
++ '[' '!' -z '' ']'
+++ command -v R
++ '[' '!' ']'
++ echo 'Cannot find '\''R_HOME'\''. Please specify '\''R_HOME'\'' or make sure R is properly installed.'
++ exit 1
[info] 122 file(s) merged using strategy 'First' (Run the task at debug level to see the details)
[info] 169 file(s) merged using strategy 'Discard' (Run the task at debug level to see the details)
[info] Built: /home/runner/work/spark/spark/connector/connect/client/jvm/target/scala-2.13/spark-connect-client-jvm-assembly-4.0.0-SNAPSHOT.jar
[info] Jar hash: a6a29d5b6a1d273d5c44b8fd95591311b37a5876
[error] java.lang.RuntimeException: Nonzero exit value: 1
[error] 	at scala.sys.package$.error(package.scala:30)
[error] 	at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.slurp(ProcessBuilderImpl.scala:138)
[error] 	at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.$bang$bang(ProcessBuilderImpl.scala:108)
[error] 	at SparkR$.$anonfun$settings$138(SparkBuild.scala:1342)
[error] 	at SparkR$.$anonfun$settings$138$adapted(SparkBuild.scala:1339)
[error] 	at scala.Function1.$anonfun$compose$1(Function1.scala:49)
[error] 	at sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:63)
[error] 	at sbt.std.Transform$$anon$4.work(Transform.scala:69)
[error] 	at sbt.Execute.$anonfun$submit$2(Execute.scala:283)
[error] 	at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:24)
[error] 	at sbt.Execute.work(Execute.scala:292)
[error] 	at sbt.Execute.$anonfun$submit$1(Execute.scala:283)
[error] 	at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:265)
[error] 	at sbt.CompletionService$$anon$2.call(CompletionService.scala:65)
[error] 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[error] 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
[error] 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[error] 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
[error] 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
[error] 	at java.base/java.lang.Thread.run(Thread.java:[840](https://github.com/LuciferYang/spark/actions/runs/11257867505/job/31304597598#step:9:841))
[error] (core / buildRPackage) Nonzero exit value: 1
[error] Total time: 227 s (03:47), completed Oct 9, 2024, 3:41:57 PM
Error: Process completed with exit code 1.

I think this error is not related to the current PR. It seems like the R environment is missing in the k8s IT environment. Could you help confirm this if you have time? @dongjoon-hyun Thanks ~

@LuciferYang
Copy link
Contributor Author

+++ dirname /home/runner/work/spark/spark/core/../R/install-dev.sh
++ cd /home/runner/work/spark/spark/core/../R
++ pwd
+ FWDIR=/home/runner/work/spark/spark/R
+ LIB_DIR=/home/runner/work/spark/spark/R/lib
+ mkdir -p /home/runner/work/spark/spark/R/lib
+ pushd /home/runner/work/spark/spark/R
+ . /home/runner/work/spark/spark/R/find-r.sh
++ '[' -z '' ']'
++ '[' '!' -z '' ']'
+++ command -v R
++ '[' '!' ']'
++ echo 'Cannot find '\''R_HOME'\''. Please specify '\''R_HOME'\'' or make sure R is properly installed.'
++ exit 1
[info] 122 file(s) merged using strategy 'First' (Run the task at debug level to see the details)
[info] 169 file(s) merged using strategy 'Discard' (Run the task at debug level to see the details)
[info] Built: /home/runner/work/spark/spark/connector/connect/client/jvm/target/scala-2.13/spark-connect-client-jvm-assembly-4.0.0-SNAPSHOT.jar
[info] Jar hash: a6a29d5b6a1d273d5c44b8fd95591311b37a5876
[error] java.lang.RuntimeException: Nonzero exit value: 1
[error] 	at scala.sys.package$.error(package.scala:30)
[error] 	at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.slurp(ProcessBuilderImpl.scala:138)
[error] 	at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.$bang$bang(ProcessBuilderImpl.scala:108)
[error] 	at SparkR$.$anonfun$settings$138(SparkBuild.scala:1342)
[error] 	at SparkR$.$anonfun$settings$138$adapted(SparkBuild.scala:1339)
[error] 	at scala.Function1.$anonfun$compose$1(Function1.scala:49)
[error] 	at sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:63)
[error] 	at sbt.std.Transform$$anon$4.work(Transform.scala:69)
[error] 	at sbt.Execute.$anonfun$submit$2(Execute.scala:283)
[error] 	at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:24)
[error] 	at sbt.Execute.work(Execute.scala:292)
[error] 	at sbt.Execute.$anonfun$submit$1(Execute.scala:283)
[error] 	at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:265)
[error] 	at sbt.CompletionService$$anon$2.call(CompletionService.scala:65)
[error] 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[error] 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
[error] 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[error] 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
[error] 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
[error] 	at java.base/java.lang.Thread.run(Thread.java:[840](https://github.com/LuciferYang/spark/actions/runs/11257867505/job/31304597598#step:9:841))
[error] (core / buildRPackage) Nonzero exit value: 1
[error] Total time: 227 s (03:47), completed Oct 9, 2024, 3:41:57 PM
Error: Process completed with exit code 1.

I think this error is not related to the current PR. It seems like the R environment is missing in the k8s IT environment. Could you help confirm this if you have time? @dongjoon-hyun Thanks ~

no this issues with local packaging with -Psparkr. I'll merge this fix in first.

@LuciferYang
Copy link
Contributor Author

Merged into master for fix maven compile issue. Thanks @hvanhovell ~

@LuciferYang
Copy link
Contributor Author

Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala:121: value makeRDD is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:75: value id is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.columnar.CachedBatch]
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:82: value env is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:88: value env is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:185: value parallelize is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:481: value cleaner is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:500: value parallelize is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:940: value addSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:943: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:947: value removeSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1667: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1668: value addSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1673: value partitions is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]

Although I can successfully build after this pr locally, the retriggered Maven daily test still failed (with a different error from before). Further investigation is needed. It feels like another weird classpath issue. Do you have any suggestion on the aforementioned error? @hvanhovell

@LuciferYang
Copy link
Contributor Author

Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala:121: value makeRDD is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:75: value id is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.columnar.CachedBatch]
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:82: value env is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:88: value env is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:185: value parallelize is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:481: value cleaner is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:500: value parallelize is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:940: value addSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:943: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:947: value removeSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1667: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1668: value addSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1673: value partitions is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]

Although I can successfully build after this pr locally, the retriggered Maven daily test still failed (with a different error from before). Further investigation is needed. It feels like another weird classpath issue. Do you have any suggestion on the aforementioned error? @hvanhovell

I am trying further fix: https://github.com/apache/spark/pull/48403/files

LuciferYang added a commit that referenced this pull request Oct 10, 2024
…l/core` module

### What changes were proposed in this pull request?
This pr exclude `spark-connect-shims` from `sql/core` module for further fix maven daily test.

### Why are the changes needed?
For fix maven daily test:

After #48399, although the Maven build was successful in my local environment, the Maven daily test pipeline still failed to build:

- https://github.com/apache/spark/actions/runs/11255598249/job/31311358712

```
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala:121: value makeRDD is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:75: value id is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.columnar.CachedBatch]
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:82: value env is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:88: value env is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:185: value parallelize is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:481: value cleaner is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:500: value parallelize is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:940: value addSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:943: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:947: value removeSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1667: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1668: value addSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1673: value partitions is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1674: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1682: value partitions is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1683: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1687: value removeSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1708: value partitions is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
...
```

After using the `mvn dependency:tree` command to check, I found that `sql/core` cascadingly introduced `org.apache.spark:spark-connect-shims_2.13:jar:4.0.0-SNAPSHOT:test` through `org.apache.spark:spark-sql-api_2.13:test-jar:tests:4.0.0-SNAPSHOT:test`.

```
[INFO] ------------------< org.apache.spark:spark-sql_2.13 >-------------------
[INFO] Building Spark Project SQL 4.0.0-SNAPSHOT                        [18/42]
[INFO]   from sql/core/pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- dependency:3.6.1:tree (default-cli)  spark-sql_2.13 ---
[INFO] org.apache.spark:spark-sql_2.13:jar:4.0.0-SNAPSHOT
...
[INFO] +- org.apache.spark:spark-catalyst_2.13:test-jar:tests:4.0.0-SNAPSHOT:test
[INFO] +- org.apache.spark:spark-sql-api_2.13:test-jar:tests:4.0.0-SNAPSHOT:test
[INFO] |  +- org.scala-lang.modules:scala-parser-combinators_2.13:jar:2.4.0:compile
[INFO] |  +- org.apache.spark:spark-connect-shims_2.13:jar:4.0.0-SNAPSHOT:test
[INFO] |  +- org.antlr:antlr4-runtime:jar:4.13.1:compile
[INFO] |  +- org.apache.arrow:arrow-vector:jar:17.0.0:compile
[INFO] |  |  +- org.apache.arrow:arrow-format:jar:17.0.0:compile
[INFO] |  |  +- org.apache.arrow:arrow-memory-core:jar:17.0.0:compile
[INFO] |  |  +- com.fasterxml.jackson.datatype:jackson-datatype-jsr310:jar:2.18.0:compile
[INFO] |  |  \- com.google.flatbuffers:flatbuffers-java:jar:24.3.25:compile
[INFO] |  \- org.apache.arrow:arrow-memory-netty:jar:17.0.0:compile
[INFO] |     \- org.apache.arrow:arrow-memory-netty-buffer-patch:jar:17.0.0:compile
```

This should be unexpected.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- Pass maven test on GitHub Actions: https://github.com/LuciferYang/spark/runs/31314342332

<img width="996" alt="image" src="https://github.com/user-attachments/assets/6d2707f9-0d58-4c80-af27-cccdfa899e87">

All maven test passed

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #48403 from LuciferYang/test-maven-build.

Lead-authored-by: YangJie <yangjie01@baidu.com>
Co-authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: yangjie01 <yangjie01@baidu.com>
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
…to the `spark-connect-shims` module to fix Maven build errors

### What changes were proposed in this pull request?
This PR adds `scala-library` maven dependency to the `spark-connect-shims` module to fix Maven build errors.

### Why are the changes needed?
Maven daily test pipeline build failed:

- https://github.com/apache/spark/actions/runs/11255598249
- https://github.com/apache/spark/actions/runs/11256610976

```
scaladoc error: fatal error: object scala in compiler mirror not found.
Error:  Failed to execute goal net.alchim31.maven:scala-maven-plugin:4.9.1:doc-jar (attach-scaladocs) on project spark-connect-shims_2.13: MavenReportException: Error while creating archive: wrap: Process exited with an error: 1 (Exit value: 1) -> [Help 1]
Error:
Error:  To see the full stack trace of the errors, re-run Maven with the -e switch.
Error:  Re-run Maven using the -X switch to enable full debug logging.
Error:
Error:  For more information about the errors and possible solutions, please read the following articles:
Error:  [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
Error:
Error:  After correcting the problems, you can resume the build with the command
Error:    mvn <args> -rf :spark-connect-shims_2.13
Error: Process completed with exit code 1.
```

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- locally test:

```
build/mvn clean install -DskipTests -Phive
```

**Before**

```
[INFO] --- scala:4.9.1:doc-jar (attach-scaladocs)  spark-connect-shims_2.13 ---
scaladoc error: fatal error: object scala in compiler mirror not found.
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Spark Project Parent POM 4.0.0-SNAPSHOT:
[INFO]
[INFO] Spark Project Parent POM ........................... SUCCESS [  2.833 s]
[INFO] Spark Project Tags ................................. SUCCESS [  5.292 s]
[INFO] Spark Project Sketch ............................... SUCCESS [  5.675 s]
[INFO] Spark Project Common Utils ......................... SUCCESS [ 16.762 s]
[INFO] Spark Project Local DB ............................. SUCCESS [  7.735 s]
[INFO] Spark Project Networking ........................... SUCCESS [ 11.389 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [  9.159 s]
[INFO] Spark Project Variant .............................. SUCCESS [  3.618 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [  9.692 s]
[INFO] Spark Project Connect Shims ........................ FAILURE [  2.478 s]
[INFO] Spark Project Launcher ............................. SKIPPED
[INFO] Spark Project Core ................................. SKIPPED
[INFO] Spark Project ML Local Library ..................... SKIPPED
[INFO] Spark Project GraphX ............................... SKIPPED
[INFO] Spark Project Streaming ............................ SKIPPED
[INFO] Spark Project SQL API .............................. SKIPPED
[INFO] Spark Project Catalyst ............................. SKIPPED
[INFO] Spark Project SQL .................................. SKIPPED
[INFO] Spark Project ML Library ........................... SKIPPED
[INFO] Spark Project Tools ................................ SKIPPED
[INFO] Spark Project Hive ................................. SKIPPED
[INFO] Spark Project Connect Common ....................... SKIPPED
[INFO] Spark Avro ......................................... SKIPPED
[INFO] Spark Protobuf ..................................... SKIPPED
[INFO] Spark Project REPL ................................. SKIPPED
[INFO] Spark Project Connect Server ....................... SKIPPED
[INFO] Spark Project Connect Client ....................... SKIPPED
[INFO] Spark Project Assembly ............................. SKIPPED
[INFO] Kafka 0.10+ Token Provider for Streaming ........... SKIPPED
[INFO] Spark Integration for Kafka 0.10 ................... SKIPPED
[INFO] Kafka 0.10+ Source for Structured Streaming ........ SKIPPED
[INFO] Spark Project Examples ............................. SKIPPED
[INFO] Spark Integration for Kafka 0.10 Assembly .......... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  01:15 min
[INFO] Finished at: 2024-10-09T23:43:58+08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:4.9.1:doc-jar (attach-scaladocs) on project spark-connect-shims_2.13: MavenReportException: Error while creating archive: wrap: Process exited with an error: 1 (Exit value: 1) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <args> -rf :spark-connect-shims_2.13

```

**After**

```
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Spark Project Parent POM 4.0.0-SNAPSHOT:
[INFO]
[INFO] Spark Project Parent POM ........................... SUCCESS [  2.766 s]
[INFO] Spark Project Tags ................................. SUCCESS [  5.398 s]
[INFO] Spark Project Sketch ............................... SUCCESS [  6.361 s]
[INFO] Spark Project Common Utils ......................... SUCCESS [ 16.919 s]
[INFO] Spark Project Local DB ............................. SUCCESS [  8.083 s]
[INFO] Spark Project Networking ........................... SUCCESS [ 11.240 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [  9.438 s]
[INFO] Spark Project Variant .............................. SUCCESS [  3.697 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [  9.939 s]
[INFO] Spark Project Connect Shims ........................ SUCCESS [  2.938 s]
[INFO] Spark Project Launcher ............................. SUCCESS [  6.502 s]
[INFO] Spark Project Core ................................. SUCCESS [01:33 min]
[INFO] Spark Project ML Local Library ..................... SUCCESS [ 18.220 s]
[INFO] Spark Project GraphX ............................... SUCCESS [ 20.923 s]
[INFO] Spark Project Streaming ............................ SUCCESS [ 29.949 s]
[INFO] Spark Project SQL API .............................. SUCCESS [ 25.842 s]
[INFO] Spark Project Catalyst ............................. SUCCESS [02:02 min]
[INFO] Spark Project SQL .................................. SUCCESS [02:18 min]
[INFO] Spark Project ML Library ........................... SUCCESS [01:38 min]
[INFO] Spark Project Tools ................................ SUCCESS [  3.365 s]
[INFO] Spark Project Hive ................................. SUCCESS [ 45.357 s]
[INFO] Spark Project Connect Common ....................... SUCCESS [ 33.636 s]
[INFO] Spark Avro ......................................... SUCCESS [ 22.040 s]
[INFO] Spark Protobuf ..................................... SUCCESS [ 24.557 s]
[INFO] Spark Project REPL ................................. SUCCESS [ 13.843 s]
[INFO] Spark Project Connect Server ....................... SUCCESS [ 35.587 s]
[INFO] Spark Project Connect Client ....................... SUCCESS [ 33.929 s]
[INFO] Spark Project Assembly ............................. SUCCESS [  5.121 s]
[INFO] Kafka 0.10+ Token Provider for Streaming ........... SUCCESS [ 12.623 s]
[INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [ 16.908 s]
[INFO] Kafka 0.10+ Source for Structured Streaming ........ SUCCESS [ 23.664 s]
[INFO] Spark Project Examples ............................. SUCCESS [ 30.777 s]
[INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [  6.997 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  15:40 min
[INFO] Finished at: 2024-10-09T23:27:20+08:00
[INFO] ------------------------------------------------------------------------

```

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#48399 from LuciferYang/SPARK-49569-FOLLOWUP.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: yangjie01 <yangjie01@baidu.com>
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
…l/core` module

### What changes were proposed in this pull request?
This pr exclude `spark-connect-shims` from `sql/core` module for further fix maven daily test.

### Why are the changes needed?
For fix maven daily test:

After apache#48399, although the Maven build was successful in my local environment, the Maven daily test pipeline still failed to build:

- https://github.com/apache/spark/actions/runs/11255598249/job/31311358712

```
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala:121: value makeRDD is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:75: value id is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.columnar.CachedBatch]
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:82: value env is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:88: value env is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:185: value parallelize is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:481: value cleaner is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:500: value parallelize is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:940: value addSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:943: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:947: value removeSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1667: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1668: value addSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1673: value partitions is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1674: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1682: value partitions is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1683: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1687: value removeSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1708: value partitions is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
...
```

After using the `mvn dependency:tree` command to check, I found that `sql/core` cascadingly introduced `org.apache.spark:spark-connect-shims_2.13:jar:4.0.0-SNAPSHOT:test` through `org.apache.spark:spark-sql-api_2.13:test-jar:tests:4.0.0-SNAPSHOT:test`.

```
[INFO] ------------------< org.apache.spark:spark-sql_2.13 >-------------------
[INFO] Building Spark Project SQL 4.0.0-SNAPSHOT                        [18/42]
[INFO]   from sql/core/pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- dependency:3.6.1:tree (default-cli)  spark-sql_2.13 ---
[INFO] org.apache.spark:spark-sql_2.13:jar:4.0.0-SNAPSHOT
...
[INFO] +- org.apache.spark:spark-catalyst_2.13:test-jar:tests:4.0.0-SNAPSHOT:test
[INFO] +- org.apache.spark:spark-sql-api_2.13:test-jar:tests:4.0.0-SNAPSHOT:test
[INFO] |  +- org.scala-lang.modules:scala-parser-combinators_2.13:jar:2.4.0:compile
[INFO] |  +- org.apache.spark:spark-connect-shims_2.13:jar:4.0.0-SNAPSHOT:test
[INFO] |  +- org.antlr:antlr4-runtime:jar:4.13.1:compile
[INFO] |  +- org.apache.arrow:arrow-vector:jar:17.0.0:compile
[INFO] |  |  +- org.apache.arrow:arrow-format:jar:17.0.0:compile
[INFO] |  |  +- org.apache.arrow:arrow-memory-core:jar:17.0.0:compile
[INFO] |  |  +- com.fasterxml.jackson.datatype:jackson-datatype-jsr310:jar:2.18.0:compile
[INFO] |  |  \- com.google.flatbuffers:flatbuffers-java:jar:24.3.25:compile
[INFO] |  \- org.apache.arrow:arrow-memory-netty:jar:17.0.0:compile
[INFO] |     \- org.apache.arrow:arrow-memory-netty-buffer-patch:jar:17.0.0:compile
```

This should be unexpected.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- Pass maven test on GitHub Actions: https://github.com/LuciferYang/spark/runs/31314342332

<img width="996" alt="image" src="https://github.com/user-attachments/assets/6d2707f9-0d58-4c80-af27-cccdfa899e87">

All maven test passed

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#48403 from LuciferYang/test-maven-build.

Lead-authored-by: YangJie <yangjie01@baidu.com>
Co-authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: yangjie01 <yangjie01@baidu.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants