Skip to content

[SPARK-30601][BUILD] Add a Google Maven Central as a primary repository #27307

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 0 additions & 8 deletions .github/workflows/master.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,14 +66,6 @@ jobs:
export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=1g -Dorg.slf4j.simpleLogger.defaultLogLevel=WARN"
export MAVEN_CLI_OPTS="--no-transfer-progress"
mkdir -p ~/.m2
# `Maven Central` is too flaky in terms of downloading artifacts in `GitHub Action` environment.
# `Google Maven Central Mirror` is too slow in terms of sycing upstream. To get the best combination,
# 1) we set `Google Maven Central` as a mirror of `central` in `GitHub Action` environment only.
# 2) we duplicates `Maven Central` in pom.xml with ID `central_without_mirror`.
# In other words, in GitHub Action environment, `central` is mirrored by `Google Maven Central` first.
# If `Google Maven Central` doesn't provide the artifact due to its slowness, `central_without_mirror` will be used.
# Note that we aim to achieve the above while keeping the existing behavior of non-`GitHub Action` environment unchanged.
echo "<settings><mirrors><mirror><id>google-maven-central</id><name>GCS Maven Central mirror</name><url>https://maven-central.storage-download.googleapis.com/repos/central/data/</url><mirrorOf>central</mirrorOf></mirror></mirrors></settings>" > ~/.m2/settings.xml
./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pmesos -Pkubernetes -Phive -P${{ matrix.hive }} -Phive-thriftserver -P${{ matrix.hadoop }} -Phadoop-cloud -Djava.version=${{ matrix.java }} install
rm -rf ~/.m2/repository/org/apache/spark

Expand Down
32 changes: 24 additions & 8 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -246,10 +246,13 @@
</properties>
<repositories>
<repository>
<id>central</id>
<!-- This should be at top, it makes maven try the central repo first and then others and hence faster dep resolution -->
<name>Maven Repository</name>
<url>https://repo.maven.apache.org/maven2</url>
<id>gcs-maven-central-mirror</id>
<!--
Google Mirror of Maven Central, placed first so that it's used instead of flaky Maven Central.
See https://storage-download.googleapis.com/maven-central/index.html
-->
<name>GCS Maven Central mirror</name>
<url>https://maven-central.storage-download.googleapis.com/repos/central/data/</url>
<releases>
<enabled>true</enabled>
</releases>
Expand All @@ -258,12 +261,10 @@
</snapshots>
</repository>
<repository>
<id>central_without_mirror</id>
<!--
This is used as a fallback when a mirror to `central` fail.
For example, when we use Google Maven Central in GitHub Action as a mirror of `central`,
this will be used when Google Maven Central is out of sync due to its late sync cycle.
This is used as a fallback when the first try fails.
-->
<id>central</id>
<name>Maven Repository</name>
<url>https://repo.maven.apache.org/maven2</url>
<releases>
Expand All @@ -275,6 +276,21 @@
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. +1 for piggybacking this. I also agree that this is required to unblock #27279 .

<id>gcs-maven-central-mirror</id>
<!--
Google Mirror of Maven Central, placed first so that it's used instead of flaky Maven Central.
See https://storage-download.googleapis.com/maven-central/index.html
-->
<name>GCS Maven Central mirror</name>
<url>https://maven-central.storage-download.googleapis.com/repos/central/data/</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</pluginRepository>
<pluginRepository>
<id>central</id>
<url>https://repo.maven.apache.org/maven2</url>
Expand Down
3 changes: 3 additions & 0 deletions project/SparkBuild.scala
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,9 @@ object SparkBuild extends PomBuild {

// Override SBT's default resolvers:
resolvers := Seq(
// Google Mirror of Maven Central, placed first so that it's used instead of flaky Maven Central.
// See https://storage-download.googleapis.com/maven-central/index.html for more info.
"gcs-maven-central-mirror" at "https://maven-central.storage-download.googleapis.com/repos/central/data/",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the repo to the SBT side as well to match (and fixed some comments).

DefaultMavenRepository,
Resolver.mavenLocal,
Resolver.file("local", file(Path.userHome.absolutePath + "/.ivy2/local"))(Resolver.ivyStylePatterns)
Expand Down