Skip to content

Commit 552cbaf

Browse files
dongjoon-hyunericm-db
authored andcommitted
[SPARK-44914][BUILD] Upgrade Apache Ivy to 2.5.2
### What changes were proposed in this pull request? This PR aims to upgrade Apache Ivy to 2.5.2 and protect old Ivy-based systems like old Spark from Apache Ivy 2.5.2's incompatibility by introducing a new `.ivy2.5.2` directory. - Apache Spark 4.0.0 will create this once and reuse this directory while all the other systems like old Sparks uses the old one, `.ivy2`. So, the behavior is the same with the case where Apache Spark 4.0.0 is installed and used in a new machine. - For the environments with `User-provided Ivy-path`es, the user might hit the incompatibility still. However, the users can mitigate them because they already have full control on `Ivy-path`es. ### Why are the changes needed? This was tried once and reverted logically due to Java 11 and Java 17 failures in Daily CIs. - apache#42613 - apache#42668 Currently, PR Builder also fails as of now. If the PR passes CIes, we can achieve the following. - [Release notes](https://lists.apache.org/thread/9gcz4xrsn8c7o9gb377xfzvkb8jltffr) - FIX: CVE-2022-46751: Apache Ivy Is Vulnerable to XML External Entity Injections ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs including `HiveExternalCatalogVersionsSuite`. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#45075 from dongjoon-hyun/SPARK-44914. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
1 parent 25bc625 commit 552cbaf

File tree

7 files changed

+24
-12
lines changed

7 files changed

+24
-12
lines changed

common/utils/src/main/scala/org/apache/spark/util/MavenUtils.scala

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -324,6 +324,14 @@ private[spark] object MavenUtils extends Logging {
324324
val ivySettings: IvySettings = new IvySettings
325325
try {
326326
ivySettings.load(file)
327+
if (ivySettings.getDefaultIvyUserDir == null && ivySettings.getDefaultCache == null) {
328+
// To protect old Ivy-based systems like old Spark from Apache Ivy 2.5.2's incompatibility.
329+
// `processIvyPathArg` can overwrite these later.
330+
val alternateIvyDir = System.getProperty("ivy.home",
331+
System.getProperty("user.home") + File.separator + ".ivy2.5.2")
332+
ivySettings.setDefaultIvyUserDir(new File(alternateIvyDir))
333+
ivySettings.setDefaultCache(new File(alternateIvyDir, "cache"))
334+
}
327335
} catch {
328336
case e @ (_: IOException | _: ParseException) =>
329337
throw new SparkException(s"Failed when loading Ivy settings from $settingsFile", e)
@@ -335,10 +343,13 @@ private[spark] object MavenUtils extends Logging {
335343

336344
/* Set ivy settings for location of cache, if option is supplied */
337345
private def processIvyPathArg(ivySettings: IvySettings, ivyPath: Option[String]): Unit = {
338-
ivyPath.filterNot(_.trim.isEmpty).foreach { alternateIvyDir =>
339-
ivySettings.setDefaultIvyUserDir(new File(alternateIvyDir))
340-
ivySettings.setDefaultCache(new File(alternateIvyDir, "cache"))
346+
val alternateIvyDir = ivyPath.filterNot(_.trim.isEmpty).getOrElse {
347+
// To protect old Ivy-based systems like old Spark from Apache Ivy 2.5.2's incompatibility.
348+
System.getProperty("ivy.home",
349+
System.getProperty("user.home") + File.separator + ".ivy2.5.2")
341350
}
351+
ivySettings.setDefaultIvyUserDir(new File(alternateIvyDir))
352+
ivySettings.setDefaultCache(new File(alternateIvyDir, "cache"))
342353
}
343354

344355
/* Add any optional additional remote repositories */

common/utils/src/test/scala/org/apache/spark/util/IvyTestUtils.scala

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -374,7 +374,8 @@ private[spark] object IvyTestUtils {
374374
f(repo.toURI.toString)
375375
} finally {
376376
// Clean up
377-
if (repo.toString.contains(".m2") || repo.toString.contains(".ivy2")) {
377+
if (repo.toString.contains(".m2") || repo.toString.contains(".ivy2") ||
378+
repo.toString.contains(".ivy2.5.2")) {
378379
val groupDir = getBaseGroupDirectory(artifact, useIvyLayout)
379380
FileUtils.deleteDirectory(new File(repo, groupDir + File.separator + artifact.artifactId))
380381
deps.foreach { _.foreach { dep =>

core/src/main/scala/org/apache/spark/internal/config/package.scala

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2491,10 +2491,10 @@ package object config {
24912491
.doc("Path to specify the Ivy user directory, used for the local Ivy cache and " +
24922492
"package files from spark.jars.packages. " +
24932493
"This will override the Ivy property ivy.default.ivy.user.dir " +
2494-
"which defaults to ~/.ivy2.")
2494+
"which defaults to ~/.ivy2.5.2")
24952495
.version("1.3.0")
24962496
.stringConf
2497-
.createOptional
2497+
.createWithDefault("~/.ivy2.5.2")
24982498

24992499
private[spark] val JAR_IVY_SETTING_PATH =
25002500
ConfigBuilder(MavenUtils.JAR_IVY_SETTING_PATH_KEY)

dev/deps/spark-deps-hadoop-3-hive-2.3

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ httpcore/4.4.16//httpcore-4.4.16.jar
102102
icu4j/72.1//icu4j-72.1.jar
103103
ini4j/0.5.4//ini4j-0.5.4.jar
104104
istack-commons-runtime/3.0.8//istack-commons-runtime-3.0.8.jar
105-
ivy/2.5.1//ivy-2.5.1.jar
105+
ivy/2.5.2//ivy-2.5.2.jar
106106
jackson-annotations/2.16.1//jackson-annotations-2.16.1.jar
107107
jackson-core-asl/1.9.13//jackson-core-asl-1.9.13.jar
108108
jackson-core/2.16.1//jackson-core-2.16.1.jar

dev/run-tests.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -478,6 +478,8 @@ def main():
478478
rm_r(os.path.join(SPARK_HOME, "work"))
479479
rm_r(os.path.join(USER_HOME, ".ivy2", "local", "org.apache.spark"))
480480
rm_r(os.path.join(USER_HOME, ".ivy2", "cache", "org.apache.spark"))
481+
rm_r(os.path.join(USER_HOME, ".ivy2.5.2", "local", "org.apache.spark"))
482+
rm_r(os.path.join(USER_HOME, ".ivy2.5.2", "cache", "org.apache.spark"))
481483

482484
os.environ["CURRENT_BLOCK"] = str(ERROR_CODES["BLOCK_GENERAL"])
483485

docs/core-migration-guide.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,8 @@ license: |
3636

3737
- Since Spark 4.0, Spark uses `ReadWriteOncePod` instead of `ReadWriteOnce` access mode in persistence volume claims. To restore the legacy behavior, you can set `spark.kubernetes.legacy.useReadWriteOnceAccessMode` to `true`.
3838

39+
- Since Spark 4.0, Spark uses `~/.ivy2.5.2` as Ivy user directory by default to isolate the existing systems from Apache Ivy's incompatibility. To restore the legacy behavior, you can set `spark.jars.ivy` to `~/.ivy2`.
40+
3941
## Upgrading from Core 3.4 to 3.5
4042

4143
- Since Spark 3.5, `spark.yarn.executor.failuresValidityInterval` is deprecated. Use `spark.executor.failuresValidityInterval` instead.

pom.xml

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -146,11 +146,7 @@
146146
<jetty.version>10.0.19</jetty.version>
147147
<jakartaservlet.version>4.0.3</jakartaservlet.version>
148148
<chill.version>0.10.0</chill.version>
149-
<!--
150-
SPARK-44968: don't upgrade Ivy to version 2.5.2 until the test aborted of
151-
`HiveExternalCatalogVersionsSuite` in Java 11/17 daily tests is resolved.
152-
-->
153-
<ivy.version>2.5.1</ivy.version>
149+
<ivy.version>2.5.2</ivy.version>
154150
<oro.version>2.0.8</oro.version>
155151
<!--
156152
If you change codahale.metrics.version, you also need to change

0 commit comments

Comments
 (0)