Skip to content

Test Ivy 2.5.2 #44477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from
Closed

Test Ivy 2.5.2 #44477

wants to merge 11 commits into from

Conversation

LuciferYang
Copy link
Contributor

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

@github-actions github-actions bot removed the SQL label Dec 25, 2023
@LuciferYang LuciferYang marked this pull request as draft December 25, 2023 06:27
@github-actions github-actions bot added the SQL label Dec 25, 2023
@github-actions github-actions bot removed the SQL label Dec 25, 2023
@LuciferYang
Copy link
Contributor Author

When master use ivy 2.5.2 and the test targets use ivy 2.5.1:

[info]   : java.lang.RuntimeException: problem during retrieve of org.apache.spark#spark-submit-parent-8bd00540-3ae3-45c0-b8cb-adf54c547a85: java.lang.RuntimeException: Multiple artifacts of the module log4j#log4j;1.2.17 are retrieved to the same file! Update the retrieve pattern to fix this error.
[info]   	at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:238)

@github-actions github-actions bot added the SQL label Dec 25, 2023
@@ -211,6 +213,10 @@ class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils {
tryDownloadSpark(version, sparkTestingDir.getCanonicalPath)
}

Files.deleteIfExists(Paths.get(sparkHome.getCanonicalPath, "jars", "ivy-2.5.1.jar"))
val ivyUrl = new URL("https://repo1.maven.org/maven2/org/apache/ivy/ivy/2.5.2/ivy-2.5.2.jar")
IOUtils.copy(ivyUrl, new File(s"${sparkHome.getCanonicalPath}/jars", "ivy-2.5.2.jar"))
Copy link
Contributor Author

@LuciferYang LuciferYang Dec 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been investigating the case of the previous Ivy upgrade failure recently. I found that if the Spark version being tested in HiveExternalCatalogVersionsSuite also uses Ivy 2.5.2, the error like #44477 (comment) will not occur again... But I haven't come up with a better solution yet.

cc @bjornjorgensen @dongjoon-hyun

@@ -220,6 +221,7 @@ class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils {
"--conf", s"${MASTER_REST_SERVER_ENABLED.key}=false",
"--conf", s"${HiveUtils.HIVE_METASTORE_VERSION.key}=$hiveMetastoreVersion",
"--conf", s"${HiveUtils.HIVE_METASTORE_JARS.key}=maven",
"--conf", s"${JAR_IVY_REPO_PATH.key}=${ivyTestDir.getCanonicalPath}",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try isolating the Ivy local repo paths for ivy-2.5.1 and 2.5.2 to see if it works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still failed

This reverts commit 15e9159.
@dongjoon-hyun
Copy link
Member

Thank you for the investigation, @LuciferYang .

@github-actions github-actions bot added the CORE label Dec 26, 2023
@github-actions github-actions bot removed the CORE label Dec 26, 2023
@LuciferYang
Copy link
Contributor Author

LuciferYang commented Dec 27, 2023

@dongjoon-hyun @bjornjorgensen Synchronization:

  1. The test failure appears to be due to incompatibility between the metadata created by Ivy versions 2.5.1 and 2.5.2 in the ivy.cache.dir. This can be resolved by using a separate ivy.cache.dir for the older versions of Spark being tested. The current pull request has already been tested successfully.

  2. However, I think this issue is not just limited to testing. I conducted additional functional tests on my local machine as follows:

  • Prepared a Spark client using Ivy 2.5.2 with version 4.0.0-SNAPSHOT and another Spark 3.x version, such as Spark 3.4.2, using Ivy 2.5.1.
  • After clearing the ~/.ivy2 directory, submitted jobs in the order of 3.4.2, 4.0.0-SNAPSHOT with the config spark.sql.hive.metastore.jars=maven. Both versions were able to run successfully.
  • After clearing the ~/.ivy2 directory again, submitted jobs in the order of 4.0.0-SNAPSHOT, 3.4.2 with the config spark.sql.hive.metastore.jars=maven. The execution of Spark 3.4.2 failed with the following error:
[info]   : java.lang.RuntimeException: problem during retrieve of org.apache.spark#spark-submit-parent-8bd00540-3ae3-45c0-b8cb-adf54c547a85: java.lang.RuntimeException: Multiple artifacts of the module log4j#log4j;1.2.17 are retrieved to the same file! Update the retrieve pattern to fix this error.
[info]    at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:238)`

Therefore, it seems that upgrading to Ivy 2.5.2 can cause issues for end users. If a client with Ivy 2.5.2 is used first and then a client with Ivy 2.5.1 is intended to be used again, it is necessary to specify a different ivy.cache.dir or manually clear the ivy.cache.dir updated by Ivy 2.5.2 in order to use it again properly.

@LuciferYang
Copy link
Contributor Author

Apart from backporting the upgrade to branch-3.4 and branch-3.5, I can't think of a better way to reduce this compatibility impact now. So, shall we skip the upgrade to Ivy 2.5.2? Although there is a CVE issue with a score of 8.2 before Ivy 2.5.1 (CVE-2022-46751), it doesn't seem to have a significant impact on Apache Spark.

@bjornjorgensen
Copy link
Contributor

Although there is a CVE issue with a score of 8.2 before Ivy 2.5.1

it's Ivy 2.5.2 not 2.5.1

@LuciferYang
Copy link
Contributor Author

Although there is a CVE issue with a score of 8.2 before Ivy 2.5.1

it's Ivy 2.5.2 not 2.5.1

Yes, you are right

@bjornjorgensen
Copy link
Contributor

So the main issue here is that spark have a to old hive version? Are we going to update that to a newer version for spark 4.0?

"Although there is a CVE issue with a score of 8.2 before Ivy 2.5.1 (GHSA-2jc4-r94c-rp7h), it doesn't seem to have a significant impact on Apache Spark."

"SECURITY-2924 / CVE-2022-46751
Ivy Plugin 2.5 and earlier bundles versions of Apache Ivy vulnerable to
CVE-2022-46751.

This allows attackers able to control the input file for the "Trigger the
build of other projects based on the Ivy dependency management system"
post-build step to have Jenkins parse a crafted XML document that uses
external entities for extraction of secrets from the Jenkins controller or
server-side request forgery."
https://www.openwall.com/lists/oss-security/2023/09/06/9

@LuciferYang
Copy link
Contributor Author

So the main issue here is that spark have a to old hive version? Are we going to update that to a newer version for spark 4.0?

No, the main issue is incompatibility between the metadata created by Ivy versions 2.5.1 and 2.5.2

@LuciferYang
Copy link
Contributor Author

"SECURITY-2924 / GHSA-2jc4-r94c-rp7h
Ivy Plugin 2.5 and earlier bundles versions of Apache Ivy vulnerable to
GHSA-2jc4-r94c-rp7h.

This allows attackers able to control the input file for the "Trigger the
build of other projects based on the Ivy dependency management system"
post-build step to have Jenkins parse a crafted XML document that uses
external entities for extraction of secrets from the Jenkins controller or
server-side request forgery."
https://www.openwall.com/lists/oss-security/2023/09/06/9

For Spark, what specific harm will it suffer if it is not upgraded? Since I don't have a clear understanding of the specific results, I choose a way that does not break compatibility now. Due to my personal knowledge limitations, this decision may be wrong. If upgrading is necessary, could you submit a PR to fix it and explain in detail the specific harm to Spark if it is not upgraded? @bjornjorgensen thanks ~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants