Skip to content

[SPARK-55136][BUILD] Bump Db2 JDBC driver 12.1.3.0_special_74723#53920

Closed
pan3793 wants to merge 1 commit intoapache:masterfrom
pan3793:SPARK-55136
Closed

[SPARK-55136][BUILD] Bump Db2 JDBC driver 12.1.3.0_special_74723#53920
pan3793 wants to merge 1 commit intoapache:masterfrom
pan3793:SPARK-55136

Conversation

@pan3793
Copy link
Member

@pan3793 pan3793 commented Jan 22, 2026

What changes were proposed in this pull request?

Previously, lz4-java was upgraded to 1.10.1 to address CVEs,

Upgrades Db2 JDBC driver to a special version 12.1.3.0_special_74723 provided by Db2 team that also bundles lz4-java 1.10.1.

Why are the changes needed?

To address a NoSuchMethodError issue (test only), see the whole story at #53454.

Does this PR introduce any user-facing change?

No, Db2 JDBC driver is only used in testing.

How was this patch tested?

See #53454

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions
Copy link

JIRA Issue Information

=== Dependency upgrade SPARK-55136 ===
Summary: Bump Db2 JDBC driver to 12.1.3.0_special_74723
Assignee: None
Status: Open
Affected: ["4.2.0"]


This comment was automatically generated by GitHub Actions

LuciferYang
LuciferYang previously approved these changes Jan 23, 2026
@LuciferYang LuciferYang dismissed their stale review January 23, 2026 03:33

I've found that the new jar still only shades the lz4-java without performing relocation. In this case, when the version of lz4-java used by Spark differs from that used by DB2, there's still a possibility of classpath contamination. It seems that this hasn't addressed the root cause of the problem...

@pan3793
Copy link
Member Author

pan3793 commented Jan 25, 2026

... the new jar still only shades the lz4-java without performing relocation.

Unfortunately, it's true. I also advised the author to relocate the lz4-java classes, provide a vanilla Db2 JDBC driver without bundling 3rd-party libs, but have not received acknowledgment.

In this case, when the version of lz4-java used by Spark differs from that used by DB2, there's still a possibility of classpath contamination. It seems that this hasn't addressed the root cause of the problem...

yes, it just mitigates the current specific lz4-java conflict issue, but anyway, this does not make things worse, the final solution depends on the artifacts provided by the author, and we don't have many options here.

@LuciferYang
Copy link
Contributor

Currently, Spark is using lz4-java 1.10.3. Is it necessary for DB2 to release a new version?

@pan3793
Copy link
Member Author

pan3793 commented Jan 27, 2026

For the compatibility aspect, it seems not necessary, we can just upgrade when new versions are available.

As I said before, we don't have many options here, except for waiting for the author fix the relocation issue or deploy another vanilla jar. I think we should move forward, because by default(I mean the official binary release), Db2 JDBC driver is only used for testing, it only affects devs, issue only happens if the user puts this jar into their SPARK_HOME/jars

Additionally, Db2 JDBC driver is not the only jar that has shading issues, when I check Spark dependencies, I found io.get-coursier:interface, net.snowflake:snowflake-jdbc also bundles zstd classes without properly relcation, which might introduce the same potential class conflict issues.

@pan3793
Copy link
Member Author

pan3793 commented Jan 28, 2026

cc @dbtsai, do you have ideas here?

@pan3793
Copy link
Member Author

pan3793 commented Feb 12, 2026

@viirya, this is the actual one block us to move forward to mitigate lz4 perf regression.

@pan3793 pan3793 requested a review from viirya February 12, 2026 06:49
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reviewers:

If the new jar file still has dependency issue, I'd like to propose to disable DB2 test coverage itself for a while with IDed TODO ID in order to unblock @pan3793 's PR.

@LuciferYang
Copy link
Contributor

@pan3793 , could you please forward the email communication with the maintainer of the Db2 JDBC driver to @dongjoon-hyun ? If I remember correctly, it seems that with each upgrade, we need to upgrade to a corresponding version of the Db2 JDBC driver to ensure that the expected lz4 code is used in the relevant tests, right? Please correct me if I've misunderstood.

@pan3793
Copy link
Member Author

pan3793 commented Feb 26, 2026

@LuciferYang, I forwarded the original mail to both you and @dongjoon-hyun

with each upgrade, we need to upgrade to a corresponding version of the Db2 JDBC driver to ensure that the expected lz4 code is used in the relevant tests, right?

this is ideal, but it totally depends on the Db2 JDBC driver author, I don't think they can have an up-to-date release in time.

@dongjoon-hyun
Copy link
Member

Thank you for sharing.

AFAIK, DB2 is simply just a test dependency (or test coverage). Apache Spark distribution has no documentation, assumptions or requirements for JDBC drivers so far.

spark/pom.xml

Lines 1360 to 1365 in 1d56813

<dependency>
<groupId>com.ibm.db2</groupId>
<artifactId>jcc</artifactId>
<version>${db2.jcc.version}</version>
<scope>test</scope>
</dependency>

That's the reason why I suggested to simply disable Db2 test coverage for a while.

We don't want to give an extra recommendation like using 12.1.3.0_special_74723 which it may contaminate Spark's classpaths. IBM may want to give a recommendation from their side.

From Apache Spark side, I believe the following is more important to achieve.

@LuciferYang
Copy link
Contributor

Thank you for sharing.

AFAIK, DB2 is simply just a test dependency (or test coverage). Apache Spark distribution has no documentation, assumptions or requirements for JDBC drivers so far.

spark/pom.xml

Lines 1360 to 1365 in 1d56813

<dependency>
<groupId>com.ibm.db2</groupId>
<artifactId>jcc</artifactId>
<version>${db2.jcc.version}</version>
<scope>test</scope>
</dependency>

That's the reason why I suggested to simply disable Db2 test coverage for a while.

We don't want to give an extra recommendation like using 12.1.3.0_special_74723 which it may contaminate Spark's classpaths. IBM may want to give a recommendation from their side.

From Apache Spark side, I believe the following is more important to achieve.

+1 for simply disable Db2 test coverage for a while

@dongjoon-hyun
Copy link
Member

Let me make a PR for that independently since my suggestion was a new one.

@dongjoon-hyun
Copy link
Member

This is what I suggested.

@pan3793
Copy link
Member Author

pan3793 commented Feb 26, 2026

Close and in favor SPARK-55706 (#54505)

@pan3793 pan3793 closed this Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants