[SPARK-52334][CORE][K8S] update all files, jars, and pyFiles to reference the working directory after they are downloaded #51037

TongWei1105 · 2025-05-28T05:21:21Z

What changes were proposed in this pull request?

This PR fixes a bug where submitting a Spark job using the --files option and also calling SparkContext.addFile() for a file with the same name causes Spark to throw an exception
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: File a.text was already registered with a different path (old path = /tmp/spark-6aa5129d-5bbb-464a-9e50-5b6ffe364ffb/a.text, new path = /opt/spark/work-dir/a.text

Why are the changes needed?

Submit a Spark application using spark-submit with the --files option:
bin/spark-submit --files s3://bucket/a.text --class testDemo app.jar
In the testDemo application code, call:
sc.addFile("a.text", true)

This works correctly in YARN mode, but throws an error in Kubernetes mode.
After SPARK-33782, in Kubernetes mode, --files, --jars, --archiveFiles, and --pyFiles are all downloaded to the working directory.

However, in the code, args.files = filesLocalFiles, and filesLocalFiles refers to a temporary download path, not the working directory.
This causes issues when user code like testDemo calls sc.addFile("a.text", true), resulting in an error such as:
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: File a.text was already registered with a different path (old path = /tmp/spark-6aa5129d-5bbb-464a-9e50-5b6ffe364ffb/a.text, new path = /opt/spark/work-dir/a.text

Does this PR introduce any user-facing change?

This issue can be resolved after this PR.

How was this patch tested?

Existed UT

Was this patch authored or co-authored using generative AI tooling?

no

TongWei1105 · 2025-05-28T06:05:24Z

Hi @dongjoon-hyun ,
Could you please review this PR when you have time? Many thanks!

dongjoon-hyun

Do you think you can write a test case, @TongWei1105 ?

TongWei1105 · 2025-05-29T08:53:37Z

Do you think you can write a test case, @TongWei1105 ?
@dongjoon-hyun
Thanks for the suggestion! I've added a unit test，please feel free to review it when convenient. Appreciate your feedback!

mridulm · 2025-06-02T06:37:25Z

core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala

Can you update this test to handle archive as well ?

Can you update this test to handle archive as well ?

Done

log4j2.properties is not an archive file - and so ends up getting copied for destination (variant of existing cases).
I am trying to ensure that if (isArchive) { works as expected when the file actually results in unpacking the file

log4j2.properties is not an archive file - and so ends up getting copied for destination (variant of existing cases). I am trying to ensure that if (isArchive) { works as expected when the file actually results in unpacking the file

The isArchive logic does get triggered in this case — the test has been updated to cover that scenario accordingly.
Thank you for your suggestion.

TongWei1105 · 2025-06-05T02:12:09Z

When you have a moment, could you please take another look at this PR? Thanks！
@dongjoon-hyun @mridulm

mridulm · 2025-06-21T22:48:10Z

Sorry for the delay @TongWei1105 - I assume you closed PR due to lack of traction, if so reopening it.
The change looks reasonable to me.
I would prefer if someone more familiar with k8s also takes a look as well.

+CC @dongjoon-hyun , @HyukjinKwon

HyukjinKwon · 2025-12-01T06:44:58Z

core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

+                localResources
              } else {
                Files.copy(source.toPath, dest.toPath)
+                dest.toURI


Shouldn't it be source.toURI?

Using dest.toURI, we are placing the file names from the current working directory. I mean yeah this should also work but I wonder why we should do it.

Can you explain how this fixes the issue?

Can you explain how this fixes the issue?

Thank you for your reply.
In Kubernetes mode, when using --files or --jars, Spark first stores a copy under the local /tmp directory and also copies it to /opt/spark/work-dir/. However, when addFile(file) is called a second time inside the SparkContext, the file path becomes /opt/spark/work-dir/file. In NettyStreamManager, however, the file entries are still recorded using the original /tmp path, so when it tries to guard against duplicate file registrations, a mismatch occurs and an exception is thrown.

Therefore, I believe that in Kubernetes mode, these paths should be unified to /opt/spark/work-dir/.

HyukjinKwon

OK, I think looks fine to me but leave this to @dongjoon-hyun who more uses K8S in production

TongWei1105 · 2025-12-05T01:54:08Z

@dongjoon-hyun , when you're free, could you help me review this?

dongjoon-hyun · 2025-12-05T02:22:57Z

Sorry guys for missing ping here. I can test this Today.

dongjoon-hyun

The main body itself looks good to me.

Please try to avoid new binary file like, archive1.zip. You can create it simply like the following, @TongWei1105 .

spark/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala

Lines 698 to 699 in 7e53ce8

    
           val zipFile1 = File.createTempFile("test1_", ".zip", dir) 
        
           TestUtils.createJar(Seq(text1, json1), zipFile1)

…rking directory after they are downloaded. fix add ut add ut for archives fix

TongWei1105 · 2025-12-05T08:58:51Z

The main body itself looks good to me.

Please try to avoid new binary file like, archive1.zip. You can create it simply like the following, @TongWei1105 .

spark/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala

Lines 698 to 699 in 7e53ce8

val zipFile1 = File.createTempFile("test1_", ".zip", dir)

TestUtils.createJar(Seq(text1, json1), zipFile1)

@dongjoon-hyun done

dongjoon-hyun

+1, LGTM. Thank you, @TongWei1105 and all.

Merged to master/4.1 for Apache Spark 4.1.0.

…ence the working directory after they are downloaded ### What changes were proposed in this pull request? This PR fixes a bug where submitting a Spark job using the --files option and also calling SparkContext.addFile() for a file with the same name causes Spark to throw an exception `Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: File a.text was already registered with a different path (old path = /tmp/spark-6aa5129d-5bbb-464a-9e50-5b6ffe364ffb/a.text, new path = /opt/spark/work-dir/a.text` ### Why are the changes needed? 1. Submit a Spark application using spark-submit with the --files option: `bin/spark-submit --files s3://bucket/a.text --class testDemo app.jar ` 2. In the testDemo application code, call: `sc.addFile("a.text", true)` This works correctly in YARN mode, but throws an error in Kubernetes mode. After [SPARK-33782](https://issues.apache.org/jira/browse/SPARK-33782), in Kubernetes mode, --files, --jars, --archiveFiles, and --pyFiles are all downloaded to the working directory. However, in the code, args.files = filesLocalFiles, and filesLocalFiles refers to a temporary download path, not the working directory. This causes issues when user code like testDemo calls sc.addFile("a.text", true), resulting in an error such as: `Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: File a.text was already registered with a different path (old path = /tmp/spark-6aa5129d-5bbb-464a-9e50-5b6ffe364ffb/a.text, new path = /opt/spark/work-dir/a.text` ### Does this PR introduce _any_ user-facing change? This issue can be resolved after this PR. ### How was this patch tested? Existed UT ### Was this patch authored or co-authored using generative AI tooling? no Closes #51037 from TongWei1105/SPARK-52334. Authored-by: TongWei1105 <vvtwow@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit dd418e3) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

github-actions bot added the CORE label May 28, 2025

dongjoon-hyun reviewed May 28, 2025

View reviewed changes

TongWei1105 force-pushed the SPARK-52334 branch from d740070 to 82ce8b5 Compare May 29, 2025 08:34

TongWei1105 requested a review from dongjoon-hyun May 29, 2025 08:54

mridulm reviewed Jun 2, 2025

View reviewed changes

TongWei1105 force-pushed the SPARK-52334 branch from 82ce8b5 to 957e40d Compare June 3, 2025 04:52

TongWei1105 requested a review from mridulm June 3, 2025 04:54

TongWei1105 force-pushed the SPARK-52334 branch 5 times, most recently from 818ff74 to 4195283 Compare June 4, 2025 10:19

TongWei1105 closed this Jun 9, 2025

mridulm reopened this Jun 21, 2025

TongWei1105 closed this Jul 11, 2025

TongWei1105 reopened this Dec 1, 2025

TongWei1105 force-pushed the SPARK-52334 branch from 4195283 to a19c2c9 Compare December 1, 2025 06:42

HyukjinKwon reviewed Dec 1, 2025

View reviewed changes

HyukjinKwon approved these changes Dec 1, 2025

View reviewed changes

dongjoon-hyun self-assigned this Dec 5, 2025

dongjoon-hyun reviewed Dec 5, 2025

View reviewed changes

TongWei1105 added 2 commits December 5, 2025 14:36

update all files, jars, archiveFiles, and pyFiles to reference the wo…

b9fa638

…rking directory after they are downloaded. fix add ut add ut for archives fix

fix ut

0b15e2c

TongWei1105 force-pushed the SPARK-52334 branch from a19c2c9 to 0b15e2c Compare December 5, 2025 08:07

dongjoon-hyun approved these changes Dec 5, 2025

View reviewed changes

dongjoon-hyun closed this in dd418e3 Dec 5, 2025

dongjoon-hyun removed their assignment Dec 5, 2025

	val zipFile1 = File.createTempFile("test1_", ".zip", dir)
	TestUtils.createJar(Seq(text1, json1), zipFile1)

[SPARK-52334][CORE][K8S] update all files, jars, and pyFiles to reference the working directory after they are downloaded #51037

[SPARK-52334][CORE][K8S] update all files, jars, and pyFiles to reference the working directory after they are downloaded #51037

Conversation

TongWei1105 commented May 28, 2025 • edited by dongjoon-hyun Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

TongWei1105 commented May 28, 2025

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

TongWei1105 commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mridulm Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TongWei1105 Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

mridulm Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

TongWei1105 Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

TongWei1105 commented Jun 5, 2025

Uh oh!

mridulm commented Jun 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HyukjinKwon Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

TongWei1105 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon left a comment

Choose a reason for hiding this comment

Uh oh!

TongWei1105 commented Dec 5, 2025

Uh oh!

dongjoon-hyun commented Dec 5, 2025

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

TongWei1105 commented Dec 5, 2025

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

TongWei1105 commented May 28, 2025 •

edited by dongjoon-hyun

Loading

TongWei1105 commented May 29, 2025 •

edited

Loading

mridulm Jun 2, 2025 •

edited

Loading

mridulm commented Jun 21, 2025 •

edited

Loading