Skip to content

Conversation

@pull
Copy link

@pull pull bot commented Dec 5, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

TongWei1105 and others added 2 commits December 5, 2025 09:25
…ence the working directory after they are downloaded

### What changes were proposed in this pull request?

This PR fixes a bug where submitting a Spark job using the --files option and also calling SparkContext.addFile() for a file with the same name causes Spark to throw an exception
`Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: File a.text was already registered with a different path (old path = /tmp/spark-6aa5129d-5bbb-464a-9e50-5b6ffe364ffb/a.text, new path = /opt/spark/work-dir/a.text`

### Why are the changes needed?

1. Submit a Spark application using spark-submit with the --files option:
`bin/spark-submit --files s3://bucket/a.text --class testDemo app.jar `
2. In the testDemo application code, call:
`sc.addFile("a.text", true)`

This works correctly in YARN mode, but throws an error in Kubernetes mode.
After [SPARK-33782](https://issues.apache.org/jira/browse/SPARK-33782), in Kubernetes mode, --files, --jars, --archiveFiles, and --pyFiles are all downloaded to the working directory.

However, in the code, args.files = filesLocalFiles, and filesLocalFiles refers to a temporary download path, not the working directory.
This causes issues when user code like testDemo calls sc.addFile("a.text", true), resulting in an error such as:
`Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: File a.text was already registered with a different path (old path = /tmp/spark-6aa5129d-5bbb-464a-9e50-5b6ffe364ffb/a.text, new path = /opt/spark/work-dir/a.text`

### Does this PR introduce _any_ user-facing change?

This issue can be resolved after this PR.

### How was this patch tested?

Existed UT

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #51037 from TongWei1105/SPARK-52334.

Authored-by: TongWei1105 <vvtwow@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?

It was enabled by default via SPARK-45771, but forgot to update the docs.

### Why are the changes needed?

Keep docs and code consistent.

### Does this PR introduce _any_ user-facing change?

No, except for correcting the docs.

### How was this patch tested?

Review, as it's a docs-only change.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #53339 from pan3793/SPARK-54606.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@pull pull bot locked and limited conversation to collaborators Dec 5, 2025
@pull pull bot added the ⤵️ pull label Dec 5, 2025
@pull pull bot merged commit ed6c2a8 into huangxiaopingRD:master Dec 5, 2025
2 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants