Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-47495][CORE] Fix primary resource jar added to spark.jars twic…
…e under k8s cluster mode ### What changes were proposed in this pull request? In `SparkSubmit`, for `isKubernetesClusterModeDriver` code path, stop appending primary resource to `spark.jars` to avoid duplicating the primary resource jar in `spark.jars`. ### Why are the changes needed? #### Context: To submit spark jobs to Kubernetes under cluster mode, the spark-submit will be called twice. The first time SparkSubmit will run under k8s cluster mode, it will append primary resource to `spark.jars` and call `KubernetesClientApplication::start` to create a driver pod. The driver pod will run spark-submit again with the updated configurations (with the same application jar but that jar will also be in the `spark.jars`). This time the SparkSubmit will run under client mode with `spark.kubernetes.submitInDriver` as `true`. Under this mode, all the jars in `spark.jars` will be downloaded to driver and jars' urls will be replaced by the driver local paths. Later SparkSubmit will append primary resource to `spark.jars` again. So in this case, `spark.jars` will have 2 paths of duplicate copies of primary resource, one with the original url user submit with, the other with the driver local file path. Later when driver starts the `SparkContext` it will copy all the `spark.jars` to `spark.app.initial.jar.urls`, and replace the driver local jars paths in `spark.app.initial.jar.urls` with driver file service paths, with which the executor can download those driver local jars. #### Issues: The executor will download 2 duplicate copies of primary resource, one with the original url user submit with, the other with the driver local file path, which leads to resource waste. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Unit test added. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#45607 from leletan/fix_k8s_submit_jar_distribution. Lead-authored-by: jiale_tan <jiale_tan@apple.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
- Loading branch information