-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-52399][Kubernetes] Support local ephemeral-storage resource #51096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
<td> | ||
Specify ephemeral storage <a href="https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#local-ephemeral-storage">request and limit</a> for the driver pod. | ||
</td> | ||
<td>4.1.0</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure which release I should target (same on other doc/config section)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making a PR, @ashangit .
However, Apache Spark 3.0+ has been providing a more generalized way to control this kind of K8s specific setting via Pod Template
instead of adding many configurations one by one. There are so many things to control like this. For example, priorityClassName
.
Please make and use pod template like the following, pod.yml
.
apiVersion: v1
Kind: Pod
spec:
containers:
- name: pod
resources:
limits:
ephemeral-storage: 10Gi
You can use like this in Apache Spark .
-c spark.kubernetes.driver.podTemplateFile=pod.yml
-c spark.kubernetes.executor.podTemplateFile=pod.yml
Hi, relying on pod template doesn't seem to work when trying to modify the spark-kubernetes-driver/spark-kubernetes-executor as spark is adding it's generated pod config to the list of containers, see spark/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala at 80657974ee691a36c5a761629260a65b2fb7fa03 · apache/spark and spark/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala at 80657974ee691a36c5a761629260a65b2fb7fa03 · apache/spark Due to this it ends up by trying to submit a pod with 2 times the spark-kubernetes-* container. |
I'm interested in your bug reports. Could you file a JIRA issue with the above one with the procedure you did, please? FYI, Apache Spark has a pod template integration test here. We can improve the test coverage first. |
Hi, Found the issue
Relying on this one work fine:
|
Great! Thank you for sharing and closing the PR, @ashangit . |
What changes were proposed in this pull request?
Add capability to set the local ephemeral-storage resource on driver and executor pods
Why are the changes needed?
On kubernetes when a node is low on ephemeral storage the kubelete will killed pods using too much ephemeral storage (more than the requested one). As driver and executor does not set any ephemeral storage they can be evicted even if the big usage comes from other pods
Being able to set the ephemeral storage requests will ensure pods are not evicted
Does this PR introduce any user-facing change?
Yes added
spark.kubernetes.driver.request.ephemeral.storage
andspark.kubernetes.executor.request.ephemeral.storage
those 2 new configs.But if those configs are not set the spark jobs will works as it was before without setting the ephemeral storage
How was this patch tested?
Added some unittest and also deployed a spark job on our kubernetes cluster using those parameters.
The ephemeral storage was well added to the driver and executor:
Was this patch authored or co-authored using generative AI tooling?
No