Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support pod template for Spark 3.x applications #2141

Merged
merged 5 commits into from
Oct 24, 2024

Conversation

ChenYi015
Copy link
Contributor

@ChenYi015 ChenYi015 commented Aug 22, 2024

Purpose of this PR

Close #2101
Close #1690

Proposed changes:

  • Add fields spec.driver.template and spec.executor.template to SparkApplication CRD
  • Webhook will only mutate Spark pods with label sparkoperator.k8s.io/mutated-by-spark-operator="true", and this label will be added by controller as needed.
  • For Spark 2.x applications, the webhook will be used to mutate Spark pods.
  • For Spark 3.x applications, the webhook will be used to mutate Spark driver/executor pods only if driver/executor pod template is not defined.
  • For all spark applications, the webhook will still be used to mutate Spark pods no matter whether pod template is defined or not.
  • Add example spark-pi-pod-template.yaml

Change Category

Indicate the type of change by marking the applicable boxes:

  • Bugfix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that could affect existing functionality)
  • Documentation update

Rationale

Checklist

Before submitting your PR, please review the following:

  • I have conducted a self-review of my own code.
  • I have updated documentation accordingly.
  • I have added tests that prove my changes are effective or that my feature works.
  • Existing unit tests pass locally with my changes.

Additional Notes

@ChenYi015
Copy link
Contributor Author

/hold
/assign @yuchaoran2011 @vara-bonthu @jacobsalway

@ChenYi015 ChenYi015 force-pushed the feature/pod-template branch from 27696b4 to a7c5002 Compare August 27, 2024 05:53
@@ -302,6 +304,12 @@ func driverConfOption(app *v1beta2.SparkApplication) ([]string, error) {
property = fmt.Sprintf(common.SparkKubernetesDriverLabelTemplate, common.LabelLaunchedBySparkOperator)
args = append(args, "--conf", fmt.Sprintf("%s=%s", property, "true"))

// If Spark version is less than 3.0.0 or driver pod template is not defined, then the driver pod needs to be mutated by the webhook.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed during the community call, we can think about a future release (maybe 2.1) where we announce the deprecation of web hook and support for Spark 2.x

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we take the approach of moving to users only being able to specify a pod template, or map the existing struct fields and construct a pod template on the fly during submission?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like mapping existing field to a pod template on the fly + new parameter templateFile go be the easiest migration path for existing user

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I would personally find it useful to support webhook mutations and template on the same SparkApplication.

Copy link
Contributor

@yuchaoran2011 yuchaoran2011 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Contributor

@vara-bonthu vara-bonthu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChenYi015 Thanks for adding this much needed feature! PR looks good to me. You may need to resolve the conflicts to move forward.

@google-oss-prow google-oss-prow bot removed the lgtm label Sep 29, 2024
@ChenYi015
Copy link
Contributor Author

@vara-bonthu Thanks for the review. I have resolved the merge conflicts.

@ChenYi015
Copy link
Contributor Author

/unhold

@ChenYi015 ChenYi015 force-pushed the feature/pod-template branch 2 times, most recently from bfd00ac to bdc9eca Compare October 18, 2024 12:44
Copy link
Contributor

@vara-bonthu vara-bonthu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@ImpSy
Copy link
Contributor

ImpSy commented Oct 19, 2024

I know that the PR has been approved but I feel like introducing a new key is not the right implementation here

We should either build the template with the current CR if spark version >= 3 or have the possibility to specify a template file name to use

This implementation force a migration for our user if they want to get rid of webhook but it should be free for them (which is not the case here)

@ChenYi015
Copy link
Contributor Author

I know that the PR has been approved but I feel like introducing a new key is not the right implementation here

We should either build the template with the current CR if spark version >= 3 or have the possibility to specify a template file name to use

This implementation force a migration for our user if they want to get rid of webhook but it should be free for them (which is not the case here)

@ImpSy

  1. It is somewhat difficult to build a template file form current CRD, it would be more easy and straight to use the template defined in spec.[driver|executor].template
  2. If one want to specify a pod template file, he can use the spark conf spark.kubernetes.[driver|executor].podTemplateFile.
  3. Maybe we should make webhook and pod template both work in this PR so that users can have time to migrate their applications to use pod templates.

@Tom-Newton
Copy link
Contributor

My user perspective is that I like introducing the new template fields now to enable new functionality but I'm not ready yet, to fully migrate away from the webhook, so I currently would like to use both.

Before the webhook is deprecated though I think the template should be built from the current CR as ImpSy suggests.

@ChenYi015 ChenYi015 force-pushed the feature/pod-template branch from bdc9eca to 4dae1fd Compare October 21, 2024 12:47
@ChenYi015
Copy link
Contributor Author

I have updated this PR, the webhook will still be used to mutate Spark pods no matter pod template is defined or not. I will try to build the pod template file from the current CRD when we try to deprecate the webhook.

Copy link
Contributor

@Tom-Newton Tom-Newton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though my review probably doesn't mean much 🙂

@ChenYi015 ChenYi015 force-pushed the feature/pod-template branch from 4dae1fd to acae606 Compare October 23, 2024 13:14
Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Yi Chen <github@chenyicn.net>
@ChenYi015 ChenYi015 force-pushed the feature/pod-template branch from acae606 to 157462d Compare October 23, 2024 13:26
Copy link
Contributor

@vara-bonthu vara-bonthu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@google-oss-prow google-oss-prow bot added the lgtm label Oct 24, 2024
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Tom-Newton, vara-bonthu, yuchaoran2011

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [vara-bonthu,yuchaoran2011]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit d0daf2f into kubeflow:master Oct 24, 2024
11 checks passed
@ChenYi015 ChenYi015 deleted the feature/pod-template branch October 24, 2024 02:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants