Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49329] Support user provided spec for SparkCluster #80

Closed
wants to merge 1 commit into from

Conversation

jiangzho
Copy link
Contributor

What changes were proposed in this pull request?

This PR introduces the feature to enable user-provided metadata & spec for Spark Clusters.

Why are the changes needed?

Similar to pod template spec support for Apps, this is desired when user would like to introduce customization for Cluster master + worker spec.

Does this PR introduce any user-facing change?

No - not released yet

How was this patch tested?

Unit tests and integration test

Was this patch authored or co-authored using generative AI tooling?

No

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @jiangzho . Ya, I exclude that intentionally. :)

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make CI happy by resolving conflicts and adding license headers.

initWorkers: 3
minWorkers: 3
maxWorkers: 3
clusterTolerations:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, this is irrelevant to the user provided spec. Please proceed this separately because this means that we need observers. That's the real reason why I decided to exclude this part initially.

buildMasterService(
clusterName,
namespace,
cluster.getSpec().getMasterSpec().getMasterServiceMetadata(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit. Just a coding style comment. It's not good to repeat the same thing multiple time. For example, we already have spec as a local variable. So, let's not add a new like this, cluster.getSpec().getMasterSpec().

@AllArgsConstructor
@Builder
@JsonInclude(JsonInclude.Include.NON_NULL)
public class ClusterInstanceConfig {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend to use WorkerInstanceConfig because we are not creating multiple cluster instances.

@dongjoon-hyun
Copy link
Member

Could you rebase this PR, @jiangzho ?

@dongjoon-hyun
Copy link
Member

Thank you for updating, @jiangzho .

mark instance config as required

Decouple Worker Instance Config
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to main.
Thank you, @jiangzho .

dongjoon-hyun added a commit that referenced this pull request Aug 26, 2024
### What changes were proposed in this pull request?

This PR aims to add `cluster-with-template.yaml` example.

### Why are the changes needed?

Apache Spark K8s Operator starts to support template-based Spark Cluster creation via the following.
- #80

We had better provide a concrete example like the following use cases.
- Custom annotations and labels.
- Custom full spec like `priorityClassName`, `securityContext`, `sidecar` containers

### Does this PR introduce _any_ user-facing change?

No. This is an unreleased new feature.

### How was this patch tested?

Manual review.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #101 from dongjoon-hyun/SPARK-49380.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Aug 26, 2024
### What changes were proposed in this pull request?

This PR aims to support `master` container and `worker` container templates.

### Why are the changes needed?

Previously, pod template throws exceptions on containers whose name is `master` or `worker` because two `master|worker` containers were created. After this PR, the problem is resolved.
- #80

### Does this PR introduce _any_ user-facing change?

No, this is a new unreleased feature.

### How was this patch tested?

For now, we need to run manual tests.
```
$ ./gradlew build buildDockerImage spark-operator-api:relocateGeneratedCRD -x check
$ helm install spark-kubernetes-operator -f build-tools/helm/spark-kubernetes-operator/values.yaml build-tools/helm/spark-kubernetes-operator/
$ kubectl apply -f examples/cluster-with-template.yaml
$ kubectl get pod -l spark-role=master -oyaml | yq '.items[0].spec.containers[0].resources'
limits:
  cpu: "2"
  memory: 2Gi
requests:
  cpu: "2"
  memory: 2Gi
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #104 from dongjoon-hyun/SPARK-49389.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants