Skip to content

Conversation

@faiq
Copy link
Contributor

@faiq faiq commented Oct 10, 2025

**What problem does this PR solve?**:
adds failure domians api to AWS/EKS Machine deployments

Which issue(s) this PR fixes:
https://jira.nutanix.com/browse/NCN-110350

How Has This Been Tested?:

Special notes for your reviewer:

- name: workerConfig
value:
aws:
failureDomain: us-west-2a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you able to verify this manually for EKS? From the docs it uses failureDomain: "1"
https://cluster-api-aws.sigs.k8s.io/topics/failure-domains/worker-nodes#failure-domains-in-worker-nodes

But in the CAPA code it does look like it should be the AZ like you have it documented

Co-authored-by: Dimitri Koshkin <dimitri.koshkin@gmail.com>
@@ -0,0 +1,73 @@
+++
title = "AWS Failure Domain"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title = "AWS Failure Domain"
title = "EKS Failure Domain"

variables:
- name: workerConfig
value:
aws:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
aws:
eks:

overrides:
- name: workerConfig
value:
aws:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
aws:
eks:

overrides:
- name: workerConfig
value:
aws:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
aws:
eks:

overrides:
- name: workerConfig
value:
aws:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
aws:
eks:

Comment on lines +84 to +99
if obj.GetKind() != "MachineDeployment" || obj.GetAPIVersion() != clusterv1.GroupVersion.String() {
log.V(5).Info("not a MachineDeployment, skipping")
return nil
}

log.WithValues(
"patchedObjectKind", obj.GetKind(),
"patchedObjectName", client.ObjectKeyFromObject(obj),
).Info("setting failure domain in worker MachineDeployment spec")

if err := unstructured.SetNestedField(
obj.Object,
failureDomainVar,
"spec", "template", "spec", "failureDomain",
); err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be using patches.MutateIfApplicable instead for consistency between all the handlers and use concrete types instead of unstructured

variableFieldPath []string
}

func NewWorkerPatch() *awsFailureDomainWorkerPatchHandler {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add failure domain patch for AWS control plane too.

Copy link
Contributor

@supershal supershal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggested few changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants