Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip named nodes in the CNR that don't exist #83

Merged
merged 8 commits into from
Aug 19, 2024
9 changes: 9 additions & 0 deletions deploy/crds/atlassian.com_cyclenoderequests_crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ spec:
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
clusterName:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this field come from? Why is it only in the CycleNodeRequest resource?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was part of the operator-sdk render, make generate-crds

type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
Expand Down Expand Up @@ -101,6 +103,13 @@ spec:
- Drain
- Wait
type: string
strictValidation:
description: StrictValidation is a boolean which determines whether
named nodes selected in a CNR must exist and be valid nodes
before cycling can begin. If set to true when invalid nodes
are selected the CNR will be transitioned to the "Failed" phase
before cycling can begin again.
type: boolean
required:
- method
type: object
Expand Down
7 changes: 7 additions & 0 deletions deploy/crds/atlassian.com_cyclenodestatuses_crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,13 @@ spec:
- Drain
- Wait
type: string
strictValidation:
Copy link
Collaborator

@awprice awprice Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So by default if not set this will be false?

Added a new StrictValidation option to the CNR to keep backwards compatible functionality.

I guess this does keep backwards compatible functionality, but is technically a breaking change or a behaviour change - we'll need to document this in the release notes.

We'll also need to ensure cluster owners update the CRDs to include this new field.

Copy link
Member Author

@vincentportella vincentportella Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is the intention, I would like this new functionality to be the default rather than the exception because it reduces the number of times a cycle will fail by default.

However, happy to change it if we deem it's better to set the flag to true by default.

description: StrictValidation is a boolean which determines whether
named nodes selected in a CNR must exist and be valid nodes
before cycling can begin. If set to true when invalid nodes
are selected the CNR will be transitioned to the "Failed" phase
before cycling can begin again.
type: boolean
required:
- method
type: object
Expand Down
7 changes: 7 additions & 0 deletions deploy/crds/atlassian.com_nodegroups_crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,13 @@ spec:
- Drain
- Wait
type: string
strictValidation:
description: StrictValidation is a boolean which determines whether
named nodes selected in a CNR must exist and be valid nodes
before cycling can begin. If set to true when invalid nodes
are selected the CNR will be transitioned to the "Failed" phase
before cycling can begin again.
type: boolean
required:
- method
type: object
Expand Down
5 changes: 5 additions & 0 deletions pkg/apis/atlassian/v1/common_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,11 @@ type CycleSettings struct {
// in-progress CNS request timeout from the time it's worked on by the controller.
// If no cyclingTimeout is provided, CNS will use the default controller CNS cyclingTimeout.
CyclingTimeout *metav1.Duration `json:"cyclingTimeout,omitempty"`

// StrictValidation is a boolean which determines whether named nodes selected in a CNR must
// exist and be valid nodes before cycling can begin. If set to true when invalid nodes are
// selected the CNR will be transitioned to the "Failed" phase before cycling can begin again.
StrictValidation bool `json:"strictValidation,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we ever see StrictValidation being used for other things?

At the moment the description of this field suggests its only for this one thing - when invalid nodes are found, transition to failed. But the naming of the field suggests its to generically enable/disable strict validation.

Naming is hard, but I'd probably suggest renaming the flag to something that describes specifically what it is enabling/disabling

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it also make sense to put this field into CycleSettings?

Doing this means it's available in a CycleNodeStatus object - do we need it here?

Should we move it elsewhere within CycleNodeRequest to make it not propagate down?

Could we have a "ValidationOptions" field within CycleNodeRequest for holding other validation/configuration options a user might want to configure?

Copy link
Member Author

@vincentportella vincentportella Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see your point. I initially intended for more validations to be affected by PR but then narrowed the scope to just checking for named nodes. Having said that, having ValidationOptions in the CNR would be better because we can then have more granular settings. 🤔

With this PR, we could make it something like, ValidationOptions.SkipMissingNamedNodes. The existing functionality would remain the same and adding the flag would lead to using this new functionality.

}

// HealthCheck defines the health check configuration for the NodeGroup
Expand Down
1 change: 1 addition & 0 deletions pkg/apis/atlassian/v1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions pkg/controller/cyclenoderequest/transitioner/checks.go
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ func (t *CycleNodeRequestTransitioner) performHealthCheck(node v1.CycleNodeReque

// performInitialHealthChecks on the nodes selected to be terminated before cycling begin. If any health
// check fails return an error to prevent cycling from starting
func (t *CycleNodeRequestTransitioner) performInitialHealthChecks(kubeNodes []corev1.Node) error {
func (t *CycleNodeRequestTransitioner) performInitialHealthChecks(kubeNodes map[string]corev1.Node) error {
// Build a set of ready nodes from which to check below
readyNodesSet := make(map[string]v1.CycleNodeRequestNode)

Expand Down Expand Up @@ -241,7 +241,7 @@ func (t *CycleNodeRequestTransitioner) performInitialHealthChecks(kubeNodes []co

// performCyclingHealthChecks before terminating an instance selected for termination. Cycling pauses
// until all health checks pass for the new instance before terminating the old one
func (t *CycleNodeRequestTransitioner) performCyclingHealthChecks(kubeNodes []corev1.Node) (bool, error) {
func (t *CycleNodeRequestTransitioner) performCyclingHealthChecks(kubeNodes map[string]corev1.Node) (bool, error) {
var allHealthChecksPassed bool = true

// Find new instsances attached to the nodegroup and perform health checks on them
Expand Down
90 changes: 53 additions & 37 deletions pkg/controller/cyclenoderequest/transitioner/node.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,15 @@ import (

// listReadyNodes lists nodes that are "ready". By default lists nodes that have also not been touched by Cyclops.
// A label is used to determine whether nodes have been touched by this CycleNodeRequest.
func (t *CycleNodeRequestTransitioner) listReadyNodes(includeInProgress bool) (nodes []corev1.Node, err error) {
func (t *CycleNodeRequestTransitioner) listReadyNodes(includeInProgress bool) (map[string]corev1.Node, error) {
nodes := make(map[string]corev1.Node)

// Get the nodes
selector, err := t.cycleNodeRequest.NodeLabelSelector()
if err != nil {
return nodes, err
}

nodeList, err := t.rm.ListNodes(selector)
if err != nil {
return nodes, err
Expand All @@ -30,14 +33,16 @@ func (t *CycleNodeRequestTransitioner) listReadyNodes(includeInProgress bool) (n
continue
}
}

// Only add "Ready" nodes
for _, cond := range node.Status.Conditions {
if cond.Type == corev1.NodeReady && cond.Status == corev1.ConditionTrue {
nodes = append(nodes, node)
nodes[node.Spec.ProviderID] = node
break
}
}
}

return nodes, nil
}

Expand All @@ -56,29 +61,34 @@ func (t *CycleNodeRequestTransitioner) getNodesToTerminate(numNodes int64) (node
}

for _, kubeNode := range kubeNodes {
// Skip nodes that are already being worked on so we don't duplicate our work
if value, ok := kubeNode.Labels[cycleNodeLabel]; ok && value == t.cycleNodeRequest.Name {
numNodesInProgress++
}
}

for _, nodeToTerminate := range t.cycleNodeRequest.Status.NodesToTerminate {
kubeNode, found := kubeNodes[nodeToTerminate.ProviderID]

if !found {
continue
}

for _, nodeToTerminate := range t.cycleNodeRequest.Status.NodesToTerminate {
// Add nodes that need to be terminated but have not yet been actioned
if kubeNode.Name == nodeToTerminate.Name && kubeNode.Spec.ProviderID == nodeToTerminate.ProviderID {
nodes = append(nodes, &kubeNode)
// Skip nodes that are already being worked on so we don't duplicate our work
if value, ok := kubeNode.Labels[cycleNodeLabel]; ok && value == t.cycleNodeRequest.Name {
continue
}

for i := 0; i < len(t.cycleNodeRequest.Status.NodesAvailable); i++ {
if kubeNode.Name == t.cycleNodeRequest.Status.NodesAvailable[i].Name {
// Remove nodes from available if they are also scheduled for termination
// Slice syntax removes this node at `i` from the array
t.cycleNodeRequest.Status.NodesAvailable = append(
t.cycleNodeRequest.Status.NodesAvailable[:i],
t.cycleNodeRequest.Status.NodesAvailable[i+1:]...,
)
// Add nodes that need to be terminated but have not yet been actioned
nodes = append(nodes, &kubeNode)

break
}
}
for i := 0; i < len(t.cycleNodeRequest.Status.NodesAvailable); i++ {
if kubeNode.Name == t.cycleNodeRequest.Status.NodesAvailable[i].Name {
// Remove nodes from available if they are also scheduled for termination
// Slice syntax removes this node at `i` from the array
t.cycleNodeRequest.Status.NodesAvailable = append(
t.cycleNodeRequest.Status.NodesAvailable[:i],
t.cycleNodeRequest.Status.NodesAvailable[i+1:]...,
)

break
}
Expand All @@ -94,32 +104,38 @@ func (t *CycleNodeRequestTransitioner) getNodesToTerminate(numNodes int64) (node
}

// addNamedNodesToTerminate adds the named nodes for this CycleNodeRequest to the list of nodes to terminate.
// Returns an error if any named node does not exist in the node group for this CycleNodeRequest.
func (t *CycleNodeRequestTransitioner) addNamedNodesToTerminate(kubeNodes []corev1.Node, nodeGroupInstances map[string]cloudprovider.Instance) error {
for _, namedNode := range t.cycleNodeRequest.Spec.NodeNames {
foundNode := false
for _, kubeNode := range kubeNodes {
if kubeNode.Name == namedNode {
foundNode = true
// Skips any named node that does not exist in the node group for this CycleNodeRequest.
func (t *CycleNodeRequestTransitioner) addNamedNodesToTerminate(kubeNodes map[string]corev1.Node, nodeGroupInstances map[string]cloudprovider.Instance) error {
nodeLookupByName := make(map[string]corev1.Node)

t.cycleNodeRequest.Status.NodesAvailable = append(
t.cycleNodeRequest.Status.NodesAvailable,
newCycleNodeRequestNode(&kubeNode, nodeGroupInstances[kubeNode.Spec.ProviderID].NodeGroupName()),
)
for _, node := range kubeNodes {
nodeLookupByName[node.Name] = node
}

t.cycleNodeRequest.Status.NodesToTerminate = append(
t.cycleNodeRequest.Status.NodesToTerminate,
newCycleNodeRequestNode(&kubeNode, nodeGroupInstances[kubeNode.Spec.ProviderID].NodeGroupName()),
)
for _, namedNode := range t.cycleNodeRequest.Spec.NodeNames {
kubeNode, found := nodeLookupByName[namedNode]

break
if !found {
t.rm.Logger.Info("could not find node by name, skipping", "nodeName", namedNode)

if t.cycleNodeRequest.Spec.CycleSettings.StrictValidation {
return fmt.Errorf("could not find node by name: %v", namedNode)
}
}

if !foundNode {
return fmt.Errorf("could not find node by name: %v", namedNode)
continue
}

t.cycleNodeRequest.Status.NodesAvailable = append(
t.cycleNodeRequest.Status.NodesAvailable,
newCycleNodeRequestNode(&kubeNode, nodeGroupInstances[kubeNode.Spec.ProviderID].NodeGroupName()),
)

t.cycleNodeRequest.Status.NodesToTerminate = append(
t.cycleNodeRequest.Status.NodesToTerminate,
newCycleNodeRequestNode(&kubeNode, nodeGroupInstances[kubeNode.Spec.ProviderID].NodeGroupName()),
)
}

return nil
}

Expand Down
52 changes: 40 additions & 12 deletions pkg/controller/cyclenoderequest/transitioner/transitions.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,12 @@ func (t *CycleNodeRequestTransitioner) transitionUndefined() (reconcile.Result,
func (t *CycleNodeRequestTransitioner) transitionPending() (reconcile.Result, error) {
// Fetch the node names for the cycleNodeRequest, using the label selector provided
t.rm.LogEvent(t.cycleNodeRequest, "SelectingNodes", "Selecting nodes with label selector")

kubeNodes, err := t.listReadyNodes(true)
if err != nil {
return t.transitionToHealing(err)
}

if len(kubeNodes) == 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're losing the ability to detect user error here. Is there a way we can keep user error detection, while also not failing at nonexistent nodes?

One strategy could be: if our selector did match some nodes, but none were still in the cluster, we can call it a success. If it matches no nodes, we don't call it a success. The only problem with this approach is nodegroups scaled to zero. Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see how that's possible unless we have a way of knowing if a node name in the CNR matches a node that existed prior to the CNR being created.

A flag here would help to toggle strict node validation and enable both use cases but I'm wary of adding more settings to a CNR.

return t.transitionToHealing(fmt.Errorf("no nodes matched selector"))
}
Expand All @@ -83,14 +85,12 @@ func (t *CycleNodeRequestTransitioner) transitionPending() (reconcile.Result, er
if err != nil {
return t.transitionToHealing(errors.Wrap(err, "failed to check instances that exist from cloud provider"))
}
var existingKubeNodes []corev1.Node

for _, node := range kubeNodes {
for _, validProviderID := range existingProviderIDs {
if node.Spec.ProviderID == validProviderID {
existingKubeNodes = append(existingKubeNodes, node)
break
}
existingKubeNodes := make(map[string]corev1.Node)

for _, validProviderID := range existingProviderIDs {
if node, found := kubeNodes[validProviderID]; found {
existingKubeNodes[node.Spec.ProviderID] = node
}
}

Expand Down Expand Up @@ -120,19 +120,40 @@ func (t *CycleNodeRequestTransitioner) transitionPending() (reconcile.Result, er
// Do some sanity checking before we start filtering things
// Check the instance count of the node group matches the number of nodes found in Kubernetes
if len(kubeNodes) != len(nodeGroupInstances) {
nodesNotInCPNodeGroup, nodesNotInKube := findOffendingNodes(kubeNodes, nodeGroupInstances)
var offendingNodesInfo string

nodesNotInCPNodeGroup, nodesNotInKube := findOffendingNodes(kubeNodes, nodeGroupInstances)

if len(nodesNotInCPNodeGroup) > 0 {
providerIDs := make([]string, 0)

for providerID := range nodesNotInCPNodeGroup {
providerIDs = append(providerIDs,
fmt.Sprintf("id %q", providerID),
)
}

offendingNodesInfo += "nodes not in node group: "
offendingNodesInfo += strings.Join(nodesNotInCPNodeGroup, ",")
offendingNodesInfo += strings.Join(providerIDs, ",")
}

if len(nodesNotInKube) > 0 {
if offendingNodesInfo != "" {
offendingNodesInfo += ";"
}

providerIDs := make([]string, 0)

for providerID, node := range nodesNotInKube {
providerIDs = append(providerIDs,
fmt.Sprintf("id %q in %q", providerID, node.NodeGroupName()),
)
}

offendingNodesInfo += "nodes not inside cluster: "
offendingNodesInfo += strings.Join(nodesNotInKube, ",")
offendingNodesInfo += strings.Join(providerIDs, ",")
}

t.rm.LogEvent(t.cycleNodeRequest, "NodeCountMismatch",
"node group: %v, kube: %v. %v",
len(nodeGroupInstances), len(kubeNodes), offendingNodesInfo)
Expand All @@ -142,12 +163,16 @@ func (t *CycleNodeRequestTransitioner) transitionPending() (reconcile.Result, er
if err != nil {
return t.transitionToHealing(err)
}

if timedOut {
err := fmt.Errorf(
"node count mismatch, number of kubernetes of nodes does not match number of cloud provider instances after %v",
nodeEquilibriumWaitLimit)
"node count mismatch, number of kubernetes nodes does not match number of cloud provider instances after %v",
nodeEquilibriumWaitLimit,
)

return t.transitionToHealing(err)
}

return reconcile.Result{Requeue: true, RequeueAfter: requeueDuration}, nil
}

Expand All @@ -162,6 +187,7 @@ func (t *CycleNodeRequestTransitioner) transitionPending() (reconcile.Result, er
} else {
// Otherwise just add all the nodes in the node group
t.rm.LogEvent(t.cycleNodeRequest, "SelectingNodes", "Adding all node group nodes to NodesToTerminate")

for _, kubeNode := range kubeNodes {
// Check to ensure the kubeNode object maps to an existing node in the ASG
// If this isn't the case, this is a phantom node. Fail the cnr to be safe.
Expand Down Expand Up @@ -213,7 +239,9 @@ func (t *CycleNodeRequestTransitioner) transitionInitialised() (reconcile.Result
// The maximum nodes we can select are bounded by our concurrency. We take into account the number
// of nodes we are already working on, and only introduce up to our concurrency cap more nodes in this step.
maxNodesToSelect := t.cycleNodeRequest.Spec.CycleSettings.Concurrency - t.cycleNodeRequest.Status.ActiveChildren

t.rm.Logger.Info("Selecting nodes to terminate", "numNodes", maxNodesToSelect)

nodes, numNodesInProgress, err := t.getNodesToTerminate(maxNodesToSelect)
if err != nil {
return t.transitionToHealing(err)
Expand Down
Loading
Loading