-
Notifications
You must be signed in to change notification settings - Fork 31
Cassandra scale in action #289
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
07bea31
to
9d2152f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kragniz
- Add some documentation
- Add an E2E test.
- See if there's a way we can query the cassandra process to learn if this node has been decommissioned. I'm nervous about setting the Decommissioned flag in the pilot status based only on the success of the
nodetool decommission
sub-command. E.g. we'd like to be able to report a decommissioned node even if someone has manually run nodetool decommission. - I'm also nervous about decommissioning all the nodes in parallel. There are a few people saying they should be done one-at-a-time.
- Maybe it's time to rethink the liveness check. The
decommissionInProgress
check on the pilot will break if the pilot process gets restarted.
return fmt.Errorf( | ||
"Not enough pilots to scale down: %d", | ||
len(pilots), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe log an error here and return nil.
The controller interprets an error to mean that the action should be retried.
But retrying will never succeed in this situation....I think.
corev1.EventTypeNormal, | ||
a.Name(), | ||
"Marked cassandra pilot %s for decommission", p.Name, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Log the namespace and name of the decommissioned pilot here.
And maybe the event should be recorded against the cluster object, so that an administrator can see all the events for a particular cluster.
"is greater than the existing StatefulSet.Replicas value (%d)", | ||
a.NodePool.Replicas, *ss.Spec.Replicas, | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Might be neater to start the function with checks for all the bad inputs and exit early. Then end with the happy path.
- And I think this error should be logged rather than returned, to prevent the controller retrying this doomed action.
} | ||
} | ||
errs = append(errs, fmt.Errorf("pilot %q not found", pilotName)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't figure out why we need this inner loop.
I imagined we'd get a list of all pilots for the nodepool and decommission all the ones with an index higher than the desired replica count.
Maybe this can be simplified? Or else add a function doc explaining the algorithm.
Message: utilerror.NewAggregate(errs).Error(), | ||
Reason: metav1.StatusReasonNotFound, | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never seen k8sErrors.StatusError
used before. What's the purpose? I looked to see if we check for this error elsewhere, but couldn't see any such checks.
return &actions.ScaleIn{ | ||
Cluster: c, | ||
NodePool: &np, | ||
}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm keen to keep this function unchanged :-)
I think it should be possible to instead check nps.ReadyReplicas
here...if not, let's discuss.
a, err := cassandra.NextAction(c, state.StatefulSetLister) | ||
if err != nil { | ||
t.Errorf("error calculating next action: %v", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if we could keep the NextAction
signature unchanged, we wouldn't need to add this extra stuff to the test.
func run(args ...string) error { | ||
cmd := exec.Command(args[0], args[1:]...) | ||
cmd.Stdout = os.Stdout | ||
cmd.Stderr = os.Stderr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should capture the stdout / stderr here and also add it to the error that we return.
Otherwise I think it'll be difficult to diagnose decommission failures.
return run("nodetool", "decommission") | ||
} | ||
|
||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should return an error if the node state != NodeStateNormal.
Otherwise the unhealthy node will be marked as decommissioned in the pilot status, right?
if p.decommissionInProgress || true { | ||
glog.Info("decommission in progress, reporting success for liveness") | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes, sorry about this.
I remember you saying that jolokia nodetool requests fails if the node has been decommissioned.
I wonder if we need to rethink this check.
@kragniz: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What this PR does / why we need it: add an action for scaling in cassandra clusters
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #285Special notes for your reviewer: this is based on top of #256, will be rebased on master when that gets merged