-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable machine replacement #10946
Comments
/triage accepted |
q: is this about replacing nodes (the node at Kubernetes level) or the entire machine where the node is hosted? |
Hi @fabriziopandini it's about machines, i'll update the issue. |
ACK, thanks for the clarification |
@fabriziopandini: GuidelinesPlease ensure that the issue body includes answers to the following questions:
For more details on the requirements of such an issue, please see here and ensure that they are met. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
I would like to contribute to this feature. If no one is already started, Shall I pick this up? cc: @sbueringer @fabriziopandini @Meecr0b I can do the initial research and share the api-modeling, high-level code changes for the first review |
@dineshba Feel free to go ahead |
/assign dineshba |
I would like to share my initial idea and get feedback on the approach cc: @sbueringer @fabriziopandini (tagging @ykakarap also as he has contributed to MachineDeployment rolloutAfter feature #8216 which is similar to this) Feature DescriptionWe want to specify a duration in spec after which machines should get replaced. Related exisiting features
// RolloutAfter is a field to indicate a rollout should be performed
// after the specified time even if no changes have been made to the
// KubeadmControlPlane.
// Example: In the YAML the time can be specified in the RFC3339 format.
// To specify the rolloutAfter target as March 9, 2023, at 9 am UTC
// use "2023-03-09T09:00:00Z".
// +optional
RolloutAfter *metav1.Time `json:"rolloutAfter,omitempty"`
type KubeadmControlPlaneSpec struct {
// RolloutBefore is a field to indicate a rollout should be performed
// if the specified criteria is met.
// +optional
RolloutBefore *RolloutBefore `json:"rolloutBefore,omitempty"`
}
// RolloutBefore describes when a rollout should be performed on the KCP machines.
type RolloutBefore struct {
// CertificatesExpiryDays indicates a rollout needs to be performed if the
// certificates of the machine will expire within the specified days.
// +optional
CertificatesExpiryDays *int32 `json:"certificatesExpiryDays,omitempty"`
} API Change ProposalOption 1: Specify machine expiry under RolloutBefore
In this approach, we are trying to extend the existing API. We want to rollout once it reached the specified duration. Adding it under Option 2: Specify machine expiry under new struct named Rollout (or better name)type KubeadmControlPlaneSpec struct {
// Rollout indicates different capabilites to
// rollout the machine when the specified conditions are met
Rollout *Rollout `json:"rollout,omitempty"`
} type Rollout struct {
// MachineExpiry indicates the duration after which the machine
// will be rolled out. If Creation time - Current time > duration,
// then rollout and replace the expired machine
MachineExpiryDays*int`json:"machineexpiry,omitempty"`
} In this new Code Changes:
|
On my todo list, just but unfortunately not much bandwidth currently 😓 |
What would you like to be added (User Story)?
As a operator i would like to be able to configure a time after machines are getting replaced automatically for testing and security reasons.
Detailed Description
Problem Statement:
Regularly replacing machines help in testing application behavior during rolling updates and ensures machines are refreshed periodically, especially important after security incidents.
Proposed Solution:
Implement
rolloutBefore.machineExpiry{Minutes,Hours,Days}
parameter within the Cluster API (likerolloutBefore.certificatesExpiryDays
implemented for KCP), allowing users to specify the maximum time a machine should exist before being automatically replaced.Benefits:
Impact:
Anything else you would like to add?
Current workarounds:
spec.rolloutAfter
periodically via CronJob for MachineDeploymentclusterctl alpha rollout restart machinedeployment/my-md-0
periodicallyLabel(s) to be applied
/kind feature
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.
The text was updated successfully, but these errors were encountered: