Description
Is your feature request related to a problem? Please describe.
Some upgrade use-cases require that the cluster "be healthy" before incurring the disruption of a node upgrade. It would be nice to configure a Plan
such that some settling has occurred before it continues with the next node. This could be achieved by some sort of health measurement, possibly ensuring that all replicasets and daemonsets have a minimum number of pods running, etc.
Describe the solution you'd like
A parameter or two on the Plan spec indicating that some health measurement should pass before commencing with node upgrade(s) and what pre-canned strategy to use for making such a determination. Maybe the presence of a strategy choice other than "none" would be enough (so, one parameter).
Describe alternatives you've considered
Relying on the eviction algorithm that respects pod disruption budgets (aka NOT specifying .spec.disableEviction
) will likely not be adequate for all upgrade needs because such can hang indefinitely in resource-constrained clusters. Because of this we must assume that some disruptions can and will happen from upgrade plan applies. Is this enough to warrant new logic in the controller? 🤷
Additional context