Description
-
Kops version 1.8.0
-
Kubernetes version 1.8.6
-
AWS (3 masters and 3 nodes)
-
kops edit followed by kops update and kops rolling-update. kops edit to add configuration flags for the apiserver (dex related). Also tried kops rolling-update --instance-group <master...> to only update one master at a time.
-
Nodes become "not ready" in an unpredictable way. Sometimes no node is affected. Sometimes one node becomes "not ready" and recovers after a few minutes. Sometimes all nodes are "not ready" for a longer period. Up to 15 minutes. While the masters report ready. During this time the workload on the cluster is not accessible.
-
Nothing: a non-breaking rolling update without affecting nodes or the workload.
-
Starting config: https://gist.github.com/recollir/9e9b4b0b426ef77014083f1839c123d6
Added via kops edit before the rolliing-update: https://gist.github.com/recollir/da9fd8a123b58f555f2e4321093e9d46 -
https://gist.github.com/recollir/5b19d543adaa50b1889aabafeb77b847
-
A couple of times I observed that after the rolling update the ELB for the API server was missing AZ attached to it.