Skip to content

Flagger Initialization causes disruption #374

Closed
@jtolsma

Description

@jtolsma

When initializing a canary object on pods that take awhile to pull down images, there is an outage as it looks like the previous pods are being terminated before the new pods are up.

{"level":"info","ts":"2019-11-18T19:39:35.794Z","caller":"controller/controller.go:235","msg":"Synced sandbox/canary-test2"}
{"level":"info","ts":"2019-11-18T19:39:43.863Z","caller":"canary/tracker.go:314","msg":"Secret canary-test2-primary synced","canary":"canary-test2.sandbox"}
{"level":"info","ts":"2019-11-18T19:39:43.871Z","caller":"canary/deployer.go:302","msg":"Deployment canary-test2-primary.sandbox created","canary":"canary-test2.sandbox"}
{"level":"info","ts":"2019-11-18T19:39:43.871Z","caller":"canary/deployer.go:50","msg":"Scaling down canary-test2.sandbox","canary":"canary-test2.sandbox"}
{"level":"info","ts":"2019-11-18T19:39:43.898Z","caller":"canary/deployer.go:355","msg":"HorizontalPodAutoscaler canary-test2-primary.sandbox created","canary":"canary-test2.sandbox"}
{"level":"info","ts":"2019-11-18T19:39:43.953Z","caller":"router/kubernetes.go:156","msg":"Service canary-test2 updated","canary":"canary-test2.sandbox"}
{"level":"info","ts":"2019-11-18T19:39:43.998Z","caller":"router/kubernetes.go:132","msg":"Service canary-test2-canary.sandbox created","canary":"canary-test2.sandbox"}
{"level":"info","ts":"2019-11-18T19:39:44.045Z","caller":"router/kubernetes.go:132","msg":"Service canary-test2-primary.sandbox created","canary":"canary-test2.sandbox"}
{"level":"info","ts":"2019-11-18T19:39:44.056Z","caller":"router/istio.go:78","msg":"DestinationRule canary-test2-canary.sandbox created","canary":"canary-test2.sandbox"}
{"level":"info","ts":"2019-11-18T19:39:44.067Z","caller":"router/istio.go:78","msg":"DestinationRule canary-test2-primary.sandbox created","canary":"canary-test2.sandbox"}
{"level":"info","ts":"2019-11-18T19:39:44.079Z","caller":"router/istio.go:230","msg":"VirtualService canary-test2.sandbox updated","canary":"canary-test2.sandbox"}
{"level":"info","ts":"2019-11-18T19:39:44.082Z","caller":"controller/controller.go:271","msg":"Halt advancement canary-test2-primary.sandbox waiting for rollout to finish: 0 of 3 updated replicas are available","canary":"canary-test2.sandbox"}
{"level":"info","ts":"2019-11-18T19:39:58.824Z","caller":"canary/deployer.go:50","msg":"Scaling down canary-test2.sandbox","canary":"canary-test2.sandbox"}
{"level":"info","ts":"2019-11-18T19:39:58.880Z","caller":"router/kubernetes.go:156","msg":"Service canary-test2 updated","canary":"canary-test2.sandbox"}
{"level":"info","ts":"2019-11-18T19:39:58.910Z","caller":"controller/controller.go:261","msg":"Initialization done! canary-test2.sandbox","canary":"canary-test2.sandbox"}

But by the time the following message is posted, the old deployment pods are already being terminated leading to an outage until the new pods come up in varying degrees of running and can cause overload on single pods until all are up.
{"level":"info","ts":"2019-11-18T19:39:44.082Z","caller":"controller/controller.go:271","msg":"Halt advancement canary-test2-primary.sandbox waiting for rollout to finish: 0 of 3 updated replicas are available","canary":"canary-test2.sandbox"}

kubernetes objects during the initialization:
sandbox get pods |grep canary-test2 ; sandbox get deployment canary-test2 -o json |jq .status ; sandbox get deployment canary-test2-primary -o json |jq .status

canary-test2-554d47bf6b-b25n6 2/2 Terminating 0 1m
canary-test2-554d47bf6b-fstxf 2/2 Terminating 0 1m
canary-test2-554d47bf6b-lj4tl 2/2 Terminating 0 1m
canary-test2-primary-576b775ddb-8smlp 1/2 Running 0 4s
canary-test2-primary-576b775ddb-jxcz8 1/2 Running 0 4s
canary-test2-primary-576b775ddb-w8j8q 0/2 PodInitializing 0 4s

{
"conditions": [
{
"lastTransitionTime": "2019-11-18T19:46:02Z",
"lastUpdateTime": "2019-11-18T19:46:02Z",
"message": "Deployment has minimum availability.",
"reason": "MinimumReplicasAvailable",
"status": "True",
"type": "Available"
}
],
"observedGeneration": 2
}

{
"conditions": [
{
"lastTransitionTime": "2019-11-18T19:47:03Z",
"lastUpdateTime": "2019-11-18T19:47:03Z",
"message": "Deployment does not have minimum availability.",
"reason": "MinimumReplicasUnavailable",
"status": "False",
"type": "Available"
}
],
"observedGeneration": 1,
"replicas": 3,
"unavailableReplicas": 3,
"updatedReplicas": 3
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions