Skip to content

Configuration for 0 downtime deployments #660

Closed
@benwilson512

Description

Hey folks, I'm having a fair bit of trouble getting 0 downtime deployments to work. The issue:

After a new pod is passes its readiness check, the ALB target group places the Pod's IP in an "initial" state which can last for a couple of seconds.

However since the new pod is ready as far as K8s is concerned, it begins to terminate an old pod, which immediately enters a "draining" state on the target group. At this point there are no pods available to answer requests.

To some extent this can be handled by simply increasing the number of pods. This isn't really a solution though, it just lowers the probability that the rolling deployment outpaces the ALB's ability to keep up. If the AWS API were to undergo any kind of delay or outage, the deployment could complete without any live pods actually registered in the target group.

Is there any known way to require a pod show up as "healthy" in the target group before K8s considers it alive?

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions