Skip to content

Add persistent-drainable option for affinity-mode ingress annotation to support draining sticky server sessions #13484

Open
@gulachek

Description

@gulachek

Background

Legacy web applications that have sticky session affinity are likely to store important application information in memory on the web server associated with the session. This stateful nature can increase application performance by reducing the amount of contextual information necessary to be supplied by the browser or loaded from a persistent database to handle a request. This performance comes with a maintenance cost, where the state on the web server needs to be carefully handled so that users are not disrupted during web server maintenance.

In a traditional model, where web server application management is hands on, sys admins can mark web servers in a "maintenance" or "draining" state to coordinate offloading user sessions from the draining web servers to other servers that are available. This may take several additional requests to the draining web server, but eventually the number of sessions on the web server will approach zero. This allows the admin to safely install OS updates, etc, on the server without disrupting users.

Problem

When these applications are ported to Kubernetes, there is a challenge. Kubernetes may dynamically mark a pod for deletion. While pods are given a preStop hook to gracefully terminate, if the pods' service is exposed via an ingress-nginx controller, even with affinity-mode: persistent, they will immediately stop receiving additional requests that may be necessary for gracefully migrating user sessions to other pods. This is because ingress-nginx removes Endpoint objects from the set of available upstreams when their Endpoint condition is no longer Ready, or in other words when it is Terminating.

Proposed Feature

I would like to submit a PR for a new affinity-mode: persistent-drainable option. I'll paste my proposed documentation for the annotation here:

The annotation nginx.ingress.kubernetes.io/affinity-mode defines the stickiness of a session.

  • balanced (default)

    Setting this to balanced will redistribute some sessions if a deployment gets scaled up, therefore rebalancing the load on the servers.

  • persistent

    Setting this to persistent will not rebalance sessions to new servers, therefore providing greater stickiness. Sticky sessions will continue to be routed to the same server as long as its Endpoint's condition remains Ready. If the Endpoint stops being Ready, such as when a server pod receives a deletion timestamp, sessions will be rebalanced to another server.

  • persistent-drainable <-- NEW

    Setting this to persistent-drainable behaves like persistent, but sticky sessions will continue to be routed to the same server as long as its Endpoint's condition remains Serving, even after the server pod receives a deletion timestamp. This allows graceful session draining during the preStop lifecycle hook. New sessions will not be directed to these draining servers and will only be routed to a server whose Endpoint is Ready, except potentially when all servers are draining.

This issue has been discussed since 2018 in nginx/kubernetes-ingress#5962, which should provide further motivation for this feature. You can see that many of those involved in the discussion have resorted to using jcmoraisjr/haproxy-ingress which has a drain-support feature.

It appears that the EndpointConditions API is available in v1.26.

Is this a feature that the maintainers would generally be interested to merge in? If so, I can get the CLA figured out and we can move more detailed review comments to the associated PR #13480 .

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/featureCategorizes issue or PR as related to a new feature.needs-priorityneeds-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions