Description
Background
Legacy web applications that have sticky session affinity are likely to store important application information in memory on the web server associated with the session. This stateful nature can increase application performance by reducing the amount of contextual information necessary to be supplied by the browser or loaded from a persistent database to handle a request. This performance comes with a maintenance cost, where the state on the web server needs to be carefully handled so that users are not disrupted during web server maintenance.
In a traditional model, where web server application management is hands on, sys admins can mark web servers in a "maintenance" or "draining" state to coordinate offloading user sessions from the draining web servers to other servers that are available. This may take several additional requests to the draining web server, but eventually the number of sessions on the web server will approach zero. This allows the admin to safely install OS updates, etc, on the server without disrupting users.
Problem
When these applications are ported to Kubernetes, there is a challenge. Kubernetes may dynamically mark a pod for deletion. While pods are given a preStop
hook to gracefully terminate, if the pods' service is exposed via an ingress-nginx
controller, even with affinity-mode: persistent
, they will immediately stop receiving additional requests that may be necessary for gracefully migrating user sessions to other pods. This is because ingress-nginx
removes Endpoint objects from the set of available upstreams when their Endpoint condition is no longer Ready
, or in other words when it is Terminating
.
Proposed Feature
I would like to submit a PR for a new affinity-mode: persistent-drainable
option. I'll paste my proposed documentation for the annotation here:
The annotation
nginx.ingress.kubernetes.io/affinity-mode
defines the stickiness of a session.
balanced
(default)Setting this to
balanced
will redistribute some sessions if a deployment gets scaled up, therefore rebalancing the load on the servers.
persistent
Setting this to
persistent
will not rebalance sessions to new servers, therefore providing greater stickiness. Sticky sessions will continue to be routed to the same server as long as its Endpoint's condition remainsReady
. If the Endpoint stops beingReady
, such as when a server pod receives a deletion timestamp, sessions will be rebalanced to another server.
persistent-drainable
<-- NEWSetting this to
persistent-drainable
behaves likepersistent
, but sticky sessions will continue to be routed to the same server as long as its Endpoint's condition remainsServing
, even after the server pod receives a deletion timestamp. This allows graceful session draining during thepreStop
lifecycle hook. New sessions will not be directed to these draining servers and will only be routed to a server whose Endpoint isReady
, except potentially when all servers are draining.
This issue has been discussed since 2018 in nginx/kubernetes-ingress#5962, which should provide further motivation for this feature. You can see that many of those involved in the discussion have resorted to using jcmoraisjr/haproxy-ingress which has a drain-support feature.
It appears that the EndpointConditions
API is available in v1.26
.
Is this a feature that the maintainers would generally be interested to merge in? If so, I can get the CLA figured out and we can move more detailed review comments to the associated PR #13480 .
Metadata
Metadata
Assignees
Labels
Type
Projects
Status