Skip to content

Conversation

@garmato
Copy link

@garmato garmato commented Nov 17, 2025

Motivation

According to the gRPC documentation documentation when a connection exceeds the graceful shutdown timeout, the server should begin a secondary forceful shutdown timer. If the client still does not close the connection within this period, the server must terminate the connection.

Problem

Tonic currently does not enforce this second-phase shutdown. As a result, in long-running applications where clients become congested, stuck or simply ignore the GOAWAY frame, connections can linger indefinitely. This leads to unbounded growth in memory usage and a continuously increasing number of live Tokio tasks.

Solution

This PR introduces a max_connection_age_grace timeout that starts after max_connection_age expires. If the client does not close the connection within this grace period, the server forcefully closes it. This mirrors the behavior of the gRPC Go implementation.

A similar idea was previously mentioned in a closed PR but it seems the discussion stalled before the feature was implemented.

@garmato
Copy link
Author

garmato commented Nov 24, 2025

@LucioFranco any chance I could get a review from someone? Thank you!

@sauravzg
Copy link
Collaborator

Hi, If I understand correctly https://github.com/grpc/proposal/blob/master/A9-server-side-conn-mgt.md and https://grpc.io/docs/guides/server-graceful-stop/ are not particularly related.

https://github.com/grpc/proposal/blob/master/A9-server-side-conn-mgt.md is about garbage collection of old connections, while graceful shutdown is about server shutdowns.
The server will proceed to terminate open connections after a timeout regardless of the underlying connection management after the server shutdown timeout forecefully.

So, while this PR itself looks okay, I am not particularly sure this'll solve issues around server shutdown you may be facing, but it should help reduce unbounded memory growth due to unresponsible clients.

@garmato
Copy link
Author

garmato commented Nov 25, 2025

Hey @sauravzg, thank you.

I'm not facing any server shutdown issue, when I say shutdown I imply a connection shutdown and not a server shutdown.

As you already said the main purpose is to reduce the memory growth due to unresponsible clients.

Copy link
Collaborator

@sauravzg sauravzg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's wait for another review from another maintainer before merge.

Copy link
Collaborator

@arjan-bal arjan-bal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. To clarify, this change relates to the MAX_CONNECTION_AGE_GRACE configuration (documented here: https://grpc.io/docs/guides/keepalive/#keepalive-configuration-specification).

This is distinct from the Graceful Shutdown documentation, which describes shutting down the entire server instance.

This file seems to lack good unit test coverage. @LucioFranco are there existing tests that can be updated adding coverage the new field?

Copy link
Collaborator

@tottoto tottoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants