Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add http2 timouts to close bad TCP connection #31672

Closed
wants to merge 5 commits into from

Conversation

wangzlei
Copy link
Contributor

Add http2 timouts to close bad TCP connection

Description:

During ALB maintainance, there could be cases where the TCP connection from Daemon to X-Ray ALB is not properly closed, the Daemon client will not know the TCP connection is no longer working, and will be kept using it for sending new HTTP2 requests as HTTP2 multplexes all the requests in a single TCP connection, this would results in repeated request timeouts and evetually losing data.

In this change 2 timeouts has been added:

  • ReadIdleTimeout time.Duration

      ReadIdleTimeout is the timeout after which a health check using ping frame will be carried out if no frame is received on the connection. Note that a ping response will is considered a received frame, so if there is no other traffic on the connection, the health check will be performed every ReadIdleTimeout interval. If zero, no health check is performed.
    
  • PingTimeout time.Duration

     PingTimeout is the timeout after which the connection will be closed if a response to Ping is not received. Defaults to 3s (read idle timeout + ping timeout).
    

So in the case of a TCP connection not being closed properly, the http2 transport would detect there is no frame is being received, then send out a Ping, and if no response is received after the PintTimeout, it would close the connection. Existing pending requests would receive an connection error, which would be retried, and subsequent requests would result in a new connection being created.

Link to tracking Issue:

Testing:

Verified the changes in https://github.com/aws/aws-xray-daemon

Documentation:
aws/aws-xray-daemon#216

Add http2 timouts to close bad TCP connection
@wangzlei wangzlei closed this Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant