Skip to content

Conversation

@antonfirsov
Copy link
Contributor

This PR implements an additional policy for limiting RTT/BDP PINGs to comply with the PING accounting mechanism of a specific server in google's backends.

Long story short, we need to make sure we do not send more than 4 PINGs without also sending DATA, HEADERS or WINDOW_UPDATE.

Thanks to the update mentioned in grpc/grpc-dotnet#1765, I believe #62216 would only occur in extreme corner cases which I was able to emulate in a functional test, but cannot reproduce in real life. I think the fix is still valuable to make 100% sure we are compliant with google's backends and avoid rare sporadic bugs in the future, but I don't think we need to backport it to 6.0.

Fixes #62216.

Contributes to #69870 by making SocketsHttpHandler_Http2FlowControl_Test tests conditional on SocketsHttpHandler.IsSupported.

@ghost ghost added the area-System.Net.Http label Jun 20, 2022
@ghost ghost assigned antonfirsov Jun 20, 2022
@ghost
Copy link

ghost commented Jun 20, 2022

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

This PR implements an additional policy for limiting RTT/BDP PINGs to comply with the PING accounting mechanism of a specific server in google's backends.

Long story short, we need to make sure we do not send more than 4 PINGs without also sending DATA, HEADERS or WINDOW_UPDATE.

Thanks to the update mentioned in grpc/grpc-dotnet#1765, I believe #62216 would only occur in extreme corner cases which I was able to emulate in a functional test, but cannot reproduce in real life. I think the fix is still valuable to make 100% sure we are compliant with google's backends and avoid rare sporadic bugs in the future, but I don't think we need to backport it to 6.0.

Fixes #62216.

Contributes to #69870 by making SocketsHttpHandler_Http2FlowControl_Test tests conditional on SocketsHttpHandler.IsSupported.

Author: antonfirsov
Assignees: -
Labels:

area-System.Net.Http

Milestone: -

if (NetEventSource.Log.IsEnabled()) s.thisRef.Trace(s.http2Stream.StreamId, $"Wrote HEADERS frame. Length={current.Length}, flags={flags}");

// RTT PING bookeeping
s.thisRef._rttEstimator.OnDataOrHeadersOrWindowUpdateSent();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OnDataOrHeadersOrWindowUpdateSent is implemented with an interlocked operation. I assume the perf impact is marginal compared to the buffer copying code above, but I may be wrong.

@antonfirsov antonfirsov requested a review from a team June 20, 2022 14:09
@antonfirsov
Copy link
Contributor Author

/azp run runtime-libraries-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

BinaryPrimitives.WriteInt64BigEndian(span.Slice(FrameHeader.Size), state.pingContent);

// RTT PING bookeeping
state.thisRef._rttEstimator.OnPingSent();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correctly, this will count ACKs as well. Is that wanted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. My assumption was that all pings are being accounted on the server side, regardless of the ACK flag. Even if it isn't the case, this stricter policy shouldn't cause a problem here, unless the server is also BDP/RTT measuring us intensively with PINGs and AFAIK servers don't do it.

@jtattermusch do you have any insights which are OK to share publicly?

internal void OnDataOrHeadersReceived(Http2Connection connection)
{
if (_state != State.Waiting) return;
if (_sendPolicyCounter >= MaxPingsWithoutSending) return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be a hard limit?
Am I reading the code correctly? This will check the value before putting the PING into the write queue, but there might be some unprocessed pending PING frames which would increment the counter, thus leading to more than MaxPingsWithoutSending PINGs sent. Or I'm missing something here?

Copy link
Contributor Author

@antonfirsov antonfirsov Jun 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I will try to implement a second check in SendPingAsync to deal with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this, it looks like there is no easy way to get rid of these races without introducing a lot of additional complexity around this code, and I'm not sure if it's worth it.

I'm tempted to close this PR and the issue #62216, since I'm not sure we are trying to solve a problem that actually exist in practice after google's backend tuning. We can come back and work on a more sophisticated solution if we get new user reports.

@karelz
Copy link
Member

karelz commented Jun 28, 2022

Triage: We decided to not pursue the PR further, given the race conditions discovered above.
More details in: #62216 (comment)

@karelz karelz closed this Jun 28, 2022
@karelz karelz added this to the 7.0.0 milestone Jul 19, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Aug 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[HTTP/2] RTT pings shutting down connection in gRPC CI - regression in .NET 6

3 participants