Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds: Update error handling for ADS stream close and failure scenarios #11596

Merged
merged 6 commits into from
Oct 9, 2024

Conversation

DNVindhya
Copy link
Contributor

@DNVindhya DNVindhya commented Oct 3, 2024

According to gRFC A57:

If the ADS stream is closed without ever having received a response from the server, then the XdsClient should consider that a connectivity error. It should log the error and report the error to all watchers of resources that were subscribed to on that stream.

For status == OK, updated the new status to Status.UNAVAILABLE with description "ADS stream closed with OK before receiving a response".

Note that we do not consider it an error if the ADS stream was closed after having received a response on the stream. This is because there are legitimate reasons why the server may need to close the stream during normal operations, such as needing to rebalance load or the underlying connection hitting its max connection age limit (see gRFC A9).

For status != OK, updated new status to Status.OK.

Updated handleStreamClosed to clean up resources and report error to all watchers of resources that were subscribed to on that stream for status != OK.

Copy link
Contributor Author

@DNVindhya DNVindhya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making the PR back to draft, need to update unit tests.

@DNVindhya DNVindhya marked this pull request as draft October 4, 2024 20:44
@DNVindhya DNVindhya marked this pull request as ready for review October 7, 2024 16:58
// close streams for various reasons during normal operation, such as load balancing or
// underlying connection hitting its max connection age limit (see gRFC A9).
if (!status.isOk()) {
newStatus = Status.UNAVAILABLE.withDescription(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why isn't newStatus being set to Status.OK? If you did that, then the if statement below and in XdsClientImpl could be simply (!newStatus.isOk()). Particularly for XdsClientImpl it would be much clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is what I initially had.
After looking at Go's error handling for same, I changed it to return an error instead.
I think we need to differentiate ADS stream being closed with Status.OK and being considered Status.OK because it received a response as stated in gRFC A57.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do Status.OK.withDescription(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. New status is Status.OK when a ADS stream is closed with an error but a response has been received.

@larry-safran larry-safran self-requested a review October 8, 2024 00:48
@DNVindhya DNVindhya merged commit 2e9c3e1 into grpc:master Oct 9, 2024
15 checks passed
@DNVindhya DNVindhya deleted the xds-fail-mode-behavior branch October 9, 2024 00:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants