Skip to content

xds: ADS stream failure triggers wildcard subscriptions on new stream #7013

Closed
@atollena

Description

@atollena

What version of gRPC are you using?

master branch/1.61.1

What version of Go are you using (go version)?

1.22.0

What operating system (Linux, Windows, …) and version?

Linux

What did you do?

  1. Create a channel with an xds:///dest target.
  2. Make sure it is properly initialized (send some successful requests onto it). As a result, the gRPC client has an open ADS stream with a set of subscriptions for the API listener corresponding to the target.
  3. Shut this channel down via Close(), or make it enter IDLE state.
  4. At this point the gRPC client has an open ADS stream with a set of empty subscriptions for all resource types (e.g. LDS, CDS, EDS) to the xDS server. resource_names is empty in the last DiscoveryRequests. This ADS stream is still open. Not that the server will send no response, so it does not have a chance to send a new version to client. This empty subscription is not treated as wildcard by the management server because there was at least one explicit subscriptions on the stream.
  5. The management server closes the stream, e.g. for rebalance. According to A57 it should simply reconnect. Note that grpc-go issues two warning in this case (here and here), which is unnecessarily worrying to users -- this should probably just be at info level. Happy to submit a PR for this.
  6. gRPC reconnects to the management server. It creates an ADS stream, and sends discovery requests for all resource types (LDS, CDS, EDS) with empty resource names. The management server correctly interprets this as a wildcard subscription (see this section of the xDS protocol spec). What exactly LDS wildcard entails is up to the management server, but to illustrate the issue, let's assume the management server sends the API listener for xds:///dest as part of it. Note that gRPC discards all resources because it doesn't subscribe to any of them yet.
  7. The channel with xds:/// is recreated, or exits idle mode. This causes a new explicit subscription to the corresponding listener, but since the management server already sent this LDS resource as part of the wildcard subscription, it considers the client up to date and does not send a response.
  8. The client times out and considers that the resource does not exist. RPCs fail.

What did you expect to see?

I expected one of the two following behaviours. Either:

  1. When the last xDS channel is closed or enters idle mode, then the xDS transport is closed and current version information discarded. That would sidestep the issue.
  2. Another reasonable behavior would be to never create empty subscriptions on a new ADS stream, to avoid triggering the protocol's wildcard subscription logic. I think upon reconnection, for CDS and LDS, if there is no subscription then there should be no request sent. Looking at the code, there seem to be some code trying to handle this case but I think it incorrectly iterates on resource type instead of the resource themselves.

What did you see instead?

gRPC sends discovery requests with empty resource_names field upon reconnecting, triggering wildcard subscription.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions