Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport of Fix xDS missing endpoint race condition. into release/1.17.x #19874

Conversation

hc-github-team-consul-core
Copy link
Contributor

Backport

This PR is auto-generated from #19866 to be assessed for backporting due to the inclusion of the label backport/1.17.

The below text is copied from the body of the original PR.


The following PR is mostly a clone of work done by @ksmiley with some minor tweaks. I would like to thank him for tracking down and describing this complicated situation in such great detail. His work is greatly appreciated.

See the following issues for more context:

#17640
#17641

This fixes the following race condition:

  • Send update endpoints
  • Send update cluster
  • Recv ACK endpoints
  • Recv ACK cluster

Prior to this fix, it would have resulted in the endpoints NOT existing in Envoy. This occurred because the cluster update implicitly clears the endpoints in Envoy, but we would never re-send the endpoint data to compensate for the loss, because we would incorrectly ACK the invalid old endpoint hash. Since the endpoint's hash did not actually change, they would not be resent.

The fix for this is to effectively clear out the invalid pending ACKs for child resources whenever the parent changes. This ensures that we do not store the child's hash as accepted when the race occurs.

An escape-hatch environment variable XDS_PROTOCOL_LEGACY_CHILD_RESEND was added so that users can revert back to the old legacy behavior in the event that this produces unknown side-effects. Visit the following thread for some extra context on why certainty around these race conditions is difficult: envoyproxy/envoy#13009


Overview of commits

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto approved Consul Bot automated PR

@github-actions github-actions bot added the theme/envoy/xds Related to Envoy support label Dec 8, 2023
@hashi-derek hashi-derek merged commit f80fc2b into release/1.17.x Dec 8, 2023
90 checks passed
@hashi-derek hashi-derek deleted the backport/derekm/NET-6565/resend-endpoints/generally-intent-boar branch December 8, 2023 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/envoy/xds Related to Envoy support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants