Description
What version of gRPC are you using?
This most likely affects all version since ring hash behavior to always attempt to connect to at least one endpoint was added via #5338.
What version of Go are you using (go version
)?
go version go1.22.4 darwin/arm64
What operating system (Linux, Windows, …) and version?
MacOS 14.5
What did you do?
Use the ring_hash
balancer with 2 priorities:
- The highest priority has 3+ endpoints, and none is available
- The lower priority has at least one available endpoint.
What did you expect to see?
If some of the endpoints in the highest priority becomes available, but not all of them (technically, at least 2 are still not available), the traffic should go back to the endpoints with highest priority.
What did you see instead?
The traffic sometimes continues to go to the lower priority indefinitely, until the endpoints change.
Additional notes
The ring hash balancer is specifically designed to avoid that situation, as outlines in A42: xDS Ring Hash LB Policy - Aggregated Connectivity States:
In addition, once the
ring_hash
policy reportsTRANSIENT_FAILURE
, it needs some way to recover from that state. Thering_hash
policy normally requires pick requests to trigger subchannel connection attempts, but if it is being used as a child of thepriority
policy, it will not be getting any picks once it reportsTRANSIENT_FAILURE
. To work around this, it will make sure that it is attempting to connect (after applicable backoff period) to at least one subchannel at any given time. After a given subchannel fails a connection attempt, it will move on to the next subchannel in the ring. It will keep doing this until one of the subchannels successfully connects, at which point it will reportREADY
and stop proactively trying to connect.
I believe that the solution implementated in Go in #5338 is not complete. Specifically, it always walks the ring from the start forward looking for an endpoint that is not the endpoint we are currently trying to connect to. As a result, if the ring contains an endpoint twice before every endpoints (i.e. the ring looks like [A B A C]
rather than [A B C A]
, which is very likely), it will cycle through the list of endpoints at the beginning of the ring until the duplicate (in this case [A B]
), without trying the remaining endpoints (C
).