balancer: fix connectivity state aggregation algorithm to follow the spec #5473
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The connectivity state aggregation algorithm as specified in the spec is as follows:
In our existing implementation, we were giving more precedence to
TRANSIENT_FAILURE
thanIDLE
. This was not causing any issue so far since this algorithm was only being used by theround_robin
LB policy whereIDLE
is a fleeting state (the policy triggers a connection attempt whenever the subConn entersIDLE
). We plan to use this connectivity state aggregation algorithm in theweightedtarget
LB policy, whereIDLE
is not a fleeting state and needs to be prioritized overTRANSIENT_FAILURE
.Also, this algorithm does not take care of suppressing connectivity state changes from
TRANSIENT_FAILURE
tonon-READY
states. This is currently taken care by the LB policy implementations and will continue to remain that way.Fixes #5458
RELEASE NOTES:
IDLE
overTRANSIENT_FAILURE
when aggregating connectivity state