Skip to content

Make Ring Hash LB a petiole policy#10610

Merged
larry-safran merged 26 commits intogrpc:masterfrom
larry-safran:ring_hash_dualstack
Nov 9, 2023
Merged

Make Ring Hash LB a petiole policy#10610
larry-safran merged 26 commits intogrpc:masterfrom
larry-safran:ring_hash_dualstack

Conversation

@larry-safran
Copy link
Contributor

@larry-safran larry-safran commented Oct 17, 2023

Implementation of A61 for Ring Hash

Major changes to behavior:

  • Lazy create PickFirstLoadBalancers rather than lazily connect subchannels
  • When picking, look for first entry in ring not in TF rather than focus on first 2 entries in the ring after hash

Copy link
Member

@YifeiZhuang YifeiZhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not finished yet, put what I have now.

/**
* The default implementation not only shuts down the lb policy, it also removes this child
* from the map of children maintained by the petiole policy. Therefore, if you plan to
* reactivate this child later, then you must override this method in your extending policy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the exact scenario that we need to re-activate? Is that for the subchannels in the picker that is different from the updated picker due to handling new addresses?

The address update that removes the old addresses and deactivate childLbState would immediately remove the key from the childLbState. Those deactivated childLbState would never be reused in the perspective of handling an update(acceptResolvedAddress), because the reusing is based on the childLbStates map key.

I wish we do not need this override, for code readability purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the logic other than resetting the flag and left it to subclasses to do what they want with complex deactivate/reactivate pairs. Right now, I'm using reactivate in RingHash tests and ClusterManagerLB has a delayed shutdown on deactivate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left it to subclasses to do what they want with complex deactivate/reactivate pairs.

I see. That sounds fine.
Could you elaborate on why not keep shutdown in the default MultiChildLB.ChildLbState.deactivate() and ringhash use the default?

If we decide to remove shutdown from the default deactivate(), then you have to update other LBs, which inherit MultiChildLB, on their ChildLBState.deactivate().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only class that overrode deactivate was ClusterManagerLoadBalancer which specifically wants the shutdown to be triggered by a timer rather than the removal from the address list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only place that calls deactivate is in MultiChildLB's acceptResolvedAddressesInternal, so that method explicitly does a shutdown rather than expecting deactivate to do it. Note that logic moved to the new MultiChildLB.updateLbStateAndShutdownRemoved and is conditional on the protected method shutdownInAcceptResolvedAddresses. Therefore, the other children don't need to be updated.

@larry-safran larry-safran added the kokoro:force-run Add this label to a PR to tell Kokoro to re-run all tests. Not generally necessary label Nov 8, 2023
…irst 2 elements, but rather takes the first ring element not in TF and uses that.

Also, added some comments, fixed a couple of bugs in MultiChildLoadBalancer and replaced SubchannelView with direct usage of RingHashChildLbState.
… created and respond to some code review comments.
…es explicitly do the shutdowns after updating picker by default
…state and removed child cleanup in acceptResolvedAddresses.
…child cleanup when processing acceptResolvedAddresses.
…y to start a connection in updateBalancingState when there aren't any active subchannels.
…and there are 2+ TF then report TF. 1 TF + idle/connecting report connecting.
@grpc-kokoro grpc-kokoro removed the kokoro:force-run Add this label to a PR to tell Kokoro to re-run all tests. Not generally necessary label Nov 9, 2023
larry-safran and others added 4 commits November 9, 2023 11:26
The methods other than forTarget() were no longer used after 493af03.
forTarget() was no longer used after cda0e9d.
This class is of questionable utility and generally not used.
@larry-safran larry-safran merged commit dfdd50b into grpc:master Nov 9, 2023
@larry-safran larry-safran deleted the ring_hash_dualstack branch November 9, 2023 21:46
@ejona86
Copy link
Member

ejona86 commented Nov 9, 2023

We need to figure out what is going on with your rebase process. The rebase also caused weirdness with the commit message.

@larry-safran larry-safran restored the ring_hash_dualstack branch November 9, 2023 23:15
@larry-safran larry-safran deleted the ring_hash_dualstack branch November 9, 2023 23:15
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 8, 2024
@ejona86 ejona86 modified the milestone: 1.79 Jan 6, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants