Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

balancer APIs (even v2) are not composable #2909

Open
jhump opened this issue Jul 16, 2019 · 5 comments
Open

balancer APIs (even v2) are not composable #2909

jhump opened this issue Jul 16, 2019 · 5 comments
Labels
P2 Type: Feature New features or improvements in behavior

Comments

@jhump
Copy link
Member

jhump commented Jul 16, 2019

The balancer APIs are not at all amenable to composition. For example, if I want to add affinity support to an existing balancer (where a context key indicates an affinity hint and otherwise it routes per the underlying/decorated balancer), I have to wrap almost every type (balancer.Builder, balancer.Balancer, balancer.ClientConn, balancer.Picker, and balancer.SubConn) as well as keep track of a lot of redundant state (basically mirror much of the internal state of the underlying balancer).

A non-trivial change that would go a long way towards fixing this would be to separate the concerns of sub-connection management (instructing the client conn what connections to create and delete based on resolver updates) vs. picking (given a set of ready connections, choose one). The latter half, picking a connection, would be composable, allowing me to create a custom picker that blends or decorates existing strategies. That also allows for the connection management piece to be more easily composed/decorated (for example, adding sophistication to the round-robin strategy other than one sub-conn per resolved address).

A small change that could help for now (though it would not be compatible so would require "V3" interfaces) would be to have Balancer.UpdateSubConnState return a Picker, or maybe (Picker, connectivity.State) instead of requiring it to call ClientConn.UpdateBalancerState. While this doesn't allow blending strategies (since two strategies would "compete" over creation/deletion of sub-connections), it does allow decoration: a decorator could wrap the picker. (Though it must still manage its own set of ready sub-connections.)

Slight aside: it also seems that these APIs aren't amenable to smart retry patterns. Maybe a picker's signature should also receive additional info, like addresses already tried by prior attempts. (This seems particularly important for implementing hedging if/when that ever happens.)

@jhump jhump changed the title balancer APIs (even v2) are not composable and do not accept a context balancer APIs (even v2) are not composable Jul 17, 2019
@dfawley
Copy link
Member

dfawley commented Jul 18, 2019

@jhump thanks for the suggestions. Responses inline:

A non-trivial change that would go a long way towards fixing this would be to separate the concerns of sub-connection management (instructing the client conn what connections to create and delete based on resolver updates) vs. picking (given a set of ready connections, choose one).

Can you provide a concrete proposal for this? (Not formal/finalized or even complete -- I'm just looking for something that shows what you are thinking.) I'm not sure I know what you're talking about, so having specific things to point to would help.

A small change that could help for now (though it would not be compatible so would require "V3" interfaces) would be to have Balancer.UpdateSubConnState return a Picker, or maybe (Picker, connectivity.State) instead of requiring it to call ClientConn.UpdateBalancerState.

My concern with this idea is that updates to the state of a subconn don't necessarily indicate the picker needs to be updated. E.g. if a subconn goes from CONNECTING to TRANSIENT_FAILURE, most will consider that a nop. However, the picker being updated is a meaningful event to grpc, so now we need a way to communicate back to the ClientConn that no picker change should occur.

Slight aside: it also seems that these APIs aren't amenable to smart retry patterns. Maybe a picker's signature should also receive additional info, like addresses already tried by prior attempts. (This seems particularly important for implementing hedging if/when that ever happens.)

I think the API we have is fine for this, it's just that we aren't yet populating what is needed for this. We have the balancer.PickOptions parameter to hold this data once we decide how to include it.

@dfawley dfawley added Type: Feature New features or improvements in behavior and removed Type: Bug labels Aug 15, 2019
@dfawley dfawley added the P2 label Aug 29, 2019
@stale
Copy link

stale bot commented Sep 6, 2019

This issue is labeled as requiring an update from the reporter, and no update has been received after 7 days. If no update is provided in the next 7 days, this issue will be automatically closed.

@jhump
Copy link
Member Author

jhump commented Sep 7, 2019

Sorry I never replied. I do appreciate your response, but I ended up hacking around this for my own needs, so it's priority is lower. (I just held my nose and re-implemented round-robin pick strategy as a hard-coded fallback/default strategy for something that supported affinity.)

I think the ideal API would allow different strategies to be picked for connection management (given results from a resolver, what sub-conns need to be created/destroyed and how many) vs picking (given a set of active sub-conns, select one). Also, the ideal API would be amenable to the decorator pattern -- such that I could easily wrap an existing balancer implementation and augment some aspect of it, such as augmenting the picking logic to conditionally support affinity (e.g. use context value to pick a sub-conn, but able to fall-back to underlying picker logic).

Affinity is just one such possible use case. Another is being able to implement connection management logic that does not need to be strictly tied to a picker algorithm (so being able to easily "mix in" round-robin, select-first, random, etc pick strategy into something that provides a layer-4-HWLB-specific connection strategy).

Sorry if that is too vague.

@jhump
Copy link
Member Author

jhump commented Oct 18, 2019

@dfawley, I've been writing some new balancers to solve some issues we've seen with the existing round-robin strategy. We've also been looking into custom connection management strategies since we have some services with 100s of backends, and having every single client establish 100s of sub-conns is a little silly (and will only get worse as the service scales up over time and needs more replicas).

So this issue has reared its head for me again. This time, I took a little more time to examine the existing balancer/base and found some fairly minor changes that can be made there so that it provides the separation of concerns and composability I was looking for.

Let me know what you think: #3108.

@zasweq
Copy link
Contributor

zasweq commented Nov 28, 2022

Discussed and had back and forth in #3108. Opened a new issue with decision and tracked by #3425.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 Type: Feature New features or improvements in behavior
Projects
None yet
Development

No branches or pull requests

3 participants