Skip to content

Conversation

@marek-szews
Copy link

Introduce a new LB policy, random_subsetting. This policy selects a subset of addresses and passes them to the child LB policy to correct the resulting server-side load balancing.

Signed-off-by: marek-szews <szews@google.com>
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 14, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@easwars easwars self-assigned this Oct 15, 2025
@easwars easwars self-requested a review October 15, 2025 19:17
@easwars easwars added Type: Feature New features or improvements in behavior Area: xDS Includes everything xDS related, including LB policies used with xDS. Area: Resolvers/Balancers Includes LB policy & NR APIs, resolver/balancer/picker wrappers, LB policy impls and utilities. and removed Area: xDS Includes everything xDS related, including LB policies used with xDS. labels Oct 15, 2025
@easwars
Copy link
Contributor

easwars commented Oct 15, 2025

@marek-szews : Could you please get the tests to pass. Thanks.

Comment on lines 79 to 81
SubsetSize uint64 `json:"subset_size,omitempty"`

ChildPolicy *iserviceconfig.BalancerConfig `json:"child_policy,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally use camelCase for the json annotations since the protobuf library uses that when marshalling protobuf messages to JSON. So, please change these to use camelCase. Thanks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I have changed the name of variable to lower case I've got:
--- FAIL: Test (0.00s)
--- FAIL: Test/ParseConfig (0.00s)
--- FAIL: Test/ParseConfig/happy-case-default (0.00s)
panic: cannot handle unexported field at {*randomsubsetting.LBConfig}.subsetSize:
"google.golang.org/grpc/balancer/randomsubsetting".LBConfig
consider using cmpopts.EquateComparable to compare comparable Go types [recovered]
panic: cannot handle unexported field at {*randomsubsetting.LBConfig}.subsetSize:
"google.golang.org/grpc/balancer/randomsubsetting".LBConfig
consider using cmpopts.EquateComparable to compare comparable Go types

return b
}

// LBConfig is the config for the outlier detection balancer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment needs to be updated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

// LBConfig is the config for the outlier detection balancer.
type LBConfig struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this type to be exported. I don't see a reason here. So, if there is no reason, please unexport it. The fields can remain exported since you might want that for JSON marshal/unmarshal.

Copy link
Author

@marek-szews marek-szews Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need to use this type in the test function.

Comment on lines 86 to 87
// Default top layer values.
SubsetSize: 10,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this default value specified?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous proposal of config I have found somthing like this (PROP-423):
message RandomSubsettingLbConfig {
// subset_size indicates how many backends every client will be connected to.
// Default is 20.
google.protobuf.UInt32Value subset_size = 1;

But I saw in the most current version (PROTO-157) that this information has been removed. My suggestion is to set subsetSize to 2 by default, because 1 should suggest to use pick_first instead.

}

// if someonw needs subsetSize == 1, he should use pick_first instead
if lbCfg.SubsetSize < 2 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gRFC only says the following:

    // The value is required and must be greater than 0.

Is the condition that the size be at least 2 specified somewhere else?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the discussion at line 86

return addressesSet[i].hash < addressesSet[j].hash
})

b.logger.Infof("resulting subset: %v", addressesSet[:b.cfg.SubsetSize])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please guard this log line with a verbosity check of 2.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


// implements the subsetting algorithm, as described in A68: https://github.com/grpc/proposal/pull/423
func (b *subsettingBalancer) prepareChildResolverState(s resolver.State) resolver.State {
addresses := s.Addresses
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be using the Endpoints field in resolver.State for everything here (except for computing the hash where we use the first address inside of the endpoint).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 174 to 184
// Convert back to resolver.addresses
addressesSubset := make([]resolver.Address, b.cfg.SubsetSize)
for _, eh := range addressesSet[:b.cfg.SubsetSize] {
addressesSubset = append(addressesSubset, eh.addr)
}

return resolver.State{
Addresses: addressesSubset,
ServiceConfig: s.ServiceConfig,
Attributes: s.Attributes,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this should deal with endpoints and not addresses.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +187 to +201
func (b *subsettingBalancer) ResolverError(err error) {
b.child.ResolverError(err)
}

func (b *subsettingBalancer) UpdateSubConnState(sc balancer.SubConn, state balancer.SubConnState) {
b.child.UpdateSubConnState(sc, state)
}

func (b *subsettingBalancer) Close() {
b.child.Close()
}

func (b *subsettingBalancer) ExitIdle() {
b.child.ExitIdle()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we embed the gracefulswitch.Balancer in the subsettingBalancer, and thereby get rid of all these forwarding methods?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How we can do this ?

ChildPolicy *iserviceconfig.BalancerConfig `json:"child_policy,omitempty"`
}

func (bb) ParseConfig(s json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add unit tests for this method. Thanks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@github-actions
Copy link

This PR is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

@github-actions github-actions bot added stale and removed stale labels Oct 23, 2025
@dfawley
Copy link
Member

dfawley commented Oct 29, 2025

Hi @marek-szews - I think we're waiting on you to make some updates so we can continue the review. We can discuss Friday but I wanted to ping this in part to prevent the stale bot from closing it.

@arjan-bal arjan-bal added this to the 1.78 Release milestone Oct 30, 2025
Copy link
Author

@marek-szews marek-szews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rework done, tests passed

Comment on lines 79 to 81
SubsetSize uint64 `json:"subset_size,omitempty"`

ChildPolicy *iserviceconfig.BalancerConfig `json:"child_policy,omitempty"`
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I have changed the name of variable to lower case I've got:
--- FAIL: Test (0.00s)
--- FAIL: Test/ParseConfig (0.00s)
--- FAIL: Test/ParseConfig/happy-case-default (0.00s)
panic: cannot handle unexported field at {*randomsubsetting.LBConfig}.subsetSize:
"google.golang.org/grpc/balancer/randomsubsetting".LBConfig
consider using cmpopts.EquateComparable to compare comparable Go types [recovered]
panic: cannot handle unexported field at {*randomsubsetting.LBConfig}.subsetSize:
"google.golang.org/grpc/balancer/randomsubsetting".LBConfig
consider using cmpopts.EquateComparable to compare comparable Go types

Comment on lines 86 to 87
// Default top layer values.
SubsetSize: 10,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous proposal of config I have found somthing like this (PROP-423):
message RandomSubsettingLbConfig {
// subset_size indicates how many backends every client will be connected to.
// Default is 20.
google.protobuf.UInt32Value subset_size = 1;

But I saw in the most current version (PROTO-157) that this information has been removed. My suggestion is to set subsetSize to 2 by default, because 1 should suggest to use pick_first instead.

ChildPolicy *iserviceconfig.BalancerConfig `json:"child_policy,omitempty"`
}

func (bb) ParseConfig(s json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

if err := json.Unmarshal(s, lbCfg); err != nil { // Validates child config if present as well.
return nil, fmt.Errorf("subsetting: unable to unmarshal LBconfig: %s, error: %v", string(s), err)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

SubsetSize: 10,
}

if err := json.Unmarshal(s, lbCfg); err != nil { // Validates child config if present as well.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


// implements the subsetting algorithm, as described in A68: https://github.com/grpc/proposal/pull/423
func (b *subsettingBalancer) prepareChildResolverState(s resolver.State) resolver.State {
addresses := s.Addresses
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

addressesSet := make([]AddressWithHash, backendCount)
// calculate hash for each endpoint
for i, endpoint := range addresses {

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return addressesSet[i].hash < addressesSet[j].hash
})

b.logger.Infof("resulting subset: %v", addressesSet[:b.cfg.SubsetSize])
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 174 to 184
// Convert back to resolver.addresses
addressesSubset := make([]resolver.Address, b.cfg.SubsetSize)
for _, eh := range addressesSet[:b.cfg.SubsetSize] {
addressesSubset = append(addressesSubset, eh.addr)
}

return resolver.State{
Addresses: addressesSubset,
ServiceConfig: s.ServiceConfig,
Attributes: s.Attributes,
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +187 to +201
func (b *subsettingBalancer) ResolverError(err error) {
b.child.ResolverError(err)
}

func (b *subsettingBalancer) UpdateSubConnState(sc balancer.SubConn, state balancer.SubConnState) {
b.child.UpdateSubConnState(sc, state)
}

func (b *subsettingBalancer) Close() {
b.child.Close()
}

func (b *subsettingBalancer) ExitIdle() {
b.child.ExitIdle()
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How we can do this ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: Resolvers/Balancers Includes LB policy & NR APIs, resolver/balancer/picker wrappers, LB policy impls and utilities. Type: Feature New features or improvements in behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants