Skip to content

Commit

Permalink
Fix least request lb not fair (envoyproxy#29873)
Browse files Browse the repository at this point in the history
* Add new idea for selecting hosts among those not selected yet.

Signed-off-by: Leonardo da Mata <ldamata@spotify.com>

* Change how we choose full table scan

Signed-off-by: Leonardo da Mata <ldamata@spotify.com>

* Remove cout

Signed-off-by: Leonardo da Mata <ldamata@spotify.com>

* Fix Tests for load_balancer_impl_test

Signed-off-by: Leonardo da Mata <ldamata@spotify.com>

* Fix format and make sure full scan happens only when selected or the number of choices is larger than the size.

Signed-off-by: Leonardo da Mata <ldamata@spotify.com>

* Enable new option on extesions api only

Signed-off-by: Leonardo da Mata <ldamata@spotify.com>

* Fix Integration tests.

Signed-off-by: Leonardo da Mata <ldamata@spotify.com>

* Add release notes for full scan in least request LB.

Signed-off-by: Leonardo da Mata <ldamata@spotify.com>

* Fix ref for release note.

Signed-off-by: Leonardo da Mata <ldamata@spotify.com>

* Fix release notes

Signed-off-by: Leonardo da Mata <ldamata@spotify.com>

* Update release note

Signed-off-by: Leonardo da Mata <ldamata@spotify.com>

---------

Signed-off-by: Leonardo da Mata <ldamata@spotify.com>
Signed-off-by: Leonardo da Mata <barroca@gmail.com>
Co-authored-by: Leonardo da Mata <ldamata@spotify.com>
  • Loading branch information
barroca and barroca authored Oct 26, 2023
1 parent 41fa676 commit 3ea2bc4
Show file tree
Hide file tree
Showing 7 changed files with 133 additions and 21 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ option (udpa.annotations.file_status).package_version_status = ACTIVE;
// This configuration allows the built-in LEAST_REQUEST LB policy to be configured via the LB policy
// extension point. See the :ref:`load balancing architecture overview
// <arch_overview_load_balancing_types>` for more information.
// [#next-free-field: 6]
message LeastRequest {
// The number of random healthy hosts from which the host with the fewest active requests will
// be chosen. Defaults to 2 so that we perform two-choice selection if the field is not set.
Expand Down Expand Up @@ -58,4 +59,9 @@ message LeastRequest {

// Configuration for local zone aware load balancing or locality weighted load balancing.
common.v3.LocalityLbConfig locality_lb_config = 4;

// Configuration for performing full scan on the list of hosts.
// If this configuration is set, when selecting the host a full scan on the list hosts will be
// used to select the one with least requests instead of using random choices.
google.protobuf.BoolValue enable_full_scan = 5;
}
14 changes: 14 additions & 0 deletions changelogs/current.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,12 @@ behavior_changes:
issue https://github.com/envoyproxy/envoy/issues/29461 for more specific detail and examples.
minor_behavior_changes:
- area: upstream
change: |
Fixed a reported issue (https://github.com/envoyproxy/envoy/issues/11004) that causes the Least
Request load balancer policy to be unfair when the number of hosts are very small, when the number
of hosts is smaller than the choice_count, instead of randomly selection hosts from the list, we
perform a full scan on it to choose the host with least requests.
# *Changes that may cause incompatibilities for some users, but should not for most*
- area: local_rate_limit
change: |
Expand Down Expand Up @@ -56,10 +62,18 @@ removed_config_or_runtime:
runtime flag and legacy code path.
new_features:
- area: upstream
change: |
Added :ref:`enable_full_scan <envoy_v3_api_msg_extensions.load_balancing_policies.least_request.v3.LeastRequest>`
option to the least requested load balancer. If set to true, Envoy will perform a full scan on the list of hosts
instead of using :ref:`choice_count
<envoy_v3_api_msg_extensions.load_balancing_policies.least_request.v3.LeastRequest>`
to select the hosts.
- area: stats
change: |
added :ref:`per_endpoint_stats <envoy_v3_api_field_config.cluster.v3.TrackClusterStats.per_endpoint_stats>` to get some metrics
for each endpoint in a cluster.
- area: jwt
change: |
The jwt filter can now serialize non-primitive custom claims when maping claims to headers.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,9 @@ same or different weights.
approach is nearly as good as an O(N) full scan). This is also known as P2C (power of two
choices). The P2C load balancer has the property that a host with the highest number of active
requests in the cluster will never receive new requests. It will be allowed to drain until it is
less than or equal to all of the other hosts.
less than or equal to all of the other hosts. The number of hosts chosen can be changed by setting
``choice_count``.

* *all weights not equal*: If two or more hosts in the cluster have different load balancing
weights, the load balancer shifts into a mode where it uses a weighted round robin schedule in
which weights are dynamically adjusted based on the host's request load at the time of selection.
Expand Down
19 changes: 19 additions & 0 deletions source/common/upstream/load_balancer_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1299,6 +1299,25 @@ HostConstSharedPtr LeastRequestLoadBalancer::unweightedHostPick(const HostVector
const HostsSource&) {
HostSharedPtr candidate_host = nullptr;

// Do full scan if it's required explicitly or the number of choices is equal to or larger than
// the hosts size.
if ((hosts_to_use.size() <= choice_count_) || enable_full_scan_) {
for (const auto& sampled_host : hosts_to_use) {
if (candidate_host == nullptr) {
// Make a first choice to start the comparisons.
candidate_host = sampled_host;
continue;
}

const auto candidate_active_rq = candidate_host->stats().rq_active_.value();
const auto sampled_active_rq = sampled_host->stats().rq_active_.value();
if (sampled_active_rq < candidate_active_rq) {
candidate_host = sampled_host;
}
}
return candidate_host;
}

for (uint32_t choice_idx = 0; choice_idx < choice_count_; ++choice_idx) {
const int rand_idx = random_.random() % hosts_to_use.size();
const HostSharedPtr& sampled_host = hosts_to_use[rand_idx];
Expand Down
5 changes: 4 additions & 1 deletion source/common/upstream/load_balancer_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -710,7 +710,9 @@ class LeastRequestLoadBalancer : public EdfLoadBalancerBase {
least_request_config.has_active_request_bias()
? absl::optional<Runtime::Double>(
{least_request_config.active_request_bias(), runtime})
: absl::nullopt) {
: absl::nullopt),
enable_full_scan_(
PROTOBUF_GET_WRAPPED_OR_DEFAULT(least_request_config, enable_full_scan, false)) {
initialize();
}

Expand Down Expand Up @@ -746,6 +748,7 @@ class LeastRequestLoadBalancer : public EdfLoadBalancerBase {
double active_request_bias_{};

const absl::optional<Runtime::Double> active_request_bias_runtime_;
const bool enable_full_scan_{};
};

/**
Expand Down
101 changes: 83 additions & 18 deletions test/common/upstream/load_balancer_impl_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2787,20 +2787,20 @@ TEST_P(LeastRequestLoadBalancerTest, SingleHost) {

// Host weight is 1.
{
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2)).WillOnce(Return(3));
EXPECT_CALL(random_, random()).WillOnce(Return(0));
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_.chooseHost(nullptr));
}

// Host weight is 100.
{
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2)).WillOnce(Return(3));
EXPECT_CALL(random_, random()).WillOnce(Return(0));
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_.chooseHost(nullptr));
}

HostVector empty;
{
hostSet().runCallbacks(empty, empty);
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2)).WillOnce(Return(3));
EXPECT_CALL(random_, random()).WillOnce(Return(0));
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_.chooseHost(nullptr));
}

Expand All @@ -2823,37 +2823,44 @@ TEST_P(LeastRequestLoadBalancerTest, Normal) {

hostSet().healthy_hosts_[0]->stats().rq_active_.set(1);
hostSet().healthy_hosts_[1]->stats().rq_active_.set(2);
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2)).WillOnce(Return(3));
EXPECT_CALL(random_, random()).WillOnce(Return(0));
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_.chooseHost(nullptr));

hostSet().healthy_hosts_[0]->stats().rq_active_.set(2);
hostSet().healthy_hosts_[1]->stats().rq_active_.set(1);
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2)).WillOnce(Return(3));
EXPECT_CALL(random_, random()).WillOnce(Return(0));
EXPECT_EQ(hostSet().healthy_hosts_[1], lb_.chooseHost(nullptr));
}

TEST_P(LeastRequestLoadBalancerTest, PNC) {
hostSet().healthy_hosts_ = {makeTestHost(info_, "tcp://127.0.0.1:80", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:81", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:82", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:83", simTime())};
makeTestHost(info_, "tcp://127.0.0.1:83", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:84", simTime())};
hostSet().hosts_ = hostSet().healthy_hosts_;
hostSet().runCallbacks({}, {}); // Trigger callbacks. The added/removed lists are not relevant.

hostSet().healthy_hosts_[0]->stats().rq_active_.set(4);
hostSet().healthy_hosts_[1]->stats().rq_active_.set(3);
hostSet().healthy_hosts_[2]->stats().rq_active_.set(2);
hostSet().healthy_hosts_[3]->stats().rq_active_.set(1);
hostSet().healthy_hosts_[4]->stats().rq_active_.set(5);

// Creating various load balancer objects with different choice configs.
envoy::config::cluster::v3::Cluster::LeastRequestLbConfig lr_lb_config;
lr_lb_config.mutable_choice_count()->set_value(2);
LeastRequestLoadBalancer lb_2{priority_set_, nullptr, stats_, runtime_,
random_, common_config_, lr_lb_config, simTime()};
lr_lb_config.mutable_choice_count()->set_value(5);
LeastRequestLoadBalancer lb_5{priority_set_, nullptr, stats_, runtime_,
lr_lb_config.mutable_choice_count()->set_value(3);
LeastRequestLoadBalancer lb_3{priority_set_, nullptr, stats_, runtime_,
random_, common_config_, lr_lb_config, simTime()};
lr_lb_config.mutable_choice_count()->set_value(4);
LeastRequestLoadBalancer lb_4{priority_set_, nullptr, stats_, runtime_,
random_, common_config_, lr_lb_config, simTime()};
lr_lb_config.mutable_choice_count()->set_value(6);
LeastRequestLoadBalancer lb_6{priority_set_, nullptr, stats_, runtime_,
random_, common_config_, lr_lb_config, simTime()};

// Verify correct number of choices.

// 0 choices configured should default to P2C.
Expand All @@ -2864,20 +2871,78 @@ TEST_P(LeastRequestLoadBalancerTest, PNC) {
EXPECT_CALL(random_, random()).Times(3).WillRepeatedly(Return(0));
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_2.chooseHost(nullptr));

// 5 choices configured results in P5C.
EXPECT_CALL(random_, random()).Times(6).WillRepeatedly(Return(0));
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_5.chooseHost(nullptr));

// Verify correct host chosen in P5C scenario.
// Verify correct host chosen in P3C scenario.
EXPECT_CALL(random_, random())
.Times(6)
.Times(4)
.WillOnce(Return(0))
.WillOnce(Return(3))
.WillOnce(Return(1))
.WillOnce(Return(2));
EXPECT_EQ(hostSet().healthy_hosts_[3], lb_3.chooseHost(nullptr));

// Verify correct host chosen in P4C scenario.
EXPECT_CALL(random_, random())
.Times(5)
.WillOnce(Return(0))
.WillOnce(Return(3))
.WillOnce(Return(2))
.WillOnce(Return(1));
EXPECT_EQ(hostSet().healthy_hosts_[3], lb_5.chooseHost(nullptr));
.WillOnce(Return(1))
.WillOnce(Return(1))
.WillOnce(Return(2));
EXPECT_EQ(hostSet().healthy_hosts_[3], lb_4.chooseHost(nullptr));

// When the number of hosts is smaller or equal to the number of choices we don't call
// random() since we do a full table scan.
EXPECT_CALL(random_, random()).WillOnce(Return(9999));
EXPECT_EQ(hostSet().healthy_hosts_[3], lb_6.chooseHost(nullptr));
}

TEST_P(LeastRequestLoadBalancerTest, FullScan) {
hostSet().healthy_hosts_ = {makeTestHost(info_, "tcp://127.0.0.1:80", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:81", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:82", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:83", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:84", simTime())};
hostSet().hosts_ = hostSet().healthy_hosts_;
hostSet().runCallbacks({}, {}); // Trigger callbacks. The added/removed lists are not relevant.

hostSet().healthy_hosts_[0]->stats().rq_active_.set(4);
hostSet().healthy_hosts_[1]->stats().rq_active_.set(3);
hostSet().healthy_hosts_[2]->stats().rq_active_.set(2);
hostSet().healthy_hosts_[3]->stats().rq_active_.set(1);
hostSet().healthy_hosts_[4]->stats().rq_active_.set(5);

// Creating various load balancer objects with different choice configs.
envoy::extensions::load_balancing_policies::least_request::v3::LeastRequest lr_lb_config;
lr_lb_config.mutable_choice_count()->set_value(2);
// Enable full table scan on hosts
lr_lb_config.mutable_enable_full_scan()->set_value(true);
common_config_.mutable_healthy_panic_threshold()->set_value(0);

LeastRequestLoadBalancer lb_2{priority_set_, nullptr, stats_, runtime_,
random_, 1, lr_lb_config, simTime()};
lr_lb_config.mutable_choice_count()->set_value(3);
LeastRequestLoadBalancer lb_3{priority_set_, nullptr, stats_, runtime_,
random_, 1, lr_lb_config, simTime()};
lr_lb_config.mutable_choice_count()->set_value(4);
LeastRequestLoadBalancer lb_4{priority_set_, nullptr, stats_, runtime_,
random_, 1, lr_lb_config, simTime()};
lr_lb_config.mutable_choice_count()->set_value(6);
LeastRequestLoadBalancer lb_6{priority_set_, nullptr, stats_, runtime_,
random_, 1, lr_lb_config, simTime()};

// random is called only once every time and is not to select the host.

EXPECT_CALL(random_, random()).WillOnce(Return(9999));
EXPECT_EQ(hostSet().healthy_hosts_[3], lb_2.chooseHost(nullptr));

EXPECT_CALL(random_, random()).WillOnce(Return(9999));
EXPECT_EQ(hostSet().healthy_hosts_[3], lb_3.chooseHost(nullptr));

EXPECT_CALL(random_, random()).WillOnce(Return(9999));
EXPECT_EQ(hostSet().healthy_hosts_[3], lb_4.chooseHost(nullptr));

EXPECT_CALL(random_, random()).WillOnce(Return(9999));
EXPECT_EQ(hostSet().healthy_hosts_[3], lb_6.chooseHost(nullptr));
}

TEST_P(LeastRequestLoadBalancerTest, WeightImbalance) {
Expand Down
5 changes: 4 additions & 1 deletion test/integration/http_subset_lb_integration_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,10 @@ class HttpSubsetLbIntegrationTest
}
}

if (is_hash_lb_) {
// The default number of choices for the LEAST_REQUEST policy is 2 hosts, if the number of hosts
// is equal to the number of choices, a full scan happens instead, this means that the same host
// will be chosen.
if (is_hash_lb_ || (GetParam() == envoy::config::cluster::v3::Cluster::LEAST_REQUEST)) {
EXPECT_EQ(hosts.size(), 1) << "Expected a single unique host to be selected for "
<< envoy::config::cluster::v3::Cluster::LbPolicy_Name(GetParam());
} else {
Expand Down

0 comments on commit 3ea2bc4

Please sign in to comment.