Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load balancing: Check each host at most once in LeastRequests LB #11006

Closed
wants to merge 12 commits into from
1 change: 1 addition & 0 deletions docs/root/version_history/current.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ Changes
* listener: added in place filter chain update flow for tcp listener update which doesn't close connections if the corresponding network filter chain is equivalent during the listener update.
Can be disabled by setting runtime feature `envoy.reloadable_features.listener_in_place_filterchain_update` to false.
Also added additional draining filter chain stat for :ref:`listener manager <config_listener_manager_stats>` to track the number of draining filter chains and the number of in place update attempts.
* load balancing: :ref:`least requests load balancing <arch_overview_load_balancing_types_least_request>` was improved to check each host at most one time per pick attempt.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this under "minor behavioral changes" since it's not really a new feature and you cannot toggle this behavior.

* logger: added :ref:`--log-format-prefix-with-location <operations_cli>` command line option to prefix '%v' with file path and line number.
* lrs: added new *envoy_api_field_service.load_stats.v2.LoadStatsResponse.send_all_clusters* field
in LRS response, which allows management servers to avoid explicitly listing all clusters it is
Expand Down
37 changes: 23 additions & 14 deletions source/common/upstream/load_balancer_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -717,7 +717,7 @@ void EdfLoadBalancerBase::refresh(uint32_t priority) {
const auto add_hosts_source = [this](HostsSource source, const HostVector& hosts) {
// Nuke existing scheduler if it exists.
auto& scheduler = scheduler_[source] = Scheduler{};
refreshHostSource(source);
refreshHostSource(source, hosts);

// Check if the original host weights are equal and skip EDF creation if they are. When all
// original weights are equal we can rely on unweighted host pick to do optimal round robin and
Expand Down Expand Up @@ -805,22 +805,31 @@ HostConstSharedPtr EdfLoadBalancerBase::chooseHostOnce(LoadBalancerContext* cont
}

HostConstSharedPtr LeastRequestLoadBalancer::unweightedHostPick(const HostVector& hosts_to_use,
const HostsSource&) {
const HostsSource& source) {
// Choose random choice_count_ hosts and pick one of them with the lowest number of active
// requests. At each iteration a random host is picked among ones which were not considered yet.
HostSharedPtr candidate_host = nullptr;
for (uint32_t choice_idx = 0; choice_idx < choice_count_; ++choice_idx) {
const int rand_idx = random_.random() % hosts_to_use.size();
HostSharedPtr sampled_host = hosts_to_use[rand_idx];

if (candidate_host == nullptr) {
// Make a first choice to start the comparisons.
candidate_host = sampled_host;
continue;
}
uint64_t candidate_active_rq = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we keep this implementation can you add more comments about what is going on in this function? It would help readability.

auto& host_indexes_to_use = unweighted_host_indexes_[source];
ASSERT(hosts_to_use.size() == host_indexes_to_use.size());
uint32_t size = host_indexes_to_use.size();
// As each host is considered at most once - number of iterations should not surpass number of
// hosts.
const uint32_t choice_count = std::min(choice_count_, size);

for (uint32_t choice_idx = 0; choice_idx < choice_count; ++choice_idx) {
ASSERT(size >= 1);
const int rand_idx = random_.random() % size;
HostSharedPtr sampled_host = hosts_to_use[host_indexes_to_use[rand_idx]];
// Move the host to the end of the list and shrink the list by one by decrementing the size.
// It makes this host inaccessible for next iterations, which guarantees no host duplicates
// across iterations.
std::swap(host_indexes_to_use[rand_idx], host_indexes_to_use[--size]);

const auto candidate_active_rq = candidate_host->stats().rq_active_.value();
const auto sampled_active_rq = sampled_host->stats().rq_active_.value();
if (sampled_active_rq < candidate_active_rq) {
candidate_host = sampled_host;
if (candidate_host == nullptr || sampled_active_rq < candidate_active_rq) {
candidate_host = std::move(sampled_host);
candidate_active_rq = sampled_active_rq;
}
}

Expand Down
19 changes: 16 additions & 3 deletions source/common/upstream/load_balancer_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -376,7 +376,7 @@ class EdfLoadBalancerBase : public ZoneAwareLoadBalancerBase {

private:
void refresh(uint32_t priority);
virtual void refreshHostSource(const HostsSource& source) PURE;
virtual void refreshHostSource(const HostsSource& source, const HostVector& hosts) PURE;
virtual double hostWeight(const Host& host) PURE;
virtual HostConstSharedPtr unweightedHostPick(const HostVector& hosts_to_use,
const HostsSource& source) PURE;
Expand All @@ -401,7 +401,7 @@ class RoundRobinLoadBalancer : public EdfLoadBalancerBase {
}

private:
void refreshHostSource(const HostsSource& source) override {
void refreshHostSource(const HostsSource& source, const HostVector&) override {
// insert() is used here on purpose so that we don't overwrite the index if the host source
// already exists. Note that host sources will never be removed, but given how uncommon this
// is it probably doesn't matter.
Expand Down Expand Up @@ -455,7 +455,17 @@ class LeastRequestLoadBalancer : public EdfLoadBalancerBase {
}

private:
void refreshHostSource(const HostsSource&) override {}
void refreshHostSource(const HostsSource& source, const HostVector& hosts) override {
// Initialize per-source list of host indexes.
// This list is used later on to guarantee that pick call considers each host at most once.
// Note that host sources will never be removed, but given how uncommon this
// is it probably doesn't matter.
auto& host_indexes = unweighted_host_indexes_[source];
host_indexes.resize(hosts.size(), 0);
for (uint32_t i = 0; i < host_indexes.size(); ++i) {
host_indexes[i] = i;
}
}
double hostWeight(const Host& host) override {
// Here we scale host weight by the number of active requests at the time we do the pick. We
// always add 1 to avoid division by 0. It might be possible to do better by picking two hosts
Expand All @@ -470,6 +480,9 @@ class LeastRequestLoadBalancer : public EdfLoadBalancerBase {
HostConstSharedPtr unweightedHostPick(const HostVector& hosts_to_use,
const HostsSource& source) override;
const uint32_t choice_count_;

// List of host indexes per HostsSource for fair random sampling.
std::unordered_map<HostsSource, std::vector<uint32_t>, HostsSourceHash> unweighted_host_indexes_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend enhancing test/common/upstream/load_balancer_benchmark.cc so that memory and memory_per_host are tracked by a LeastLoaded Build benchmark similar to BM_RoundRobinLoadBalancerBuild

};

/**
Expand Down
23 changes: 11 additions & 12 deletions test/common/upstream/load_balancer_impl_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1395,22 +1395,22 @@ TEST_P(LeastRequestLoadBalancerTest, SingleHost) {

// Host weight is 1.
{
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2)).WillOnce(Return(3));
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2));
stats_.max_host_weight_.set(1UL);
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_.chooseHost(nullptr));
}

// Host weight is 100.
{
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2)).WillOnce(Return(3));
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2));
stats_.max_host_weight_.set(100UL);
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_.chooseHost(nullptr));
}

HostVector empty;
{
hostSet().runCallbacks(empty, empty);
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2)).WillOnce(Return(3));
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2));
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_.chooseHost(nullptr));
}

Expand Down Expand Up @@ -1469,25 +1469,24 @@ TEST_P(LeastRequestLoadBalancerTest, PNC) {

// 0 choices configured should default to P2C.
EXPECT_CALL(random_, random()).Times(3).WillRepeatedly(Return(0));
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_.chooseHost(nullptr));
EXPECT_EQ(hostSet().healthy_hosts_[3], lb_.chooseHost(nullptr));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, but can you check that lb_ has 0 choices configured.


// 2 choices configured results in P2C.
EXPECT_CALL(random_, random()).Times(3).WillRepeatedly(Return(0));
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_2.chooseHost(nullptr));
EXPECT_EQ(hostSet().healthy_hosts_[3], lb_2.chooseHost(nullptr));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Host 3 is returned because you're swapping the first element with whatever is at the end. I'd clarify in comments here that this is dependent on the way you're doing this partial fisher-yates shuffle and this case might fail if that behavior changes and not because P2C is broken.


// 5 choices configured results in P5C.
EXPECT_CALL(random_, random()).Times(6).WillRepeatedly(Return(0));
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_5.chooseHost(nullptr));
EXPECT_CALL(random_, random()).Times(5).WillRepeatedly(Return(0));
EXPECT_EQ(hostSet().healthy_hosts_[3], lb_5.chooseHost(nullptr));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also clarify in comments here that even though we're selecting the first item in the host set over and over, it's shuffling through the whole set of 4 hosts and comparing each host's active rq count and that it stops early (hence, random() is called 5 times instead of 6).


// Verify correct host chosen in P5C scenario.
EXPECT_CALL(random_, random())
.Times(6)
.Times(5)
.WillOnce(Return(0))
.WillOnce(Return(3))
.WillOnce(Return(1))
.WillOnce(Return(0))
.WillOnce(Return(3))
.WillOnce(Return(2))
.WillOnce(Return(1));
.WillOnce(Return(1))
.WillOnce(Return(2));
EXPECT_EQ(hostSet().healthy_hosts_[3], lb_5.chooseHost(nullptr));
}

Expand Down