Skip to content

Commit

Permalink
c/health: validate target_node_id when collecting health report
Browse files Browse the repository at this point in the history
The health report is used to determine if a cluster node is online and
available. When a node id changes but the RPC endpoint does not change
the requester may incorrectly assume that the node with the previous
node_id but the same endpoint is still operational. Added validation of
the node that the request was sent to before collecting the health
report. This way a sender will have correct information about the node
availability as only the request targeted to the node with the correct
node id will be replied with success.

Fixes: CORE-5766

Signed-off-by: Michał Maślanka <michal@redpanda.com>
  • Loading branch information
mmaslankaprv committed Aug 9, 2024
1 parent 7886aec commit 90eafa8
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 3 deletions.
4 changes: 2 additions & 2 deletions src/v/cluster/health_monitor_backend.cc
Original file line number Diff line number Diff line change
Expand Up @@ -360,9 +360,9 @@ health_monitor_backend::collect_remote_node_health(model::node_id id) {
ss::this_shard_id(),
id,
max_metadata_age(),
[timeout](controller_client_protocol client) mutable {
[timeout, id](controller_client_protocol client) mutable {
return client.collect_node_health_report(
get_node_health_request{}, rpc::client_opts(timeout));
get_node_health_request(id), rpc::client_opts(timeout));
})
.then(&rpc::get_ctx_data<get_node_health_reply>)
.then([this, id](result<get_node_health_reply> reply) {
Expand Down
16 changes: 15 additions & 1 deletion src/v/cluster/service.cc
Original file line number Diff line number Diff line change
Expand Up @@ -514,7 +514,21 @@ ss::future<get_cluster_health_reply> service::get_cluster_health_report(
}

ss::future<get_node_health_reply>
service::do_collect_node_health_report(get_node_health_request) {
service::do_collect_node_health_report(get_node_health_request req) {
// validate if the receiving node is the one that that the request is
// addressed to
if (
req.get_target_node_id() != get_node_health_request::node_id_not_set
&& req.get_target_node_id() != _controller->self()) {
vlog(
clusterlog.debug,
"Received a get_node_health request addressed to different node. "
"Requested node id: {}, current node id: {}",
req.get_target_node_id(),
_controller->self());
co_return get_node_health_reply{.error = errc::invalid_target_node_id};
}

auto res = co_await _hm_frontend.local().get_current_node_health();
if (res.has_error()) {
co_return get_node_health_reply{
Expand Down

0 comments on commit 90eafa8

Please sign in to comment.