-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix dissemination large shard count #22977
Fix dissemination large shard count #22977
Conversation
vlog(clusterlog.trace, "Received a metadata update"); | ||
co_await ss::parallel_for_each( | ||
boost::irange<ss::shard_id>(0, ss::smp::count), | ||
[this, leaders = std::move(leaders)](ss::shard_id shard) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific reason you switched to ss::do_with
?
Just doing &leaders
should do the same no? The coroutine will keep it alive until ss::parallel_for_each returns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is called very often, i dropped coroutines to prevent allocations, in this case the coroutine do not help much with readability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use _leaders.invoke_on_all(...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is called very often, i dropped coroutines to prevent allocations, in this case the coroutine do not help much with readability
I really wouldn't overthink this in general. do_with
will need an extra alloc as well to store the state.
This path isn't covered by any microbench right now, right? (Just asking) |
no it is not. |
7dbf46e
to
377dceb
Compare
The `update_leader_request` may only contain few of ntp leader updates. In this case using a `fragmented_vector` with large chunk size is a memory waste Signed-off-by: Michał Maślanka <michal@redpanda.com>
Previously each time the leadership update request was received by the `metadata_dissemination_handler` it created a copy for each of the shard on the shard handling the request. This is inefficient and may lead to OOM on the handler shard (especially for very large machines). Instead of creating a copy for each shard we can simply use the const reference as the updates vector doesn't change and it is safe to access it from other cores. Signed-off-by: Michał Maślanka <michal@redpanda.com>
377dceb
to
8e4f51f
Compare
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53280#01917533-e197-4b3c-981b-c6da111be185 |
/backport v24.2.x |
/backport v24.1.x |
Previously each time the leadership update request was received by the
metadata_dissemination_handler
it created a copy for each of the shardon the shard handling the request. This is inefficient and may lead to
OOM on the handler shard (especially for very large machines).
Instead of creating a copy for each shard we can simply use the const
reference as the updates vector doesn't change and it is safe to access
it from other cores.
Backports Required
Release Notes
Improvements