-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Fork remote-cluster response handling #97922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fork remote-cluster response handling #97922
Conversation
Today all responses to remote-cluster requests are deserialized and handled on the transport worker thread. Some of these responses can be sizeable, so with this commit we add the facility for callers to specify a different executor to handle this work.
Pinging @elastic/es-distributed (Team:Distributed) |
One thing to note here is that we are forking to pools that have bounded queues. Semantically that's ok, we force-execute these tasks (see Edit: also, I'm upgrading this to |
Hi @DaveCTurner, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I am in doubt about one of the thread pools though.
@@ -466,6 +466,7 @@ static void ccsRemoteReduce( | |||
ActionListener<SearchResponse> listener, | |||
BiConsumer<SearchRequest, ActionListener<SearchResponse>> localSearchConsumer | |||
) { | |||
final var remoteClientResponseExecutor = threadPool.executor(ThreadPool.Names.SEARCH_COORDINATION); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should use the SEARCH
pool instead, since that is the pool other work here is spawned on.
I think the coordination pool is mainly for the node level can-match/fieldcaps actions.
Might not matter a whole lot, but I'd rather induce any rejections due to this on the SEARCH
thread pool than the SEARCH_COORDINATOR
pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will revert this to SAME to make progress here, and raise it with the search folks to follow-up. I think what we're doing here is more like coordination work so enqueueing it behind some IO-heavy search activity might do bad things for performance, but there's definitely tradeoffs either way.
@@ -373,10 +373,6 @@ public Optional<EngineFactory> getEngineFactory(final IndexSettings indexSetting | |||
|
|||
@SuppressWarnings("HiddenField") | |||
public List<ExecutorBuilder<?>> getExecutorBuilders(Settings settings) { | |||
if (enabled == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that it matters, but was this change necessary or did you just find the "optimization" to not have the extra thread pool unnecessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We apparently do some CCR-related things even if CCR is disabled, or at least there are tests which now fail if the CCR threadpool's existence is contingent on whether it's enabled or not. Seemed simplest to just create the threadpool either way rather than pick all that apart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM2 nice :)
Thanks both :) |
Today we deserialize the chunks received when creating a follower on the CCR pool (prior to elastic#97922 this was the transport thread) and then manually fork off to the GENERIC pool to write the chunk to disk. It's simpler just to do all this on the GENERIC pool to start with, so this commit does that.
Today we deserialize the chunks received when creating a follower on the CCR pool (prior to #97922 this was the transport thread) and then manually fork off to the GENERIC pool to write the chunk to disk. It's simpler just to do all this on the GENERIC pool to start with, so this commit does that.
Today all responses to remote-cluster requests are deserialized and
handled on the transport worker thread. Some of these responses can be
sizeable, so with this commit we add the facility for callers to specify
a different executor to handle this work. It also adjusts several
callers to use more appropriate threads, including:
ccr
search_coordination