-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move class caching from ComputeStepSyntaxElement to CompiledPipeline #11482
Conversation
leaf -> outputDataset(leaf, flatten(Collections.emptyList(), leaf)) | ||
).collect(Collectors.toList())); | ||
} | ||
} | ||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually just moved code from DatasetCompiler
here to keep consistency with the other filterDataset
, outputDataset
, ... methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this solves the problem, and does so more elegantly than the other PR.
I've added a couple of suggestions in-line, with links to branches implementing what I talk about.
It would be nice if we could also find a way to avoid synchronizing on COMPILER
, since this means only a single thread can be working on this at a time, but I don't understand exactly what is going on with how our single shared COMPILER
is being mutated.
logstash-core/src/main/java/org/logstash/config/ir/CompiledPipeline.java
Outdated
Show resolved
Hide resolved
@@ -119,6 +137,15 @@ public CompiledPipeline( | |||
* @return Compiled {@link Dataset} representation of the underlying {@link PipelineIR} topology | |||
*/ | |||
public Dataset buildExecution() { | |||
synchronized (this) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
synchronization could be avoided by using a final AtomicReference<CompiledExecution> warmedCompiledExecution
that is primed with the warm compiled execution: yaauie@017051f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I like that, good suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added.
logstash-core/src/main/java/org/logstash/config/ir/CompiledPipeline.java
Outdated
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/config/ir/compiler/ComputeStepSyntaxElement.java
Outdated
Show resolved
Hide resolved
@yaauie Thanks for the review; all great suggestions. For the |
I think we can avoid a global lock compiling java classes with yaauie@a8a5da1 From what I can tell, the single |
I'm glad to follow up with lock removal in a later PR. This one should remove significant rework anyway. |
@yaauie Included your suggestions. Great stuff figuring the |
unrelated build error; restarting. |
jenkins test this |
I think we should be good-to-go here, but would you mind letting it sit over the weekend before merging so I can continue ruminating? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, @colinsurprenant 🎉
LGTM 👍
Thanks @yaauie - great review, as usual 👍 |
bfb60cc
to
f8e895f
Compare
Merged in master, backports for 7.6.0 and 7.5.2. |
This PR addressed the multiple workers slowdown but introduced in 7.5.2 a regression for the multiple pipelines use-case. This PR will be reverted and replaced by #11564 which should correctly solve both problems. |
The regression has been addressed by #11564 and will be available in 7.6 (and 7.5.3 if such a version is released). Note that the "new" fix is slightly less efficient than this one but still much much more efficient than 7.5.1 and does not slow down the multi-pipelines use-case. |
Fixes #11105
Relates to #11175
This is an alternate implementation of #11479 where the compiles class caching is kept per-pipeline into the
CompiledPipeline
class.See #11479 for the problem details.
This approach has the following benefits: