Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move class caching from ComputeStepSyntaxElement to CompiledPipeline #11482

Merged
merged 1 commit into from
Jan 13, 2020

Conversation

colinsurprenant
Copy link
Contributor

Fixes #11105
Relates to #11175

This is an alternate implementation of #11479 where the compiles class caching is kept per-pipeline into the CompiledPipeline class.

See #11479 for the problem details.

This approach has the following benefits:

  • Solves a potential leak for a global cache upon pipeline reloads.
  • Solves a potential class name collision for identical plugins IDs across multiple pipelines.
  • Avoids dealing with cache invalidation altogether because the cache is cleared every time a pipeline is created/reloaded.

leaf -> outputDataset(leaf, flatten(Collections.emptyList(), leaf))
).collect(Collectors.toList()));
}
}
/**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually just moved code from DatasetCompiler here to keep consistency with the other filterDataset, outputDataset, ... methods.

Copy link
Member

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this solves the problem, and does so more elegantly than the other PR.

I've added a couple of suggestions in-line, with links to branches implementing what I talk about.

It would be nice if we could also find a way to avoid synchronizing on COMPILER, since this means only a single thread can be working on this at a time, but I don't understand exactly what is going on with how our single shared COMPILER is being mutated.

@@ -119,6 +137,15 @@ public CompiledPipeline(
* @return Compiled {@link Dataset} representation of the underlying {@link PipelineIR} topology
*/
public Dataset buildExecution() {
synchronized (this) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synchronization could be avoided by using a final AtomicReference<CompiledExecution> warmedCompiledExecution that is primed with the warm compiled execution: yaauie@017051f

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I like that, good suggestion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.

@colinsurprenant
Copy link
Contributor Author

@yaauie Thanks for the review; all great suggestions.

For the COMPILER synchronization, I would prefer not to remove it just yet as I am also not 100% sure if it is still needed and for exactly what. Also this is not central to the goal of this PR and we can followup on this as a potential extra optimization in a new issue. WDYT?

@yaauie
Copy link
Member

yaauie commented Jan 10, 2020

I think we can avoid a global lock compiling java classes with yaauie@a8a5da1

From what I can tell, the single SimpleCompiler COMPILER was hacked to be reused by setting its parent classloader to be the current classloader (which is the result of the most recent cook), nearly-ensuring that getting the current class loader would always be able to load the class that it had just cooked even if another thread had cooked another input in the meantime. The only reason why this hack could have worked in practice was because we happened to already be synchronized on COMPILER for other reasons.

@yaauie
Copy link
Member

yaauie commented Jan 10, 2020

I'm glad to follow up with lock removal in a later PR. This one should remove significant rework anyway.

@colinsurprenant
Copy link
Contributor Author

@yaauie Included your suggestions. Great stuff figuring the COMPILER lock 💪

@colinsurprenant
Copy link
Contributor Author

unrelated build error; restarting.

@colinsurprenant
Copy link
Contributor Author

jenkins test this

@yaauie
Copy link
Member

yaauie commented Jan 11, 2020

I think we should be good-to-go here, but would you mind letting it sit over the weekend before merging so I can continue ruminating?

Copy link
Member

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, @colinsurprenant 🎉

LGTM 👍

@colinsurprenant
Copy link
Contributor Author

Thanks @yaauie - great review, as usual 👍
:shipit:

@colinsurprenant
Copy link
Contributor Author

Merged in master, backports for 7.6.0 and 7.5.2.

@colinsurprenant
Copy link
Contributor Author

This PR addressed the multiple workers slowdown but introduced in 7.5.2 a regression for the multiple pipelines use-case. This PR will be reverted and replaced by #11564 which should correctly solve both problems.

colinsurprenant added a commit to colinsurprenant/logstash that referenced this pull request Feb 4, 2020
colinsurprenant added a commit to colinsurprenant/logstash that referenced this pull request Feb 4, 2020
colinsurprenant added a commit to colinsurprenant/logstash that referenced this pull request Feb 4, 2020
colinsurprenant added a commit to colinsurprenant/logstash that referenced this pull request Feb 4, 2020
colinsurprenant added a commit that referenced this pull request Feb 4, 2020
colinsurprenant added a commit that referenced this pull request Feb 4, 2020
@colinsurprenant
Copy link
Contributor Author

The regression has been addressed by #11564 and will be available in 7.6 (and 7.5.3 if such a version is released). Note that the "new" fix is slightly less efficient than this one but still much much more efficient than 7.5.1 and does not slow down the multi-pipelines use-case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Start time much longer after upgrade from 6.x to 7.x
2 participants