revert #11482 and fix redundant code generation #11564

colinsurprenant · 2020-02-01T03:48:15Z

The fix for the Java execution slowdown relative to the number of worker in #11482 introduced a regression when using multiple pipelines. By analyzing the regression we realized that although the fix was solving the problem for the single pipeline with multiple workers it actually did not correctly solve the root cause of the problem. By removing the global class cache it prevented cross-pipelines class caching, which resulted in the regression.

This PR reverts the previous fix and correctly address the root cause. The real problem was that the configuration Java code was being generated multiple times through the hashCode() and equals() methods which were calling the normalizedSource() method multiple times. The fix addresses this by memoizing the hashCode() and normalizedSource() methods. A few other optimizations have also been made.

The result is that the multiple worker slowdown is still solved and the regression is also fixed.

logstash-core/src/main/java/org/logstash/config/ir/compiler/ComputeStepSyntaxElement.java

yaauie

I think this is the right approach for now, to get a fix for both scenarios out the door in a timely manner, but agree with your other comments that we should take a step back and reevaluate code generation in general in a separate effort.

I've left a note about maybe getting instantiation out of the global lock, and another requesting we keep or enhance commentary about the need for synchronization.

logstash-core/src/main/java/org/logstash/config/ir/compiler/ComputeStepSyntaxElement.java

colinsurprenant · 2020-02-03T22:46:46Z

I revisited the need to have the instantiation inside the compile lock and I did not understand why it was required probably for good reasons because it does not look like it is required. My previous tests for that probably included other changes which created other concurrency issues.
As it is now, using the synchronized (COMPILER) is basically equivalent to just synchronizing the method.
Now that we know that instantiation does not need to be in that synchronization, I tried using a ReadWriteLock, the logic being that multiple threads should be able to concurrently read the cache for already compiled classes. The problem we still have is that the write side of actually performing the compilation is disproportionate to the read side, so every time a write/compilation is performed, everything is locked.

Here's a test implementation of using a ReentrantReadWriteLock, which follows the Javadoc sample usage example.

    private  Class<? extends Dataset> compile() {
        try {
            COMPILE_LOCK.readLock().lock();

            Class<? extends Dataset> clazz = CLASS_CACHE.get(this);
             if (clazz == null) {
                // must release read lock before acquiring write lock
                COMPILE_LOCK.readLock().unlock();
                COMPILE_LOCK.writeLock().lock();
                try {
                    // recheck state because another thread might have
                    // acquired write lock and changed state before we did.
                    clazz = CLASS_CACHE.get(this);
                    if (clazz == null) {
                        final String name = String.format("CompiledDataset%d", CLASS_CACHE.size());
                        final String code = CLASS_NAME_PLACEHOLDER_REGEX.matcher(normalizedSource).replaceAll(name);
                        if (SOURCE_DIR != null) {
                            final Path sourceFile = SOURCE_DIR.resolve(String.format("%s.java", name));
                            Files.write(sourceFile, code.getBytes(StandardCharsets.UTF_8));
                            COMPILER.cookFile(sourceFile.toFile());
                        } else {
                            COMPILER.cook(code);
                        }
                        COMPILER.setParentClassLoader(COMPILER.getClassLoader());
                        clazz = (Class<T>) COMPILER.getClassLoader().loadClass(
                                String.format("org.logstash.generated.%s", name)
                        );
                        CLASS_CACHE.put(this, clazz);
                    }
                    // downgrade by acquiring read lock before releasing write lock
                    COMPILE_LOCK.readLock().lock();
                } finally {
                    // unlock write, still hold read
                    COMPILE_LOCK.writeLock().unlock();
                }
            }
            COMPILE_LOCK.readLock().unlock();
            return clazz;
        } catch (final CompileException | ClassNotFoundException | IOException ex) {
            throw new IllegalStateException(ex);
        }
    }

So far it seems to be working correctly but I have not seen any significant performance difference with my local tests. WDYT?

I will try a variation where the actual compilation will happen outside the write lock, which will help not holding the write lock for too long at the potential cost of performing multiple compilations.

colinsurprenant · 2020-02-03T23:20:14Z

@yaauie I think we should move on with the current simpler and straightforward synchronization (pending LGTM) for the sake of moving forward with a correct solution with solves the regression to unblock 7.6. Further optimization work can be worked on in another PR.

yaauie

+1 to merging as-is and chasing down locking optimization in a separate effort.

colinsurprenant · 2020-02-04T17:23:27Z

Merging in master, backports for 7.7, 7.6, 7.5.3

[7.6 backport of #11564]

[7.5 backport of #11564]

7.7 backport of #11564

colinsurprenant added Java Execution bug blocker labels Feb 1, 2020

This was referenced Feb 1, 2020

multiple pipelines startup slowdown regression introduced in 7.5.2 #11560

Closed

Move class caching from ComputeStepSyntaxElement to CompiledPipeline #11482

Merged

[meta] Pipeline Java Execution important issues #11175

Closed

colinsurprenant requested a review from yaauie February 1, 2020 04:00

colinsurprenant commented Feb 3, 2020

View reviewed changes

logstash-core/src/main/java/org/logstash/config/ir/compiler/ComputeStepSyntaxElement.java Outdated Show resolved Hide resolved

yaauie reviewed Feb 3, 2020

View reviewed changes

yaauie approved these changes Feb 4, 2020

View reviewed changes

revert elastic#11482 and fix redundant code generation

2f888d4

colinsurprenant force-pushed the fixcodegen branch from 4711c4d to 2f888d4 Compare February 4, 2020 15:39

This was referenced Feb 4, 2020

[7.6 backport of #11564] revert #11482 and fix redundant code generation #11574

Merged

[7.5 backport of #11564] revert #11482 and fix redundant code generation #11575

Merged

colinsurprenant added the v8.0.0 label Feb 4, 2020

colinsurprenant mentioned this pull request Feb 4, 2020

[7.7 backport of #11564] revert #11482 and fix redundant code generation #11574 #11576

Merged

colinsurprenant merged commit 8481bd0 into elastic:master Feb 4, 2020

colinsurprenant added a commit that referenced this pull request Feb 4, 2020

revert #11482 and fix redundant code generation (#11574)

402f02d

[7.6 backport of #11564]

colinsurprenant added a commit that referenced this pull request Feb 4, 2020

revert #11482 and fix redundant code generation (#11575)

84b1bf6

[7.5 backport of #11564]

colinsurprenant added a commit that referenced this pull request Feb 7, 2020

revert #11482 and fix redundant code generation (#11576)

72b5f29

7.7 backport of #11564

colinsurprenant mentioned this pull request Feb 13, 2020

Start time much longer after upgrade from 6.x to 7.x #11105

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

revert #11482 and fix redundant code generation #11564

revert #11482 and fix redundant code generation #11564

colinsurprenant commented Feb 1, 2020 •

edited

Loading

yaauie left a comment

colinsurprenant commented Feb 3, 2020 •

edited

Loading

colinsurprenant commented Feb 3, 2020

yaauie left a comment

colinsurprenant commented Feb 4, 2020

revert #11482 and fix redundant code generation #11564

revert #11482 and fix redundant code generation #11564

Conversation

colinsurprenant commented Feb 1, 2020 • edited Loading

yaauie left a comment

Choose a reason for hiding this comment

colinsurprenant commented Feb 3, 2020 • edited Loading

colinsurprenant commented Feb 3, 2020

yaauie left a comment

Choose a reason for hiding this comment

colinsurprenant commented Feb 4, 2020

colinsurprenant commented Feb 1, 2020 •

edited

Loading

colinsurprenant commented Feb 3, 2020 •

edited

Loading