-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for perf degradation in method calls on polyglot arrays #3781
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure I was drawing attention to this in the PR @JaroslavTulach introduced this in: the method should just take the UnresolvedSymbol, not a string. This here is fine, but it solves a problem that should not exist. Change the method in question to take UnresolvedSymbol
in parameter.
Yeah, I was, because I also still believe the code @JaroslavTulach introduced is incorrect. We didn't find a repro, but there's a bug lurking somewhere cause by that line, I'm sure (it is totally wrong to construct that symbol in the builtin scope) |
This change stems from the investigation into elusive performance degradation that became visible when implementing `Builder.to_vector` with `slice` to not copy the store in #3744. We were seeing about 40% drop for a simple `filter` method calls without any reason. IGV debugging was fun, but didn't reveal much. Profiling with VisualVM revealed that we were previously inlining most of nodes, while with the new version that was no longer the case. Still, it was hard to figure out why. Decided to use ``` -Dpolyglot.engine.TraceDeoptimizeFrame=true -Dpolyglot.engine.BackgroundCompilation=false -Dpolyglot.engine.TraceCompilation=true -Dpolyglot.engine.TraceInlining=true ``` which revealed a more than expected number of deoptimizations. With ``` -Dpolyglot.engine.TraceTransferToInterpreter=true ``` we were able to see the source of it: ``` [info] [2022-10-07T13:13:12.98Z] [engine] transferToInterpreter at Builder.to_vector<arg-2>(built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso:1091) <split-c6d56df> Builder.to_vector(built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso:1091) <split-6500912a> Vector.filter(built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso:347) <split-660ff9b4> ... ... org.graalvm.truffle/com.oracle.truffle.api.CompilerDirectives.transferToInterpreterAndInvalidate(CompilerDirectives.java:90) org.enso.interpreter.node.callable.resolver.MethodResolverNodeGen.execute(MethodResolverNodeGen.java:47) org.enso.interpreter.node.callable.resolver.HostMethodCallNode.getPolyglotCallType(HostMethodCallNode.java:146) org.enso.interpreter.node.callable.InvokeMethodNodeGen.execute(InvokeMethodNodeGen.java:86) org.enso.interpreter.node.callable.InvokeCallableNode.invokeDynamicSymbol(InvokeCallableNode.java:254) org.enso.interpreter.node.callable.InvokeCallableNodeGen.execute(InvokeCallableNodeGen.java:54) org.enso.interpreter.node.callable.ApplicationNode.executeGeneric(ApplicationNode.java:99) org.enso.interpreter.node.ClosureRootNode.execute(ClosureRootNode.java:90) ``` So the problem was in the new implementation of `to_vector` but not in `slice`, as we expected, but in `list.size`. Problem was that `MethodResolver` would always deoptimize for polyglot array method calls because newly created `UnresolvedSymbol` in [HostMethodCallNode](https://github.com/enso-org/enso/blob/ea60cd5fab48d9f6b16923cea21620b97216a898/engine/runtime/src/main/java/org/enso/interpreter/node/callable/resolver/HostMethodCallNode.java#L145) wouldn't `==` to the cached one in [MethodResolver](https://github.com/enso-org/enso/blob/ea60cd5fab48d9f6b16923cea21620b97216a898/engine/runtime/src/main/java/org/enso/interpreter/node/callable/resolver/MethodResolverNode.java#L38). Fixed by providing a custom `equals` implementation for `UnresolvedSymbol` and `s/==/equals` in `MethodResolverNode` cache check.
Rather than creating a new `UnresolvedSymbol` from the symbol's name and then overriding `equals` we can just apply the original `UnresolvedSymbol`. Also, minor change in Bench to avoid unnecessary transfer to interpreter as it showed up in the logs.
8fa2eed
to
1e760bc
Compare
@kustosz Thanks, that works as well! |
This change stems from the investigation into elusive performance degradation that became visible when implementing
Builder.to_vector
withslice
to not copy the store in#3744.
We were seeing about 40% drop for a simple
filter
method calls without any reason.IGV debugging was fun, but didn't reveal much. Profiling with VisualVM revealed that we were previously inlining most of nodes, while with the new version that was no longer the case. Still, it was hard to figure out why.
Decided to use
which revealed a more than expected number of deoptimizations. With
we were able to see the source of it:
So the problem was in the new implementation of
to_vector
but not inslice
, as we expected, but inlist.size
.Problem was that
MethodResolver
would always deoptimize for polyglot array method calls because newly createdUnresolvedSymbol
in HostMethodCallNode wouldn't==
to the cached one inMethodResolver.
Fixed by providing a custom
equals
implementation forUnresolvedSymbol
ands/==/equals
inMethodResolverNode
cache check.Relates to https://www.pivotaltracker.com/story/show/183488659