-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Backport to 1.9: Improve performance of global code by emitting fewer atomic barriers. #49411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport to 1.9: Improve performance of global code by emitting fewer atomic barriers. #49411
Conversation
As expected:
I haven't been able to catch this in rr yet. |
I can't reproduce this locally anymore, only with the build from CI, so I'm having CI produce me a debug build. The bug also seems very sensitive to how the stack is initialized (e.g., piping output or setting env vars hides the bug). EDIT: that doesn't work on CI either, because of #47938. |
Reproducer:
cc @vtjnash |
11b8dc1
to
85a869d
Compare
Why are we backporting this? It seems very featury. |
What feature? It fixes the regression in #47561. |
didn't realize the performance improvement was regression fixing. Never mind. |
Very cool repro–fast and reliable. The object looks like we never initialized it at the time where we segfault
The code here looks like:
with an initial ssavalue of:
where that value came from (as sret)
This looks emitted okay
but the post-optimization result doesn't have stores to initialize the object
|
It looks like refstore is incorrectly computed, so it doesn't block julia-loop-licm of this value, resulting in an uninitialized value in the julia GC frame (@pchintalapudi) |
85a869d
to
8d40445
Compare
OK, let's try reverting #43057 then. |
8d40445
to
31415a3
Compare
The 32-bit error that happens here is different:
It also seems to happen on master, so maybe the GC corruption that happened before is fixed? |
It seems to reliably fail in a ccall test, both here and on other places |
(and on 1.9) AFAIU this started happening after a kernel upgrade on the buildbots. |
So here's the full function, just before we run escape analysis on %46: https://gist.github.com/pchintalapudi/e73cff1ff78af737076fa9745de048e5 %49 is a load of an i64 from %11, which then gets stored to %46 as initialization. However, %11 is actually a bitcast of our gc frame to a i64*, so %49 is actually a |
It looked like we did a memcpy to initialize the object, since we knew it wasn't visible to the application. But the hoisting pass is moving the allocation without keeping the initialization. |
We could wipe objects when we hoist them so that initialization can no longer pose an issue there, but alloc-opt also relies on the ability to distinguish between a pointer field and a non-pointer field. If by the time alloc-opt sees the IR it's just a store of an i64, then alloc-opt could do the wrong thing and e.g. remove an allocation without rerooting the object. |
Yeah, codegen may be most significantly at fault there for not respecting the memory type, and treating |
31415a3
to
0be97bb
Compare
I backported #49584 here too, which IIUC should fix the 32bit segfault. |
CI looks good! @KristofferC I'll let you merge this in |
@KristofferC Should this target the |
Manual backport of #47636. We should be careful with this change, as in #47636 there used to be a mysterious failure with the i686 tester during LinearAlgebra/special (a segfault) that just disappeared at some point.