Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add code to patch unresolved trampolines for x86-64 #11426

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

dmitry-ten
Copy link
Contributor

No description provided.

@dmitry-ten dmitry-ten changed the title Add code to patch unresolved trampolines for x86-64 WIP: Add code to patch unresolved trampolines for x86-64 Dec 9, 2020
@dmitry-ten
Copy link
Contributor Author

@mpirvu I'm having trouble putting cpIndex onto the stack in this change.
Looking at the caller code it seems that %rsi must contain the RA in call snippet and thus %rsi + 20 is the address of cpIndex:
https://github.com/eclipse/openj9/blob/e031840a90cc00fe7503efa4803e05a376c30b75/runtime/compiler/x/runtime/X86Unresolveds.nasm#L906-L935
However, I get a crash in my code when trying to dereference %rsi from interpreterStaticAndSpecialGlue because it contains an invalid address. Would you know why this is happening?

; Call to adjust trampoline reservations, since this callsite is now resolved
mov edx, dword [rsi+20]
push rdx ; descriptor+24: CP index
mov rdx, qword [rdi + J9TR_MethodFlagsOffset]
Copy link
Contributor

@mpirvu mpirvu Dec 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be mov rdx, dword [rsi+12] to get to the cpAddr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but this also works, because %rdi is the J9Method addres.s and J9TR_MethodFlagsOFfset is the offset to constant pool from start of J9Method

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mov edx, dword [rsi+20] looks ok to me and I have no clue why the address is invalid.
Is it always invalid, or only in some particular cases? maybe you could trace it instruction by instruction using the debugger and see how registers change

@dmitry-ten
Copy link
Contributor Author

Ok, I see why this is not working. We call interpreterStaticAndSpecialGlue through interpreterUnresolvedSpecial(Static)Glue only the first time the method is called. Right before interpreterStaticAndSpecialGlue is called, the caller patches itself out and replaces its instruction with move of J9Method into %rdi. So %rsi contains snippet RA only on the first call, and has invalid values on subsequent ones.

@dmitry-ten
Copy link
Contributor Author

A way to fix this would be to perform trampoline patching from callers of interpreterStaticAndSpecialGlue

@dmitry-ten
Copy link
Contributor Author

I was wrong, we can't just patch trampoline from callers of interpreterStaticAndSpecialGlue because at that point J9Methodf is not resolved yet, so we can't patch the trampoline. We need to patch inside the method itself, after resolution check.
I'm not sure how to pass it the snippet RA though.

@mpirvu mpirvu linked an issue Dec 10, 2020 that may be closed by this pull request
@dmitry-ten
Copy link
Contributor Author

Here is a possible solution: store cpIndex and cpAddress in registers right before calling interpreterStaticAndSpecialGlue. This requires modifying interpreterUnresolvedSpecial(Static)Glue and places in TR::X86CallSnippet::emitSnippetBody() where interpreterStaticAndSpecialGlue is called. I have code for this version that seems to work, will push it shortly.

@mpirvu
Copy link
Contributor

mpirvu commented Dec 10, 2020

Shouldn't the trampoline reservation happen at the point where method is discovered as resolved?
I would put that new code into interpreterUnresolvedStaticGlue: and interpreterUnresolvedSpecialGlue: which are executed only until the method is discovered as unresolved. After that we patch that mov rdi, j9mtehod instruction and don't enter those "Unresolved" routines, but we also don't need to reserve another trampoline.

@dmitry-ten
Copy link
Contributor Author

The new code doesn't reserve trampolines, it replaces unresolved trampoline entry with resolved one at the point where the corresponding method becomes resolved. So we can't put the new code in unresolved routines, since in many cases there is no resolved trampoline entry yet.

@mpirvu
Copy link
Contributor

mpirvu commented Dec 11, 2020

So we can't put the new code in unresolved routines, since in many cases there is no resolved trampoline entry yet.

But the call to the helper in the new code can delete the unresolved trampoline and create a new resolved trampoline.
Your new code intended to call mcc_reservationAdjustment_unwrapper which calls adjustTrampolineReservation that does just that. I am just saying that this can be done at the point where the we discover the identity of the j9method which is in the "interpreterUnresolved*" routines after call jitResolveStaticMethod

@dmitry-ten
Copy link
Contributor Author

Alright, so implementing it according to @mpirvu's advice seems to solve the issue.
However, now I see a different problem: running java -Xjit:count=0,stressTrampolines,verbose -version causes a hang after some small number of compilations. Looking at CPU usage for the process, it still consumes 100% CPU, so it's doing something. So it's most likely some kind of infinite loop caused by doing method calls through trampolines.

Looking at stack traces, this is the only thread that's doing anything:

Thread 2 (Thread 0x7ffff7193700 (LWP 27475)):
#0  0x00007fffef55e33c in mergeResolveVPicClass () from /home/dmitry-ten/src/openj9-openjdk-jdk8/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/amd64/compressedrefs/libj9jit29.so
#1  0x0000000000041300 in ?? ()
#2  0x00007ffff7190800 in ?? ()
#3  0x00000000390a0a00 in ?? ()
#4  0x0000000000017ec8 in ?? ()
#5  0x00007fffda6ed8a5 in ?? ()
#6  0x0000000000000001 in ?? ()
#7  0x00000000390a0a50 in ?? ()
#8  0x000000000004a808 in ?? ()
#9  0x0000000000000011 in ?? ()
#10 0x0000000000000000 in ?? ()

Copy link
Contributor

@mpirvu mpirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assembly code looks right to me

@dmitry-ten dmitry-ten force-pushed the trampoline-patch branch 2 times, most recently from b1e1c1d to cd95d69 Compare December 15, 2020 21:49
Previously, we didn't have JITServer specific code
for handling method trampoline reservations because we assumed
that all methods can be called without trampolines.
This is true for x86, where code repository is enabled by default,
but leads to crashes on Power.

There are 2 types of method trampolines: resolved and unresolved,
corresponding to the types of method calls they enable.
In baseline JIT compilation (no JITServer, no AOT) we reserve
trampolines during codegen and they are patched during method execution.
However, for AOT and JITServer reservation during codegen would not
work, since compiled code is executed by a different VM that doesn't
have trampolines reserved.
AOT uses 2 relocation types to handle this:

1. For unresolved trampolines, we create a `TR_Trampolines` relocation.
This relocation is also added for JITServer compilations so this case
is already handled.

2. For resolved trampolines, AOT only allows their use (and thus resolved
call dispatch) when SVM is enabled because `TR_ResolvedTrampolines`
relocation requires SVM. This is a problem for JITServer, since non-AOT
compilations do not use SVM and we don't want to use unresolved dispatch
for every call.

This commit keeps track of methods for which resolved trampoline is
needed during compilation on the server and sends the list to the client
at the end of the compilation. Once the client receives compiled method
and allocates code cache, it will reserve trampolines.

Signed-off-by: Dmitry Ten <Dmitry.Ten@ibm.com>
Signed-off-by: Dmitry Ten <Dmitry.Ten@ibm.com>
@genie-openj9
Copy link

Can one of the admins verify this patch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segmentation fault with TR_StressTrampolines enabled
3 participants