-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Move JIT_InitPInvokeFrame implementation to assembler on x86 and ARM32 #113791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it actually help perf to have this method in assembly on Linux arm32? I would expect that the C/C++ version is going to have better perf thanks to better optimized TLS access that will outweigh the few extra instructions in JITed code to deal with the standard calling convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dtto for Linux x86
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They use custom calling convention, unfortunately, and I didn't want to modify JIT to account for that. On x86 this generated equivalent code to the old stub. On arm32 I moved some register saves to prolog/epilog because it was easier and more efficient.
Moreover, the C version currently behaves differently in regards to who pushes the frame so that would need to be unified too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(We can optimize the TLS access; the
JIT_PInvokeBegin
helper used on R2R would benefit from that too; and it's very similar code also written in assembly.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is that it is a left-over from fragile NGen where we were happy to do a lot of complicated things to save a few bytes of binary size here and there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the asm implementation is ok for win-x86 since win-x86 is weird in many different ways, but I would switch to regular C++ for everything else.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be a much larger change to do that. I'm not necessarily opposed to that in principle, but I went with a change that is less disruptive for a few reasons. Namely, there's already assembler code for this in the R2R PInvoke helpers, so this puts similar code in a similar place which makes any changes to it easier to keep in sync. It removes some of the odd cases for x86/ARM32 even if not all. It produces slightly more optimal code on ARM32 at no expense to readability. It removes unnecessary startup cost (albeit a tiny one) and one place with dynamic code generation.
If we were to revisit this with JIT changes then I would prefer to address more things. Currently we have disparity with usage of
USE_PER_FRAME_PINVOKE_INIT
on various platforms. There's also the fact thatJIT_PInvokeBegin
andJIT_InitPInvokeFrame
helpers do almost the same thing, except in the later case the JIT has to know more of the internal layout ofInlinedCallFrame
. I tried to switch the JIT to use the R2R P/Invoke helpers for all P/Invokes but that resulted in triggering hidden JIT bug (fix submitted in PR) and it painfully made me aware of the fact thatJIT_PInvokeBegin
/JIT_PInvokeEnd
is not aware ofSuppressGCTransition
. Compared to all these open questions I consider the custom calling convention just a small problem.(Lastly, the reason I run into this in the first place is that in my x86/new-EH experiment I need to errect a SEH frame on P/Invoke transitions. That's difficult to address with the current state of the code. At very least this removes the need to write a custom code generator for that.)
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second look it seems that going with
JIT_PInvokeBegin
/JIT_PInvokeEnd
for all P/Invokes may be a viable alternative for a long term solution (unless it causes some unforeseen regressions; possibly restricted to 32-bit platforms if there are reasons to keep the frame optimizations in 64-bit JIT). There's still the issue with the behavior not being the same withSuppressGCTransition
, but it seems that the GC starvation issues I have seen are unrelated (caused by the recentJIT_PollGC
rewrite, issue #113807).That said, I would still like to get this in as an improvement over status quo that is unlikely to cause any regressions.