Skip to content

[NativeAOT] Port x86 ResumeSP logic from CoreCLR #99866

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 19, 2024

Conversation

filipnavara
Copy link
Member

@filipnavara filipnavara commented Mar 16, 2024

On x86 with funclets that's a scenario where exception can happen with argument registers pushed on the stack. One such case is in the Invariant.Tests's method Invariant_Tests!Invariant_Tests_System_Globalization_Tests_InvariantModeTests__PredefinedCulturesOnly:

006a4cea e8a1910600     call    Invariant_Tests!System.Type__GetProperty_0 (70de90)
006a4cef 6a00           push    0
006a4cf1 6a00           push    0
006a4cf3 6a00           push    0
006a4cf5 6a00           push    0
006a4cf7 8bc8           mov     ecx, eax
006a4cf9 33d2           xor     edx, edx
006a4cfb 8b00           mov     eax, dword ptr [eax] <---
006a4cfd ff506c         call    dword ptr [eax+6Ch]

The exception is followed by calling a catch funclet and then jumping to the epilog of the main method. This currently fails on NativeAOT since there's no code that restores the correct SP for the epilog that accounts for the pushed arguments. On CoreCLR this is handled by the EECodeManager::GetResumeSp method and fixing up the context after calling the catch funclet and before proceeding to jump to the epilog.

Following the CoreCLR method exactly is possible but not necessarily easy and efficient due to separation between managed and native code. In this PR I opted to save the ResumeSP in StackFrameIterator into separate field in REGDISPLAY structure. RhCallCatchFunclet reads the ResumeSP field instead of SP now.

Additionally, this restores locAllocSPvar in the NativeAOT ABI to save the SP offset in methods that use localloc. This is necessary to calculate the correct SP value after the catch block.

Copy link
Contributor

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Mar 16, 2024
@jkotas
Copy link
Member

jkotas commented Mar 16, 2024

how is this handled on x64 and why are we doing something different?

This problem is specific to callee-pop calling convention. x86 is the only arch that uses callee-pop calling convention.

@filipnavara
Copy link
Member Author

filipnavara commented Mar 16, 2024

This problem is specific to callee-pop calling convention. x86 is the only arch that uses callee-pop calling convention.

I'm not sure how specifically it related to the callee-pop calling convention. We already handle that in unwinding with PCTAddr and SP being two different things in the REGDISPLAY context.

(Also, so far I was not able to reproduce the error on locally built binaries because they don't contain the "push 0" opcodes as in the original example, which I downloaded from CI build in Helix. I'll have to clean everything and try to build it from scratch on the same branch as the CI, something is odd there. I may have triggered some incorrect codegen by enabling FEATURE_EH_CALLFINALLY_THUNKS.)

@jkotas
Copy link
Member

jkotas commented Mar 16, 2024

The SP is constant throughout the main method body with caller pop (except for localloc). You never need to
restore the correct SP for the epilog that accounts for the pushed arguments with caller pop.

@filipnavara
Copy link
Member Author

The SP is constant throughout the main method body with caller pop (except for localloc).

I do see the different approach in CodeGen::genPutArgStk but I always associated that more with the relative efficiency of push/pop on x86 vs [esp+X] access in the instruction set rather than the calling convention (and the calling convention being a consequence of that). Anyway, I compared the disassembly of x86 and x64 for some function call cases and it makes more sense now. Thanks.

@filipnavara filipnavara marked this pull request as ready for review March 18, 2024 10:58
@filipnavara filipnavara changed the title [NativeAOT] RFC: Port most of x86 ResumeSP logic from CoreCLR [NativeAOT] Port x86 ResumeSP logic from CoreCLR Mar 18, 2024
Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @dotnet/jit-contrib for the JIT change

@jkotas jkotas merged commit be76968 into dotnet:main Mar 19, 2024
@filipnavara filipnavara deleted the win-x86-resume-sp branch March 19, 2024 14:03
@github-actions github-actions bot locked and limited conversation to collaborators Apr 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-x86 area-NativeAOT-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants