-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Adjust stack level while calculating offset for Vector.getElement #38311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@dotnet/jit-contrib |
bool isEBPbased; | ||
unsigned offs = compiler->lvaFrameAddress(simdInitTempVarNum, &isEBPbased); | ||
|
||
// Adjust the offset with stack level, if any. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could use a comment. I generally find it confusing to reason about the stack level, but I believe it's the case that 1) it will always be zero if there's a frame pointer (I was wondering why you weren't checking isEBPbased
), and 2) it will always be zero when FEATURE_FIXED_OUT_ARGS
is true. Can you confirm or correct?
Also, I'm wondering why this isn't needed further down when a temp is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally, all local access is passed down to the emitter by storing the local number and offset within the local, and emitOutputSV
does the appropriate offset translation using:
#if !FEATURE_FIXED_OUT_ARGS
// Adjust the offset by the amount currently pushed on the CPU stack
dsp += emitCurStackLvl;
#endif
It looks like the two cases that don't do this are the calls to lvaFrameAddress
in this function, genSIMDIntrinsicGetItem
.
So it seems like this is ok, but you need the #if !FEATURE_FIXED_OUT_ARGS
case (for Linux x86 ABI) or prove Carol's #2. It would probably be better to only do the adjustment for the !isEBPbased
case, for clarity.
Also, I think the op1 case above that calls lvaFrameAddress
also needs this logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Carol and Bruce for your valuable input. I thought about adding it under isEBPbased
but I guess I was assuming that for EBP it will always be zero. I have instead added an assert to check if stacklevel is zero for isEBPbased
and FEATURE_FIXED_OUT_ARGS
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, so just verified that for isEBPbased == true
, genStackLevel
can be non-zero. I was hitting previously added assert when crossgening System.Numerics.Plane:CreateFromVertices
. I have now updated the assert(genStackLevel == 0)
to only include when FEATURE_FIXED_OUT_ARGS
is true
. Below is the code that was getting generated when I hit the assert:
83EC0C sub esp, 12
F20F110424 movsd qword ptr [esp], xmm0
660F70C802 pshufd xmm1, xmm0, 2
F30F114C2408 movss dword ptr [esp+08H], xmm1
F30F108568FFFFFF movss xmm0, dword ptr [ebp-98H] ; <-- This line
In above highlighted line, offset returned by lvaFrameAddress
was 0x98 and genStackLevel was 0n12 inside genSimdIntrinsicGetItem
. Since it was isEBPBased
, the stack level didn't get added to the offset 0x98 and everything now works fine. So what I can conclude is for EBP based, the varDsc->lvStkOff
holds accurate value but that doesn't happen when ESP based?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what I can conclude is for EBP based, the varDsc->lvStkOff holds accurate value but that doesn't happen when ESP based?
Yes, that's right - because genStackLevel
is relative to SP. EBP remains fixed, so the offset relative to that doesn't change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to add a test case or two to the coreclr tests, to cover this.
I spent some time thinking about how to add a test for this. From my understanding, this only happens if |
I was able to come up with standalone case. Earlier I couldn't repro it because I was not using particular stress modes needed to repro it. |
Include stack level while calculating the offset from stack pointer when emitting
vmovss
for getting element of SIMD intrinsic.This started showing up in stress run because
STRESS_DBL_ALN
was set for the method which was forcing the stack to be stack pointer based rather than frame pointer.Fixes: #36614