-
Notifications
You must be signed in to change notification settings - Fork 2.6k
GC info fix: correctly adjust argCnt #17363
Conversation
When there are nested calls, and there is a non-ptr on the stack below the last ptr popped by the inner call, the `argHigh` and `argCnt` values can get out of sync.
|
test Windows_NT x86 Checked gcstress0xc_minopts_heapverify1 |
|
|
||
| if (hasPartialArgInfo) | ||
| { | ||
| // We always leave argHigh pointing to the next ptr arg. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this how it always was, or was this changed with RyuJIT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not very familiar with this code, and had to do a fair bit of spelunking to figure out how this is supposed to work, but looking at the code above, it assumes that argHigh is always pointing to the next ptr arg. The code has been like this since at least 2013, and I'm guessing that this particular case (partially interruptible ebp frame, nested calls, and non-ptr arg(s) below the last popped arg of the inner call) is rare enough that this hasn't been hit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, my plan was to do some gcstress testing on this, and then get reviews from people who might be more familiar with this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am familiar with the x86 GC encodings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@briansull - that's great! Could you have a look?
As I suspected, for this same test on desktop, jit32 un-nests the calls, so that it doesn't run into this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and in addition to the fix, it would be great if you could look at the comments I added and verify that they are correct. Figuring this out would have been a lot easier with more comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the code dealing with partialArgInfo was recently introduced on the CoreCLR side of the world:
19b1ce56f95 (Jonghyun Park 2017-04-11 07:49:08 +0900 2435) bool hasPartialArgInfo;
42924c80d75 (Jonghyun Park 2017-04-05 09:30:08 +0900 2436)
42924c80d75 (Jonghyun Park 2017-04-05 09:30:08 +0900 2437) #ifndef UNIX_X86_ABI
19b1ce56f95 (Jonghyun Park 2017-04-11 07:49:08 +0900 2438) hasPartialArgInfo = info->ebpFrame;
42924c80d75 (Jonghyun Park 2017-04-05 09:30:08 +0900 2439) #else
42924c80d75 (Jonghyun Park 2017-04-05 09:30:08 +0900 2440) // For x86/Linux, interruptible code always has full arg info
42924c80d75 (Jonghyun Park 2017-04-05 09:30:08 +0900 2441) //
42924c80d75 (Jonghyun Park 2017-04-05 09:30:08 +0900 2442) // This should be aligned with emitFullArgInfo setting at
42924c80d75 (Jonghyun Park 2017-04-05 09:30:08 +0900 2443) // emitter::emitEndCodeGen (in JIT)
19b1ce56f95 (Jonghyun Park 2017-04-11 07:49:08 +0900 2444) hasPartialArgInfo = false;
42924c80d75 (Jonghyun Park 2017-04-05 09:30:08 +0900 2445) #endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but the logic was the same previously, it was just using info->ebpFrame directly instead of hasPartialArgInfo
|
Is there a bug number this fixes? Existing or new repro case? |
|
ok, i'm guessing https://github.com/dotnet/coreclr/issues/17242 |
|
@BruceForstall yes, #17242 - sorry I forgot to mention it |
|
test Windows_NT x86 Checked gcstress0xc |
1 similar comment
|
test Windows_NT x86 Checked gcstress0xc |
|
test Windows_NT x86 Checked gcstress0xc |
|
The tests failures are basically the same as without this fix, except that the |
I believe it is correct. Here is how I read/understand the encoding for an ebp frame:
So, this is how I understand the GC info: If this is not how it works, then perhaps someone could explain how it's supposed to work. |
| ptrArgTP ptrArgs(0); // The mask of stack values that contain pointers. | ||
| ptrArgTP iptrArgs(0); // The subset of ptrArgs that are interior pointers. | ||
| ptrArgTP argHigh(0); // The current mask position that corresponds to the top of the stack. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments look correct to me
|
Yes, I did figure that the GC info was correct after I posted that. |
briansull
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
A little more info:
In PR #10782, for x86, checks for was previously Which reversed the sense (why was checking a
My preference would be the latter, as I think it provides a small amount of additional sanity checking. However, tracking |
|
Also see this definition: Note that this got mangled by the upgrade to auto-format our code. |
|
That assert on three conditions needs an explaination as to what it is checking for. |
|
On the Desktop the assert says:
|
|
I believe that the old assert was wrong once nested ebp call sites were allowed. (i.e RyuJIT only) So I believe that going with your latter choice is appropriate.
|
|
The various x86 gcstress failures all look like pre-existing issues (you can x-ref vs list in #17330). |
Right, I mention that above, but I'd like to see the arm tests pass, since it also uses the method I changed. Not sure why they're not completing |
|
@janvorli - could you have a look at this? |
This assert was formerly enabled only for the "full arg info case", i.e. when there was not a frame on x86. WIth PR dotnet#10782 the sense of the first condition was inadvertently changed. With PR dotnet#17363 the tracking of `argCnt` vs. `argHigh` was fixed so that the assert is correct for the "partial arg info" case. The first condition should be removed to make it cover the "full argo info case" as well (as before dotnet#10782).
When there are nested calls, and there is a non-ptr on the stack below the last ptr popped by the inner call, the
argHighandargCntvalues can get out of sync.