-
Notifications
You must be signed in to change notification settings - Fork 5k
JIT: enhance escape analysis to understand local struct fields #113977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: enhance escape analysis to understand local struct fields #113977
Conversation
Continuation of dotnet#113141 Implement a very simplistic "field sensitive" analysis for gc structs where we model each struct as simply its gc field(s). That is, given a gc struct `G` with GC field `f`, we model ``` G.f = ... ``` as an assignment to `G` and ``` = G.f ``` as a read from `G`. Since we know `G` itself cannot escape, "escaping" of `G` means `G.f` can escape. Note this conflates the fields of `G`: either they all escape or none of them do. We could make this more granular but it's not clear that doing so is worthwhile, and it requires more up-front work to determine how to track each field's status. If the struct or struct fields are only consumed locally, we may be able to prove the gc referents don't escape. This is a subset/elaboration of dotnet#112543 that does not try and reason interprocedurally. Contributes to dotnet#104936 / dotnet#108913 Fixes dotnet#113236 (once the right inlines happen)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request enhances the JIT escape analysis by introducing field‐sensitive handling of GC structs and expanding test coverage for various scenarios involving spans and struct escapes. Key changes include the addition of new GC struct definitions and related tests, multiple span capture/escape test methods, and adjustments to the allocation verification logic in the tests.
Files not reviewed (6)
- src/coreclr/jit/compiler.h: Language not supported
- src/coreclr/jit/jitconfigvalues.h: Language not supported
- src/coreclr/jit/layout.cpp: Language not supported
- src/coreclr/jit/layout.h: Language not supported
- src/coreclr/jit/objectalloc.cpp: Language not supported
- src/coreclr/jit/objectalloc.h: Language not supported
Comments suppressed due to low confidence (2)
src/tests/JIT/opt/ObjectStackAllocation/ObjectStackAllocationTests.cs:476
- The method 'SpanCaptureArray3' is defined but never invoked by any test. Consider adding a test call (via CallTestAndVerifyAllocation) or removing the method if it is not needed.
static int SpanCaptureArray3()
src/tests/JIT/opt/ObjectStackAllocation/ObjectStackAllocationTests.cs:512
- The 'SpanEscapeArrayReturn' method is defined but not used in any test invocation. Consider adding a corresponding test or removing the method if it is unnecessary.
static int SpanEscapeArrayReturn() => SpanEscapeArrayReturnHelper()[10];
g.a[0] = "Goodbye"; | ||
g.b[0] = "Hello"; | ||
|
||
ref string[] rs = ref g.b; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of a ref assignment between g.b and later g.a in a conditional block could benefit from a clarifying comment to explain the intended behavior.
Copilot uses AI. Check for mistakes.
@dotnet/jit-contrib PTAL This refreshes #113141 based on recent BV refactoring and other changes. I have not incorporated all the feedback from that PR yet, so feel free to re-raise. SPMI should do a decent job here demonstrating impact. I also haven't yet looked through all the regressions in detail. |
{ | ||
gcType = TYPE_GC_BYREF; | ||
} | ||
SetGCPtr(startSlot + slot, gcType); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as previously.. This feels awfully specific to escape analysis to add in the generic builder type.
Current diffs. Looks like there is a bug to sort out first. |
Still hitting some asserts on x86 locally (32 bit is a stress case because we try and track all pointer-sized types). |
Allowing
On 32 bit architectures this will add a connection graph edge Now if To fix this I have extracted the var retyping as a prepass, so that when we go to write the IR all vars have their "final" types. And we now will not retype any non-GC vars. We should look into the implications of not modelling |
Not too bad -- handled in #114130 @jakobbotsch I am tempted to defer futher refactoring until this is merged, so I can do that as zero diff on top. Thoughts? |
Sounds fine to me. There's also some more refactoring that I think would be worth it, like the change to use |
if (layoutName != nullptr) | ||
{ | ||
size_t len = strlen(prefix) + strlen(layoutName) + 1; | ||
newName = m_compiler->getAllocator(CMK_DebugOnly).allocate<char>(len); | ||
sprintf_s(newName, len, "%s%s", prefix, layoutShortName); | ||
} | ||
|
||
if (layoutShortName != nullptr) | ||
{ | ||
size_t len = strlen(prefix) + strlen(layoutName) + 1; | ||
newShortName = m_compiler->getAllocator(CMK_DebugOnly).allocate<char>(len); | ||
sprintf_s(newShortName, len, "%s%s", prefix, layoutShortName); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can use printfAlloc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will do that in the follow-up.
// Do the normal compatibility check first, when possible to do so. | ||
// | ||
if ((IsCustomLayout() == layout->IsCustomLayout()) || (!HasGCPtr() && !layout->HasGCPtr())) | ||
{ | ||
const bool areCompatible = AreCompatible(this, layout); | ||
|
||
if (areCompatible) | ||
{ | ||
return true; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the logic below subsume this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to make this a superset of what AreCompatible
does but that ended up being a bit clunky since I need to guard the call to avoid asserts.
I can either just give up on that idea or else give AreCompatible
and optional early out to avoid asserting and just return false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the assert?
I would probably vote to just take the small amount of duplication. Seems simpler to reason about the operation that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
runtime/src/coreclr/jit/layout.cpp
Lines 712 to 713 in f79c098
assert(clsHnd1 != NO_CLASS_HANDLE); | |
assert(clsHnd2 != NO_CLASS_HANDLE); |
Without the guarding we can get here with a custom (byref) source and a class-based (ref) dest. The "early" out would be just before this, since in this situation an exact gc slot compare will fail; we need the more relaxed one that allows mixed types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that assert is just outdated, it should have been removed as part of #112064 since we can now have layouts without class handles but with GC pointers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I can just remove the guarding and call it unconditionally.
if (lhs == tree) | ||
{ | ||
if (rhs->IsIntegralConst(0)) | ||
{ | ||
rhs->ChangeType(newType); | ||
} | ||
} | ||
else if (rhs == tree) | ||
{ | ||
if (lhs->IsIntegralConst(0)) | ||
{ | ||
lhs->ChangeType(newType); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about other objects, like frozen objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like it could be possible (at least EQ/NE).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is a fix needed? It seems like object stack allocation can now start producing mixed TYP_REF
/TYP_BYREF
compares. Does it mean the rest of the JIT will have to be taught to handle these kind of mixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question.
I think it is reasonable to assert that relops (and maybe some other kinds of binops) should have same-typed operands. But (as you noted earlier) it also seems wrong to forcibly retype unrelated computation.
We are running into cases where the current type system can't really express what we are able to represent. A reference to an on-stack object is still an object reference, but we must use TYP_I_IMPL to describe its referent, and an operand that might refer to a stack or heap object likewise must now be TYP_BYREF.
Introducing new types doesn't seem like the right answer either, though maybe something more like subtypes of TYP_REF is appropriate? Say TYP_REF_HEAP, TYP_REF_STACK, TYP_REF_ANYWHERE, etc... But given the widespread knowledge of the basic types any change here would be very messy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think that's part of the broader IR change where object stack allocation introduces implicit conversions between TYP_REF
and TYP_BYREF
that other things in the JIT may not be expecting. I still worry we will end up with a bug tail around these mismatches.
I wonder if the "special GC segment" idea would also give us a way to avoid these typing woes in the JIT. For example, I could imagine only stack allocating and retyping accesses for objects where we can see all uses, and otherwise use the special GC segment and keep accesses TYP_REF
typed. Still comes with some questions for write barriers, though.
Should we run jitstress/gcstress for this? |
Azure Pipelines successfully started running 3 pipeline(s). |
Failures all seem unrelated. But let's rerun since some of it was cleaned up. |
/azp run runtime-coreclr jitstress, runtime-coreclr gcstress0x3-gcstress0xc, runtime-coreclr libraries-pgo |
Azure Pipelines successfully started running 3 pipeline(s). |
|
@jakobbotsch this is the "simpler" version where we don't try and track any scalar locals that are address-taken. |
/azp run runtime-coreclr jitstress, runtime-coreclr gcstress0x3-gcstress0xc, runtime-coreclr libraries-pgo |
Azure Pipelines successfully started running 3 pipeline(s). |
jitstress failure is timeout |
@jakobbotsch this latest has the &scalar fixes in it. |
// Change the layout to one that may not be compatible. | ||
void ChangeLayout(ClassLayout* layout) | ||
{ | ||
assert(varTypeIsStruct(lvType)); | ||
m_layout = layout; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace GrowBlockLayout
with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure ... neither is widely used, both likely get inlined and evaporate in release.
// layout1 - the first layout (copy destination) | ||
// layout2 - the second layout (copy source) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undo this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will address this in the follow-up.
// Is this a store to a field of a local struct...? | ||
// | ||
if (parent->OperIs(GT_STOREIND)) | ||
{ | ||
// Are we storing to a local field? | ||
// | ||
if (addr->OperIs(GT_FIELD_ADDR)) | ||
{ | ||
// Simple check for which local. | ||
// | ||
GenTree* const base = addr->AsOp()->gtGetOp1(); | ||
|
||
if (base->OperIs(GT_LCL_ADDR)) | ||
{ | ||
unsigned const dstLclNum = base->AsLclVarCommon()->GetLclNum(); | ||
LclVarDsc* const dstDsc = comp->lvaGetDesc(dstLclNum); | ||
|
||
if (IsTrackedLocal(dstLclNum)) | ||
{ | ||
JITDUMP("... local.field store\n"); | ||
// Add an edge to the connection graph. | ||
AddConnGraphEdge(dstLclNum, lclNum); | ||
canLclVarEscapeViaParentStack = false; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you were going with the more conservative approach, but evidently not completely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conservative for non-struct locals, but not for struct locals.
If it turns out we have a field addr on a non-struct local, the lcl addr will have caused that local to escape.
Will fields of a stack-allocated ref object be considered as "fields of local structs" here as well? |
No, this is strictly for local struct fields. For fields of stack allocated objects we can do something similar, but we'll need to distinguish between the object and its fields and allocate separate connection graph nodes for the fields. And to do that we need to know how many allocation sites there are up front, which either entails an IL traversal prepass or else flagging the newobj/newarr temps during importation so we can count them cheaply when we get into the object allocation phase. Jakob and I have been kicking around the idea that we should run parts of local morph here to put local addresses into simpler forms so perhaps that gives us motivation to look at running a prepass... |
Continuation of #113141
Implement a very simplistic "field sensitive" analysis for gc structs where we model each struct as simply its gc field(s).
That is, given a gc struct
G
with GC fieldf
, we modelas an assignment to
G
andas a read from
G
. Since we knowG
itself cannot escape, "escaping" ofG
meansG.f
can escape.Note this conflates the fields of
G
: either they all escape or none of them do. We could make this more granular but it's not clear that doing so is worthwhile, and it requires more up-front work to determine how to track each field's status.If the struct or struct fields are only consumed locally, we may be able to prove the gc referents don't escape.
This is a subset/elaboration of #112543 that does not try and reason interprocedurally.
Contributes to #104936 / #108913
Fixes #113236 (once the right inlines happen)