Skip to content

JIT: enhance escape analysis to understand local struct fields #113977

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

AndyAyersMS
Copy link
Member

Continuation of #113141

Implement a very simplistic "field sensitive" analysis for gc structs where we model each struct as simply its gc field(s).

That is, given a gc struct G with GC field f, we model

G.f = ...

as an assignment to G and

   = G.f

as a read from G. Since we know G itself cannot escape, "escaping" of G means G.f can escape.

Note this conflates the fields of G: either they all escape or none of them do. We could make this more granular but it's not clear that doing so is worthwhile, and it requires more up-front work to determine how to track each field's status.

If the struct or struct fields are only consumed locally, we may be able to prove the gc referents don't escape.

This is a subset/elaboration of #112543 that does not try and reason interprocedurally.

Contributes to #104936 / #108913
Fixes #113236 (once the right inlines happen)

Continuation of dotnet#113141

Implement a very simplistic "field sensitive" analysis for gc structs where we model each struct as simply its gc field(s).

That is, given a gc struct `G` with GC field `f`, we model
```
G.f = ...
```
as an assignment to `G` and
```
   = G.f
```
as a read from `G`. Since we know `G` itself cannot escape, "escaping" of `G` means `G.f` can escape.

Note this conflates the fields of `G`: either they all escape or none of them do. We could make this more granular
but it's not clear that doing so is worthwhile, and it requires more up-front work to determine how to track each field's status.

If the struct or struct fields are only consumed locally, we may be able to prove the gc referents don't escape.

This is a subset/elaboration of dotnet#112543 that does not try and reason interprocedurally.

Contributes to dotnet#104936 / dotnet#108913
Fixes dotnet#113236 (once the right inlines happen)
@Copilot Copilot AI review requested due to automatic review settings March 27, 2025 17:08
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 27, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request enhances the JIT escape analysis by introducing field‐sensitive handling of GC structs and expanding test coverage for various scenarios involving spans and struct escapes. Key changes include the addition of new GC struct definitions and related tests, multiple span capture/escape test methods, and adjustments to the allocation verification logic in the tests.

Files not reviewed (6)
  • src/coreclr/jit/compiler.h: Language not supported
  • src/coreclr/jit/jitconfigvalues.h: Language not supported
  • src/coreclr/jit/layout.cpp: Language not supported
  • src/coreclr/jit/layout.h: Language not supported
  • src/coreclr/jit/objectalloc.cpp: Language not supported
  • src/coreclr/jit/objectalloc.h: Language not supported
Comments suppressed due to low confidence (2)

src/tests/JIT/opt/ObjectStackAllocation/ObjectStackAllocationTests.cs:476

  • The method 'SpanCaptureArray3' is defined but never invoked by any test. Consider adding a test call (via CallTestAndVerifyAllocation) or removing the method if it is not needed.
static int SpanCaptureArray3()

src/tests/JIT/opt/ObjectStackAllocation/ObjectStackAllocationTests.cs:512

  • The 'SpanEscapeArrayReturn' method is defined but not used in any test invocation. Consider adding a corresponding test or removing the method if it is unnecessary.
static int SpanEscapeArrayReturn() => SpanEscapeArrayReturnHelper()[10];

g.a[0] = "Goodbye";
g.b[0] = "Hello";

ref string[] rs = ref g.b;
Copy link
Preview

Copilot AI Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of a ref assignment between g.b and later g.a in a conditional block could benefit from a clarifying comment to explain the intended behavior.

Copilot uses AI. Check for mistakes.

@AndyAyersMS
Copy link
Member Author

@dotnet/jit-contrib PTAL

This refreshes #113141 based on recent BV refactoring and other changes. I have not incorporated all the feedback from that PR yet, so feel free to re-raise.

SPMI should do a decent job here demonstrating impact. I also haven't yet looked through all the regressions in detail.

{
gcType = TYPE_GC_BYREF;
}
SetGCPtr(startSlot + slot, gcType);
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as previously.. This feels awfully specific to escape analysis to add in the generic builder type.

@AndyAyersMS
Copy link
Member Author

Current diffs.

Looks like there is a bug to sort out first.

@AndyAyersMS
Copy link
Member Author

Still hitting some asserts on x86 locally (32 bit is a stress case because we try and track all pointer-sized types).

@AndyAyersMS
Copy link
Member Author

Allowing TYP_I_IMPL in the connection graph creates some potential confusion. EG if we have

[000458] DA--G------                         *  STORE_LCL_VAR int    V44 tmp43        
[000457] n---G------                         \--*  IND       int   
[000456] -----------                            \--*  FIELD_ADDR byref  System.ReadOnlySpan`1[int]:_length
[000455] -----------                               \--*  LCL_ADDR  byref  V43 tmp42        [+0]

On 32 bit architectures this will add a connection graph edge V44 <- V43 since V43 is a struct with a GC field and we coalesce fields.

Now if V43's GC field ends up being assigned a mixture of stack and heap allocations, we will retype the GC field in V43 as TYP_BYREF. And then given the connection above, we would deduce V44 is also TYP_BYREF and would try and rewrite V44 as TYP_BYREF, which is clearly nonsensical.

To fix this I have extracted the var retyping as a prepass, so that when we go to write the IR all vars have their "final" types. And we now will not retype any non-GC vars.

We should look into the implications of not modelling TYP_I_IMPL.

@AndyAyersMS
Copy link
Member Author

We should look into the implications of not modelling TYP_I_IMPL.

Not too bad -- handled in #114130

@jakobbotsch I am tempted to defer futher refactoring until this is merged, so I can do that as zero diff on top. Thoughts?

@jakobbotsch
Copy link
Member

@jakobbotsch I am tempted to defer futher refactoring until this is merged, so I can do that as zero diff on top. Thoughts?

Sounds fine to me. There's also some more refactoring that I think would be worth it, like the change to use ClassLayout more widely and the change to get rid of getTypeForBoxOnStack in favor of a custom layout.

Comment on lines +468 to +480
if (layoutName != nullptr)
{
size_t len = strlen(prefix) + strlen(layoutName) + 1;
newName = m_compiler->getAllocator(CMK_DebugOnly).allocate<char>(len);
sprintf_s(newName, len, "%s%s", prefix, layoutShortName);
}

if (layoutShortName != nullptr)
{
size_t len = strlen(prefix) + strlen(layoutName) + 1;
newShortName = m_compiler->getAllocator(CMK_DebugOnly).allocate<char>(len);
sprintf_s(newShortName, len, "%s%s", prefix, layoutShortName);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can use printfAlloc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will do that in the follow-up.

Comment on lines +818 to +828
// Do the normal compatibility check first, when possible to do so.
//
if ((IsCustomLayout() == layout->IsCustomLayout()) || (!HasGCPtr() && !layout->HasGCPtr()))
{
const bool areCompatible = AreCompatible(this, layout);

if (areCompatible)
{
return true;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the logic below subsume this check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to make this a superset of what AreCompatible does but that ended up being a bit clunky since I need to guard the call to avoid asserts.

I can either just give up on that idea or else give AreCompatible and optional early out to avoid asserting and just return false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the assert?

I would probably vote to just take the small amount of duplication. Seems simpler to reason about the operation that way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert(clsHnd1 != NO_CLASS_HANDLE);
assert(clsHnd2 != NO_CLASS_HANDLE);

Without the guarding we can get here with a custom (byref) source and a class-based (ref) dest. The "early" out would be just before this, since in this situation an exact gc slot compare will fail; we need the more relaxed one that allows mixed types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that assert is just outdated, it should have been removed as part of #112064 since we can now have layouts without class handles but with GC pointers

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I can just remove the guarding and call it unconditionally.

Comment on lines +1924 to +1937
if (lhs == tree)
{
if (rhs->IsIntegralConst(0))
{
rhs->ChangeType(newType);
}
}
else if (rhs == tree)
{
if (lhs->IsIntegralConst(0))
{
lhs->ChangeType(newType);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about other objects, like frozen objects?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it could be possible (at least EQ/NE).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is a fix needed? It seems like object stack allocation can now start producing mixed TYP_REF/TYP_BYREF compares. Does it mean the rest of the JIT will have to be taught to handle these kind of mixes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.

I think it is reasonable to assert that relops (and maybe some other kinds of binops) should have same-typed operands. But (as you noted earlier) it also seems wrong to forcibly retype unrelated computation.

We are running into cases where the current type system can't really express what we are able to represent. A reference to an on-stack object is still an object reference, but we must use TYP_I_IMPL to describe its referent, and an operand that might refer to a stack or heap object likewise must now be TYP_BYREF.

Introducing new types doesn't seem like the right answer either, though maybe something more like subtypes of TYP_REF is appropriate? Say TYP_REF_HEAP, TYP_REF_STACK, TYP_REF_ANYWHERE, etc... But given the widespread knowledge of the basic types any change here would be very messy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that's part of the broader IR change where object stack allocation introduces implicit conversions between TYP_REF and TYP_BYREF that other things in the JIT may not be expecting. I still worry we will end up with a bug tail around these mismatches.

I wonder if the "special GC segment" idea would also give us a way to avoid these typing woes in the JIT. For example, I could imagine only stack allocating and retyping accesses for objects where we can see all uses, and otherwise use the special GC segment and keep accesses TYP_REF typed. Still comes with some questions for write barriers, though.

@jakobbotsch
Copy link
Member

Should we run jitstress/gcstress for this?

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@AndyAyersMS
Copy link
Member Author

Failures all seem unrelated. But let's rerun since some of it was cleaned up.

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr gcstress0x3-gcstress0xc, runtime-coreclr libraries-pgo

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@AndyAyersMS
Copy link
Member Author

@AndyAyersMS
Copy link
Member Author

@jakobbotsch this is the "simpler" version where we don't try and track any scalar locals that are address-taken.

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr gcstress0x3-gcstress0xc, runtime-coreclr libraries-pgo

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@AndyAyersMS
Copy link
Member Author

jitstress failure is timeout
osx gcstress likely also will timeout.

@AndyAyersMS
Copy link
Member Author

@jakobbotsch this latest has the &scalar fixes in it.

@jakobbotsch jakobbotsch self-requested a review April 14, 2025 14:35
Comment on lines +1039 to +1044
// Change the layout to one that may not be compatible.
void ChangeLayout(ClassLayout* layout)
{
assert(varTypeIsStruct(lvType));
m_layout = layout;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace GrowBlockLayout with this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure ... neither is widely used, both likely get inlined and evaporate in release.

Comment on lines +711 to +712
// layout1 - the first layout (copy destination)
// layout2 - the second layout (copy source)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undo this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will address this in the follow-up.

Comment on lines +1739 to +1764
// Is this a store to a field of a local struct...?
//
if (parent->OperIs(GT_STOREIND))
{
// Are we storing to a local field?
//
if (addr->OperIs(GT_FIELD_ADDR))
{
// Simple check for which local.
//
GenTree* const base = addr->AsOp()->gtGetOp1();

if (base->OperIs(GT_LCL_ADDR))
{
unsigned const dstLclNum = base->AsLclVarCommon()->GetLclNum();
LclVarDsc* const dstDsc = comp->lvaGetDesc(dstLclNum);

if (IsTrackedLocal(dstLclNum))
{
JITDUMP("... local.field store\n");
// Add an edge to the connection graph.
AddConnGraphEdge(dstLclNum, lclNum);
canLclVarEscapeViaParentStack = false;
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you were going with the more conservative approach, but evidently not completely?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conservative for non-struct locals, but not for struct locals.

If it turns out we have a field addr on a non-struct local, the lcl addr will have caused that local to escape.

@AndyAyersMS AndyAyersMS merged commit 212b120 into dotnet:main Apr 15, 2025
150 of 158 checks passed
@hez2010
Copy link
Contributor

hez2010 commented Apr 16, 2025

Will fields of a stack-allocated ref object be considered as "fields of local structs" here as well?

@AndyAyersMS
Copy link
Member Author

Will fields of a stack-allocated ref object be considered as "fields of local structs" here as well?

No, this is strictly for local struct fields.

For fields of stack allocated objects we can do something similar, but we'll need to distinguish between the object and its fields and allocate separate connection graph nodes for the fields.

And to do that we need to know how many allocation sites there are up front, which either entails an IL traversal prepass or else flagging the newobj/newarr temps during importation so we can count them cheaply when we get into the object allocation phase.

Jakob and I have been kicking around the idea that we should run parts of local morph here to put local addresses into simpler forms so perhaps that gives us motivation to look at running a prepass...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Escape Analysis: Elide an array allocation in BitConverter.GetBytes pattern
4 participants