Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Optimize out write barriers for fields in ref-like structs #103503

Merged
merged 2 commits into from
Jun 27, 2024

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Jun 15, 2024

Closes #9512

Remove write barriers for fields of ref ByrefLikeStruct arguments since we know that such refs always point to stack, Example:

void Test(ref MyStruct s, object o1, object o2)
{
    s.A = o1;
    s.B = o2;
}

ref struct MyStruct
{ 
    public object A;
    public object B;
}

Codegen for Test(...) on Main:

; Method Prog:Test(byref,System.Object,System.Object):this (FullOpts)
G_M1451_IG01:
       push     rsi
       push     rbx
       mov      rbx, rdx
       mov      rsi, r9

G_M1451_IG02:
       mov      rcx, rbx
       mov      rdx, r8
       call     CORINFO_HELP_CHECKED_ASSIGN_REF
       lea      rcx, bword ptr [rbx+0x08]
       mov      rdx, rsi
       call     CORINFO_HELP_CHECKED_ASSIGN_REF
       nop      

G_M1451_IG03:
       pop      rbx
       pop      rsi
       ret      
; Total bytes of code: 35

This PR:

; Method Prog:Test(byref,System.Object,System.Object):this (FullOpts)
G_M1451_IG01:
G_M1451_IG02:
       mov      gword ptr [rdx], r8
       mov      gword ptr [rdx+0x08], r9

G_M1451_IG03:
       ret      
; Total bytes of code: 8

Similar case when ref struct is "this":

ref struct RefLike
{
    public object o;

    public void Test(object o)
    {
        this.o = o; // Main: checked-write-barrier, PR: no-barrier
    }
}

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 15, 2024
@EgorBo
Copy link
Member Author

EgorBo commented Jun 15, 2024

@MihuBot

@EgorBo
Copy link
Member Author

EgorBo commented Jun 15, 2024

@EgorBot

using BenchmarkDotNet.Attributes;
using System.Runtime.CompilerServices;

public class MyBench
{
    [Benchmark]
    public void Bench()
    {
        MyStruct s = default;
        Test(ref s, null, null);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    void Test(ref MyStruct s, object o1, object o2)
    {
        s.A = o1;
        s.B = o2;
    }

    ref struct MyStruct
    {
        public object A;
        public object B;
    }
}

@EgorBot
Copy link

EgorBot commented Jun 15, 2024

Benchmark results on Intel
BenchmarkDotNet v0.13.12, Ubuntu 22.04.4 LTS (Jammy Jellyfish)
Intel Xeon Platinum 8370C CPU 2.80GHz, 1 CPU, 16 logical and 8 physical cores
  Job-YFOKOK : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-EVRGON : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Method Toolchain Mean Error Ratio
Bench Main 4.631 ns 0.0076 ns 1.00
Bench PR 1.720 ns 0.0002 ns 0.37

BDN_Artifacts.zip

@hez2010
Copy link
Contributor

hez2010 commented Jun 15, 2024

I think we can also do this for objects allocated on the stack due to escape analysis.

@EgorBo
Copy link
Member Author

EgorBo commented Jun 15, 2024

I think this can also be done for objects allocated on the stack due to escape analysis.

We already do that where it's legal afair. Although, in some cases we have to switch to slower "checked" write barriers when "object on stack" feature is enabled since TYP_REF no longer gives the "always on heap" guarantees.

@EgorBo EgorBo marked this pull request as ready for review June 15, 2024 09:11
@EgorBo
Copy link
Member Author

EgorBo commented Jun 15, 2024

PTAL @jakobbotsch @AndyAyersMS cc @dotnet/jit-contrib (CI failures are unrelated). Diffs

@EgorBo
Copy link
Member Author

EgorBo commented Jun 15, 2024

@EgorBot -arm64 -profiler

using BenchmarkDotNet.Attributes;
using System.Runtime.CompilerServices;

public class MyBench
{
    [Benchmark]
    public void Bench()
    {
        MyStruct s = default;
        Test(ref s, null, null);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    void Test(ref MyStruct s, object o1, object o2)
    {
        s.A = o1;
        s.B = o2;
    }

    ref struct MyStruct
    {
        public object A;
        public object B;
    }
}

@EgorBot
Copy link

EgorBot commented Jun 15, 2024

Benchmark results on Arm64
BenchmarkDotNet v0.13.12, Ubuntu 22.04.4 LTS (Jammy Jellyfish)
Unknown processor
  Job-IRHBKJ : .NET 9.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD
  Job-UINBZM : .NET 9.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD
Method Toolchain Mean Error Ratio
Bench Main 3.982 ns 0.0029 ns 1.00
Bench PR 2.696 ns 0.0425 ns 0.68

BDN_Artifacts.zip

Flame graphs: Main vs PR 🔥
Hot asm: Main vs PR
Hot functions: Main vs PR

For clean perf results, make sure you have just one [Benchmark] in your app.

src/coreclr/jit/lclvars.cpp Outdated Show resolved Hide resolved
@@ -2051,6 +2051,12 @@ unsigned Compiler::compMap2ILvarNum(unsigned varNum) const
return (unsigned)ICorDebugInfo::RETBUF_ILNUM;
}

if (info.compMethodInfo->args.hasImplicitThis() && varNum == info.compThisArg)
{
// implicit this doesn't map to any IL arg
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// implicit this doesn't map to any IL arg
// implicit this doesn't map to any method signature arg

assuming this change is otherwise correct that I am not sure about

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted those changes since compMap2ILvarNum is also used in debug info generation and I don't want to break it accidentally. I put the workaround directly to my function.

Copy link
Member

@jakobbotsch jakobbotsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change has the same change in our aliasing model as #97997 that we rejected did (relevant comment: #97997 (comment)).

Basically, before this change, it is ok in our aliasing model to write code like this:

struct A { public object X; }
ref struct B { public object X; }

void Foo(B* b)
{
  ((A*)b)->X = "abc";
}

void Bar(A* a)
{
  Foo((B*)a);
}

The optimization made by this PR is only valid if we consider this kind of aliasing to be UB. I am fine with doing that if the diffs justify it, but it is something to be aware about and explicit around if we take this change.

Comment on lines 5059 to 5124
// Check if we deal with `ref ByrefLikeStruct` argument
unsigned lclNum = (unsigned)vnStore->CoercedConstantValue<ssize_t>(funcApp.m_args[0]);
assert(lclNum != BAD_VAR_NUM);
if (comp->lvaIsParameter(lclNum))
{
if (comp->info.compThisArg == lclNum)
{
// No barrier needed if current class is byref-like
// and the destination is an implicit "this"
return comp->eeIsByrefLike(comp->info.compClassHnd) ? GCInfo::WriteBarrierForm::WBF_NoBarrier
: GCInfo::WriteBarrierForm::WBF_BarrierUnknown;
}

// Find the corresponding argument index. Basically, we need to ignore the hidden args
// such as generic context, return buffer, etc..
unsigned ilArg = comp->compMap2ILvarNum(lclNum);
switch (static_cast<int>(ilArg))
{
// Ignore non-user args
case ICorDebugInfo::RETBUF_ILNUM:
case ICorDebugInfo::VARARGS_HND_ILNUM:
case ICorDebugInfo::TYPECTXT_ILNUM:
case ICorDebugInfo::UNKNOWN_ILNUM:
return GCInfo::WriteBarrierForm::WBF_BarrierUnknown;

default:
break;
}
assert(ilArg != BAD_VAR_NUM);

// A small inconsistency: various signature walking VM APIs (e.g. getArgClass)
// do not expose implicit "this" args, while compMap2ILvarNum does.
if (comp->info.compMethodInfo->args.hasImplicitThis())
{
assert(ilArg > 0);
ilArg--;
}

// First, get corresponding CORINFO_ARG_LIST_HANDLE arg data
CORINFO_ARG_LIST_HANDLE currentArg = comp->info.compMethodInfo->args.args;
for (unsigned i = 0; i < ilArg; i++)
{
currentArg = comp->info.compCompHnd->getArgNext(currentArg);
}

// Now get its type and strip `ref` so we can get CORINFO_CLASS_HANDLE of the
// underlying type
CORINFO_CLASS_HANDLE byrefCls = comp->eeGetArgClass(&comp->info.compMethodInfo->args, currentArg);
if (byrefCls != NO_CLASS_HANDLE)
{
CORINFO_CLASS_HANDLE actualCls;
CorInfoType actualClsType = comp->info.compCompHnd->getChildType(byrefCls, &actualCls);

// Only structs can be byref-like
if (actualClsType == CORINFO_TYPE_VALUECLASS)
{
assert(actualCls != NO_CLASS_HANDLE);
if (comp->eeIsByrefLike(actualCls))
{
// It's byref-like, barrier is not needed
return GCInfo::WriteBarrierForm::WBF_NoBarrier;
}
}
}
}
return GCInfo::WriteBarrierForm::WBF_BarrierUnknown;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This look potentially quite expensive. You may want to try running an SPMI collection on this PR to get more accurate diffs, there are a lot of misses.

Comment on lines 5104 to 5107
// Now get its type and strip `ref` so we can get CORINFO_CLASS_HANDLE of the
// underlying type
CORINFO_CLASS_HANDLE byrefCls = comp->eeGetArgClass(&comp->info.compMethodInfo->args, currentArg);
if (byrefCls != NO_CLASS_HANDLE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is actually checking that this is a byref to that type? Would this potentially pass for an array as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire code sequence is a bit unfortunate... it would be much cleaner/less scary looking if we had just saved enough information in LclVarDsc (or a side table, similar to lvaParameterPassingInfo) to get the full class handle for the parameter. I suppose the main problem around that is that we don't actually call getArgClass on all parameters today, only on TYP_REF parameters.

I wonder if we should just bite the bullet and call getArgClass on all parameters, or perhaps change getArgType to unconditionally return a class handle (today it only returns the class handle for value classes).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree -- we should compute this early when we first process the method and local sigs and cache it on the local var.

@jkotas
Copy link
Member

jkotas commented Jun 16, 2024

The optimization made by this PR is only valid if we consider this kind of aliasing to be UB.

I think it is fine to make this form of aliasing UB, in particular when it involves managed types.

@EgorBo
Copy link
Member Author

EgorBo commented Jun 16, 2024

if the diffs justify it

Current diffs are not too big, but I am wondering if we can, sort of, mark more internal structs as stack-only where it's allowed - I already tried that for CoreLib and found 50 structs eligable for it.

Also, perhaps, e.g. NativeAOT can do this as part of IL analysis as an optimization

@EgorBo
Copy link
Member Author

EgorBo commented Jun 16, 2024

@MihuBot

@EgorBo
Copy link
Member Author

EgorBo commented Jun 22, 2024

@EgorBot -profiler

using BenchmarkDotNet.Attributes;
using System.Runtime.CompilerServices;

public class MyBench
{
    [Benchmark]
    public void Bench()
    {
        MyStruct s = default;
        Test(ref s, null, null);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    void Test(ref MyStruct s, object o1, object o2)
    {
        s.A = o1;
        s.B = o2;
    }

    ref struct MyStruct
    {
        public object A;
        public object B;
    }
}

@EgorBot
Copy link

EgorBot commented Jun 22, 2024

Benchmark results on Intel
BenchmarkDotNet v0.13.12, Ubuntu 22.04.4 LTS (Jammy Jellyfish)
Intel Xeon Platinum 8370C CPU 2.80GHz, 1 CPU, 16 logical and 8 physical cores
  Job-VSVRXS : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-CUMSFN : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Method Toolchain Mean Error Ratio
Bench Main 4.366 ns 0.0080 ns 1.00
Bench PR 1.437 ns 0.0006 ns 0.33

BDN_Artifacts.zip

Flame graphs: Main vs PR 🔥
Hot asm: Main vs PR
Hot functions: Main vs PR

For clean perf results, make sure you have just one [Benchmark] in your app.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also be able to make the conservative VNs for these structs more aggressive (they can match the liberal vns), since they can't be modified cross-thread.

Comment on lines 5104 to 5107
// Now get its type and strip `ref` so we can get CORINFO_CLASS_HANDLE of the
// underlying type
CORINFO_CLASS_HANDLE byrefCls = comp->eeGetArgClass(&comp->info.compMethodInfo->args, currentArg);
if (byrefCls != NO_CLASS_HANDLE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree -- we should compute this early when we first process the method and local sigs and cache it on the local var.

@EgorBo
Copy link
Member Author

EgorBo commented Jun 26, 2024

@jakobbotsch @AndyAyersMS can you take a look again? I ended up reserving a bitfield in LclVarDesc

win-x64 Release layout
4>class LclVarDsc	size(72):
4>	+---
4> 0.	| var_types lvType (bitstart=0,nbits=5)
4> 0.	| lvIsParam (bitstart=5,nbits=1)
4> 0.	| lvIsRegArg (bitstart=6,nbits=1)
4> 0.	| lvFramePointerBased (bitstart=7,nbits=1)
4> 1.	| lvOnFrame (bitstart=0,nbits=1)
4> 1.	| lvRegister (bitstart=1,nbits=1)
4> 1.	| lvTracked (bitstart=2,nbits=1)
4> 1.	| lvPinned (bitstart=3,nbits=1)
4> 1.	| lvMustInit (bitstart=4,nbits=1)
4> 1.	| m_addrExposed (bitstart=5,nbits=1)
4> 1.	| lvDoNotEnregister (bitstart=6,nbits=1)
4> 1.	| lvFieldAccessed (bitstart=7,nbits=1)
4> 2.	| lvLiveInOutOfHndlr (bitstart=0,nbits=1)
4> 2.	| lvInSsa (bitstart=1,nbits=1)
4> 2.	| lvIsCSE (bitstart=2,nbits=1)
4> 2.	| lvHasLdAddrOp (bitstart=3,nbits=1)
4> 2.	| lvHasILStoreOp (bitstart=4,nbits=1)
4> 2.	| lvHasMultipleILStoreOp (bitstart=5,nbits=1)
4> 2.	| lvIsTemp (bitstart=6,nbits=1)
4> 2.	| lvIsImplicitByRef (bitstart=7,nbits=1)
4> 3.	| lvIsLastUseCopyOmissionCandidate (bitstart=0,nbits=1)
4> 3.	| lvSingleDef (bitstart=1,nbits=1)
4> 3.	| lvSingleDefRegCandidate (bitstart=2,nbits=1)
4> 3.	| lvDisqualifySingleDefRegCandidate (bitstart=3,nbits=1)
4> 3.	| lvSpillAtSingleDef (bitstart=4,nbits=1)
4> 3.	| lvHasExceptionalUsesHint (bitstart=5,nbits=1)
4> 3.	| lvQuirkToLong (bitstart=6,nbits=1)
4> 3.	| lvIsPtr (bitstart=7,nbits=1)
4> 4.	| lvIsUnsafeBuffer (bitstart=0,nbits=1)
4> 4.	| lvPromoted (bitstart=1,nbits=1)
4> 4.	| lvIsStructField (bitstart=2,nbits=1)
4> 4.	| lvContainsHoles (bitstart=3,nbits=1)
4> 4.	| lvAnySignificantPadding (bitstart=4,nbits=1)
4> 4.	| lvIsMultiRegArg (bitstart=5,nbits=1)
4> 4.	| lvIsMultiRegRet (bitstart=6,nbits=1)
4> 4.	| lvLRACandidate (bitstart=7,nbits=1)
4> 5.	| lvUsedInSIMDIntrinsic (bitstart=0,nbits=1)
4> 5.	| lvRegStruct (bitstart=1,nbits=1)
4> 5.	| lvClassIsExact (bitstart=2,nbits=1)
4> 5.	| lvImplicitlyReferenced (bitstart=3,nbits=1)
4> 5.	| lvSuppressedZeroInit (bitstart=4,nbits=1)
4> 5.	| lvHasExplicitInit (bitstart=5,nbits=1)
4> 5.	| lvIsOSRLocal (bitstart=6,nbits=1)
4> 5.	| lvIsOSRExposedLocal (bitstart=7,nbits=1)
4> 6.	| lvRedefinedInEmbeddedStatement (bitstart=0,nbits=1)
4> 6.	| lvIsNeverNegative (bitstart=1,nbits=1)
4> 6.	| lvIsSpan (bitstart=2,nbits=1)
4>  	| <alignment member> (size=1)             <-------------------- HERE!
4> 8	| lvFieldLclStart
4> 8	| lvParentLcl
4>12	| lvFieldCnt
4>13	| lvFldOffset
4>14	| lvFldOrdinal
4>15.	| lvAllDefsAreNoGc (bitstart=0,nbits=1)
4>16	| _lvRegNum
4>17	| _lvArgReg
4>18	| _lvArgInitReg
4>  	| <alignment member> (size=1)
4>20	| lvVarIndex
4>22	| m_lvRefCnt
4>24	| m_lvRefCntWtd
4>32	| lvStkOffs
4>36	| lvSlotNum
4>40	| lvClassHnd
4>48	| m_layout
4>56	| ?$SsaDefArray@VLclSsaVarDsc@@ lvPerSsaData
4>	+---

I also considered re-using lvClassHnd or m_Layout fields for it, but that required a bit more changes. Same Diffs with many missing contexts (jit-diff is the same).
I was thinking to play with "interproc" analysis and mark more structs as "never leave stack" to get bigger diffs (there are many cases), but will do it separately

Comment on lines 9427 to 9429
case ELEMENT_TYPE_BYREF:
typeHnd = ptr.GetTypeHandleThrowing(pModule, &typeContext);
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do it for all types? If not, can we at least do it for ELEMENT_TYPE_PTR as well?

The documentation of the JIT-EE API should be updated (vcTypeRet should probably be renamed too). Also, what about crossgen2/NAOT implementations of this API?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does doing this cause the operand type of the byref/pointer to be loaded eagerly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what about crossgen2/NAOT implementations of this API?

They seem to already always return it, so my test snippet is optimized there too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does doing this cause the operand type of the byref/pointer to be loaded eagerly?

How do I check that? I presume since it's a parameter it's going to be loaded anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do it for all types?

Any particular reason to do that? Isn't, like you mentioned, it might lead to redundant type load events we don't need?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I check that? I presume since it's a parameter it's going to be loaded anyway?

Maybe @jkotas can answer this question?

Any particular reason to do that? Isn't, like you mentioned, it might lead to redundant type load events we don't need?

The reason would be to make the JIT-EE API more regular in its contract, and avoid us having to call back to the EE side in the other cases too (e.g. we want this information for class types as well, and we currently do another JIT-EE call to get it). We could remove getArgClass entirely if getArgType just returned the class handle always.

Copy link
Member

@jkotas jkotas Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might lead to redundant type load events we don't need?

Once we load the type, the future attempts to load the type are cheap. There is not such a thing as "redundant type load events".

The point is to avoid loading the type in the first place. As you can see in the ELEMENT_TYPE_PTR comment below, eagerly loading types that do not need to be loaded can even break stuff. Avoiding unnecessary type loading is goodness for startup.

I think the JIT should never need to load the unmanaged pointer types. It should treat all unmanaged pointers as opaque void*.

This optimization should be based on the type provided in the stfld metadata token.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This optimization should be based on the type provided in the stfld metadata token.

Actually, this might simplify changes a lot, let me try

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This optimization should be based on the type provided in the stfld metadata token.

This would also be much more in line with how our normal modelling of aliasing and stores/loads work and effectively would mean that we aren't changing the aliasing model, so if we can do that it seems like it would be much cleaner overall.

{
CORINFO_CLASS_HANDLE clsHnd = info.compCompHnd->getArgClass(&info.compMethodInfo->args, argLst);
lvaSetClass(varDscInfo->varNum, clsHnd);
}
else if (type == CORINFO_TYPE_BYREF)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we already avoid the write barrier if this was an unmanaged pointer?

Copy link
Member Author

@EgorBo EgorBo Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean something like this:

unsafe void Test(string* a, string b)
{
    *a = b;
}

isn't it not safe to remove WB here? And, hopefully, nobody does this anyway 🙂
Currenly it does emit a checked WB.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would be unsafe to remove the write barrier there.

I think it means we should make this feature work for CORINFO_TYPE_PTR too, for parity between managed and unmanaged pointers. E.g.

    [MethodImpl(MethodImplOptions.NoInlining)]
    void Test(MyStruct* s, object o1, object o2)
    {
        s->A = o1;
        s->B = o2;
    }

    ref struct MyStruct
    {
        public object A;
        public object B;
    }

should avoid the write barrier as well, like if it was a managed pointer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE that MyStruct* s is issueing a warning in Roslyn as an attempt to take an address of a managed type. Do we want to also eagerly load all types under unmanaged pointers in hope somebody do this (I presume it's rare/never).

cc @jkotas

Copy link
Member

@jkotas jkotas Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to assume that both ref MyByRefLikeStruct and MyByRefLikeStruct* never point into GC heap here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm after is that we have consistency in optimizations like these. Switching a byref to a pointer because the target is on the stack should not come with surprising perf regressions like new write barriers.

@EgorBo
Copy link
Member Author

EgorBo commented Jun 26, 2024

@MihuBot

@EgorBo
Copy link
Member Author

EgorBo commented Jun 26, 2024

@MihuBot

@EgorBo
Copy link
Member Author

EgorBo commented Jun 26, 2024

Thanks for reviews and the suggestion, stsfld really eliminated all the complexity, no idea why I never thought about it 🙂

@EgorBo EgorBo merged commit 520af01 into dotnet:main Jun 27, 2024
101 of 107 checks passed
@EgorBo EgorBo deleted the optimize-byref-structs branch June 27, 2024 19:27
@github-actions github-actions bot locked and limited conversation to collaborators Jul 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize out write barriers for fields in ref-like structs
6 participants