Skip to content

GC polling in unboxing JIT helpers #32353

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 16, 2020
Merged

GC polling in unboxing JIT helpers #32353

merged 7 commits into from
Feb 16, 2020

Conversation

VSadov
Copy link
Member

@VSadov VSadov commented Feb 15, 2020

  • moved fast part of Unbox to a managed helper
  • added GC polling to the native helper that handles rare cases of Unbox.
  • added GC polling to JIT_Unbox_Nullable native helpers
    (it does not look like it is worth the trouble to move any parts of this one to managed)

@VSadov VSadov requested a review from jkotas February 15, 2020 01:49
@VSadov
Copy link
Member Author

VSadov commented Feb 15, 2020

Initially I assumed that JIT inlines Unbox in most cases and the change would be mostly for the purpose of completeness and predictability.

Turns out Unbox helper calls are common because JIT uses helpers when compiling with MinOpts.

@jkotas
Copy link
Member

jkotas commented Feb 15, 2020

Is there a targeted microbenchmark that shows the improved GC pause times in pathological cases?

@jkotas jkotas added the tenet-performance Performance related issue label Feb 15, 2020
[DebuggerStepThrough]
private static ref byte Unbox(void* toTypeHnd, object obj)
{
// this will NRE if obj is null, as expected.
Copy link
Member

@jkotas jkotas Feb 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// this will NRE if obj is null, as expected.
// this will throw NullReferenceException if obj is null, as expected.

The number of people with a decoder ring for NRE is pretty small... #Resolved

(pMT1->IsEnum() || pMT1->IsTruePrimitive()) &&
(pMT2->IsEnum() || pMT2->IsTruePrimitive()))
{
// we allow enums and their primtive type to be interchangable
Copy link
Member

@jkotas jkotas Feb 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// we allow enums and their primtive type to be interchangable
// we allow enums and their primitive type to be interchangeable
``` #Resolved

OBJECTREF objRef = ObjectToOBJECTREF(obj);
HELPER_METHOD_FRAME_BEGIN_RET_1(objRef);
if (TypeHandle(type).IsEquivalentTo(objRef->GetTypeHandle()))

if (pMT1->GetInternalCorElementType() == pMT2->GetInternalCorElementType() &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this inside HELPER_METHOD_FRAME will regress performance for enums...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We return a ref to the boxed value. If I want to poll, I need to protect the box. Is there a way to protect without a frame?


In reply to: 379777249 [](ancestors = 379777249)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to do this is to use fast path without HMF when g_TrapReturningThreads.LoadWithoutBarrier() is zero, and use polling show path with HMF otherwise. It would duplicate a bit of the logic, but it should not be a huge problem for the unbox helper.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to do this would be to introduce FC_GC_POLL_EX variant for byrefs, but that would be quite a bit of plumbing. I do not think it is worth it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. A flag check would not trash the refs and should be much cheaper.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With suggested changes (g_TrapReturningThreads and GetVerifierCorElementType) the enum case is slightly faster than originally.

That is - as long as g_TrapReturningThreads is checked after comparing GetVerifierCorElementType.
Not sure why the order of compares matters. It is nearly the same codegen. I am guessing the rarely taken branch is more predictable this way.

@VSadov
Copy link
Member Author

VSadov commented Feb 15, 2020

There are ways to ensure that a helper is called (unbox enum from boxed underlying type, prevent inlining, prevent rejitting), so I assumed a similar repro as with casts can be constructed. I’ll try making one.

@VSadov
Copy link
Member Author

VSadov commented Feb 15, 2020

The following reports pauses > 100 ms. Sometimes up to half second on my machine.
After the fix it is nearly always 0 ms and nothing is printed.

using System;
using System.Diagnostics;
using System.Threading;

class Program
{
    enum E1
    {
        e
    }

    static object o = E1.e;

    static void Work()
    {
        for(; ;)
        {
            int x = (int)o + (int)o + (int)o + (int)o + (int)o + (int)o + (int)o + (int)o;
        }
    }

    static void Main()
    {
        new Thread(() =>
        {
            Work();
        }).Start();

        for (; ; ) 
        { 
            var sw = Stopwatch.StartNew(); 
            GC.Collect(); 
            sw.Stop();

            if (sw.ElapsedMilliseconds > 100)
            {
                Console.WriteLine(sw.ElapsedMilliseconds);
            }

            Thread.Sleep(1); 
        }
    }
}
109
282
207
109
767
175
417
130
230
131
503
786
120
131
273
240
130
340
131
197
131

OBJECTREF objRef = ObjectToOBJECTREF(obj);
HELPER_METHOD_FRAME_BEGIN_RET_1(objRef);
if (TypeHandle(type).IsEquivalentTo(objRef->GetTypeHandle()))

if (pMT1->GetInternalCorElementType() == pMT2->GetInternalCorElementType() &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be also able to make the enum case faster by using GetVerifierCorElementType. It should make the IsEnum/IsTruePrimitiveType conditions unnecessary.

}
// we allow enums and their primitive type to be interchangable.
// if suspension is requested, defer to the framed helper.
if (pMT1->GetVerifierCorElementType() == pMT2->GetVerifierCorElementType() &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs to check that the CorElementType is primitive (ie != ELEMENTTYPE_VALUETYPE?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the fastest check is

    if (pMT1->GetInternalCorElementType() == pMT2->GetInternalCorElementType() &&
        (pMT1->IsEnum() || pMT1->IsTruePrimitive()) &&
        (pMT2->IsEnum() || pMT2->IsTruePrimitive()))
    {

Copy link
Member

@jkotas jkotas Feb 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not this?

CorElementType corElementType = pMT1->GetVerifierCorElementType();
if (corElementType != ELEMENT_TYPE_VALUETYPE &&  corElementType == pMT2->GetVerifierCorElementType())
{

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is yet a bit faster. :-)

@@ -8698,7 +8698,7 @@ void Interpreter::UnboxAny()
CorElementType type1 = pMT1->GetInternalCorElementType();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change it here to GetVerifierCorElementType as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the interpreter too. Do we have any tests for that?

}
// we allow enums and their primitive type to be interchangable.
// if suspension is requested, defer to the framed helper.
CorElementType corElementType = pMT1->GetInternalCorElementType();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CorElementType corElementType = pMT1->GetInternalCorElementType();
CorElementType corElementType = pMT1->GetVerifierCorElementType();

Multiple places. I had the same bug in my comment and fixed it, but I guess you have seen the original buggy version of the comment.

Copy link
Member Author

@VSadov VSadov Feb 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed that GetInternalCorElementType could be actually correct here and cheaper than GetVerifierCorElementType

  • When we get here, since this is Unbox, the target type is a struct, enum or a true primitive
  • We are not interested in structs, since they either should have matched already or need an equivalency check.
  • For others we just need the GetInternalCorElementType to match.

From what I see the subtleness is about types that are primitives, but not true primitives. I am not sure what they are (byrefs?, void?), but I guess we should not see them here.


In reply to: 379897103 [](ancestors = 379897103)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

types that are primitives, but not true primitives

These are structs that wrap a primitive type on x86.

In the early days of .NET Framework, we had a performance problem with structs that wrap a primitive type (there is a lot of struct like this). The JIT had close to none optimizations for structs at that time, and fixing it the right way was very hard. So we invented this quick hack: Treat them as the primitive type that they are wrapping. This is the type that GetInternalCorElementType returns.

GetInternalCorElementType should be only ever used in places that deal with calling convention.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you seeing GetVerifierCorElementType to be more expensive than GetInternalCorElementType , you should be able to make GetVerifierCorElementType just return GetInternalCorElementType everywhere except x86. The two should return identical element type everywhere except x86.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Yes, GetVerifierCorElementType is noticeably more expensive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, we are using the fake cor element types on !x86 as well in a few cases:

else if (strcmp(name, g_RuntimeArgumentHandleName) == 0)
. So using GetInternalCorElementType on !x86 may not be a good idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried

    if (pMT1->GetInternalCorElementType() == pMT2->GetInternalCorElementType() &&
            (pMT1->IsEnum() || pMT1->IsTruePrimitive()) &&
            (pMT2->IsEnum() || pMT2->IsTruePrimitive()) &&

vs

    CorElementType corElementType = pMT1->GetVerifierCorElementType();
    if (corElementType != ELEMENT_TYPE_VALUETYPE &&  corElementType == pMT2->GetVerifierCorElementType() &&

The first variant is faster. (20%-30% on an Unbox microbenchmark)
I think the checks are equivalent, unless I am missing something.

@@ -8534,7 +8534,7 @@ void Interpreter::Unbox()
CorElementType type1 = pMT1->GetInternalCorElementType();
CorElementType type2 = pMT2->GetInternalCorElementType();

// we allow enums and their primtive type to be interchangable
// we allow enums and their primitive type to be interchangable
if (type1 == type2)
{
if ((pMT1->IsEnum() || pMT1->IsTruePrimitive()) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This place too?

}

// Even less common cases (type equivalence) go to a framed helper.
ENDFORBIDGC();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to keep ENDFORBIDGC(); before forwarding the helper call. It will fix some of the test failures hit by the CI.

}

HELPER_METHOD_POLL();
Copy link
Member

@jkotas jkotas Feb 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Result is not protected here. Move the POLL right after HELPER_METHOD_FRAME_BEGIN_RET_1 #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or protect the result, but moving the poll earlier is simpler.


In reply to: 379897330 [](ancestors = 379897330)

@VSadov VSadov merged commit dfa5d03 into dotnet:master Feb 16, 2020
@VSadov VSadov deleted the unbox branch February 16, 2020 23:42
@VSadov
Copy link
Member Author

VSadov commented Feb 16, 2020

Thanks!!!

@ghost ghost locked as resolved and limited conversation to collaborators Dec 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants