-
Notifications
You must be signed in to change notification settings - Fork 5.2k
[NativeAOT] Initialize COM before initializing FLS slot. #113194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
// The thread was already initialized, so it is already attached | ||
if (pAttachingThread->IsDetached()) | ||
{ | ||
ASSERT_UNCONDITIONALLY("The thread has been detached already"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make the error message to be identical to the CoreCLR one? https://github.com/dotnet/runtime/pull/112809/files#diff-63935b2e5a324d25feff725f83200eaa716c69fdef7529d2bd3cfa35620105aeR1776
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make the error message to be identical to the CoreCLR one?
Yes, we can do that.
Technically, a NativeAOT thread does not have a "destroyed" state. There is "detached" when we can't run managed code any more and then there is "thread is gone, because native TLS is gone". It seems we do not need a state in between and things get simpler without it.
For the purpose of error message the difference is too subtle though.
I'll make this "Ready for review" because the change is basically complete. What is left is to construct a repro that fails with no fix and does not fail after. Unlike with Core we do not have an existing scenario. |
I have confirmed that the problematic scenario (running managed code on detached thread) is reachable on baseline NativeAOT SDK and that this fix prevents it. Here is my repro reduced to only what is necessary: using System.Collections;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
namespace ConsoleApp57
{
internal class Program
{
[DllImport("oleaut32.dll", CharSet = CharSet.Unicode, SetLastError = true)]
private static extern int SetErrorInfo(uint dwReserved, IntPtr perrinfo);
static void Main(string[] args)
{
Thread thr1 = new Thread(Test);
thr1.Start();
System.Console.ReadLine();
}
private static void Test()
{
var cw = new ComWrappersImpl();
IntPtr iUnknown = cw.GetOrCreateComInterfaceForObject(new object(), CreateComInterfaceFlags.None);
SetErrorInfo(0, iUnknown);
System.Console.WriteLine(Marshal.Release(iUnknown));
}
}
}
public class ComWrappersImpl : ComWrappers
{
public const string IID_IErrorInfo = "1CF2B120-547D-101B-8E65-08002B2BD119";
protected unsafe override ComInterfaceEntry* ComputeVtables(object obj, CreateComInterfaceFlags flags, out int count)
{
IntPtr fpQueryInterface = default;
IntPtr fpAddRef = default;
IntPtr fpRelease = default;
ComWrappers.GetIUnknownImpl(out fpQueryInterface, out fpAddRef, out fpRelease);
var vtblRaw = (IntPtr*)RuntimeHelpers.AllocateTypeAssociatedMemory(typeof(ComWrappersImpl), IntPtr.Size * 3);
vtblRaw[0] = fpQueryInterface;
vtblRaw[1] = fpAddRef;
vtblRaw[2] = fpRelease;
var entryRaw = (ComInterfaceEntry*)RuntimeHelpers.AllocateTypeAssociatedMemory(typeof(ComWrappersImpl), sizeof(ComInterfaceEntry));
entryRaw->IID = new Guid(IID_IErrorInfo);
entryRaw->Vtable = (IntPtr)vtblRaw;
count = 1;
return entryRaw;
}
protected override object CreateObject(IntPtr externalComObject, CreateObjectFlags flag)
=> throw new NotImplementedException();
protected override void ReleaseObjects(IEnumerable objects)
=> throw new NotImplementedException();
} If I run this compiled with baseline SDK, COM FLS callback fires after ours and runs managed In this PR we are introducing a FailFast to catch such scenario and deterministically fail. If I run same repro with the current PR, but with the call to
And if I run the repro sample with complete fix, I see no crash. With the complete fix COM calls the Release while the thread is still attached and has no problems with running managed code. |
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you!
{ | ||
// The current fiber is the home fiber of a thread, so the thread is shutting down | ||
RuntimeThreadShutdown(lpFlsData); | ||
ASSERT_UNCONDITIONALLY("Could not allocate an FLS slot."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to return false from the runtime init method when this happens, like before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to return false from the runtime init method when this happens?
We cannot handle a failure to initialize FLS slot, so we have to bail one way or another.
But, yes, we could handle this failure the same way as an OOM on init path - no failfast, just return from InitializeGC
with a failure/false.
uint32_t WINAPI FinalizerStart(void* pContext) | ||
{ | ||
#ifdef TARGET_WINDOWS | ||
g_FlsInitSucceeded = PalInitComAndFlsSlot(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to exit the finalizer thread when this fails? Otherwise, we call ThreadStore::AttachCurrentThread
below that is going to call FlsSetValue
on uninitialized g_flsIndex
and things will just go bad from there.
|
||
// Making finalizer thread MTA early ensures that COM is initialized before we initialize our thread | ||
// termination callback. | ||
CoInitializeEx(NULL, COINIT_MULTITHREADED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can check for bad HRESULT from CoInitializeEx as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can COM be somehow unavailable to the app? Like turned off for the platform or the app?
I think in such case we'd want to ignore a failure and continue, since we do not actually require that init succeeded, just want to be sure it succeeds early enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
COM is part of Windows. Windows without COM is not a thing.
The current COM init logic propagates the errors from CoInitialize:
Lines 320 to 335 in 8849480
int hr = Interop.Ole32.CoInitializeEx(IntPtr.Zero, | |
(state == ApartmentState.STA) ? Interop.Ole32.COINIT_APARTMENTTHREADED | |
: Interop.Ole32.COINIT_MULTITHREADED); | |
#endif | |
if (hr < 0) | |
{ | |
// RPC_E_CHANGED_MODE indicates this thread has been already initialized with a different | |
// concurrency model. We stay away and let whoever else initialized the COM to be in control. | |
if (hr == HResults.RPC_E_CHANGED_MODE) | |
return; | |
// CoInitializeEx returns E_NOTIMPL on Windows Nano Server for STA | |
if (hr == HResults.E_NOTIMPL) | |
throw new PlatformNotSupportedException(); | |
throw new OutOfMemoryException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I think forbidding COM (or in particular COINIT_MULTITHREADED mode) might be too breaking, so it will always work.
Thanks! |
Fixes: #113040
Order FLS slot initialization after ensuring that COM is initialized. This is to ensure that deinitialization of COM happen before detachment of a thread - as COM deinitialization may want to run managed code.