-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NativeAOT] Adding CET support #102680
[NativeAOT] Adding CET support #102680
Conversation
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas |
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
// NOTE: When thunks are present, the thunk sequence may report a conservative GC reporting | ||
// lower bound that must be applied when processing the managed frame. | ||
|
||
ReturnAddressCategory category = CategorizeUnadjustedReturnAddress(m_ControlPC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how this is related to the CET stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InternalInit is supposed to leave the iterator in managed code. Until this change the InternalInit that takes NATIVE_CONTEXT
was always starting from a location that is already in managed code. (unlike the one that takes PInvokeTransitionFrame
)
Now, when we receive hijack exception, we pop to the caller and perform suspension in the caller's context.
Typically managed method that is not a reverse PInvoke must only be called by another managed method, so that would require no changes here. There are rare cases though in NativeAOT when ordinary managed methods are called from asm thunks. So it becomes possible for this code to see a location in a thunk. That is handled in the same way as in the other InternalInit method. (i.e. via UnwindNonEHThunkSequence
)
Perhaps I should unify this code into a helper to not duplicate it in both InternalInit methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've unified the adjustment for asm thunks
Turns out my CET-compatible machine already enables usermode shadow stacks for executables that are CET-compatible. (i.e. msedge.exe, conhost.exe, ...). No extra steps necessary. This probably came with some OS update. If we pass The next step is to make sure we are not relying on some compat/fixup behavior in exception handling, and whether we need |
@@ -99,6 +99,10 @@ The .NET Foundation licenses this file to you under the MIT license. | |||
<LinkerArg Condition="'$(OutputType)' == 'WinExe' or '$(OutputType)' == 'Exe'" Include="/STACK:$(IlcDefaultStackSize)" /> | |||
<!-- Do not warn if someone declares UnmanagedCallersOnly with an entrypoint of 'DllGetClassObject' and similar --> | |||
<LinkerArg Include="/IGNORE:4104" /> | |||
<!-- Opt into CETCOMPAT by default. --> | |||
<LinkerArg Condition="'$(ShadowStacksCompatible)' != 'false'" Include="/CETCOMPAT" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should call this property CETCompat
. It is what it is called in .vcxproj files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think we may want make it x64-only, just in case it starts doing something for x86.
According to docs /CETCOMPAT
is accepted for x86 and currently does nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may want make it x64-only,
Sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I will start doing something for arm64 eventually.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also /CETCOMPAT
seems not accepted on arm64. Not yet, I guess.
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
if (translateToManagedException)
{
pExPtrs->ContextRecord->SetIp(PCODEToPINSTR((PCODE)&RhpThrowHwEx));
pExPtrs->ContextRecord->SetArg0Reg(faultCode);
pExPtrs->ContextRecord->SetArg1Reg(faultingIP);
return EXCEPTION_CONTINUE_EXECUTION;
} I think |
The other interesting part is that For now I will assume that we will be enabling |
73f484b
to
051c2d7
Compare
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
src/coreclr/vm/gcinfodecoder.cpp
Outdated
@@ -786,7 +786,7 @@ bool GcInfoDecoder::EnumerateLiveSlots( | |||
|
|||
UINT32 normStart = lastNormStop + normStartDelta; | |||
UINT32 normStop = normStart + normStopDelta; | |||
if(normBreakOffset >= normStart && normBreakOffset < normStop) | |||
if(normBreakOffset >= normStart && normBreakOffset <= normStop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an ugly hack to workaround a rare case where a call to a hijacked method is immediately followed by a NoGC region thus the caller IP is outside of interruptible code.
It is always safe to pop to the caller of a regular managed method and start stackwalk at the call site, but when we do that as a part of hijack and the above condition is present, it triggers asserts if caller is fully-interruptible because we end up trying a stackwalk in a NoGC region.
A proper fix for this condition is in PR: #103540
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
I think the technical part of this PR is complete - we can enable CET for NativeAOT. What remains is to settle on enablement policy and how to test this. What I can suggest for enablement:
It may be possible that some environments will be configured to treat this as "not enough CET" and require EHCONT as well.
Basically:
I am basically optimizing for minimum number of knobs and assume that CET without EHCONT could be a good default. |
For the testing part - it looks like we have a bunch of tests that are incompatible with CFG. We need to do something about that - either make tests compatible with CFG (where possible) or conditionally disable some tests/scenarios if can't made compatible. |
Sounds good to me. |
At least some of these failures look like product bugs to me. We want to fix them - CFG is a supported config for naot. Have you looked into where the problems are?
I agree that we can use more outer loops tests with CFG enabled given these results. |
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you!
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Thanks!!! |
Fixes: #101942
STATUS_RETURN_ADDRESS_HIJACK_ATTEMPT
/CETCOMPAT:NO
RhpCallCatchFunclet