Arm64: [PAC-RET] NativeAOT changes#128950
Conversation
As suggested in [comment](dotnet#125436 (comment)), this PR covers subset of changes from dotnet#125436 related to NativeAOT. More details on PAC and its role in software security can be found ([here](https://llsoftsec.github.io/llsoftsecbook/#sec:pointer-authentication)).
|
Tagging subscribers to this area: @agocke, @dotnet/ilc-contrib |
|
/azp run runtime-nativeaot-outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Tests failing on linux-arm64 with |
|
Tests are crashing on linux-arm64 and osx-arm64 |
Add assert to ensure PAC is being emitted before stack adjustment Change-Id: Ifb636d2bb3fd3c78d33274ea7786291b5ee83fd6
Change-Id: Ic64dd02db14e691d3ea88212544059e9d9892062
These crashes came from changes to bail out if we interrupt on RET when pac is enabled. Previously PR was handling RET incorrect in |
| return false; | ||
| } | ||
|
|
||
| // At RET, FP/LR/SP have already been restored to the caller state so |
There was a problem hiding this comment.
Should this check be in TrailingEpilogueInstructionsCount so that all epilog-related checks are together?
There was a problem hiding this comment.
True, it was in there before and returned 1 on RET instruction (cannot return 0/-1 as it means not in in epilog/unknown). However when PAC is disabled, it allowed hijacking the current frame when interrupted on the RET instruction. That caused the crashes that we saw earlier. To keep PAC related changes separate, I added them in PAC specific area. I can extract the RET check into a separate function for better readability.
There was a problem hiding this comment.
when PAC is disabled, it allowed hijacking the current frame when interrupted on the RET instruction
Where is hijacking on RET rejected in main before these changes? We are not able to hijack on RET instruction - we can only hijack when the return address is spilled on stack.
returned 1 on RET instruction (cannot return 0/-1 as it means not in in epilog/unknown).
TrailingEpilogueInstructionsCount never returns 1 on Arm64, even for instructions that are clearly epilog. It always returns -1 for those. I do not see a problem with returning -1 for RET to follow the pattern.
There was a problem hiding this comment.
Apologies, my wording was wrong. I meant that on main we do not detect the RET instruction in TrailingEpilogueInstructionsCount(), so it returns 0. That lets GetReturnAddressHijackInfo() continue into the existing unwind path, which unwinds the current frame and may then identify the caller return-address slot for hijacking.
Returning -1 for RET would stop that path, even when PAC is disabled. I initially kept the RET check PAC-specific to avoid changing non-PAC behaviour.
I agreed your point that Arm64 already uses -1 for unsafe epilog states, and RET fits that pattern since FP/LR/SP have already been restored. Updated to return -1 for RET from TrailingEpilogueInstructionsCount().
There was a problem hiding this comment.
That lets GetReturnAddressHijackInfo() continue into the existing unwind path, which unwinds the current frame and may then identify the caller return-address slot for hijacking.
There is no return-address slot to hijacking when we are stopped at the RET instruction. The return address is in LR register at that point. I am still missing where this gets rejected in main today. I think it must be getting rejected somehow.
There was a problem hiding this comment.
I think it's line 1524 that rejects hijacks on RET.
if (pRegisterSet->GetReturnAddressRegisterLocation() == oldLocation)
{
// This is the case when we are either:
//
// 1) In a leaf method that does not push return address register on stack, OR
// 2) In the prolog/epilog of a non-leaf method that has not yet pushed return address register on stack
// or has return address register already popped off.
return false;
}
There was a problem hiding this comment.
If this is rejecting RET, it should be rejecting it both with and without PAC changes. So why do you need to reject the RET explicitly in your change?
|
Failure seems related to the tagged issues, not with the PR. |
|
Failures seem unrelated to the PR. |
| #endif | ||
|
|
||
| #if defined(TARGET_AMD64) | ||
| *pSpForArm64PacSign = 0; |
There was a problem hiding this comment.
Clear it as the first thing in the method - same as for Unix
This PR covers the final subset of changes from #125436 related to NativeAOT as suggested in comment.
It follows the previous work from- #127949, #127838 and #128147.
More details on PAC and its role in software security can be found (here).
cc: @dotnet/arm64-contrib @a74nh @jkotas @dhartglassMSFT