Skip to content

[PAC][InstCombine] Replace auth+sign with resign #130807

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

atrosinenko
Copy link
Contributor

Prevent introducing signing oracles by increasing the chances that pointer resign pattern is detected and kept as a single operation until it is expanded in AsmPrinter.

With separate auth and sign intrinsics, it is possible that intermediate authenticated pointer is spilled to stack and reloaded before it is signed again, thus making it possible for the attacker to trick the PAC* instruction into signing an arbitrary pointer.

@atrosinenko atrosinenko marked this pull request as ready for review March 11, 2025 18:16
@atrosinenko atrosinenko requested a review from nikic as a code owner March 11, 2025 18:16
@llvmbot
Copy link
Member

llvmbot commented Mar 11, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Anatoly Trosinenko (atrosinenko)

Changes

Prevent introducing signing oracles by increasing the chances that pointer resign pattern is detected and kept as a single operation until it is expanded in AsmPrinter.

With separate auth and sign intrinsics, it is possible that intermediate authenticated pointer is spilled to stack and reloaded before it is signed again, thus making it possible for the attacker to trick the PAC* instruction into signing an arbitrary pointer.


Full diff: https://github.com/llvm/llvm-project/pull/130807.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp (+23)
  • (modified) llvm/test/Transforms/InstCombine/ptrauth-intrinsics.ll (+58)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
index 63f2fd0a733ce..a4b3c0afdb2f8 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
@@ -2978,6 +2978,29 @@ Instruction *InstCombinerImpl::visitCallInst(CallInst &CI) {
         Intrinsic::getOrInsertDeclaration(II->getModule(), NewIntrin);
     return CallInst::Create(NewFn, CallArgs);
   }
+  case Intrinsic::ptrauth_sign: {
+    // auth + sign can be replaced with resign, which prevents unsafe
+    // spills and reloads of intermediate authenticated value.
+    Value *Ptr = II->getArgOperand(0);
+    Value *SignKey = II->getArgOperand(1);
+    Value *SignDisc = II->getArgOperand(2);
+
+    const auto *CI = dyn_cast<CallBase>(Ptr);
+    if (!CI || CI->getIntrinsicID() != Intrinsic::ptrauth_auth)
+      break;
+
+    Value *BasePtr = CI->getOperand(0);
+    Value *AuthKey = CI->getArgOperand(1);
+    Value *AuthDisc = CI->getArgOperand(2);
+
+    // Not replacing auth+sign using the same schema with nop, as auth+sign
+    // pair traps on authentication failure.
+
+    Function *NewFn = Intrinsic::getOrInsertDeclaration(
+        II->getModule(), Intrinsic::ptrauth_resign);
+    return CallInst::Create(NewFn,
+                            {BasePtr, AuthKey, AuthDisc, SignKey, SignDisc});
+  }
   case Intrinsic::arm_neon_vtbl1:
   case Intrinsic::aarch64_neon_tbl1:
     if (Value *V = simplifyNeonTbl1(*II, Builder))
diff --git a/llvm/test/Transforms/InstCombine/ptrauth-intrinsics.ll b/llvm/test/Transforms/InstCombine/ptrauth-intrinsics.ll
index 208e162ac9416..15dd8bff19f2c 100644
--- a/llvm/test/Transforms/InstCombine/ptrauth-intrinsics.ll
+++ b/llvm/test/Transforms/InstCombine/ptrauth-intrinsics.ll
@@ -160,6 +160,64 @@ define i64 @test_ptrauth_resign_ptrauth_constant(ptr %p) {
   ret i64 %authed
 }
 
+define i64 @test_ptrauth_auth_sign_same_schema(ptr %p) {
+; CHECK-LABEL: @test_ptrauth_auth_sign_same_schema(
+; CHECK-NEXT:    [[P_INT:%.*]] = ptrtoint ptr [[P:%.*]] to i64
+; CHECK-NEXT:    [[RESIGNED:%.*]] = call i64 @llvm.ptrauth.resign(i64 [[P_INT]], i32 1, i64 1234, i32 1, i64 1234)
+; CHECK-NEXT:    ret i64 [[RESIGNED]]
+;
+  %p.int = ptrtoint ptr %p to i64
+  %authed = call i64 @llvm.ptrauth.auth(i64 %p.int, i32 1, i64 1234)
+  %resigned = call i64 @llvm.ptrauth.sign(i64 %authed, i32 1, i64 1234)
+  ret i64 %resigned
+}
+
+define i64 @test_ptrauth_auth_sign_opaque_disc_same_schema(ptr %p, i64 %disc) {
+; CHECK-LABEL: @test_ptrauth_auth_sign_opaque_disc_same_schema(
+; CHECK-NEXT:    [[P_INT:%.*]] = ptrtoint ptr [[P:%.*]] to i64
+; CHECK-NEXT:    [[RESIGNED:%.*]] = call i64 @llvm.ptrauth.resign(i64 [[P_INT]], i32 1, i64 [[DISC:%.*]], i32 1, i64 [[DISC]])
+; CHECK-NEXT:    ret i64 [[RESIGNED]]
+;
+  %p.int = ptrtoint ptr %p to i64
+  %authed = call i64 @llvm.ptrauth.auth(i64 %p.int, i32 1, i64 %disc)
+  %resigned = call i64 @llvm.ptrauth.sign(i64 %authed, i32 1, i64 %disc)
+  ret i64 %resigned
+}
+
+define i64 @test_ptrauth_auth_sign_different_disc(ptr %p, i64 %disc) {
+; CHECK-LABEL: @test_ptrauth_auth_sign_different_disc(
+; CHECK-NEXT:    [[P_INT:%.*]] = ptrtoint ptr [[P:%.*]] to i64
+; CHECK-NEXT:    [[RESIGNED:%.*]] = call i64 @llvm.ptrauth.resign(i64 [[P_INT]], i32 1, i64 [[DISC:%.*]], i32 1, i64 1234)
+; CHECK-NEXT:    ret i64 [[RESIGNED]]
+;
+  %p.int = ptrtoint ptr %p to i64
+  %authed = call i64 @llvm.ptrauth.auth(i64 %p.int, i32 1, i64 %disc)
+  %resigned = call i64 @llvm.ptrauth.sign(i64 %authed, i32 1, i64 1234)
+  ret i64 %resigned
+}
+
+define i64 @test_ptrauth_auth_sign_different_key(ptr %p) {
+; CHECK-LABEL: @test_ptrauth_auth_sign_different_key(
+; CHECK-NEXT:    [[P_INT:%.*]] = ptrtoint ptr [[P:%.*]] to i64
+; CHECK-NEXT:    [[RESIGNED:%.*]] = call i64 @llvm.ptrauth.resign(i64 [[P_INT]], i32 0, i64 1234, i32 1, i64 1234)
+; CHECK-NEXT:    ret i64 [[RESIGNED]]
+;
+  %p.int = ptrtoint ptr %p to i64
+  %authed = call i64 @llvm.ptrauth.auth(i64 %p.int, i32 0, i64 1234)
+  %resigned = call i64 @llvm.ptrauth.sign(i64 %authed, i32 1, i64 1234)
+  ret i64 %resigned
+}
+
+define i64 @test_ptrauth_sign_nonauth_nonconst_disc(i64 %disc) {
+; CHECK-LABEL: @test_ptrauth_sign_nonauth_nonconst_disc(
+; CHECK-NEXT:    [[SIGNED:%.*]] = call i64 @llvm.ptrauth.sign(i64 ptrtoint (ptr @foo to i64), i32 1, i64 [[DISC:%.*]])
+; CHECK-NEXT:    ret i64 [[SIGNED]]
+;
+  %foo.int = ptrtoint ptr @foo to i64
+  %signed = call i64 @llvm.ptrauth.sign(i64 %foo.int, i32 1, i64 %disc)
+  ret i64 %signed
+}
+
 declare i64 @llvm.ptrauth.auth(i64, i32, i64)
 declare i64 @llvm.ptrauth.sign(i64, i32, i64)
 declare i64 @llvm.ptrauth.resign(i64, i32, i64, i32, i64)

Copy link
Collaborator

@kbeyls kbeyls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naively, it would seem to me that it is a user error to not use the resign intrinsic and instead use auth and sign in sequence.
I can see that having this peephole optimization might reduce the occurrence of auth + sign operations in sequence, but it seems this is a "best effort" fix.
I wonder if you could share any insights into where these auth and sign sequences typically originate from. I wonder if other mitigations (e.g. generating warnings when seeing such sequences) wouldn't also be needed or sometimes be better?

@@ -2978,6 +2978,29 @@ Instruction *InstCombinerImpl::visitCallInst(CallInst &CI) {
Intrinsic::getOrInsertDeclaration(II->getModule(), NewIntrin);
return CallInst::Create(NewFn, CallArgs);
}
case Intrinsic::ptrauth_sign: {
// auth + sign can be replaced with resign, which prevents unsafe
// spills and reloads of intermediate authenticated value.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this comment would be easier to understand by a reader who hasn't worked much on pointer authentication if it explained this in a bit more detail, for example as: After auth, a raw, non-signed pointer resides in a register. Later compiler passes may end up spilling and reloading this raw pointer to memory, which undermines the security property that pointer authentication aims to give (not having certain classes of pointers residing unsigned/raw in memory). By rewriting an authentication followed by a signing operation as a single resign, we eliminate the possibility of the intermediate unsigned pointer being spilled to memory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update the comment, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 3b4d9b5, though, with different wording - hopefully this would be even easier to understand:

    // Replace auth+sign with a single resign intrinsic.
    // When auth and sign operations are performed separately, later compiler
    // passes may spill intermediate result to memory as a raw, unprotected
    // pointer, which makes it possible for an attacker to replace it under
    // PAuth threat model. On the other hand, resign intrinsic is not expanded
    // until AsmPrinter, when it is emitted as a contiguous, non-attackable
    // sequence of instructions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
After I wrote my initial comment above I wondered if this is the first peephole pattern added to InstCombine that is implemented because of security reasons rather than because of performance reasons.
Anyway thanks for that improved comment!

@atrosinenko
Copy link
Contributor Author

but it seems this is a "best effort" fix

This is only a best-effort fix and it can even increase the code size sometimes, as it effectively turns this code

tmp = auth(old, schema1)
new1 = sign(tmp, schema2)
new2 = sign(tmp, schema3)
new3 = sign(tmp, schema4)

into this

tmp = auth(old, schema1)
new1 = sign(tmp, schema2)
tmp = auth(old, schema1)
new2 = sign(tmp, schema3)
tmp = auth(old, schema1)
new3 = sign(tmp, schema4)

Though, the correct approach of emitting resign from the start would have the same effect on code size, I think.

I wonder if you could share any insights into where these auth and sign sequences typically originate from.

Considering these signing oracles being observed "in the wild", I experimented with your gadget-scanner prototype to search for other kinds of gadgets, and for the entire llvm-test-suite (excluding Bitcode/ and a few other places) built at O2 optimization level, it found 349 signing oracles that are fixed by this PR, but all they seem to originate either from MicroBenchmarks/libs/benchmark/src/sysinfo.cc (6 reports per executable in benchmark::CPUInfo::CPUInfo()) or from MicroBenchmarks/libs/benchmark/test/output_test_helper.cc (5 reports per executable: 3 in RunOutputTests(int, char**)::ReporterTest::~ReporterTest() and 2 in GetFileReporterOutput(int, char**)).

I tried tracing the origins of these auth and sign intrinsics when this instcombine rule triggers and found the following call stacks: CodeGenFunction::InitializeVTablePointer creates the auth intrinsic call here:

and sign is created here:

Note that this was an arbitrary pair of auth+sign that was combined by this rule, not necessarily one of those actually usable as a signing oracle. By the way, rr helps significantly by making it possible to reverse-continue from the breakpoint with watchpoints set at II->SubclassID and CI->SubclassID.

@asl
Copy link
Collaborator

asl commented Mar 14, 2025

Tagging @ojhunt

@ojhunt
Copy link
Contributor

ojhunt commented Mar 14, 2025

Tagging @ahmedbougacha as it's surprising if the back end isn't combining auth+sign intrinsics

@atrosinenko
Copy link
Contributor Author

For the record, in the discussion of #130809 the following example was proposed which is more relevant here:

void* f3(void *p) {
  void* authed = __builtin_ptrauth_auth(p, 2, 1234);
  __asm__(""::"m"(authed));
  return __builtin_ptrauth_sign_unauthenticated(authed, 3, 42);
}

Turned out, with this PR, the following is emitted:

f3:
        sub     sp, sp, #16
        mov     x16, x0
        add     x8, sp, #8
        mov     x17, #1234
        autda   x16, x17
        mov     x17, x16
        xpacd   x17
        cmp     x16, x17
        b.eq    .Lauth_success_0
        brk     #0xc472
.Lauth_success_0:
        str     x16, [sp, #8]
        mov     x0, x16
        //APP
        //NO_APP
        mov     w8, #42
        pacdb   x0, x8
        add     sp, sp, #16
        ret

which does not seem safe. Though, the input of InstCombiner (the last invocation) is

; Function Attrs: nounwind uwtable
define dso_local ptr @f3(ptr noundef %p) local_unnamed_addr #0 {
entry:
  %authed = alloca ptr, align 8
  call void @llvm.lifetime.start.p0(i64 8, ptr nonnull %authed) #4
  %0 = ptrtoint ptr %p to i64
  %1 = tail call i64 @llvm.ptrauth.auth(i64 %0, i32 2, i64 1234)
  %2 = inttoptr i64 %1 to ptr
  store ptr %2, ptr %authed, align 8, !tbaa !9
  call void asm sideeffect "", "*m"(ptr nonnull elementtype(ptr) %authed) #4, !srcloc !13
  %3 = load ptr, ptr %authed, align 8, !tbaa !9
  %4 = ptrtoint ptr %3 to i64
  %5 = call i64 @llvm.ptrauth.sign(i64 %4, i32 3, i64 42)
  %6 = inttoptr i64 %5 to ptr
  call void @llvm.lifetime.end.p0(i64 8, ptr nonnull %authed) #4
  ret ptr %6
}

Meaning it is probably not an issue of this PR, as even replacing const auto *CI = dyn_cast<CallBase>(Ptr) with const auto *CI = dyn_cast<CallBase>(Ptr->stripPointerCasts()) would not help. On the other hand, simple cases that would be handled by stripPointerCasts() are seemingly handled before simplifying intrinsics calls anyway:

define i64 @test_ptrauth_auth_sign_same_schema(ptr %p) {
; CHECK-LABEL: @test_ptrauth_auth_sign_same_schema(
; CHECK-NEXT:    [[P_INT:%.*]] = ptrtoint ptr [[P:%.*]] to i64
; CHECK-NEXT:    [[RESIGNED:%.*]] = call i64 @llvm.ptrauth.resign(i64 [[P_INT]], i32 1, i64 1234, i32 1, i64 1234)
; CHECK-NEXT:    ret i64 [[RESIGNED]]
;
  %p.int = ptrtoint ptr %p to i64
  %authed = call i64 @llvm.ptrauth.auth(i64 %p.int, i32 1, i64 1234)
  %authed.ptr = inttoptr i64 %authed to ptr
  %authed.ptr.int = ptrtoint ptr %authed.ptr to i64
  %resigned = call i64 @llvm.ptrauth.sign(i64 %authed.ptr.int, i32 1, i64 1234)
  ret i64 %resigned
}

@atrosinenko atrosinenko force-pushed the pauth-instcombine-auth-sign branch from 3b4d9b5 to 0910bcc Compare April 21, 2025 16:38
@atrosinenko
Copy link
Contributor Author

Ping.

Rebased onto current main branch.

Prevent introducing signing oracles by increasing the chances that
pointer resign pattern is detected and kept as a single operation until
it is expanded in AsmPrinter.

With separate auth and sign intrinsics, it is possible that intermediate
authenticated pointer is spilled to stack and reloaded before it is
signed again, thus making it possible for the attacker to trick the PAC*
instruction into signing an arbitrary pointer.
@atrosinenko atrosinenko force-pushed the pauth-instcombine-auth-sign branch from 06d0e14 to 981ce94 Compare May 23, 2025 08:44
@asl asl moved this to In Progress in Pointer Authentication Tasks Jun 2, 2025
@asl asl added this to the LLVM 21.x Release milestone Jun 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

5 participants