Skip to content

[BOLT][AArch64] Detect veneers with missing data markers #142069

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 30, 2025

Conversation

maksfb
Copy link
Contributor

@maksfb maksfb commented May 30, 2025

The linker may omit data markers for long absolute veneers causing BOLT to treat data as code. Detect such veneers and introduce data markers artificially before BOLT's disassembler kicks in.

The linker may omit data markers for long absolute veneers causing BOLT
to treat data as code. Detect such veneers and introduce data markers
artificially before BOLT's disassembler kicks in.
@llvmbot
Copy link
Member

llvmbot commented May 30, 2025

@llvm/pr-subscribers-bolt

Author: Maksim Panchenko (maksfb)

Changes

The linker may omit data markers for long absolute veneers causing BOLT to treat data as code. Detect such veneers and introduce data markers artificially before BOLT's disassembler kicks in.


Full diff: https://github.com/llvm/llvm-project/pull/142069.diff

2 Files Affected:

  • (modified) bolt/lib/Rewrite/RewriteInstance.cpp (+13)
  • (modified) bolt/test/AArch64/veneer-lld-abs.s (+12-1)
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index ad062ea3622d1..69c39cf5e5bac 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -1341,6 +1341,19 @@ void RewriteInstance::discoverFileObjects() {
         }
       }
     }
+
+    // The linker may omit data markers for absolute long veneers. Introduce
+    // those markers artificially to assist the disassembler.
+    for (BinaryFunction &BF :
+         llvm::make_second_range(BC->getBinaryFunctions())) {
+      if (BF.getOneName().starts_with("__AArch64AbsLongThunk_") &&
+          BF.getSize() == 16 && !BF.getSizeOfDataInCodeAt(8)) {
+        BC->errs() << "BOLT-WARNING: missing data marker detected in veneer "
+                   << BF << '\n';
+        BF.markDataAtOffset(8);
+        BC->AddressToConstantIslandMap[BF.getAddress() + 8] = &BF;
+      }
+    }
   }
 
   if (!BC->IsLinuxKernel) {
diff --git a/bolt/test/AArch64/veneer-lld-abs.s b/bolt/test/AArch64/veneer-lld-abs.s
index 7e6fe2d127060..b22301db66c54 100644
--- a/bolt/test/AArch64/veneer-lld-abs.s
+++ b/bolt/test/AArch64/veneer-lld-abs.s
@@ -6,10 +6,21 @@
 # RUN:    -fuse-ld=lld -Wl,-q
 # RUN: llvm-objdump -d %t.exe | FileCheck --check-prefix=CHECK-INPUT %s
 # RUN: llvm-objcopy --remove-section .rela.mytext %t.exe
-# RUN: llvm-bolt %t.exe -o %t.bolt --elim-link-veneers=true --lite=0
+# RUN: llvm-bolt %t.exe -o %t.bolt
 # RUN: llvm-objdump -d -j .text  %t.bolt | \
 # RUN:   FileCheck --check-prefix=CHECK-OUTPUT %s
 
+## Occasionally, we see the linker not generating $d symbols for long veneers
+## causing BOLT to fail veneer elimination.
+# RUN: llvm-objcopy --remove-symbol-prefix=\$d %t.exe %t.no-marker.exe
+# RUN: llvm-bolt %t.no-marker.exe -o %t.no-marker.bolt \
+# RUN:   2>&1 | FileCheck %s --check-prefix=CHECK-BOLT
+# RUN: llvm-objdump -d -j .text  %t.no-marker.bolt | \
+# RUN:   FileCheck --check-prefix=CHECK-OUTPUT %s
+
+# CHECK-BOLT-NOT: BOLT-WARNING: unable to disassemble instruction
+# CHECK-BOLT: BOLT-WARNING: missing data marker detected in veneer __AArch64AbsLongThunk_far_function
+
 .text
 .balign 4
 .global far_function

@maksfb
Copy link
Contributor Author

maksfb commented May 30, 2025

@MaskRay, any idea why LLD will sometimes omit $d for long absolute thunks?

Copy link
Contributor

@aaupov aaupov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

// those markers artificially to assist the disassembler.
for (BinaryFunction &BF :
llvm::make_second_range(BC->getBinaryFunctions())) {
if (BF.getOneName().starts_with("__AArch64AbsLongThunk_") &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious where does the name come from? Is it in the symbol table, or a synthetic name created by us during disassembly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the linker, i.e. the symbol table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the linker knows these are veneers, and just omits data symbols even if they're mandated by ABI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linker creates them, so - yes.

@efriedma-quic
Copy link
Collaborator

You might want to file an lld bug if it's dropping markers?

@maksfb maksfb merged commit c9022a2 into llvm:main May 30, 2025
10 of 11 checks passed
@MaskRay
Copy link
Member

MaskRay commented May 30, 2025

@MaskRay, any idea why LLD will sometimes omit $d for long absolute thunks?

This should not happen.... Was there a binary manipulation that stripped the mapping symbols?

@maksfb
Copy link
Contributor Author

maksfb commented May 30, 2025

@MaskRay, any idea why LLD will sometimes omit $d for long absolute thunks?

This should not happen.... Was there a binary manipulation that stripped the mapping symbols?

No... This only happens for a few (4) thunks out of > 15K. As you can imagine, it's hard to repro on open-source binaries.

@maksfb
Copy link
Contributor Author

maksfb commented Jun 2, 2025

You might want to file an lld bug if it's dropping markers?

#142326

sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
The linker may omit data markers for long absolute veneers causing BOLT
to treat data as code. Detect such veneers and introduce data markers
artificially before BOLT's disassembler kicks in.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants