Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Instrumentation] Mark instrumented calls as implicit #106447

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

serge-sans-paille
Copy link
Collaborator

Make sure that instrumentation code doesn't interferes with with prologue endpoint computation when generating debug information.

Fix #54873

@llvmbot
Copy link
Collaborator

llvmbot commented Aug 28, 2024

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-compiler-rt-sanitizer

@llvm/pr-subscribers-debuginfo

Author: None (serge-sans-paille)

Changes

Make sure that instrumentation code doesn't interferes with with prologue endpoint computation when generating debug information.

Fix #54873


Full diff: https://github.com/llvm/llvm-project/pull/106447.diff

7 Files Affected:

  • (modified) llvm/include/llvm/Transforms/Instrumentation.h (+17-3)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+4-2)
  • (modified) llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp (+1-1)
  • (modified) llvm/test/Instrumentation/AddressSanitizer/missing_dbg.ll (+1-1)
  • (modified) llvm/test/Instrumentation/SanitizerCoverage/crit-edge-sancov.ll (+3-3)
  • (modified) llvm/test/Instrumentation/SanitizerCoverage/missing_dbg.ll (+2-2)
  • (modified) llvm/test/Instrumentation/ThreadSanitizer/missing_dbg.ll (+2-2)
diff --git a/llvm/include/llvm/Transforms/Instrumentation.h b/llvm/include/llvm/Transforms/Instrumentation.h
index 1a4824a806dc6e..7f7e107f41e776 100644
--- a/llvm/include/llvm/Transforms/Instrumentation.h
+++ b/llvm/include/llvm/Transforms/Instrumentation.h
@@ -194,15 +194,29 @@ static inline uint32_t scaleBranchCount(uint64_t Count, uint64_t Scale) {
 // if the enclosing function does.
 struct InstrumentationIRBuilder : IRBuilder<> {
   static void ensureDebugInfo(IRBuilder<> &IRB, const Function &F) {
-    if (IRB.getCurrentDebugLocation())
+    if (DebugLoc L = IRB.getCurrentDebugLocation()) {
+      if (!L.isImplicitCode()) {
+        L.setImplicitCode(true);
+        IRB.SetCurrentDebugLocation(L);
+      }
       return;
-    if (DISubprogram *SP = F.getSubprogram())
-      IRB.SetCurrentDebugLocation(DILocation::get(SP->getContext(), 0, 0, SP));
+    }
+    if (DISubprogram *SP = F.getSubprogram()) {
+      DebugLoc L = DILocation::get(SP->getContext(), 0, 0, SP, nullptr,
+                                   /*ImplicitCode=*/true);
+      IRB.SetCurrentDebugLocation(L);
+    }
   }
 
   explicit InstrumentationIRBuilder(Instruction *IP) : IRBuilder<>(IP) {
     ensureDebugInfo(*this, *IP->getFunction());
   }
+
+  void SetCurrentImplicitDebugLocation(DebugLoc L) {
+    if (!L.isImplicitCode())
+      L.setImplicitCode(true);
+    SetCurrentDebugLocation(L);
+  }
 };
 } // end namespace llvm
 
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index f111b4aea06f1b..457eab6ed2df33 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -2145,14 +2145,16 @@ static std::pair<DebugLoc, bool> findPrologueEndLoc(const MachineFunction *MF) {
   for (const auto &MBB : *MF) {
     for (const auto &MI : MBB) {
       if (!MI.isMetaInstruction()) {
-        if (!MI.getFlag(MachineInstr::FrameSetup) && MI.getDebugLoc()) {
+        if (!MI.getFlag(MachineInstr::FrameSetup) &&
+            !MI.getDebugLoc().isImplicitCode()) {
           // Scan forward to try to find a non-zero line number. The
           // prologue_end marks the first breakpoint in the function after the
           // frame setup, and a compiler-generated line 0 location is not a
           // meaningful breakpoint. If none is found, return the first
           // location after the frame setup.
-          if (MI.getDebugLoc().getLine())
+          if (MI.getDebugLoc().getLine()) {
             return std::make_pair(MI.getDebugLoc(), IsEmptyPrologue);
+          }
 
           LineZeroLoc = MI.getDebugLoc();
         }
diff --git a/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index 6a89cee9aaf6cc..cfbc21b69faebe 100644
--- a/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -958,7 +958,7 @@ void ModuleSanitizerCoverage::InjectCoverageAtBlock(Function &F, BasicBlock &BB,
 
   InstrumentationIRBuilder IRB(&*IP);
   if (EntryLoc)
-    IRB.SetCurrentDebugLocation(EntryLoc);
+    IRB.SetCurrentImplicitDebugLocation(EntryLoc);
   if (Options.TracePC) {
     IRB.CreateCall(SanCovTracePC)
         ->setCannotMerge(); // gets the PC using GET_CALLER_PC.
diff --git a/llvm/test/Instrumentation/AddressSanitizer/missing_dbg.ll b/llvm/test/Instrumentation/AddressSanitizer/missing_dbg.ll
index f6553322fd6787..5bb1c01e9d1231 100644
--- a/llvm/test/Instrumentation/AddressSanitizer/missing_dbg.ll
+++ b/llvm/test/Instrumentation/AddressSanitizer/missing_dbg.ll
@@ -34,4 +34,4 @@ entry:
 !4 = !DISubroutineType(types: !5)
 !5 = !{}
 
-; CHECK:       [[DBG]] = !DILocation(line: 0, scope: !4)
+; CHECK:       [[DBG]] = !DILocation(line: 0, scope: !4, isImplicitCode: true)
diff --git a/llvm/test/Instrumentation/SanitizerCoverage/crit-edge-sancov.ll b/llvm/test/Instrumentation/SanitizerCoverage/crit-edge-sancov.ll
index f42fa7139fd585..cbab3ac9fdf302 100644
--- a/llvm/test/Instrumentation/SanitizerCoverage/crit-edge-sancov.ll
+++ b/llvm/test/Instrumentation/SanitizerCoverage/crit-edge-sancov.ll
@@ -21,9 +21,9 @@ define void @update_shadow(i1 %c) !dbg !3 {
 ; CHECK-NEXT:    call void @__sanitizer_cov_trace_pc() #[[ATTR0]], !dbg [[DBG7:![0-9]+]]
 ; CHECK:       if.end22.i:
 ; CHECK-NEXT:    call void @__sanitizer_cov_trace_pc() #[[ATTR0]], !dbg [[DBG8:![0-9]+]]
-; CHECK:       [[DBG6]] = !DILocation(line: 192, scope: !3)
-; CHECK:       [[DBG7]] = !DILocation(line: 0, scope: !3)
-; CHECK:       [[DBG8]] = !DILocation(line: 129, column: 2, scope: !3)
+; CHECK:       [[DBG6]] = !DILocation(line: 192, scope: !3, isImplicitCode: true)
+; CHECK:       [[DBG7]] = !DILocation(line: 0, scope: !3, isImplicitCode: true)
+; CHECK:       [[DBG8]] = !DILocation(line: 129, column: 2, scope: !3, isImplicitCode: true)
 entry:
   br i1 %c, label %for.inc.i, label %if.end22.i
 
diff --git a/llvm/test/Instrumentation/SanitizerCoverage/missing_dbg.ll b/llvm/test/Instrumentation/SanitizerCoverage/missing_dbg.ll
index 21c6fcdb3a84b0..67af5663ceff64 100644
--- a/llvm/test/Instrumentation/SanitizerCoverage/missing_dbg.ll
+++ b/llvm/test/Instrumentation/SanitizerCoverage/missing_dbg.ll
@@ -46,5 +46,5 @@ entry:
 !6 = !DILocation(line: 192, scope: !3)
 !7 = !DILocation(line: 0, scope: !3)
 
-; CHECK:       [[DBG1]] = !DILocation(line: 192, scope: !3)
-; CHECK:       [[DBG2]] = !DILocation(line: 0, scope: !3)
+; CHECK:       [[DBG1]] = !DILocation(line: 192, scope: !3, isImplicitCode: true)
+; CHECK:       [[DBG2]] = !DILocation(line: 0, scope: !3, isImplicitCode: true)
diff --git a/llvm/test/Instrumentation/ThreadSanitizer/missing_dbg.ll b/llvm/test/Instrumentation/ThreadSanitizer/missing_dbg.ll
index a9d6bcd5191487..a5432fa97e2b96 100644
--- a/llvm/test/Instrumentation/ThreadSanitizer/missing_dbg.ll
+++ b/llvm/test/Instrumentation/ThreadSanitizer/missing_dbg.ll
@@ -51,5 +51,5 @@ entry:
 !5 = !{}
 !7 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 195, type: !4, scopeLine: 199, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition | DISPFlagOptimized, unit: !0)
 
-; CHECK:       [[DBG1]] = !DILocation(line: 0, scope: !3)
-; CHECK:       [[DBG2]] = !DILocation(line: 0, scope: !7)
+; CHECK:       [[DBG1]] = !DILocation(line: 0, scope: !3, isImplicitCode: true)
+; CHECK:       [[DBG2]] = !DILocation(line: 0, scope: !7, isImplicitCode: true)

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hola; this looks like a good fix, however I wonder whether existing facilities for implicit code would be sufficient to achieve this. Nowadays, by convention, a DILocation with a line-number of zero is interpreted as being a compiler-generated source location and is ignored by various pieces of LLVM (and DWARF consumers too). Would using that achieve your aims instead?

Specifically, I gave this a quick spin:

diff --git a/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index 6a89cee9aaf6..48d00cfb9671 100644
--- a/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -949,7 +949,7 @@ void ModuleSanitizerCoverage::InjectCoverageAtBlock(Function &F, BasicBlock &BB,
   DebugLoc EntryLoc;
   if (IsEntryBB) {
     if (auto SP = F.getSubprogram())
-      EntryLoc = DILocation::get(SP->getContext(), SP->getScopeLine(), 0, SP);
+      EntryLoc = DILocation::get(SP->getContext(), 0, 0, SP);
     // Keep static allocas and llvm.localescape calls in the entry block.  Even
     // if we aren't splitting the block, it's nice for allocas to be before
     // calls.

and the reproducer from the linked ticket/issue shifted the prologue_end flag back to the desired position. As a side-effect the instrumentation code then can't be stepped on (a feature, in my view).

Disclosure: I'd prefer to avoid more users of isImplicitCode, as it'd be quite nice to repurpose it in the future, and it's little-used right now.

@serge-sans-paille
Copy link
Collaborator Author

The change dates back to 201733b, and I agree with the simplicity of your approach, let me update the diff.

…uginfo

Doing so makes sure instrumentation code doesn't interferes with with
prologue endpoint computation when generating debug information, as
prologue_endpoint computation skips over such instructions.

Fix llvm#54873
@serge-sans-paille
Copy link
Collaborator Author

@jmorse : updated!

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM with a couple of nits; it's not clear to me whether using line-zero on coverage-instrumentation calls breaks some convention or developer expectation, as coverage-instrumentation isn't an area I'm familiar with.

Comment on lines 17 to 18
; Test that __sanitizer_cov_trace_pc_guard call has !dbg pointing to the opening { of A::f().
; CHECK: call void @__sanitizer_cov_trace_pc_guard(ptr{{.*}}) #{{.*}}, !dbg [[A:!.*]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment will want updating as it's now not true I guess; it's unclear to me whether this indicates there are expectations that the coverage call gets a "real" source location?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent of the original commit (493df13) is unclear :-/

@@ -949,7 +949,7 @@ void ModuleSanitizerCoverage::InjectCoverageAtBlock(Function &F, BasicBlock &BB,
DebugLoc EntryLoc;
if (IsEntryBB) {
if (auto SP = F.getSubprogram())
EntryLoc = DILocation::get(SP->getContext(), SP->getScopeLine(), 0, SP);
EntryLoc = DILocation::get(SP->getContext(), 0, 0, SP);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO this warrants a comment along the lines of "Use a compile-generated source-location for all coverage calls", to aid future readers

@khuey
Copy link
Contributor

khuey commented Sep 6, 2024

This needs to be tested against actual users of this API before it's merged. The example given in the docs (https://clang.llvm.org/docs/SanitizerCoverage.html)

extern "C" void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {
  if (!*guard) return;  // Duplicate the guard check.
  // If you set *guard to 0 this code will not be called again for this edge.
  // Now you can get the PC and do whatever you want:
  //   store it somewhere or symbolize it and print right away.
  // The values of `*guard` are as you set them in
  // __sanitizer_cov_trace_pc_guard_init and so you can make them consecutive
  // and use them to dereference an array or a bit vector.
  void *PC = __builtin_return_address(0);
  char PcDescr[1024];
  // This function is a part of the sanitizer run-time.
  // To use it, link with AddressSanitizer or other sanitizer.
  __sanitizer_symbolize_pc(PC, "%p %F %L", PcDescr, sizeof(PcDescr));
  printf("guard: %p %x PC %s\n", guard, *guard, PcDescr);
}

Seems very likely to break (__sanitizer_symbolize_pc will end up finding the line 0 location with the instrumentation sequence in #54873).

@vitalybuka
Copy link
Collaborator

This needs to be tested against actual users of this API before it's merged. The example given in the docs (https://clang.llvm.org/docs/SanitizerCoverage.html)

extern "C" void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {
  if (!*guard) return;  // Duplicate the guard check.
  // If you set *guard to 0 this code will not be called again for this edge.
  // Now you can get the PC and do whatever you want:
  //   store it somewhere or symbolize it and print right away.
  // The values of `*guard` are as you set them in
  // __sanitizer_cov_trace_pc_guard_init and so you can make them consecutive
  // and use them to dereference an array or a bit vector.
  void *PC = __builtin_return_address(0);
  char PcDescr[1024];
  // This function is a part of the sanitizer run-time.
  // To use it, link with AddressSanitizer or other sanitizer.
  __sanitizer_symbolize_pc(PC, "%p %F %L", PcDescr, sizeof(PcDescr));
  printf("guard: %p %x PC %s\n", guard, *guard, PcDescr);
}

Seems very likely to break (__sanitizer_symbolize_pc will end up finding the line 0 location with the instrumentation sequence in #54873).

I wondering if https://clang.llvm.org/docs/SanitizerCoverage.html#id16 will still work?

Copy link
Collaborator

@vitalybuka vitalybuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing from my pending review, as there are open concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

-fsanitize-coverage produces incorrect debug info at function entry points
5 participants