Skip to content

[DLCov 2/5] Implement DebugLoc coverage tracking #107279

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions clang/lib/CodeGen/BackendUtil.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -961,6 +961,22 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
Debugify.setOrigDIVerifyBugsReportFilePath(
CodeGenOpts.DIBugsReportFilePath);
Debugify.registerCallbacks(PIC, MAM);

#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
// If we're using debug location coverage tracking, mark all the
// instructions coming out of the frontend without a DebugLoc as being
// compiler-generated, to prevent both those instructions and new
// instructions that inherit their location from being treated as
// incorrectly empty locations.
for (Function &F : *TheModule) {
if (!F.getSubprogram())
continue;
for (BasicBlock &BB : F)
for (Instruction &I : BB)
if (!I.getDebugLoc())
I.setDebugLoc(DebugLoc::getCompilerGenerated());
}
#endif
}
// Attempt to load pass plugins and register their callbacks with PB.
for (auto &PluginFN : CodeGenOpts.PassPlugins) {
Expand Down
41 changes: 41 additions & 0 deletions llvm/docs/HowToUpdateDebugInfo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,47 @@ See the discussion in the section about
:ref:`merging locations<WhenToMergeLocation>` for examples of when the rule for
dropping locations applies.

.. _NewInstLocations:

Setting locations for new instructions
--------------------------------------

Whenever a new instruction is created and there is no suitable location for that
instruction, that instruction should be annotated accordingly. There are a set
of special ``DebugLoc`` values that can be set on an instruction to annotate the
reason that it does not have a valid location. These are as follows:

* ``DebugLoc::getCompilerGenerated()``: This indicates that the instruction is a
compiler-generated instruction, i.e. it is not associated with any user source
code.

* ``DebugLoc::getDropped()``: This indicates that the instruction has
intentionally had its source location removed, according to the rules for
:ref:`dropping locations<WhenToDropLocation>`; this is set automatically by
``Instruction::dropLocation()``.

* ``DebugLoc::getUnknown()``: This indicates that the instruction does not have
a known or currently knowable source location, e.g. that it is infeasible to
determine the correct source location, or that the source location is
ambiguous in a way that LLVM cannot currently represent.

* ``DebugLoc::getTemporary()``: This is used for instructions that we don't
expect to be emitted (e.g. ``UnreachableInst``), and so should not need a
valid location; if we ever try to emit a temporary location into an object/asm
file, this indicates that something has gone wrong.

Where applicable, these should be used instead of leaving an instruction without
an assigned location or explicitly setting the location as ``DebugLoc()``.
Ordinarily these special locations are identical to an absent location, but LLVM
built with coverage-tracking
(``-DLLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING="COVERAGE"``) will keep track of
these special locations in order to detect unintentionally-missing locations;
for this reason, the most important rule is to *not* apply any of these if it
isn't clear which, if any, is appropriate - an absent location can be detected
and fixed, while an incorrectly annotated instruction is much harder to detect.
On the other hand, if any of these clearly apply, then they should be used to
prevent false positives from being flagged up.

Rules for updating debug values
===============================

Expand Down
96 changes: 95 additions & 1 deletion llvm/include/llvm/IR/DebugLoc.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#ifndef LLVM_IR_DEBUGLOC_H
#define LLVM_IR_DEBUGLOC_H

#include "llvm/Config/config.h"
#include "llvm/IR/TrackingMDRef.h"
#include "llvm/Support/DataTypes.h"

Expand All @@ -22,6 +23,73 @@ namespace llvm {
class LLVMContext;
class raw_ostream;
class DILocation;
class Function;

#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
// Used to represent different "kinds" of DebugLoc, expressing that the
// instruction it is part of is either normal and should contain a valid
// DILocation, or otherwise describing the reason why the instruction does
// not contain a valid DILocation.
enum class DebugLocKind : uint8_t {
// The instruction is expected to contain a valid DILocation.
Normal,
// The instruction is compiler-generated, i.e. it is not associated with any
// line in the original source.
CompilerGenerated,
// The instruction has intentionally had its source location removed,
// typically because it was moved outside of its original control-flow and
// presenting the prior source location would be misleading for debuggers
// or profilers.
Dropped,
// The instruction does not have a known or currently knowable source
// location, e.g. the attribution is ambiguous in a way that can't be
// represented, or determining the correct location is complicated and
// requires future developer effort.
Unknown,
// DebugLoc is attached to an instruction that we don't expect to be
// emitted, and so can omit a valid DILocation; we don't expect to ever try
// and emit these into the line table, and trying to do so is a sign that
// something has gone wrong (most likely a DebugLoc leaking from a transient
// compiler-generated instruction).
Temporary
};

// Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
// to ignore intentionally-empty DebugLocs.
class DILocAndCoverageTracking : public TrackingMDNodeRef {
public:
DebugLocKind Kind;
// Default constructor for empty DebugLocs.
DILocAndCoverageTracking()
: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
// Valid or nullptr MDNode*, normal DebugLocKind.
DILocAndCoverageTracking(const MDNode *Loc)
: TrackingMDNodeRef(const_cast<MDNode *>(Loc)),
Kind(DebugLocKind::Normal) {}
DILocAndCoverageTracking(const DILocation *Loc);
// Explicit DebugLocKind, which always means a nullptr MDNode*.
DILocAndCoverageTracking(DebugLocKind Kind)
: TrackingMDNodeRef(nullptr), Kind(Kind) {}
};
template <> struct simplify_type<DILocAndCoverageTracking> {
using SimpleType = MDNode *;

static MDNode *getSimplifiedValue(DILocAndCoverageTracking &MD) {
return MD.get();
}
};
template <> struct simplify_type<const DILocAndCoverageTracking> {
using SimpleType = MDNode *;

static MDNode *getSimplifiedValue(const DILocAndCoverageTracking &MD) {
return MD.get();
}
};

using DebugLocTrackingRef = DILocAndCoverageTracking;
#else
using DebugLocTrackingRef = TrackingMDNodeRef;
#endif // ENABLE_DEBUGLOC_COVERAGE_TRACKING

/// A debug info location.
///
Expand All @@ -31,7 +99,8 @@ namespace llvm {
/// To avoid extra includes, \a DebugLoc doubles the \a DILocation API with a
/// one based on relatively opaque \a MDNode pointers.
class DebugLoc {
TrackingMDNodeRef Loc;

DebugLocTrackingRef Loc;

public:
DebugLoc() = default;
Expand All @@ -47,6 +116,31 @@ namespace llvm {
/// IR.
explicit DebugLoc(const MDNode *N);

#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
DebugLoc(DebugLocKind Kind) : Loc(Kind) {}
DebugLocKind getKind() const { return Loc.Kind; }
#endif

#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
static inline DebugLoc getTemporary() {
return DebugLoc(DebugLocKind::Temporary);
}
static inline DebugLoc getUnknown() {
return DebugLoc(DebugLocKind::Unknown);
}
static inline DebugLoc getCompilerGenerated() {
return DebugLoc(DebugLocKind::CompilerGenerated);
}
static inline DebugLoc getDropped() {
return DebugLoc(DebugLocKind::Dropped);
}
#else
static inline DebugLoc getTemporary() { return DebugLoc(); }
static inline DebugLoc getUnknown() { return DebugLoc(); }
static inline DebugLoc getCompilerGenerated() { return DebugLoc(); }
static inline DebugLoc getDropped() { return DebugLoc(); }
#endif // ENABLE_DEBUGLOC_COVERAGE_TRACKING

/// Get the underlying \a DILocation.
///
/// \pre !*this or \c isa<DILocation>(getAsMDNode()).
Expand Down
4 changes: 4 additions & 0 deletions llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2097,6 +2097,10 @@ void DwarfDebug::beginInstruction(const MachineInstr *MI) {
}

if (!DL) {
// FIXME: We could assert that `DL.getKind() != DebugLocKind::Temporary`
// here, or otherwise record any temporary DebugLocs seen to ensure that
// transient compiler-generated instructions aren't leaking their DLs to
// other instructions.
Comment on lines +2100 to +2103
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yes, reporting such cases somehow does seem valuable... though unclear to me which way is best, so perhaps best left as future work for now.

// We have an unspecified location, which might want to be line 0.
// If we have already emitted a line-0 record, don't repeat it.
if (LastAsmLine == 0)
Expand Down
8 changes: 5 additions & 3 deletions llvm/lib/IR/DebugInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -987,8 +987,10 @@ void Instruction::updateLocationAfterHoist() { dropLocation(); }

void Instruction::dropLocation() {
const DebugLoc &DL = getDebugLoc();
if (!DL)
if (!DL) {
setDebugLoc(DebugLoc::getDropped());
return;
}

// If this isn't a call, drop the location to allow a location from a
// preceding instruction to propagate.
Expand All @@ -1000,7 +1002,7 @@ void Instruction::dropLocation() {
}

if (!MayLowerToCall) {
setDebugLoc(DebugLoc());
setDebugLoc(DebugLoc::getDropped());
return;
}

Expand All @@ -1019,7 +1021,7 @@ void Instruction::dropLocation() {
//
// One alternative is to set a line 0 location with the existing scope and
// inlinedAt info. The location might be sensitive to when inlining occurs.
setDebugLoc(DebugLoc());
setDebugLoc(DebugLoc::getDropped());
}

//===----------------------------------------------------------------------===//
Expand Down
6 changes: 6 additions & 0 deletions llvm/lib/IR/DebugLoc.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,12 @@
#include "llvm/IR/DebugInfo.h"
using namespace llvm;

#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
DILocAndCoverageTracking::DILocAndCoverageTracking(const DILocation *L)
: TrackingMDNodeRef(const_cast<DILocation *>(L)),
Kind(DebugLocKind::Normal) {}
#endif // ENABLE_DEBUGLOC_COVERAGE_TRACKING

//===----------------------------------------------------------------------===//
// DebugLoc Implementation
//===----------------------------------------------------------------------===//
Expand Down
19 changes: 12 additions & 7 deletions llvm/lib/Transforms/Utils/Debugify.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -290,6 +290,16 @@ bool llvm::stripDebugifyMetadata(Module &M) {
return Changed;
}

bool hasLoc(const Instruction &I) {
const DILocation *Loc = I.getDebugLoc().get();
#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
DebugLocKind Kind = I.getDebugLoc().getKind();
return Loc || Kind != DebugLocKind::Normal;
#else
return Loc;
#endif
}

bool llvm::collectDebugInfoMetadata(Module &M,
iterator_range<Module::iterator> Functions,
DebugInfoPerPass &DebugInfoBeforePass,
Expand Down Expand Up @@ -362,9 +372,7 @@ bool llvm::collectDebugInfoMetadata(Module &M,
LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n');
DebugInfoBeforePass.InstToDelete.insert({&I, &I});

const DILocation *Loc = I.getDebugLoc().get();
bool HasLoc = Loc != nullptr;
DebugInfoBeforePass.DILocations.insert({&I, HasLoc});
DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
}
}
}
Expand Down Expand Up @@ -607,10 +615,7 @@ bool llvm::checkDebugInfoMetadata(Module &M,

LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n');

const DILocation *Loc = I.getDebugLoc().get();
bool HasLoc = Loc != nullptr;

DebugInfoAfterPass.DILocations.insert({&I, HasLoc});
DebugInfoAfterPass.DILocations.insert({&I, hasLoc(I)});
}
}
}
Expand Down
Loading