Skip to content

Commit

Permalink
[MemProf] Avoid incorrect ICP symtab canonicalization (#115419)
Browse files Browse the repository at this point in the history
ICP builds a symtab from the symbols in the module allowing mapping from
the VP metadata GUIDs to the Function. MemProf uses this same symtab
handling for its ICP during cloning. When symbols are added to the
symtab, the handling adds both a GUID computed from the function name,
or from the attached PGOFuncName metadata for locals, as well as a GUID
computed from the "canonicalized" name, which strips all "." suffixes
other than ".__uniq". This was originally meant to remove the ".llvm.*"
suffix added to promoted locals (done earlier in the ThinLTO backend).
In theory, it should no longer be needed as locals should have
PGOFuncName metadata.

However, this was causing a linker unsat, in code that used coroutines.
For an original coroutine function, there were several additional
functions created that had the same name, but different "." suffixes.
Therefore the canonical name for these additional functions had the same
GUID as that of the original function, leading to extra entries in the
symtab, and to selecting the wrong function for promotion. For regular
ICP this can happen, but is just a performance issue. However, for
memprof the promoted direct call calls a memprof clone, and because we
called the wrong function, in this case it didn't have a memprof clone
and we got a linker unsat.

We may be able to remove the canonical name handling for ICP in general,
but for now disable it for MemProf. At worst this could lead to not
finding a GUID in the symtab and not performing an ICP, so should be
conservatively correct.
  • Loading branch information
teresajohnson authored Nov 8, 2024
1 parent bfa3ffb commit 594e11c
Show file tree
Hide file tree
Showing 4 changed files with 35 additions and 7 deletions.
7 changes: 5 additions & 2 deletions llvm/include/llvm/ProfileData/InstrProf.h
Original file line number Diff line number Diff line change
Expand Up @@ -504,7 +504,8 @@ class InstrProfSymtab {
// name-set = {PGOFuncName} union {getCanonicalName(PGOFuncName)}
// - In MD5NameMap: <MD5Hash(name), name> for name in name-set
// - In MD5FuncMap: <MD5Hash(name), &F> for name in name-set
Error addFuncWithName(Function &F, StringRef PGOFuncName);
// The canonical name is only added if \c AddCanonical is true.
Error addFuncWithName(Function &F, StringRef PGOFuncName, bool AddCanonical);

// Add the vtable into the symbol table, by creating the following
// map entries:
Expand Down Expand Up @@ -560,7 +561,9 @@ class InstrProfSymtab {
/// decls from module \c M. This interface is used by transformation
/// passes such as indirect function call promotion. Variable \c InLTO
/// indicates if this is called from LTO optimization passes.
Error create(Module &M, bool InLTO = false);
/// A canonical name, removing non-__uniq suffixes, is added if
/// \c AddCanonical is true.
Error create(Module &M, bool InLTO = false, bool AddCanonical = true);

/// Create InstrProfSymtab from a set of names iteratable from
/// \p IterRange. This interface is used by IndexedProfReader.
Expand Down
12 changes: 8 additions & 4 deletions llvm/lib/ProfileData/InstrProf.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -483,16 +483,16 @@ GlobalVariable *createPGOFuncNameVar(Function &F, StringRef PGOFuncName) {
return createPGOFuncNameVar(*F.getParent(), F.getLinkage(), PGOFuncName);
}

Error InstrProfSymtab::create(Module &M, bool InLTO) {
Error InstrProfSymtab::create(Module &M, bool InLTO, bool AddCanonical) {
for (Function &F : M) {
// Function may not have a name: like using asm("") to overwrite the name.
// Ignore in this case.
if (!F.hasName())
continue;
if (Error E = addFuncWithName(F, getIRPGOFuncName(F, InLTO)))
if (Error E = addFuncWithName(F, getIRPGOFuncName(F, InLTO), AddCanonical))
return E;
// Also use getPGOFuncName() so that we can find records from older profiles
if (Error E = addFuncWithName(F, getPGOFuncName(F, InLTO)))
if (Error E = addFuncWithName(F, getPGOFuncName(F, InLTO), AddCanonical))
return E;
}

Expand Down Expand Up @@ -630,7 +630,8 @@ StringRef InstrProfSymtab::getCanonicalName(StringRef PGOName) {
return PGOName;
}

Error InstrProfSymtab::addFuncWithName(Function &F, StringRef PGOFuncName) {
Error InstrProfSymtab::addFuncWithName(Function &F, StringRef PGOFuncName,
bool AddCanonical) {
auto NameToGUIDMap = [&](StringRef Name) -> Error {
if (Error E = addFuncName(Name))
return E;
Expand All @@ -640,6 +641,9 @@ Error InstrProfSymtab::addFuncWithName(Function &F, StringRef PGOFuncName) {
if (Error E = NameToGUIDMap(PGOFuncName))
return E;

if (!AddCanonical)
return Error::success();

StringRef CanonicalFuncName = getCanonicalName(PGOFuncName);
if (CanonicalFuncName != PGOFuncName)
return NameToGUIDMap(CanonicalFuncName);
Expand Down
12 changes: 11 additions & 1 deletion llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4143,7 +4143,17 @@ bool MemProfContextDisambiguation::initializeIndirectCallPromotionInfo(
Module &M) {
ICallAnalysis = std::make_unique<ICallPromotionAnalysis>();
Symtab = std::make_unique<InstrProfSymtab>();
if (Error E = Symtab->create(M, /*InLTO=*/true)) {
// Don't add canonical names, to avoid multiple functions to the symtab
// when they both have the same root name with "." suffixes stripped.
// If we pick the wrong one then this could lead to incorrect ICP and calling
// a memprof clone that we don't actually create (resulting in linker unsats).
// What this means is that the GUID of the function (or its PGOFuncName
// metadata) *must* match that in the VP metadata to allow promotion.
// In practice this should not be a limitation, since local functions should
// have PGOFuncName metadata and global function names shouldn't need any
// special handling (they should not get the ".llvm.*" suffix that the
// canonicalization handling is attempting to strip).
if (Error E = Symtab->create(M, /*InLTO=*/true, /*AddCanonical=*/false)) {
std::string SymtabFailure = toString(std::move(E));
M.getContext().emitError("Failed to create symtab: " + SymtabFailure);
return false;
Expand Down
11 changes: 11 additions & 0 deletions llvm/test/ThinLTO/X86/memprof-icp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@
; RUN: -enable-memprof-indirect-call-support=true \
; RUN: -supports-hot-cold-new \
; RUN: -r=%t/foo.o,_Z3fooR2B0j,plx \
; RUN: -r=%t/foo.o,_ZN2B03barEj.abc,plx \
; RUN: -r=%t/foo.o,_Z3xyzR2B0j, \
; RUN: -r=%t/main.o,_Z3fooR2B0j, \
; RUN: -r=%t/main.o,_Znwm, \
Expand Down Expand Up @@ -121,6 +122,7 @@
; RUN: -supports-hot-cold-new \
; RUN: -thinlto-distributed-indexes \
; RUN: -r=%t/foo.o,_Z3fooR2B0j,plx \
; RUN: -r=%t/foo.o,_ZN2B03barEj.abc,plx \
; RUN: -r=%t/foo.o,_Z3xyzR2B0j, \
; RUN: -r=%t/main.o,_Z3fooR2B0j, \
; RUN: -r=%t/main.o,_Znwm, \
Expand Down Expand Up @@ -155,6 +157,7 @@
; RUN: -enable-memprof-indirect-call-support=false \
; RUN: -supports-hot-cold-new \
; RUN: -r=%t/foo.noicp.o,_Z3fooR2B0j,plx \
; RUN: -r=%t/foo.noicp.o,_ZN2B03barEj.abc,plx \
; RUN: -r=%t/foo.noicp.o,_Z3xyzR2B0j, \
; RUN: -r=%t/main.o,_Z3fooR2B0j, \
; RUN: -r=%t/main.o,_Znwm, \
Expand Down Expand Up @@ -261,6 +264,14 @@ target triple = "x86_64-unknown-linux-gnu"

declare i32 @_Z3xyzR2B0j(ptr %b)

;; Add a function that has the same name as one of the indirect callees, but
;; with a suffix, to make sure we don't incorrectly pick the wrong one as the
;; promoted target (which happens if we attempt to canonicalize the names
;; when building the ICP symtab).
define i32 @_ZN2B03barEj.abc(ptr %this, i32 %s) {
ret i32 0
}

define i32 @_Z3fooR2B0j(ptr %b) {
entry:
%0 = load ptr, ptr %b, align 8
Expand Down

0 comments on commit 594e11c

Please sign in to comment.