Skip to content

Commit

Permalink
[MemProf] Strip callsite metadata when inlining an unprofiled callsit…
Browse files Browse the repository at this point in the history
…e (#110998)

We weren't flagging inlined callee functions with callsite but not
memprof metadata correctly, leading to the callsite metadata not being
stripped when that function was inlined into a callsite that didn't
itself have callsite metadata.

In practice, this meant that we went into the LTO link with many more
calls than necessary having callsite metadata / summary records, which
in turn made the graph larger than necessary.

Fixing this oversight resulted in huge reductions in the thin link of a
large target:
99% fewer duplicated context ids (recall we have to duplicate when
callsites containing the same stack ids are in different functions)
71% fewer graph edges
17% fewer graph nodes
13% fewer functions cloned
44% smaller peak memory
47% smaller time
  • Loading branch information
teresajohnson authored Oct 3, 2024
1 parent dce5bf8 commit 79b32bc
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 0 deletions.
2 changes: 2 additions & 0 deletions llvm/lib/Transforms/Utils/CloneFunction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ BasicBlock *llvm::CloneBasicBlock(const BasicBlock *BB, ValueToValueMapTy &VMap,
if (isa<CallInst>(I) && !I.isDebugOrPseudoInst()) {
hasCalls = true;
hasMemProfMetadata |= I.hasMetadata(LLVMContext::MD_memprof);
hasMemProfMetadata |= I.hasMetadata(LLVMContext::MD_callsite);
}
if (const AllocaInst *AI = dyn_cast<AllocaInst>(&I)) {
if (!AI->isStaticAlloca()) {
Expand Down Expand Up @@ -556,6 +557,7 @@ void PruningFunctionCloner::CloneBlock(
if (isa<CallInst>(II) && !II->isDebugOrPseudoInst()) {
hasCalls = true;
hasMemProfMetadata |= II->hasMetadata(LLVMContext::MD_memprof);
hasMemProfMetadata |= II->hasMetadata(LLVMContext::MD_callsite);
}

CloneDbgRecordsToHere(NewInst, II);
Expand Down
8 changes: 8 additions & 0 deletions llvm/test/Transforms/Inline/memprof_inline2.ll
Original file line number Diff line number Diff line change
Expand Up @@ -90,10 +90,18 @@ entry:
; CHECK-LABEL: define dso_local noundef ptr @notprofiled
define dso_local noundef ptr @notprofiled() #0 !dbg !66 {
entry:
;; When foo is inlined, both the memprof and callsite metadata should be
;; stripped from the inlined call to new, as there is no callsite metadata on
;; the call.
; CHECK: call {{.*}} @_Znam
; CHECK-NOT: !memprof
; CHECK-NOT: !callsite
%call = call noundef ptr @_Z3foov(), !dbg !67
;; When baz is inlined, the callsite metadata should be stripped from the
;; inlined call to foo2, as there is no callsite metadata on the call.
; CHECK: call {{.*}} @_Z4foo2v
; CHECK-NOT: !callsite
%call2 = call noundef ptr @_Z3bazv()
; CHECK-NEXT: ret
ret ptr %call, !dbg !68
}
Expand Down

0 comments on commit 79b32bc

Please sign in to comment.