-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[MemProf] Attach value profile metadata to the IR using CalleeGuids. #141164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MemProf] Attach value profile metadata to the IR using CalleeGuids. #141164
Conversation
Use the newly introduced CalleeGuids in CallSiteInfo to annotate the IR where necessary with value profile metadata. Use a synthetic count of 1 since we don't have actual counts in the profile collection.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-pgo @llvm/pr-subscribers-llvm-transforms Author: Snehasish Kumar (snehasish) ChangesUse the newly introduced CalleeGuids in CallSiteInfo to annotate the IR Full diff: https://github.com/llvm/llvm-project/pull/141164.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp b/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp
index 6538311571529..29206781f1743 100644
--- a/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp
@@ -1209,6 +1209,41 @@ readMemprof(Module &M, Function &F, IndexedInstrProfReader *MemProfReader,
InlinedCallStack)) {
NumOfMemProfMatchedCallSites++;
addCallsiteMetadata(I, InlinedCallStack, Ctx);
+
+ // Check if this is an indirect call and we have GUID information
+ // from CallSiteInfo to attach value profile metadata
+ if (!CalledFunction) {
+ // This is an indirect call, look for CallSites with matching stacks
+ // that have CalleeGuids information
+ for (auto &CS : MemProfRec->CallSites) {
+ if (!CS.CalleeGuids.empty() && stackFrameIncludesInlinedCallStack(
+ CS.Frames, InlinedCallStack)) {
+ // Create value profile data from the CalleeGuids
+ SmallVector<InstrProfValueData, 4> VDs;
+ uint64_t TotalCount = 0;
+
+ for (GlobalValue::GUID CalleeGUID : CS.CalleeGuids) {
+ // For MemProf, we don't have actual call counts, so we assign
+ // a weight of 1 to each potential target. This provides the
+ // information needed for indirect call promotion without
+ // specific count data.
+ InstrProfValueData VD;
+ VD.Value = CalleeGUID;
+ VD.Count = 1; // Weight for ICP decision making
+ VDs.push_back(VD);
+ TotalCount += VD.Count;
+ }
+
+ if (!VDs.empty()) {
+ // Attach value profile metadata for indirect call targets
+ annotateValueSite(M, I, VDs, TotalCount,
+ IPVK_IndirectCallTarget, VDs.size());
+ }
+ break;
+ }
+ }
+ }
+
// Only need to find one with a matching call stack and add a single
// callsite metadata.
diff --git a/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call_yaml.test b/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call_yaml.test
new file mode 100644
index 0000000000000..dcea8f14f5fe9
--- /dev/null
+++ b/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call_yaml.test
@@ -0,0 +1,37 @@
+; Make sure that we can ingest the MemProf profile in YAML with CalleeGuids
+; and annotate an indirect call with value profile metadata.
+
+; RUN: split-file %s %t
+; RUN: llvm-profdata merge --memprof-version=4 %t/memprof_annotate_indirect_call_yaml.yaml -o %t/memprof_annotate_indirect_call_yaml.memprofdata
+; RUN: opt < %t/memprof_annotate_indirect_call_yaml.ll -passes='memprof-use<profile-filename=%t/memprof_annotate_indirect_call_yaml.memprofdata>' -S 2>&1 | FileCheck %s
+
+;--- memprof_annotate_indirect_call_yaml.yaml
+---
+HeapProfileRecords:
+ - GUID: _Z3barv
+ AllocSites: []
+ CallSites:
+ - Frames:
+ - { Function: _Z3barv, LineOffset: 3, Column: 5, IsInlineFrame: false }
+ CalleeGuids: [0x123456789abcdef0, 0x23456789abcdef01]
+...
+;--- memprof_annotate_indirect_call_yaml.ll
+define dso_local void @_Z3barv() !dbg !4 {
+entry:
+ %fp = alloca ptr, align 8
+ %0 = load ptr, ptr %fp, align 8
+ call void %0(), !dbg !5
+; CHECK: call void %0(), {{.*}} !prof ![[PROF:[0-9]+]]
+ ret void
+}
+
+; CHECK: ![[PROF]] = !{!"VP", i32 0, i64 2, i64 1311768467463790320, i64 1, i64 2541551405711093505, i64 1}
+
+!llvm.module.flags = !{!2, !3}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1)
+!1 = !DIFile(filename: "t", directory: "/")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = distinct !DISubprogram(name: "bar", linkageName: "_Z3barv", scope: !1, file: !1, line: 1, unit: !0)
+!5 = !DILocation(line: 4, column: 5, scope: !4)
|
Still need to verify on a large benchmark that it addresses the root cause. |
This isn't ready for review yet, converted to draft while I work on it some more. Apologies for the noise. |
d55dcbf
to
4abc503
Compare
4abc503
to
ca1cc24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@teresajohnson Marked as ready for review. Ptal, thanks!
if (I.getMetadata(LLVMContext::MD_prof)) { | ||
uint64_t Unused; | ||
auto ExistingVD = | ||
getValueProfDataFromInst(I, IPVK_IndirectCallTarget, ~0U, Unused); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document constant parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -0,0 +1,89 @@ | |||
; RUN: split-file %s %t | |||
|
|||
; COM: Basic functionality with flag toggle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use COM here and below? Just the leading ";" means it should be interpreted as a comment, although I typically use ";;" to set off the comment. Typically COM is only needed to prevent FileCheck from misinterpreting a reference to a directive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
%0 = load ptr, ptr %fp, align 8 | ||
; This call already has FDO value profile metadata - should NOT be modified | ||
; CHECK-FDO: call void %0(), {{.*}} !prof !6 | ||
; CHECK-FDO-NOT: call void %0(), {{.*}} !prof !9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the NOT check needed, given the preceding check for the call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
InstrProfValueData VD; | ||
VD.Value = CalleeGUID; | ||
// For MemProf, we don't have actual call counts, so we assign | ||
// a weight of 1 to each potential target. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this could be used to drive ICP too, but I'm not sure a weight of 1 will be hot enough. Should it either be a configurable value, or add a FIXME or TODO to adjust this to allow for ICP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added TODO.
// Potential targets for indirect calls. | ||
ArrayRef<GlobalValue::GUID> CalleeGuids; | ||
|
||
// Only compare Frame contents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe comment why you aren't using ArrayRef operator==. I guess that does elementwise equality and here you are looking for the underlying pointer being the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, added comment. It's also faster this way.
InlinedCallStack)) { | ||
NumOfMemProfMatchedCallSites++; | ||
addCallsiteMetadata(I, InlinedCallStack, Ctx); | ||
|
||
// Try to attach indirect call metadata if possible. | ||
if (!CalledFunction) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: remove braces around single line body
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the review, ptal.
if (I.getMetadata(LLVMContext::MD_prof)) { | ||
uint64_t Unused; | ||
auto ExistingVD = | ||
getValueProfDataFromInst(I, IPVK_IndirectCallTarget, ~0U, Unused); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
InlinedCallStack)) { | ||
NumOfMemProfMatchedCallSites++; | ||
addCallsiteMetadata(I, InlinedCallStack, Ctx); | ||
|
||
// Try to attach indirect call metadata if possible. | ||
if (!CalledFunction) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
InstrProfValueData VD; | ||
VD.Value = CalleeGUID; | ||
// For MemProf, we don't have actual call counts, so we assign | ||
// a weight of 1 to each potential target. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added TODO.
// Potential targets for indirect calls. | ||
ArrayRef<GlobalValue::GUID> CalleeGuids; | ||
|
||
// Only compare Frame contents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, added comment. It's also faster this way.
@@ -0,0 +1,89 @@ | |||
; RUN: split-file %s %t | |||
|
|||
; COM: Basic functionality with flag toggle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
%0 = load ptr, ptr %fp, align 8 | ||
; This call already has FDO value profile metadata - should NOT be modified | ||
; CHECK-FDO: call void %0(), {{.*}} !prof !6 | ||
; CHECK-FDO-NOT: call void %0(), {{.*}} !prof !9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with one question below
@@ -978,8 +978,9 @@ static void addVPMetadata(Module &M, Instruction &I, | |||
|
|||
if (I.getMetadata(LLVMContext::MD_prof)) { | |||
uint64_t Unused; | |||
auto ExistingVD = | |||
getValueProfDataFromInst(I, IPVK_IndirectCallTarget, ~0U, Unused); | |||
// ~0U means get all available value profile data without any count limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like overkill, why do we need them all given that we quit below if it is non-empty? At least for now why not just set this to 1? For eventual merging, it could be increased to one of the max values for this call in ICP. If you change this add a TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, done.
Merge activity
|
Use the newly introduced CalleeGuids in CallSiteInfo to annotate the IR
where necessary with value profile metadata. Use a synthetic count of 1
since we don't have actual counts in the profile collection.