Skip to content

[MemProf] Attach value profile metadata to the IR using CalleeGuids. #141164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

snehasish
Copy link
Contributor

Use the newly introduced CalleeGuids in CallSiteInfo to annotate the IR
where necessary with value profile metadata. Use a synthetic count of 1
since we don't have actual counts in the profile collection.

Use the newly introduced CalleeGuids in CallSiteInfo to annotate the IR
where necessary with value profile metadata. Use a synthetic count of 1
since we don't have actual counts in the profile collection.
@llvmbot llvmbot added PGO Profile Guided Optimizations llvm:transforms labels May 23, 2025
Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@llvmbot
Copy link
Member

llvmbot commented May 23, 2025

@llvm/pr-subscribers-pgo

@llvm/pr-subscribers-llvm-transforms

Author: Snehasish Kumar (snehasish)

Changes

Use the newly introduced CalleeGuids in CallSiteInfo to annotate the IR
where necessary with value profile metadata. Use a synthetic count of 1
since we don't have actual counts in the profile collection.


Full diff: https://github.com/llvm/llvm-project/pull/141164.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/MemProfiler.cpp (+35)
  • (added) llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call_yaml.test (+37)
diff --git a/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp b/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp
index 6538311571529..29206781f1743 100644
--- a/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp
@@ -1209,6 +1209,41 @@ readMemprof(Module &M, Function &F, IndexedInstrProfReader *MemProfReader,
                                                InlinedCallStack)) {
           NumOfMemProfMatchedCallSites++;
           addCallsiteMetadata(I, InlinedCallStack, Ctx);
+
+          // Check if this is an indirect call and we have GUID information
+          // from CallSiteInfo to attach value profile metadata
+          if (!CalledFunction) {
+            // This is an indirect call, look for CallSites with matching stacks
+            // that have CalleeGuids information
+            for (auto &CS : MemProfRec->CallSites) {
+              if (!CS.CalleeGuids.empty() && stackFrameIncludesInlinedCallStack(
+                                                 CS.Frames, InlinedCallStack)) {
+                // Create value profile data from the CalleeGuids
+                SmallVector<InstrProfValueData, 4> VDs;
+                uint64_t TotalCount = 0;
+
+                for (GlobalValue::GUID CalleeGUID : CS.CalleeGuids) {
+                  // For MemProf, we don't have actual call counts, so we assign
+                  // a weight of 1 to each potential target. This provides the
+                  // information needed for indirect call promotion without
+                  // specific count data.
+                  InstrProfValueData VD;
+                  VD.Value = CalleeGUID;
+                  VD.Count = 1; // Weight for ICP decision making
+                  VDs.push_back(VD);
+                  TotalCount += VD.Count;
+                }
+
+                if (!VDs.empty()) {
+                  // Attach value profile metadata for indirect call targets
+                  annotateValueSite(M, I, VDs, TotalCount,
+                                    IPVK_IndirectCallTarget, VDs.size());
+                }
+                break;
+              }
+            }
+          }
+
           // Only need to find one with a matching call stack and add a single
           // callsite metadata.
 
diff --git a/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call_yaml.test b/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call_yaml.test
new file mode 100644
index 0000000000000..dcea8f14f5fe9
--- /dev/null
+++ b/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call_yaml.test
@@ -0,0 +1,37 @@
+; Make sure that we can ingest the MemProf profile in YAML with CalleeGuids
+; and annotate an indirect call with value profile metadata.
+
+; RUN: split-file %s %t
+; RUN: llvm-profdata merge --memprof-version=4 %t/memprof_annotate_indirect_call_yaml.yaml -o %t/memprof_annotate_indirect_call_yaml.memprofdata
+; RUN: opt < %t/memprof_annotate_indirect_call_yaml.ll -passes='memprof-use<profile-filename=%t/memprof_annotate_indirect_call_yaml.memprofdata>' -S 2>&1 | FileCheck %s
+
+;--- memprof_annotate_indirect_call_yaml.yaml
+---
+HeapProfileRecords:
+  - GUID:            _Z3barv
+    AllocSites:      []
+    CallSites:
+      - Frames:
+          - { Function: _Z3barv, LineOffset: 3, Column: 5, IsInlineFrame: false }
+        CalleeGuids:   [0x123456789abcdef0, 0x23456789abcdef01]
+...
+;--- memprof_annotate_indirect_call_yaml.ll
+define dso_local void @_Z3barv() !dbg !4 {
+entry:
+  %fp = alloca ptr, align 8
+  %0 = load ptr, ptr %fp, align 8
+  call void %0(), !dbg !5
+; CHECK: call void %0(), {{.*}} !prof ![[PROF:[0-9]+]]
+  ret void
+}
+
+; CHECK: ![[PROF]] = !{!"VP", i32 0, i64 2, i64 1311768467463790320, i64 1, i64 2541551405711093505, i64 1}
+
+!llvm.module.flags = !{!2, !3}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1)
+!1 = !DIFile(filename: "t", directory: "/")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = distinct !DISubprogram(name: "bar", linkageName: "_Z3barv", scope: !1, file: !1, line: 1, unit: !0)
+!5 = !DILocation(line: 4, column: 5, scope: !4)

Copy link
Contributor Author

Still need to verify on a large benchmark that it addresses the root cause.

@snehasish snehasish marked this pull request as draft May 23, 2025 15:46
Copy link
Contributor Author

This isn't ready for review yet, converted to draft while I work on it some more. Apologies for the noise.

@snehasish snehasish force-pushed the users/snehasish/05-22-_memprof_attach_value_profile_metadata_to_the_ir_using_calleeguids branch from d55dcbf to 4abc503 Compare May 30, 2025 18:23
@snehasish snehasish force-pushed the users/snehasish/05-22-_memprof_attach_value_profile_metadata_to_the_ir_using_calleeguids branch from 4abc503 to ca1cc24 Compare May 30, 2025 18:53
@snehasish snehasish marked this pull request as ready for review May 30, 2025 18:55
Copy link
Contributor Author

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@teresajohnson Marked as ready for review. Ptal, thanks!

@snehasish snehasish requested a review from kazutakahirata May 30, 2025 19:02
if (I.getMetadata(LLVMContext::MD_prof)) {
uint64_t Unused;
auto ExistingVD =
getValueProfDataFromInst(I, IPVK_IndirectCallTarget, ~0U, Unused);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document constant parameter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -0,0 +1,89 @@
; RUN: split-file %s %t

; COM: Basic functionality with flag toggle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use COM here and below? Just the leading ";" means it should be interpreted as a comment, although I typically use ";;" to set off the comment. Typically COM is only needed to prevent FileCheck from misinterpreting a reference to a directive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

%0 = load ptr, ptr %fp, align 8
; This call already has FDO value profile metadata - should NOT be modified
; CHECK-FDO: call void %0(), {{.*}} !prof !6
; CHECK-FDO-NOT: call void %0(), {{.*}} !prof !9
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the NOT check needed, given the preceding check for the call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

InstrProfValueData VD;
VD.Value = CalleeGUID;
// For MemProf, we don't have actual call counts, so we assign
// a weight of 1 to each potential target.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally this could be used to drive ICP too, but I'm not sure a weight of 1 will be hot enough. Should it either be a configurable value, or add a FIXME or TODO to adjust this to allow for ICP?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added TODO.

// Potential targets for indirect calls.
ArrayRef<GlobalValue::GUID> CalleeGuids;

// Only compare Frame contents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe comment why you aren't using ArrayRef operator==. I guess that does elementwise equality and here you are looking for the underlying pointer being the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, added comment. It's also faster this way.

InlinedCallStack)) {
NumOfMemProfMatchedCallSites++;
addCallsiteMetadata(I, InlinedCallStack, Ctx);

// Try to attach indirect call metadata if possible.
if (!CalledFunction) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove braces around single line body

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor Author

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the review, ptal.

if (I.getMetadata(LLVMContext::MD_prof)) {
uint64_t Unused;
auto ExistingVD =
getValueProfDataFromInst(I, IPVK_IndirectCallTarget, ~0U, Unused);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

InlinedCallStack)) {
NumOfMemProfMatchedCallSites++;
addCallsiteMetadata(I, InlinedCallStack, Ctx);

// Try to attach indirect call metadata if possible.
if (!CalledFunction) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

InstrProfValueData VD;
VD.Value = CalleeGUID;
// For MemProf, we don't have actual call counts, so we assign
// a weight of 1 to each potential target.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added TODO.

// Potential targets for indirect calls.
ArrayRef<GlobalValue::GUID> CalleeGuids;

// Only compare Frame contents.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, added comment. It's also faster this way.

@@ -0,0 +1,89 @@
; RUN: split-file %s %t

; COM: Basic functionality with flag toggle
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

%0 = load ptr, ptr %fp, align 8
; This call already has FDO value profile metadata - should NOT be modified
; CHECK-FDO: call void %0(), {{.*}} !prof !6
; CHECK-FDO-NOT: call void %0(), {{.*}} !prof !9
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with one question below

@@ -978,8 +978,9 @@ static void addVPMetadata(Module &M, Instruction &I,

if (I.getMetadata(LLVMContext::MD_prof)) {
uint64_t Unused;
auto ExistingVD =
getValueProfDataFromInst(I, IPVK_IndirectCallTarget, ~0U, Unused);
// ~0U means get all available value profile data without any count limit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like overkill, why do we need them all given that we quit below if it is non-empty? At least for now why not just set this to 1? For eventual merging, it could be increased to one of the max values for this call in ICP. If you change this add a TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done.

Copy link
Contributor Author

snehasish commented May 31, 2025

Merge activity

  • May 31, 7:51 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • May 31, 7:53 PM UTC: @snehasish merged this pull request with Graphite.

@snehasish snehasish merged commit c7b421d into main May 31, 2025
11 checks passed
@snehasish snehasish deleted the users/snehasish/05-22-_memprof_attach_value_profile_metadata_to_the_ir_using_calleeguids branch May 31, 2025 19:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:transforms PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants