-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[AIX] PGO codegen changes for function-sections. #139761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[AIX] PGO codegen changes for function-sections. #139761
Conversation
The existing PGO implementation creates a dependence from the profiling data symbol to the function descriptors for every function defined within the compilation unit. This forces the binder to include either all or no functions from an object file when linking a profile-generate enabled object file, which can break links that require the aggressive "include only whats used" link semantics that XCOFF provides. To remedy this we break up the profiling data symbol into per-function CSECTs by using the rename directive to emit the profd symbol for a single function to a unique CSECT, then rename it to the common name afterwards. We also have to split the counters into per-function CSECTs as they need to have a .ref relocation from the counter section which is referenced by the functions, to the profiling-data symbol which is otherwise unreferenced.
@llvm/pr-subscribers-lto @llvm/pr-subscribers-backend-powerpc Author: Sean Fertile (mandlebug) ChangesThe existing PGO implementation creates a dependence from the profiling data symbol to the function descriptors for every function defined within the compilation unit. This forces the binder to include either all or no functions from an object file when linking a profile-generate enabled object file, which can break links that require the aggressive "include only whats used" link semantics that XCOFF provides. To remedy this we break up the profiling data symbol into per-function CSECTs by using the rename directive to emit the profd symbol for a single function to a unique CSECT, then rename it to the common name afterwards. We also have to split the counters into per-function CSECTs as they need to have a .ref relocation from the counter section which is referenced by the functions, to the profiling-data symbol which is otherwise unreferenced. Patch is 25.22 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139761.diff 6 Files Affected:
diff --git a/compiler-rt/test/profile/AIX/needs-garbage-collection.c b/compiler-rt/test/profile/AIX/needs-garbage-collection.c
new file mode 100644
index 0000000000000..e27467be0f05c
--- /dev/null
+++ b/compiler-rt/test/profile/AIX/needs-garbage-collection.c
@@ -0,0 +1,49 @@
+// RUN: split-file %s %t
+// RUN: cd %t
+// RUN: %clang_pgogen -ffunction-sections main.c -c -o main.o
+// RUN: %clang_pgogen -ffunction-sections needs_gc.c -c -o needs_gc.o
+// RUN: %clang_pgogen main.o needs_gc.o -o needs_gc.out
+// RUN: env LLVM_PROFILE_FILE=needs_gc.profraw %run ./needs_gc.out > /dev/null
+// RUN: llvm-profdata show --all-functions needs_gc.profraw | FileCheck %s
+
+// CHECK-DAG: main
+// CHECK-DAG: baz
+// CHECK-DAG: get_message
+
+
+//--- main.c
+const char* get_message(void) {
+ return "Hello World!";
+}
+
+int foo(void);
+double bar(void);
+const char* baz();
+
+int printf(const char*, ...);
+
+int main(void) {
+ printf("%s\n", baz());
+}
+
+//--- needs_gc.c
+extern int not_def_one(const char *);
+extern double not_def_two(void);
+
+extern const char* get_message(void);
+
+char buf[512];
+int foo(const char *ptr, unsigned long size) {
+ void *memcpy(void *, const void *, unsigned long);
+ memcpy(buf, ptr, size);
+ return not_def_one(buf);
+}
+
+double bar(void) {
+ return not_def_two();
+}
+
+
+const char* baz() {
+ return get_message();
+}
diff --git a/llvm/lib/CodeGen/GlobalMerge.cpp b/llvm/lib/CodeGen/GlobalMerge.cpp
index e58d7e344c28b..b3200ee4630d3 100644
--- a/llvm/lib/CodeGen/GlobalMerge.cpp
+++ b/llvm/lib/CodeGen/GlobalMerge.cpp
@@ -87,6 +87,7 @@
#include "llvm/InitializePasses.h"
#include "llvm/MC/SectionKind.h"
#include "llvm/Pass.h"
+#include "llvm/ProfileData/InstrProf.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"
@@ -155,6 +156,7 @@ class GlobalMergeImpl {
const TargetMachine *TM = nullptr;
GlobalMergeOptions Opt;
bool IsMachO = false;
+ bool IsAIX = false;
private:
bool doMerge(SmallVectorImpl<GlobalVariable *> &Globals, Module &M,
@@ -674,7 +676,9 @@ bool GlobalMergeImpl::run(Module &M) {
if (!EnableGlobalMerge)
return false;
- IsMachO = M.getTargetTriple().isOSBinFormatMachO();
+ Triple T(M.getTargetTriple());
+ IsMachO = T.isOSBinFormatMachO();
+ IsAIX = T.isOSBinFormatXCOFF();
auto &DL = M.getDataLayout();
MapVector<std::pair<unsigned, StringRef>, SmallVector<GlobalVariable *, 0>>
@@ -717,6 +721,14 @@ bool GlobalMergeImpl::run(Module &M) {
GV.getName().starts_with(".llvm.") || Section == "llvm.metadata")
continue;
+ // Do not merge profiling counters as it will prevent us from breaking
+ // the __llvm_prf_cnts section into subsections, which in turn creates
+ // extra symbol dependencies that can break otherwise valid link steps.
+ if (IsAIX && TM && TM->getFunctionSections() && GV.hasSection() &&
+ Section.starts_with(
+ getInstrProfSectionName(IPSK_cnts, Triple::XCOFF, false)))
+ continue;
+
// Ignore all "required" globals:
if (isMustKeepGlobalVariable(&GV))
continue;
diff --git a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
index e1bdc7e09fdc0..eef9e52ca827a 100644
--- a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
+++ b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
@@ -2435,7 +2435,7 @@ MCSection *TargetLoweringObjectFileXCOFF::getExplicitSectionGlobal(
if (!GO->hasSection())
report_fatal_error("#pragma clang section is not yet supported");
- StringRef SectionName = GO->getSection();
+ std::string SectionName(GO->getSection());
// Handle the XCOFF::TD case first, then deal with the rest.
if (const GlobalVariable *GVar = dyn_cast<GlobalVariable>(GO))
@@ -2458,6 +2458,25 @@ MCSection *TargetLoweringObjectFileXCOFF::getExplicitSectionGlobal(
else
report_fatal_error("XCOFF other section types not yet implemented.");
+ // The profiling instrumentation symbols are special in that we want to
+ // emit a unique CSECT for each when function sections are enabeld, which
+ // are then renamed back to the CSECT name specified by the explicit section.
+ // This is to work around the limitation of not having section groups or a
+ // similar feature in XCOFF.
+ if (TM.getFunctionSections()) {
+ std::string ProfilingDataSectionName =
+ getInstrProfSectionName(IPSK_data, Triple::XCOFF, false);
+ std::string ProfilingCounterSectionName =
+ getInstrProfSectionName(IPSK_cnts, Triple::XCOFF, false);
+ if ((SectionName == ProfilingDataSectionName &&
+ GO->getName().starts_with("__profd_")) ||
+ (SectionName == ProfilingCounterSectionName &&
+ GO->getName().starts_with("__profc_"))) {
+ SectionName += ".";
+ SectionName += GO->getName();
+ }
+ }
+
return getContext().getXCOFFSection(
SectionName, Kind, XCOFF::CsectProperties(MappingClass, XCOFF::XTY_SD),
/* MultiSymbolsAllowed*/ true);
diff --git a/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp b/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
index 0fe615a95894f..bd642c9827d26 100644
--- a/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
+++ b/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
@@ -253,6 +253,22 @@ class PPCAIXAsmPrinter : public PPCAsmPrinter {
DenseMap<const GlobalObject *, SmallVector<const GlobalAlias *, 1>>
GOAliasMap;
+ // The __profd_* symbol for the profiling instrumentation data and the
+ // corresponding __profc_* counters it refernces.
+ struct ProfilingSubSection {
+ MCSectionXCOFF *ProfD;
+ MCSectionXCOFF *ProfC;
+ };
+
+ // Collect the 'sub-sections' of the profile-generate symbols
+ // so we can:
+ // 1) rename to the common CSECT name after emission.
+ // 2) emit the refs from the profc_ symbol to the related CSECTs.
+ SmallVector<ProfilingSubSection> ProfGenSubSections;
+
+ void emitSharedSectionPGORefs(Module &M);
+ void emitSplitSectionPGORefs();
+
uint16_t getNumberOfVRSaved();
void emitTracebackTable();
@@ -2810,6 +2826,57 @@ void PPCAIXAsmPrinter::emitGlobalVariableHelper(const GlobalVariable *GV) {
MCSectionXCOFF *Csect = cast<MCSectionXCOFF>(
getObjFileLowering().SectionForGlobal(GV, GVKind, TM));
+ // When compiling with function sections enabled, we need some special
+ // codegen to rename the CSECTs. For each profiling data symbol find its
+ // associated profiling counters.
+ if (TM.getFunctionSections() &&
+ Csect->getName().starts_with("__llvm_prf_data.")) {
+ // Unraveling the initializer to find the related counters variable. The
+ // initializer is a structure whose third member is a subtract expression
+ // between the counters label and the label for the start of this structure.
+ // Use the subtract expression to get the GlobalValue for the counters
+ // global.
+ assert(GV->hasInitializer() &&
+ "profiling data symbol must have an initializer");
+ assert(isa<ConstantStruct>(GV->getInitializer()) &&
+ "expect the initializer for a profiling data symbol to be a struct");
+ const ConstantStruct *Initializer =
+ cast<ConstantStruct>(GV->getInitializer());
+
+ // The initializer structure is: { i64, i64, i32, ptr, ptr, i32, [4 x i16] }
+ // and the reference to the global variable for the counters is in the
+ // first i32 member.
+ const Constant *Member = Initializer->getAggregateElement(2);
+ assert(Member && "profiling data symbol has more then 3 elements");
+
+ // Want to decompose a constant expression of the form:
+ // sub (ptrtoint (ptr @__profc_sym), ptrtoint (ptr @__profd_sym))
+ // to get the GlobalVariable for the '@__profc_sym` symbol.
+ assert(isa<ConstantExpr>(Member) &&
+ "expected member initializer is a constant expression.");
+ const ConstantExpr *CExpr = cast<ConstantExpr>(Member);
+ assert(CExpr->getOpcode() == Instruction::Sub &&
+ "expected member intializer is a sub expression.");
+
+ Value *V1 = CExpr->getOperand(0);
+ assert(V1 && isa<ConstantExpr>(V1) &&
+ "expected sub expression operand to be constant expr.");
+ ConstantExpr *PointerToIntExpr = cast<ConstantExpr>(V1);
+ assert(PointerToIntExpr->isCast() && "unexpected operand type.");
+
+ Value *PointerToIntOperand = PointerToIntExpr->getOperand(0);
+ assert(isa<GlobalVariable>(PointerToIntOperand) &&
+ "expected global variable of profc symbol");
+
+ const GlobalVariable *ProfCGV = cast<GlobalVariable>(PointerToIntOperand);
+ // Map the global variable to its CSECT.
+ SectionKind ProfCKind = getObjFileLowering().getKindForGlobal(GV, TM);
+ MCSectionXCOFF *ProfCCsect = cast<MCSectionXCOFF>(
+ getObjFileLowering().SectionForGlobal(ProfCGV, ProfCKind, TM));
+
+ ProfGenSubSections.push_back({Csect, ProfCCsect});
+ }
+
// Switch to the containing csect.
OutStreamer->switchSection(Csect);
@@ -2911,7 +2978,7 @@ void PPCAIXAsmPrinter::emitFunctionEntryLabel() {
getObjFileLowering().getFunctionEntryPointSymbol(Alias, TM));
}
-void PPCAIXAsmPrinter::emitPGORefs(Module &M) {
+void PPCAIXAsmPrinter::emitSharedSectionPGORefs(Module &M) {
if (!OutContext.hasXCOFFSection(
"__llvm_prf_cnts",
XCOFF::CsectProperties(XCOFF::XMC_RW, XCOFF::XTY_SD)))
@@ -2960,6 +3027,54 @@ void PPCAIXAsmPrinter::emitPGORefs(Module &M) {
}
}
+void PPCAIXAsmPrinter::emitSplitSectionPGORefs() {
+ MCSymbol *NamesSym = nullptr;
+ MCSymbol *VNDSSym = nullptr;
+
+ if (OutContext.hasXCOFFSection(
+ "__llvm_prf_names",
+ XCOFF::CsectProperties(XCOFF::XMC_RO, XCOFF::XTY_SD)))
+ NamesSym = OutContext.getOrCreateSymbol("__llvm_prf_names[RO]");
+
+ if (OutContext.hasXCOFFSection(
+ "__llvm_prf_vnds",
+ XCOFF::CsectProperties(XCOFF::XMC_RW, XCOFF::XTY_SD)))
+ VNDSSym = OutContext.getOrCreateSymbol("__llvm_prf_vnds[RW]");
+
+ for (auto SubSections : ProfGenSubSections) {
+ MCSectionXCOFF *ProfDCsect = SubSections.ProfD;
+ MCSectionXCOFF *ProfCCsect = SubSections.ProfC;
+
+ OutStreamer->switchSection(ProfCCsect);
+
+ if (NamesSym)
+ OutStreamer->emitXCOFFRefDirective(NamesSym);
+
+ if (VNDSSym)
+ OutStreamer->emitXCOFFRefDirective(VNDSSym);
+
+ OutStreamer->emitXCOFFRefDirective(ProfDCsect->getQualNameSymbol());
+
+ // Rename the subsection for the counters
+ OutStreamer->emitXCOFFRenameDirective(ProfCCsect->getQualNameSymbol(),
+ "__llvm_prf_cnts");
+ OutStreamer->addBlankLine();
+
+ // Rename the subsection for the data.
+ OutStreamer->switchSection(ProfDCsect);
+ OutStreamer->emitXCOFFRenameDirective(ProfDCsect->getQualNameSymbol(),
+ "__llvm_prf_data");
+ OutStreamer->addBlankLine();
+ }
+}
+
+void PPCAIXAsmPrinter::emitPGORefs(Module &M) {
+ if (!TM.getFunctionSections())
+ emitSharedSectionPGORefs(M);
+ else
+ emitSplitSectionPGORefs();
+}
+
void PPCAIXAsmPrinter::emitGCOVRefs() {
if (!OutContext.hasXCOFFSection(
"__llvm_gcov_ctr_section",
diff --git a/llvm/test/CodeGen/PowerPC/aix-pgo-function-sections.ll b/llvm/test/CodeGen/PowerPC/aix-pgo-function-sections.ll
new file mode 100644
index 0000000000000..bf3a5cf6dbe34
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/aix-pgo-function-sections.ll
@@ -0,0 +1,61 @@
+; RUN: llc --function-sections -mtriple powerpc-ibm-aix-xcoff < %s | \
+; RUN: FileCheck %s
+
+; RUN: llc --function-sections -mtriple powerpc64-ibm-aix-xcoff < %s | \
+; RUN: FileCheck %s
+
+@i = external local_unnamed_addr global i32, align 4
+@__llvm_profile_raw_version = weak hidden local_unnamed_addr constant i64 72057594037927944
+@__profc_func1 = private global [1 x i64] zeroinitializer, section "__llvm_prf_cnts", align 8
+@__profd_func1 = private global { i64, i64, i32, ptr, ptr, i32, [4 x i16] } { i64 -2545542355363006406, i64 742261418966908927, i32 sub (i32 ptrtoint (ptr @__profc_func1 to i32), i32 ptrtoint (ptr @__profd_func1 to i32)), ptr @func1.local, ptr null, i32 1, [4 x i16] zeroinitializer }, section "__llvm_prf_data", align 8
+@__profc_func2 = private global [1 x i64] zeroinitializer, section "__llvm_prf_cnts", align 8
+@__profd_func2 = private global { i64, i64, i32, ptr, ptr, i32, [4 x i16] } { i64 -4377547752858689819, i64 742261418966908927, i32 sub (i32 ptrtoint (ptr @__profc_func2 to i32), i32 ptrtoint (ptr @__profd_func2 to i32)), ptr @func2.local, ptr null, i32 1, [4 x i16] zeroinitializer }, section "__llvm_prf_data", align 8
+@__llvm_prf_nm = private constant [13 x i8] c"\0B\00func1\01func2", section "__llvm_prf_names", align 1
+@__llvm_profile_filename = weak hidden local_unnamed_addr constant [19 x i8] c"default_%m.profraw\00"
+@llvm.used = appending global [3 x ptr] [ptr @__llvm_prf_nm, ptr @__profd_func1, ptr @__profd_func2], section "llvm.metadata"
+
+@func1.local = private alias i32 (), ptr @func1
+@func2.local = private alias i32 (), ptr @func2
+
+define i32 @func1() {
+entry:
+ %pgocount = load i64, ptr @__profc_func1, align 8
+ %0 = add i64 %pgocount, 1
+ store i64 %0, ptr @__profc_func1, align 8
+ %1 = load i32, ptr @i, align 4
+ ret i32 %1
+}
+
+define i32 @func2() {
+entry:
+ %pgocount = load i64, ptr @__profc_func2, align 8
+ %0 = add i64 %pgocount, 1
+ store i64 %0, ptr @__profc_func2, align 8
+ %1 = load i32, ptr @i, align 4
+ %call = tail call i32 @external_func(i32 noundef %1)
+ ret i32 %call
+}
+
+declare i32 @external_func(i32 noundef)
+
+; CHECK-DAG: .csect __llvm_prf_cnts.__profc_func1[RW]
+; CHECK-DAG: .csect __llvm_prf_data.__profd_func1[RW]
+; CHECK-DAG: .csect __llvm_prf_cnts.__profc_func2[RW]
+; CHECK-DAG: .csect __llvm_prf_data.__profd_func2[RW]
+; CHECK-DAG: .csect __llvm_prf_names[RO]
+
+; CHECK: .csect __llvm_prf_cnts.__profc_func1[RW]
+; CHECK-NEXT: .ref __llvm_prf_names[RO]
+; CHECK-NEXT: .ref __llvm_prf_data.__profd_func1[RW]
+; CHECK-NEXT: .rename __llvm_prf_cnts.__profc_func1[RW],"__llvm_prf_cnts"
+
+; CHECK: .csect __llvm_prf_data.__profd_func1[RW]
+; CHECK-NEXT: .rename __llvm_prf_data.__profd_func1[RW],"__llvm_prf_data"
+
+; CHECK: .csect __llvm_prf_cnts.__profc_func2[RW]
+; CHECK-NEXT: .ref __llvm_prf_names[RO]
+; CHECK-NEXT: .ref __llvm_prf_data.__profd_func2[RW]
+; CHECK-NEXT: .rename __llvm_prf_cnts.__profc_func2[RW],"__llvm_prf_cnts"
+
+; CHECK: .csect __llvm_prf_data.__profd_func2[RW]
+; CHECK-NEXT: .rename __llvm_prf_data.__profd_func2[RW],"__llvm_prf_data"
diff --git a/llvm/test/LTO/PowerPC/pgo-function-sections-aix.ll b/llvm/test/LTO/PowerPC/pgo-function-sections-aix.ll
new file mode 100644
index 0000000000000..81e326add79b7
--- /dev/null
+++ b/llvm/test/LTO/PowerPC/pgo-function-sections-aix.ll
@@ -0,0 +1,210 @@
+; RUN: rm -rf %t
+; RUN: split-file %s %t
+; RUN: llvm-as %t/f1.ll -o %t/f1.bc
+; RUN: llvm-as %t/f2.ll -o %t/f2.bc
+; RUN: llvm-lto -filetype=asm -function-sections=1 %t/f1.bc %t/f2.bc -o %t/fc.s
+; RUN: cat %t/fc.s | FileCheck %s
+
+;; Test that the renaming mechanism for the profiling counters and data
+;; symbols section works when performing an LTO link with symbols with
+;; name clashes from different modules.
+
+
+;--- f1.ll
+target datalayout = "E-m:a-p:32:32-i64:64-n32"
+target triple = "powerpc-ibm-aix7.2.0.0"
+
+@__llvm_profile_raw_version = weak hidden constant i64 72057594037927944
+@__profc_func1 = private global [2 x i64] zeroinitializer, section "__llvm_prf_cnts", align 8
+@__profd_func1 = private global { i64, i64, i32, ptr, ptr, i32, [4 x i16] } { i64 -2545542355363006406, i64 146835647075900052, i32 sub (i32 ptrtoint (ptr @__profc_func1 to i32), i32 ptrtoint (ptr @__profd_func1 to i32)), ptr @func1.local, ptr null, i32 2, [4 x i16] zeroinitializer }, section "__llvm_prf_data", align 8
+@__profc_file1.c_internal_func = private global [1 x i64] zeroinitializer, section "__llvm_prf_cnts", align 8
+@__profd_file1.c_internal_func = private global { i64, i64, i32, ptr, ptr, i32, [4 x i16] } { i64 2905054957054668920, i64 742261418966908927, i32 sub (i32 ptrtoint (ptr @__profc_file1.c_internal_func to i32), i32 ptrtoint(ptr @__profd_file1.c_internal_func to i32)), ptr @internal_func, ptr null, i32 1, [4 x i16] zeroinitializer }, section "__llvm_prf_data", align 8
+@__profc_escape1 = private global [1 x i64] zeroinitializer, section "__llvm_prf_cnts", align 8
+@__profd_escape1 = private global { i64, i64, i32, ptr, ptr, i32, [4 x i16] } { i64 3473293639883741762, i64 742261418966908927, i32 sub (i32 ptrtoint (ptr @__profc_escape1 to i32), i32 ptrtoint (ptr @__profd_escape1 to i32)), ptr @escape1.local, ptr null, i32 1, [4 x i16] zeroinitializer }, section "__llvm_prf_data", align 8
+@__llvm_prf_nm = private constant [37 x i8] c"#\00func1\01file1.c:internal_func\01escape1", section "__llvm_prf_names", align 1
+@llvm.used = appending global [4 x ptr] [ptr @__profd_func1, ptr @__profd_file1.c_internal_func, ptr @__profd_escape1, ptr @__llvm_prf_nm], section "llvm.metadata"
+@__llvm_profile_filename = weak hidden constant [19 x i8] c"default_%m.profraw\00"
+
+@func1.local = private alias i64 (i32), ptr @func1
+@escape1.local = private alias ptr (), ptr @escape1
+
+; Function Attrs: noinline nounwind optnone
+define i64 @func1(i32 noundef %n) #0 {
+entry:
+ %retval = alloca i64, align 8
+ %n.addr = alloca i32, align 4
+ store i32 %n, ptr %n.addr, align 4
+ %0 = load i32, ptr %n.addr, align 4
+ %cmp = icmp slt i32 %0, 0
+ br i1 %cmp, label %if.then, label %if.end
+
+if.then: ; preds = %entry
+ %pgocount = load i64, ptr getelementptr inbounds ([2 x i64], ptr @__profc_func1, i32 0, i32 1), align 8
+ %1 = add i64 %pgocount, 1
+ store i64 %1, ptr getelementptr inbounds ([2 x i64], ptr @__profc_func1, i32 0, i32 1), align 8
+ store i64 0, ptr %retval, align 8
+ br label %return
+
+if.end: ; preds = %entry
+ %pgocount1 = load i64, ptr @__profc_func1, align 8
+ %2 = add i64 %pgocount1, 1
+ store i64 %2, ptr @__profc_func1, align 8
+ %3 = load i32, ptr %n.addr, align 4
+ %call = call i64 @internal_func(i32 noundef %3)
+ store i64 %call, ptr %retval, align 8
+ br label %return
+
+return: ; preds = %if.end, %if.then
+ %4 = load i64, ptr %retval, align 8
+ ret i64 %4
+}
+
+; Function Attrs: noinline nounwind optnone
+define internal i64 @internal_func(i32 noundef %n) #0 {
+entry:
+ %pgocount = load i64, ptr @__profc_file1.c_internal_func, align 8
+ %0 = add i64 %pgocount, 1
+ store i64 %0, ptr @__profc_file1.c_internal_func, align 8
+ %n.addr = alloca i32, align 4
+ store i32 %n, ptr %n.addr, align 4
+ %1 = load i32, ptr %n.addr, align 4
+ %conv = sext i32 %1 to i64
+ ret i64 %conv
+}
+
+; Function Attrs: noinline nounwind optnone
+define ptr @escape1() #0 {
+entry:
+ %pgocount = load i64, ptr @__profc_escape1, align 8
+ %0 = add i64 %pgocount, 1
+ store i64 %0, ptr @__profc_escape1, align 8
+ ret ptr @internal_func
+}
+
+; Function Attrs: nounwind
+declare void @llvm.instrprof.increment(ptr, i64, i32, i32) #1
+
+attributes #0 = { noinline nounwind optnone }
+attributes #1 = { nounwind }
+
+;--- f2.ll
+target datalayout = "E-m:a-p:32:32-i64:64-n32"
+target triple = "powerpc-ibm-aix7.2.0.0"
+
+@__llvm_profile_raw_version = weak hidden constant i64 72057594037927944
+@__profc_func2 = private global [2 x i64] zeroinitializer, section "__llvm_prf_cnts", align 8
+@__profd_func2 = private global { i64, i64, i32, ptr, ptr, i32, [4 x i16] } { i64 -4377547752858689819, i64 146835647075900052, i32 sub (i32 ptrtoint (ptr @__profc_func2 to i32), i32 ptrtoint (ptr @__profd_func2 to i32)), ptr @func2.local, ptr null, i32 2, [4 x i16] zeroinitializer }, section "__llvm_prf_data", align 8
+@__profc_file2.c_internal_func = private global [1 x i64] zeroinitializer, section "__llvm_prf_cnts", align 8
+@__profd_file2.c_internal_func = private global { i64, i64, i32, ptr, ptr, i32, [4 x i16] } { i64 4899437111831460578, i64 742261418966908927, i32 sub (i32 ptrtoint (ptr @__profc_file2.c_internal_func to i32), i32 ptrtoint (ptr @__profd_file2.c_internal_func to i32)), ptr @internal_func, ptr null, i32 1, [4 x i16] zeroinitializer }, section "__llvm_prf_data",...
[truncated]
|
You can test this locally with the following command:git-clang-format --diff HEAD~1 HEAD --extensions c,cpp -- compiler-rt/test/profile/AIX/needs-garbage-collection.c llvm/lib/CodeGen/GlobalMerge.cpp llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp View the diff from clang-format here.diff --git a/compiler-rt/test/profile/AIX/needs-garbage-collection.c b/compiler-rt/test/profile/AIX/needs-garbage-collection.c
index edb6710fa..8f190fb34 100644
--- a/compiler-rt/test/profile/AIX/needs-garbage-collection.c
+++ b/compiler-rt/test/profile/AIX/needs-garbage-collection.c
@@ -10,38 +10,28 @@
// CHECK-DAG: baz
// CHECK-DAG: get_message
-
//--- main.c
-const char* get_message(void) {
- return "Hello World!";
-}
+const char *get_message(void) { return "Hello World!"; }
-const char* baz();
+const char *baz();
-int printf(const char*, ...);
+int printf(const char *, ...);
-int main(void) {
- printf("%s\n", baz());
-}
+int main(void) { printf("%s\n", baz()); }
//--- needs_gc.c
extern int not_def_one(const char *);
extern double not_def_two(void);
-extern const char* get_message(void);
+extern const char *get_message(void);
char buf[512];
int foo(const char *ptr, unsigned long size) {
- void *memcpy(void *, const void *, unsigned long);
- memcpy(buf, ptr, size);
- return not_def_one(buf);
-}
-
-double bar(void) {
- return not_def_two();
+ void *memcpy(void *, const void *, unsigned long);
+ memcpy(buf, ptr, size);
+ return not_def_one(buf);
}
+double bar(void) { return not_def_two(); }
-const char* baz() {
- return get_message();
-}
+const char *baz() { return get_message(); }
|
int foo(void); | ||
double bar(void); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these declaration are dead in the frontend, can be removed
@@ -2458,6 +2458,25 @@ MCSection *TargetLoweringObjectFileXCOFF::getExplicitSectionGlobal( | |||
else | |||
report_fatal_error("XCOFF other section types not yet implemented."); | |||
|
|||
// The profiling instrumentation symbols are special in that we want to | |||
// emit a unique CSECT for each when function sections are enabeld, which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enabled
std::string ProfilingCounterSectionName = | ||
getInstrProfSectionName(IPSK_cnts, Triple::XCOFF, false); | ||
if ((SectionName == ProfilingDataSectionName && | ||
GO->getName().starts_with("__profd_")) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm::getInstrProfDataVarPrefix() can be used in place of "__profd_"
// between the counters label and the label for the start of this structure. | ||
// Use the subtract expression to get the GlobalValue for the counters | ||
// global. | ||
assert(GV->hasInitializer() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic is tied to the logic that generates the initializer. Can we extract it into a query function in InstrProf.h? Or if we can somehow use the profd's layout described through the INSTR_PROF_DATA
macro in InstrProfData.inc
to find the counter field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really dislike this part because we are so tightly coupled to the implementation that if anyone changed the initializer format they would have to simultaneously modify the PPC target which is a problem. For ELF the section keys on the Comdats carry the relationship info in the IR implicitly.
What do you think of an alternative approach where we could use a metadata node to explicitly attach to dependancy info to the data symbol? https://llvm.org/docs/LangRef.html#associated-metadata is a similar idea for ELF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this is really brittle and may force developers changing InstrProfiling
to also change the PowerPC
backend and they may not have the necessary expertise or a way to test their changes.
- Fix spelling mistake in comment. - Use llvm functions to get the string prefixes for pgo data and counter objects instead of hardcoding them. - Remove some unused declarations from one of the test files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor nit comments.
@@ -253,6 +253,22 @@ class PPCAIXAsmPrinter : public PPCAsmPrinter { | |||
DenseMap<const GlobalObject *, SmallVector<const GlobalAlias *, 1>> | |||
GOAliasMap; | |||
|
|||
// The __profd_* symbol for the profiling instrumentation data and the | |||
// corresponding __profc_* counters it refernces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// corresponding __profc_* counters it refernces. | |
// corresponding __profc_* counters it references. |
// global. | ||
assert(GV->hasInitializer() && | ||
"profiling data symbol must have an initializer"); | ||
assert(isa<ConstantStruct>(GV->getInitializer()) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can probably pull out the getInitializer()
call since it's used more than once.
"expected member initializer is a constant expression."); | ||
const ConstantExpr *CExpr = cast<ConstantExpr>(Member); | ||
assert(CExpr->getOpcode() == Instruction::Sub && | ||
"expected member intializer is a sub expression."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"expected member intializer is a sub expression."); | |
"expected member initializer is a sub expression."); |
|
||
// Rename the subsection for the counters | ||
OutStreamer->emitXCOFFRenameDirective(ProfCCsect->getQualNameSymbol(), | ||
"__llvm_prf_cnts"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use getInstrProfSectionName
consistently everywhere rather than hardcoding the names as strings here? That gives us flexibility in case we need to change the name in the future.
The existing PGO implementation creates a dependence from the profiling data symbol to the function descriptors for every function defined within the compilation unit. This forces the binder to include either all or no functions from an object file when linking a profile-generate enabled object file, which can break links that require the aggressive "include only whats used" link semantics that XCOFF provides. To remedy this we break up the profiling data symbol into per-function CSECTs by using the rename directive to emit the profd symbol for a single function to a unique CSECT, then rename it to the common name afterwards. We also have to split the counters into per-function CSECTs as they need to have a .ref relocation from the counter section which is referenced by the functions, to the profiling-data symbol which is otherwise unreferenced.