-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[InstrPGO] Support cold function coverage instrumentation #109837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
07a2cab
b3054f0
5a2848c
f07457f
66f837a
b1f0105
3c4e85b
d54fbe0
5e0d106
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
// Test -fprofile-generate-cold-function-coverage | ||
|
||
// RUN: rm -rf %t && split-file %s %t | ||
// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx/yyy/ -fprofile-sample-accurate -fprofile-sample-use=%t/pgo-cold-func.prof -S -emit-llvm -o - %t/pgo-cold-func.c | FileCheck %s | ||
|
||
// CHECK: @__llvm_profile_filename = {{.*}} c"/xxx/yyy/default_%m.profraw\00" | ||
|
||
// CHECK: store i8 0, ptr @__profc_bar, align 1 | ||
// CHECK-NOT: @__profc_foo | ||
|
||
//--- pgo-cold-func.prof | ||
foo:1:1 | ||
1: 1 | ||
|
||
//--- pgo-cold-func.c | ||
int bar(int x) { return x;} | ||
int foo(int x) { | ||
return x; | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
// RUN: %clang -### -c -fprofile-generate-cold-function-coverage %s 2>&1 | FileCheck %s | ||
// CHECK: "--instrument-cold-function-only-path=default_%m.profraw" | ||
// CHECK: "--pgo-function-entry-coverage" | ||
// CHECK-NOT: "-fprofile-instrument" | ||
// CHECK-NOT: "-fprofile-instrument-path= | ||
|
||
// RUN: %clang -### -c -fprofile-generate-cold-function-coverage=dir %s 2>&1 | FileCheck %s --check-prefix=CHECK-EQ | ||
// CHECK-EQ: "--instrument-cold-function-only-path=dir{{/|\\\\}}default_%m.profraw" |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -296,7 +296,12 @@ static cl::opt<bool> UseLoopVersioningLICM( | |
"enable-loop-versioning-licm", cl::init(false), cl::Hidden, | ||
cl::desc("Enable the experimental Loop Versioning LICM pass")); | ||
|
||
static cl::opt<std::string> InstrumentColdFuncOnlyPath( | ||
"instrument-cold-function-only-path", cl::init(""), | ||
cl::desc("File path for cold function only instrumentation"), cl::Hidden); | ||
|
||
extern cl::opt<std::string> UseCtxProfile; | ||
extern cl::opt<bool> PGOInstrumentColdFunctionOnly; | ||
|
||
namespace llvm { | ||
extern cl::opt<bool> EnableMemProfContextDisambiguation; | ||
|
@@ -1182,8 +1187,13 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, | |
const bool IsCtxProfUse = | ||
!UseCtxProfile.empty() && Phase == ThinOrFullLTOPhase::ThinLTOPreLink; | ||
|
||
// Enable cold function coverage instrumentation if | ||
// InstrumentColdFuncOnlyPath is provided. | ||
const bool IsColdFuncOnlyInstrGen = PGOInstrumentColdFunctionOnly = | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Setting this global variable here has introduced a data race detected by TSan when different threads set up different pass pipelines. Was this intentional? Did you mean to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for pointing out. It's intentional to set the global variable, I wasn't aware it will cause data race. I will try to change it to local one. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This causes TSAN tests failures all over our infrastructure in anything that uses LLVM jit, i.e. Tensorflow/XLA One of the opens-source examples is https://github.com/google-deepmind/mujoco_menagerie/blob/main/test/model_test.py Traces look like:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should be fixed by #114364 |
||
IsPGOPreLink && !InstrumentColdFuncOnlyPath.empty(); | ||
|
||
if (IsPGOInstrGen || IsPGOInstrUse || IsMemprofUse || IsCtxProfGen || | ||
IsCtxProfUse) | ||
IsCtxProfUse || IsColdFuncOnlyInstrGen) | ||
addPreInlinerPasses(MPM, Level, Phase); | ||
|
||
// Add all the requested passes for instrumentation PGO, if requested. | ||
|
@@ -1205,6 +1215,11 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, | |
return MPM; | ||
addPostPGOLoopRotation(MPM, Level); | ||
MPM.addPass(PGOCtxProfLoweringPass()); | ||
} else if (IsColdFuncOnlyInstrGen) { | ||
addPGOInstrPasses( | ||
MPM, Level, /* RunProfileGen */ true, /* IsCS */ false, | ||
/* AtomicCounterUpdate */ false, InstrumentColdFuncOnlyPath, | ||
/* ProfileRemappingFile */ "", IntrusiveRefCntPtr<vfs::FileSystem>()); | ||
} | ||
|
||
if (IsPGOInstrGen || IsPGOInstrUse || IsCtxProfGen) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
; RUN: opt < %s --passes=pgo-instr-gen -pgo-instrument-cold-function-only -pgo-function-entry-coverage -S | FileCheck --check-prefixes=COLD %s | ||
; RUN: opt < %s --passes=pgo-instr-gen -pgo-instrument-cold-function-only -pgo-function-entry-coverage -pgo-cold-instrument-entry-threshold=1 -S | FileCheck --check-prefixes=ENTRY-COUNT %s | ||
; RUN: opt < %s --passes=pgo-instr-gen -pgo-instrument-cold-function-only -pgo-function-entry-coverage -pgo-treat-unknown-as-cold -S | FileCheck --check-prefixes=UNKNOWN-FUNC %s | ||
|
||
; COLD: call void @llvm.instrprof.cover(ptr @__profn_foo, i64 [[#]], i32 1, i32 0) | ||
; COLD-NOT: __profn_main | ||
; COLD-NOT: __profn_bar | ||
|
||
; ENTRY-COUNT: call void @llvm.instrprof.cover(ptr @__profn_foo, i64 [[#]], i32 1, i32 0) | ||
; ENTRY-COUNT: call void @llvm.instrprof.cover(ptr @__profn_main, i64 [[#]], i32 1, i32 0) | ||
|
||
; UNKNOWN-FUNC: call void @llvm.instrprof.cover(ptr @__profn_bar, i64 [[#]], i32 1, i32 0) | ||
; UNKNOWN-FUNC: call void @llvm.instrprof.cover(ptr @__profn_foo, i64 [[#]], i32 1, i32 0) | ||
|
||
|
||
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128" | ||
target triple = "x86_64-unknown-linux-gnu" | ||
|
||
define void @bar() { | ||
entry: | ||
ret void | ||
} | ||
|
||
define void @foo() !prof !0 { | ||
entry: | ||
ret void | ||
} | ||
|
||
define i32 @main() !prof !1 { | ||
entry: | ||
ret i32 0 | ||
} | ||
|
||
!0 = !{!"function_entry_count", i64 0} | ||
!1 = !{!"function_entry_count", i64 1} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to expose this through the clang frontend?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if my PR description is not clear. Note that there is no use for
-fprofile-generate
and-fprofile-instr-generate
here, so a driver flag here is to configure the instr file path and make linker to link the compiler_rt.profile object files (just similar to-fprofile-[instr]-generate=
).The reason for not using
-fprofile-[instr]-generate
is because those flags conflict with-fprofile-sample-use
, see PGOOptions,ProfileFile
is a shared file path which means the two flags are actually mutually exclusive.Another option is to make
-fprofile-generate
compatible with-fprofile-samle-use
(like refactoring PGOOptions or adding another flag to configure the file path things), this seems to me they are more complicated than the current one. But I’m open to any suggestions on this.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant, why not just use
clang ... -mllvm -instrument-cold-function-coverage
? Is this a clang - only feature (i.e. rust can't use it?) Is it just for symmetry with the current PGO flags?(This is not a pushback, mainly curious. Also the patch would be quite smaller if you didn't need to pass through the flags from clang to llvm)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for the suggestion! We also need to link runtime lib(
compiler_rt.profile.a
)but yeah, I agree it's possible to pass directly to llvm and linker without through clang. Then the full command line would be like
So yes, adding the clang driver flag is kind of to mirror current IRPGO flags, for easy maintenance purpose(given that
-fprofile-generate
doesn't work with-fprofile-sample-use
) and also to centralize the configuration for the convenience. IMO, thecompiler_rt
is probably the main blocker here, I didn't find an easy way to bundle it with a llvm flag.Appreciate any further suggestions!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would
-Wl,-lclang_rt.profile
work?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 I think a frontend flag is useful. It allows us to identify incompatibilities early and provide clear error messages instead of more obscure failures down the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would incompatibilities and ability to provide useful error messages be dependent on the flag being in the frontend? Is it that the backend can't actually reliably point the finger at the specific flag causing the conflict, so a message would be diluted to "sampling won't work with this"?
(probably a subject for a different forum tho)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, I think it should work, except it requires another linker flag: the
--u__llvm_runtime_variable
, we can configure it to linker too, but basically those instr PGO flags control addProfileRTLibs (which seems not equivalent to-Wl,-lclang_rt.profile
), I just wanted to mirror those flags so that we don't need to maintain if anything changes toaddProfileRTLibs
in the future. (Though I feel this code should be very stable, should not be a big problem, so mainly for the convenience for compiler user to use one flag instead of using/maintaining multiple flags )Overall, I agree that it's feasible to do it without clang flag. I'm not very familiar with the convention for adding driver flags, so if you think this doesn't justify it, I will drop it from this patch. Thanks for the discussion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if we know that this instrumentation mode should not be mixed with e.g. sanitizers or something else we can enforce these checks early. I don't see a particular downside to adding a frontend flag. The convenience of bundling the 2 lld flags and 3 mllvm flags into a single frontend flag seems like a good enough motivation to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on single driver flag for convenience.