Skip to content

Commit bf18dc9

Browse files
authored
merge main into amd-staging (#784)
https://compiler-ci.amd.com/blue/organizations/jenkins/compiler-psdb-amd-staging/detail/compiler-psdb-amd-staging/3167/pipeline/727/ passed all but oclConf* appears to be due to bad tester node as it affects unrelated PRs.
2 parents 0299335 + 879bde0 commit bf18dc9

File tree

78 files changed

+2291
-751
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

78 files changed

+2291
-751
lines changed
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
1-
-checks='-*,llvm-namespace-comment'
1+
-checks=-*,llvm-namespace-comment
22
--warnings-as-errors=llvm-namespace-comment

clang/docs/ReleaseNotes.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -338,6 +338,7 @@ Modified Compiler Flags
338338
-----------------------
339339
- The `-gkey-instructions` compiler flag is now enabled by default when DWARF is emitted for plain C/C++ and optimizations are enabled. (#GH149509)
340340
- The `-fconstexpr-steps` compiler flag now accepts value `0` to opt out of this limit. (#GH160440)
341+
- The `-fdevirtualize-speculatively` compiler flag is now supported to enable speculative devirtualization of virtual function calls, it's disabled by default. (#GH159685)
341342

342343
Removed Compiler Flags
343344
-------------------------

clang/docs/UsersManual.rst

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2352,6 +2352,56 @@ are listed below.
23522352
pure ThinLTO, as all split regular LTO modules are merged and LTO linked
23532353
with regular LTO.
23542354

2355+
.. option:: -fdevirtualize-speculatively
2356+
2357+
Enable speculative devirtualization optimization where a virtual call
2358+
can be transformed into a direct call under the assumption that its
2359+
object is of a particular type. A runtime check is inserted to validate
2360+
the assumption before making the direct call, and if the check fails,
2361+
the original virtual call is made instead. This optimization can enable
2362+
more inlining opportunities and better optimization of the direct call.
2363+
This is different from whole program devirtualization optimization
2364+
that rely on global analysis and hidden visibility of the objects to prove
2365+
that the object is always of a particular type at a virtual call site.
2366+
This optimization doesn't require global analysis or hidden visibility.
2367+
This optimization doesn't devirtualize all virtual calls, but only
2368+
when there's a single implementation of the virtual function in the module.
2369+
There could be a single implementation of the virtual function
2370+
either because the function is not overridden in any derived class,
2371+
or because all objects are instances of the same class/type.
2372+
2373+
Ex of IR before the optimization:
2374+
2375+
.. code-block:: llvm
2376+
2377+
%vtable = load ptr, ptr %BV, align 8, !tbaa !6
2378+
%0 = tail call i1 @llvm.public.type.test(ptr %vtable, metadata !"_ZTS4Base")
2379+
tail call void @llvm.assume(i1 %0)
2380+
%0 = load ptr, ptr %vtable, align 8
2381+
tail call void %0(ptr noundef nonnull align 8 dereferenceable(8) %BV)
2382+
ret void
2383+
2384+
IR after the optimization:
2385+
2386+
.. code-block:: llvm
2387+
2388+
%vtable = load ptr, ptr %BV, align 8, !tbaa !12
2389+
%0 = load ptr, ptr %vtable, align 8
2390+
%1 = icmp eq ptr %0, @_ZN4Base17virtual_function1Ev
2391+
br i1 %1, label %if.true.direct_targ, label %if.false.orig_indirect, !prof !15
2392+
if.true.direct_targ: ; preds = %entry
2393+
tail call void @_ZN4Base17virtual_function1Ev(ptr noundef nonnull align 8 dereferenceable(8) %BV)
2394+
br label %if.end.icp
2395+
if.false.orig_indirect: ; preds = %entry
2396+
tail call void %0(ptr noundef nonnull align 8 dereferenceable(8) %BV)
2397+
br label %if.end.icp
2398+
if.end.icp: ; preds = %if.false.orig_indirect, %if.true.direct_targ
2399+
ret void
2400+
2401+
This feature is temporarily ignored at the LLVM side when LTO is enabled.
2402+
TODO: Update the comment when the LLVM side supports this feature for LTO.
2403+
This feature is turned off by default.
2404+
23552405
.. option:: -f[no-]unique-source-file-names
23562406

23572407
When enabled, allows the compiler to assume that each object file
@@ -5216,6 +5266,8 @@ Execute ``clang-cl /?`` to see a list of supported options:
52165266
-fstandalone-debug Emit full debug info for all types used by the program
52175267
-fstrict-aliasing Enable optimizations based on strict aliasing rules
52185268
-fsyntax-only Run the preprocessor, parser and semantic analysis stages
5269+
-fdevirtualize-speculatively
5270+
Enables speculative devirtualization optimization.
52195271
-fwhole-program-vtables Enables whole-program vtable optimization. Requires -flto
52205272
-gcodeview-ghash Emit type record hashes in a .debug$H section
52215273
-gcodeview Generate CodeView debug information

clang/include/clang/Basic/CodeGenOptions.def

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -364,6 +364,8 @@ VALUE_CODEGENOPT(WarnStackSize , 32, UINT_MAX, Benign) ///< Set via -fwarn-s
364364
CODEGENOPT(NoStackArgProbe, 1, 0, Benign) ///< Set when -mno-stack-arg-probe is used
365365
CODEGENOPT(EmitLLVMUseLists, 1, 0, Benign) ///< Control whether to serialize use-lists.
366366

367+
CODEGENOPT(DevirtualizeSpeculatively, 1, 0, Benign) ///< Whether to apply the speculative
368+
/// devirtualization optimization.
367369
CODEGENOPT(WholeProgramVTables, 1, 0, Benign) ///< Whether to apply whole-program
368370
/// vtable optimization.
369371

clang/include/clang/Options/Options.td

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4627,6 +4627,13 @@ defm new_infallible : BoolFOption<"new-infallible",
46274627
BothFlags<[], [ClangOption, CC1Option],
46284628
" treating throwing global C++ operator new as always returning valid memory "
46294629
"(annotates with __attribute__((returns_nonnull)) and throw()). This is detectable in source.">>;
4630+
defm devirtualize_speculatively
4631+
: BoolFOption<"devirtualize-speculatively",
4632+
CodeGenOpts<"DevirtualizeSpeculatively">, DefaultFalse,
4633+
PosFlag<SetTrue, [], [],
4634+
"Enables speculative devirtualization optimization.">,
4635+
NegFlag<SetFalse>,
4636+
BothFlags<[], [ClangOption, CLOption, CC1Option]>>;
46304637
defm whole_program_vtables : BoolFOption<"whole-program-vtables",
46314638
CodeGenOpts<"WholeProgramVTables">, DefaultFalse,
46324639
PosFlag<SetTrue, [], [ClangOption, CC1Option],
@@ -7272,9 +7279,8 @@ defm variable_expansion_in_unroller : BooleanFFlag<"variable-expansion-in-unroll
72727279
Group<clang_ignored_gcc_optimization_f_Group>;
72737280
defm web : BooleanFFlag<"web">, Group<clang_ignored_gcc_optimization_f_Group>;
72747281
defm whole_program : BooleanFFlag<"whole-program">, Group<clang_ignored_gcc_optimization_f_Group>;
7275-
defm devirtualize : BooleanFFlag<"devirtualize">, Group<clang_ignored_gcc_optimization_f_Group>;
7276-
defm devirtualize_speculatively : BooleanFFlag<"devirtualize-speculatively">,
7277-
Group<clang_ignored_gcc_optimization_f_Group>;
7282+
defm devirtualize : BooleanFFlag<"devirtualize">,
7283+
Group<clang_ignored_gcc_optimization_f_Group>;
72787284

72797285
// Generic gfortran options.
72807286
def A_DASH : Joined<["-"], "A-">, Group<gfortran_Group>;

clang/lib/CodeGen/BackendUtil.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -946,6 +946,7 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
946946
// non-integrated assemblers don't recognize .cgprofile section.
947947
PTO.CallGraphProfile = !CodeGenOpts.DisableIntegratedAS;
948948
PTO.UnifiedLTO = CodeGenOpts.UnifiedLTO;
949+
PTO.DevirtualizeSpeculatively = CodeGenOpts.DevirtualizeSpeculatively;
949950

950951
LoopAnalysisManager LAM;
951952
FunctionAnalysisManager FAM;

clang/lib/CodeGen/CGClass.cpp

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2830,10 +2830,15 @@ void CodeGenFunction::EmitTypeMetadataCodeForVCall(const CXXRecordDecl *RD,
28302830
SourceLocation Loc) {
28312831
if (SanOpts.has(SanitizerKind::CFIVCall))
28322832
EmitVTablePtrCheckForCall(RD, VTable, CodeGenFunction::CFITCK_VCall, Loc);
2833-
else if (CGM.getCodeGenOpts().WholeProgramVTables &&
2834-
// Don't insert type test assumes if we are forcing public
2835-
// visibility.
2836-
!CGM.AlwaysHasLTOVisibilityPublic(RD)) {
2833+
// Emit the intrinsics of (type_test and assume) for the features of WPD and
2834+
// speculative devirtualization. For WPD, emit the intrinsics only for the
2835+
// case of non_public LTO visibility.
2836+
// TODO: refactor this condition and similar ones into a function (e.g.,
2837+
// ShouldEmitDevirtualizationMD) to encapsulate the details of the different
2838+
// types of devirtualization.
2839+
else if ((CGM.getCodeGenOpts().WholeProgramVTables &&
2840+
!CGM.AlwaysHasLTOVisibilityPublic(RD)) ||
2841+
CGM.getCodeGenOpts().DevirtualizeSpeculatively) {
28372842
CanQualType Ty = CGM.getContext().getCanonicalTagType(RD);
28382843
llvm::Metadata *MD = CGM.CreateMetadataIdentifierForType(Ty);
28392844
llvm::Value *TypeId =
@@ -2991,8 +2996,9 @@ void CodeGenFunction::EmitVTablePtrCheck(const CXXRecordDecl *RD,
29912996
}
29922997

29932998
bool CodeGenFunction::ShouldEmitVTableTypeCheckedLoad(const CXXRecordDecl *RD) {
2994-
if (!CGM.getCodeGenOpts().WholeProgramVTables ||
2995-
!CGM.HasHiddenLTOVisibility(RD))
2999+
if ((!CGM.getCodeGenOpts().WholeProgramVTables ||
3000+
!CGM.HasHiddenLTOVisibility(RD)) &&
3001+
!CGM.getCodeGenOpts().DevirtualizeSpeculatively)
29963002
return false;
29973003

29983004
if (CGM.getCodeGenOpts().VirtualFunctionElimination)

clang/lib/CodeGen/CGVTables.cpp

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1363,10 +1363,12 @@ llvm::GlobalObject::VCallVisibility CodeGenModule::GetVCallVisibilityLevel(
13631363
void CodeGenModule::EmitVTableTypeMetadata(const CXXRecordDecl *RD,
13641364
llvm::GlobalVariable *VTable,
13651365
const VTableLayout &VTLayout) {
1366-
// Emit type metadata on vtables with LTO or IR instrumentation.
1366+
// Emit type metadata on vtables with LTO or IR instrumentation or
1367+
// speculative devirtualization.
13671368
// In IR instrumentation, the type metadata is used to find out vtable
13681369
// definitions (for type profiling) among all global variables.
1369-
if (!getCodeGenOpts().LTOUnit && !getCodeGenOpts().hasProfileIRInstr())
1370+
if (!getCodeGenOpts().LTOUnit && !getCodeGenOpts().hasProfileIRInstr() &&
1371+
!getCodeGenOpts().DevirtualizeSpeculatively)
13701372
return;
13711373

13721374
CharUnits ComponentWidth = GetTargetTypeStoreSize(getVTableComponentType());

clang/lib/CodeGen/ItaniumCXXABI.cpp

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -716,10 +716,14 @@ CGCallee ItaniumCXXABI::EmitLoadOfMemberFunctionPointer(
716716

717717
bool ShouldEmitVFEInfo = CGM.getCodeGenOpts().VirtualFunctionElimination &&
718718
CGM.HasHiddenLTOVisibility(RD);
719+
// TODO: Update this name not to be restricted to WPD only
720+
// as we now emit the vtable info info for speculative devirtualization as
721+
// well.
719722
bool ShouldEmitWPDInfo =
720-
CGM.getCodeGenOpts().WholeProgramVTables &&
721-
// Don't insert type tests if we are forcing public visibility.
722-
!CGM.AlwaysHasLTOVisibilityPublic(RD);
723+
(CGM.getCodeGenOpts().WholeProgramVTables &&
724+
// Don't insert type tests if we are forcing public visibility.
725+
!CGM.AlwaysHasLTOVisibilityPublic(RD)) ||
726+
CGM.getCodeGenOpts().DevirtualizeSpeculatively;
723727
llvm::Value *VirtualFn = nullptr;
724728

725729
{
@@ -2110,17 +2114,20 @@ void ItaniumCXXABI::emitVTableDefinitions(CodeGenVTables &CGVT,
21102114

21112115
// Always emit type metadata on non-available_externally definitions, and on
21122116
// available_externally definitions if we are performing whole program
2113-
// devirtualization. For WPD we need the type metadata on all vtable
2114-
// definitions to ensure we associate derived classes with base classes
2115-
// defined in headers but with a strong definition only in a shared library.
2117+
// devirtualization or speculative devirtualization. We need the type metadata
2118+
// on all vtable definitions to ensure we associate derived classes with base
2119+
// classes defined in headers but with a strong definition only in a shared
2120+
// library.
21162121
if (!VTable->isDeclarationForLinker() ||
2117-
CGM.getCodeGenOpts().WholeProgramVTables) {
2122+
CGM.getCodeGenOpts().WholeProgramVTables ||
2123+
CGM.getCodeGenOpts().DevirtualizeSpeculatively) {
21182124
CGM.EmitVTableTypeMetadata(RD, VTable, VTLayout);
21192125
// For available_externally definitions, add the vtable to
21202126
// @llvm.compiler.used so that it isn't deleted before whole program
21212127
// analysis.
21222128
if (VTable->isDeclarationForLinker()) {
2123-
assert(CGM.getCodeGenOpts().WholeProgramVTables);
2129+
assert(CGM.getCodeGenOpts().WholeProgramVTables ||
2130+
CGM.getCodeGenOpts().DevirtualizeSpeculatively);
21242131
CGM.addCompilerUsedGlobal(VTable);
21252132
}
21262133
}

0 commit comments

Comments
 (0)