-
Notifications
You must be signed in to change notification settings - Fork 13.9k
[Clang] Enable -fpointer-tbaa by default. #117244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-driver @llvm/pr-subscribers-clang Author: Florian Hahn (fhahn) ChangesSupport for more precise TBAA metadata has been added a while ago (behind the -fpointer-tbaa flag). The more precise TBAA metadata allows treating accesses of different pointer types as no-alias. This helps to remove more redundant loads and stores in a number of workloads. Some highlights on the impact across llvm-test-suite's MultiSource, SPEC2006 & SPEC2017 include:
This closes a relatively big gap to GCC, which has been supporting disambiguating based on pointer types for a long time. (https://clang.godbolt.org/z/K7Wbhrz4q) Pointer-TBAA support for pointers to builtin types has been added in #76612. Support for user-defined types has been added in There are 2 recent PRs with bug fixes for special cases uncovered during some of my testing:
Patch is 142.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/117244.diff 11 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 8c81de341937ca..f42e970f0e07f6 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -211,6 +211,10 @@ C++ Language Changes
- The builtin type alias ``__builtin_common_type`` has been added to improve the
performance of ``std::common_type``.
+- Clang now emits distinct type-based alias analysis tags for incompatible
+ pointers by default, enabling more powerful alias analysis when accessing
+ pointer types. The new default behavior can be disabled using ``-fno-pointer-tbaa``.
+
C++2c Feature Support
^^^^^^^^^^^^^^^^^^^^^
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index 4cf22c4ee08ce0..0f4ed13d5f3d8c 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -242,7 +242,7 @@ ENUM_CODEGENOPT(StructReturnConvention, StructReturnConventionKind, 2, SRCK_Defa
CODEGENOPT(RelaxAll , 1, 0) ///< Relax all machine code instructions.
CODEGENOPT(RelaxedAliasing , 1, 0) ///< Set when -fno-strict-aliasing is enabled.
-CODEGENOPT(PointerTBAA, 1, 0) ///< Whether or not to use distinct TBAA tags for pointers.
+CODEGENOPT(PointerTBAA , 1, 1) ///< Whether or not to use distinct TBAA tags for pointers.
CODEGENOPT(StructPathTBAA , 1, 0) ///< Whether or not to use struct-path TBAA.
CODEGENOPT(NewStructPathTBAA , 1, 0) ///< Whether or not to use enhanced struct-path TBAA.
CODEGENOPT(SaveTempLabels , 1, 0) ///< Save temporary labels.
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 5167c3c39e315a..35cdf789598936 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -7361,9 +7361,12 @@ def fuse_register_sized_bitfield_access: Flag<["-"], "fuse-register-sized-bitfie
def relaxed_aliasing : Flag<["-"], "relaxed-aliasing">,
HelpText<"Turn off Type Based Alias Analysis">,
MarshallingInfoFlag<CodeGenOpts<"RelaxedAliasing">>;
-def pointer_tbaa: Flag<["-"], "pointer-tbaa">,
- HelpText<"Turn on Type Based Alias Analysis for pointer accesses">,
- MarshallingInfoFlag<CodeGenOpts<"PointerTBAA">>;
+defm pointer_tbaa: BoolOption<"", "pointer-tbaa", CodeGenOpts<"PointerTBAA">,
+DefaultTrue,
+ PosFlag<SetTrue, [], [ClangOption], "Enable">,
+ NegFlag<SetFalse, [], [ClangOption], "Disable">,
+ BothFlags<[], [ClangOption], " that single precision floating-point divide and sqrt used in ">>
+ ;
def no_struct_path_tbaa : Flag<["-"], "no-struct-path-tbaa">,
HelpText<"Turn off struct-path aware Type Based Alias Analysis">,
MarshallingInfoNegativeFlag<CodeGenOpts<"StructPathTBAA">>;
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index d3eec9fea0d498..affe63642d2ffa 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5935,9 +5935,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
if (!Args.hasFlag(options::OPT_fstrict_aliasing, StrictAliasingAliasOption,
options::OPT_fno_strict_aliasing, !IsWindowsMSVC))
CmdArgs.push_back("-relaxed-aliasing");
- if (Args.hasFlag(options::OPT_fpointer_tbaa, options::OPT_fno_pointer_tbaa,
+ if (Args.hasFlag(options::OPT_fno_pointer_tbaa, options::OPT_fpointer_tbaa,
false))
- CmdArgs.push_back("-pointer-tbaa");
+ CmdArgs.push_back("-no-pointer-tbaa");
if (!Args.hasFlag(options::OPT_fstruct_path_tbaa,
options::OPT_fno_struct_path_tbaa, true))
CmdArgs.push_back("-no-struct-path-tbaa");
diff --git a/clang/test/CodeGen/attr-counted-by.c b/clang/test/CodeGen/attr-counted-by.c
index f70e552bca26ab..d62d1c412f1a31 100644
--- a/clang/test/CodeGen/attr-counted-by.c
+++ b/clang/test/CodeGen/attr-counted-by.c
@@ -1137,7 +1137,7 @@ struct test13_bar {
// SANITIZE-WITH-ATTR: cont5:
// SANITIZE-WITH-ATTR-NEXT: [[REVMAP:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP0]], i64 16
// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x ptr], ptr [[REVMAP]], i64 0, i64 [[INDEX]]
-// SANITIZE-WITH-ATTR-NEXT: store ptr null, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA14:![0-9]+]]
+// SANITIZE-WITH-ATTR-NEXT: store ptr null, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA15:![0-9]+]]
// SANITIZE-WITH-ATTR-NEXT: ret i32 0
//
// NO-SANITIZE-WITH-ATTR-LABEL: define dso_local noundef i32 @test13(
@@ -1146,7 +1146,7 @@ struct test13_bar {
// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr @test13_f, align 8, !tbaa [[TBAA8:![0-9]+]]
// NO-SANITIZE-WITH-ATTR-NEXT: [[REVMAP:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP0]], i64 16
// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x ptr], ptr [[REVMAP]], i64 0, i64 [[INDEX]]
-// NO-SANITIZE-WITH-ATTR-NEXT: store ptr null, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA11:![0-9]+]]
+// NO-SANITIZE-WITH-ATTR-NEXT: store ptr null, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA12:![0-9]+]]
// NO-SANITIZE-WITH-ATTR-NEXT: ret i32 0
//
// SANITIZE-WITHOUT-ATTR-LABEL: define dso_local noundef i32 @test13(
@@ -1164,7 +1164,7 @@ struct test13_bar {
// SANITIZE-WITHOUT-ATTR: cont5:
// SANITIZE-WITHOUT-ATTR-NEXT: [[REVMAP:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP0]], i64 16
// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x ptr], ptr [[REVMAP]], i64 0, i64 [[INDEX]]
-// SANITIZE-WITHOUT-ATTR-NEXT: store ptr null, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA14:![0-9]+]]
+// SANITIZE-WITHOUT-ATTR-NEXT: store ptr null, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA15:![0-9]+]]
// SANITIZE-WITHOUT-ATTR-NEXT: ret i32 0
//
// NO-SANITIZE-WITHOUT-ATTR-LABEL: define dso_local noundef i32 @test13(
@@ -1173,7 +1173,7 @@ struct test13_bar {
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr @test13_f, align 8, !tbaa [[TBAA8:![0-9]+]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[REVMAP:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP0]], i64 16
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x ptr], ptr [[REVMAP]], i64 0, i64 [[INDEX]]
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: store ptr null, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA11:![0-9]+]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: store ptr null, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA12:![0-9]+]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: ret i32 0
//
int test13(long index) {
@@ -1467,7 +1467,7 @@ int test24(int c, struct tests_foo *var) {
// SANITIZE-WITH-ATTR-LABEL: define dso_local i32 @test25(
// SANITIZE-WITH-ATTR-SAME: i32 noundef [[C:%.*]], ptr noundef [[VAR:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SANITIZE-WITH-ATTR-NEXT: entry:
-// SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[VAR]], align 8, !tbaa [[TBAA14]]
+// SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[VAR]], align 8, !tbaa [[TBAA17:![0-9]+]]
// SANITIZE-WITH-ATTR-NEXT: [[DOTCOUNTED_BY_LOAD:%.*]] = load i32, ptr [[TMP0]], align 4
// SANITIZE-WITH-ATTR-NEXT: [[TMP1:%.*]] = icmp ugt i32 [[DOTCOUNTED_BY_LOAD]], 10
// SANITIZE-WITH-ATTR-NEXT: br i1 [[TMP1]], label [[CONT5:%.*]], label [[HANDLER_OUT_OF_BOUNDS:%.*]], !prof [[PROF3]], !nosanitize [[META2]]
@@ -1482,7 +1482,7 @@ int test24(int c, struct tests_foo *var) {
// NO-SANITIZE-WITH-ATTR-LABEL: define dso_local i32 @test25(
// NO-SANITIZE-WITH-ATTR-SAME: i32 noundef [[C:%.*]], ptr nocapture noundef readonly [[VAR:%.*]]) local_unnamed_addr #[[ATTR8:[0-9]+]] {
// NO-SANITIZE-WITH-ATTR-NEXT: entry:
-// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[VAR]], align 8, !tbaa [[TBAA11]]
+// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[VAR]], align 8, !tbaa [[TBAA14:![0-9]+]]
// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 44
// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP1:%.*]] = load i32, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
// NO-SANITIZE-WITH-ATTR-NEXT: ret i32 [[TMP1]]
@@ -1490,7 +1490,7 @@ int test24(int c, struct tests_foo *var) {
// SANITIZE-WITHOUT-ATTR-LABEL: define dso_local i32 @test25(
// SANITIZE-WITHOUT-ATTR-SAME: i32 noundef [[C:%.*]], ptr noundef [[VAR:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SANITIZE-WITHOUT-ATTR-NEXT: entry:
-// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[VAR]], align 8, !tbaa [[TBAA14]]
+// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[VAR]], align 8, !tbaa [[TBAA17:![0-9]+]]
// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 44
// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP1:%.*]] = load i32, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
// SANITIZE-WITHOUT-ATTR-NEXT: ret i32 [[TMP1]]
@@ -1498,7 +1498,7 @@ int test24(int c, struct tests_foo *var) {
// NO-SANITIZE-WITHOUT-ATTR-LABEL: define dso_local i32 @test25(
// NO-SANITIZE-WITHOUT-ATTR-SAME: i32 noundef [[C:%.*]], ptr nocapture noundef readonly [[VAR:%.*]]) local_unnamed_addr #[[ATTR7:[0-9]+]] {
// NO-SANITIZE-WITHOUT-ATTR-NEXT: entry:
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[VAR]], align 8, !tbaa [[TBAA11]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[VAR]], align 8, !tbaa [[TBAA14:![0-9]+]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 44
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP1:%.*]] = load i32, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: ret i32 [[TMP1]]
@@ -1600,7 +1600,7 @@ struct test27_foo {
// SANITIZE-WITH-ATTR: cont3:
// SANITIZE-WITH-ATTR-NEXT: [[ENTRIES:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 24
// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x ptr], ptr [[ENTRIES]], i64 0, i64 [[IDXPROM]]
-// SANITIZE-WITH-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA14]]
+// SANITIZE-WITH-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA19:![0-9]+]]
// SANITIZE-WITH-ATTR-NEXT: [[IDXPROM4:%.*]] = sext i32 [[J]] to i64
// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX5:%.*]] = getelementptr inbounds [[STRUCT_TEST27_BAR:%.*]], ptr [[TMP2]], i64 [[IDXPROM4]]
// SANITIZE-WITH-ATTR-NEXT: ret ptr [[ARRAYIDX5]]
@@ -1611,7 +1611,7 @@ struct test27_foo {
// NO-SANITIZE-WITH-ATTR-NEXT: [[ENTRIES:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 24
// NO-SANITIZE-WITH-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[I]] to i64
// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x ptr], ptr [[ENTRIES]], i64 0, i64 [[IDXPROM]]
-// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA11]]
+// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA16:![0-9]+]]
// NO-SANITIZE-WITH-ATTR-NEXT: [[IDXPROM1:%.*]] = sext i32 [[J]] to i64
// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds [[STRUCT_TEST27_BAR:%.*]], ptr [[TMP0]], i64 [[IDXPROM1]]
// NO-SANITIZE-WITH-ATTR-NEXT: ret ptr [[ARRAYIDX2]]
@@ -1622,7 +1622,7 @@ struct test27_foo {
// SANITIZE-WITHOUT-ATTR-NEXT: [[ENTRIES:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 24
// SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[I]] to i64
// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x ptr], ptr [[ENTRIES]], i64 0, i64 [[IDXPROM]]
-// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA14]]
+// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA19:![0-9]+]]
// SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM3:%.*]] = sext i32 [[J]] to i64
// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX4:%.*]] = getelementptr inbounds [[STRUCT_TEST27_BAR:%.*]], ptr [[TMP0]], i64 [[IDXPROM3]]
// SANITIZE-WITHOUT-ATTR-NEXT: ret ptr [[ARRAYIDX4]]
@@ -1633,7 +1633,7 @@ struct test27_foo {
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ENTRIES:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 24
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[I]] to i64
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x ptr], ptr [[ENTRIES]], i64 0, i64 [[IDXPROM]]
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA11]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA16:![0-9]+]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM1:%.*]] = sext i32 [[J]] to i64
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds [[STRUCT_TEST27_BAR:%.*]], ptr [[TMP0]], i64 [[IDXPROM1]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: ret ptr [[ARRAYIDX2]]
@@ -1651,9 +1651,9 @@ struct test28_foo {
// SANITIZE-WITH-ATTR-LABEL: define dso_local i32 @test28(
// SANITIZE-WITH-ATTR-SAME: ptr noundef [[P:%.*]], i32 noundef [[I:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SANITIZE-WITH-ATTR-NEXT: entry:
-// SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[P]], align 8, !tbaa [[TBAA14]]
-// SANITIZE-WITH-ATTR-NEXT: [[TMP1:%.*]] = load ptr, ptr [[TMP0]], align 8, !tbaa [[TBAA14]]
-// SANITIZE-WITH-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[TMP1]], align 8, !tbaa [[TBAA14]]
+// SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[P]], align 8, !tbaa [[TBAA21:![0-9]+]]
+// SANITIZE-WITH-ATTR-NEXT: [[TMP1:%.*]] = load ptr, ptr [[TMP0]], align 8, !tbaa [[TBAA21]]
+// SANITIZE-WITH-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[TMP1]], align 8, !tbaa [[TBAA21]]
// SANITIZE-WITH-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[I]] to i64
// SANITIZE-WITH-ATTR-NEXT: [[DOTCOUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[TMP2]], i64 8
// SANITIZE-WITH-ATTR-NEXT: [[DOTCOUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOTCOUNTED_BY_GEP]], align 4
@@ -1672,9 +1672,9 @@ struct test28_foo {
// NO-SANITIZE-WITH-ATTR-LABEL: define dso_local i32 @test28(
// NO-SANITIZE-WITH-ATTR-SAME: ptr nocapture noundef readonly [[P:%.*]], i32 noundef [[I:%.*]]) local_unnamed_addr #[[ATTR8]] {
// NO-SANITIZE-WITH-ATTR-NEXT: entry:
-// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[P]], align 8, !tbaa [[TBAA11]]
-// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP1:%.*]] = load ptr, ptr [[TMP0]], align 8, !tbaa [[TBAA11]]
-// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[TMP1]], align 8, !tbaa [[TBAA11]]
+// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[P]], align 8, !tbaa [[TBAA18:![0-9]+]]
+// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP1:%.*]] = load ptr, ptr [[TMP0]], align 8, !tbaa [[TBAA18]]
+// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[TMP1]], align 8, !tbaa [[TBAA18]]
// NO-SANITIZE-WITH-ATTR-NEXT: [[ARR:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP2]], i64 12
// NO-SANITIZE-WITH-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[I]] to i64
// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARR]], i64 0, i64 [[IDXPROM]]
@@ -1684,9 +1684,9 @@ struct test28_foo {
// SANITIZE-WITHOUT-ATTR-LABEL: define dso_local i32 @test28(
// SANITIZE-WITHOUT-ATTR-SAME: ptr noundef [[P:%.*]], i32 noundef [[I:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SANITIZE-WITHOUT-ATTR-NEXT: entry:
-// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[P]], align 8, !tbaa [[TBAA14]]
-// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP1:%.*]] = load ptr, ptr [[TMP0]], align 8, !tbaa [[TBAA14]]
-// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[TMP1]], align 8, !tbaa [[TBAA14]]
+// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[P]], align 8, !tbaa [[TBAA21:![0-9]+]]
+// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP1:%.*]] = load ptr, ptr [[TMP0]], align 8, !tbaa [[TBAA21]]
+// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[TMP1]], align 8, !tbaa [[TBAA21]]
// SANITIZE-WITHOUT-ATTR-NEXT: [[ARR:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP2]], i64 12
// SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[I]] to i64
// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARR]], i64 0, i64 [[IDXPROM]]
@@ -1696,9 +1696,9 @@ struct test28_foo {
// NO-SANITIZE-WITHOUT-ATTR-LABEL: define dso_local i32 @test28(
// NO-SANITIZE-WITHOUT-ATTR-SAME: ptr nocapture noundef readonly [[P:%.*]], i32 noundef [[I:%.*]]) local_unnamed_addr #[[ATTR7]] {
// NO-SANITIZE-WITHOUT-ATTR-NEXT: entry:
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[P]], align 8, !tbaa [[TBAA11]]
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP1:%.*]] = load ptr, ptr [[TMP0]], align 8, !tbaa [[TBAA11]]
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[TMP1]], align 8, !tbaa [[TBAA11]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[P]], align 8, !tbaa [[TBAA18:![0-9]+]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP1:%.*]] = load ptr, ptr [[TMP0]], align 8, !tbaa [[TBAA18]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[TMP1]], align 8, !tbaa [[TBAA18]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARR:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP2]], i64 12
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[I]] to i64
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARR]], i64 0, i64 [[IDXPROM]]
@@ -1727,7 +1727,7 @@ struct annotated_struct_array {
// SANITIZE-WITH-ATTR-NEXT: unreachable, !nosanitize [[META2]]
// SANITIZE-WITH-ATTR: cont3:
// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [10 x ptr], ptr [[ANN]], i64 0, i64 [[TMP1]]
-// SANITIZE-WITH-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA14]]
+// SANITIZE-WITH-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA23:![0-9]+]]
// SANITIZE-WITH-ATTR-NEXT: [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[TMP2]], i64 8
// SANITIZE-WITH-ATTR-NEXT: [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
// SANITIZE-WITH-ATTR-NEXT: [[IDXPROM15:%.*]] = sext i32 [[IDX2]] to i64
@@ -1751,7 +1751,7 @@ struct annotated_struct_array {
// NO-SANITIZE-WITH-ATTR-NEXT: entry:
// NO-SANITIZE-WITH-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[IDX1]] to i64
// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [10 x ptr], ptr [[ANN]], i64 0, i64 [[IDXPROM]]
-// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA11]]
+// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA20:![0-9]+]]
// NO-SANITIZE-WITH-ATTR-NEXT: [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 8
// NO-SANITIZE-WITH-ATTR-NEXT: [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP1:%.*]] = shl i32 [[DOT_COUNTED_BY_LOAD]], 2
@@ -1774,7 +1774,7 @@ struct annotated_struct_array {
// SANITIZE-WITHOUT-ATTR-NEXT: unreachable, !nosanitize [[META9]]
// SANITIZE-WITHOUT-ATTR: cont21:
// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [10 x ptr], ptr [[ANN]], i64 0, i64 [[TMP1]]
-// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA14]]
+// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP2:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA23:![0-9]+]]
// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP2]], i64 12
// SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM18:%.*]] = sext i32 [[IDX2]] to i64
// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX19:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM18]]
@@ -1786,7 +1786,7 @@ struct annotated_struct_array {
// NO-SANITIZE-WITHOUT-ATTR-NEXT: entry:
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[IDX1]] to i64
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [10 x ptr], ptr [[ANN]], i64 0, i64 [[IDXPROM]]
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = load ptr, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA11]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:...
[truncated]
|
You can test this locally with the following command:git-clang-format --diff 157d847ba737b4136aeb1d92912f549ea1c96d4c 1222bbbd8afe57b870f6b91a40c2324279f91949 --extensions cpp,c -- clang/lib/Driver/ToolChains/Clang.cpp clang/test/CodeGen/attr-counted-by.c clang/test/CodeGen/tbaa-pointers.c clang/test/CodeGen/tbaa-reference.cpp clang/test/CodeGenCXX/template-instantiation.cpp clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp clang/unittests/CodeGen/TBAAMetadataTest.cpp View the diff from clang-format here.diff --git a/clang/unittests/CodeGen/TBAAMetadataTest.cpp b/clang/unittests/CodeGen/TBAAMetadataTest.cpp
index cad8783ea7..adc0218fea 100644
--- a/clang/unittests/CodeGen/TBAAMetadataTest.cpp
+++ b/clang/unittests/CodeGen/TBAAMetadataTest.cpp
@@ -39,11 +39,8 @@ auto OmnipotentCharC = MMTuple(
MConstInt(0, 64)
);
-auto AnyPtr = MMTuple(
- MMString("any pointer"),
- OmnipotentCharC,
- MConstInt(0, 64)
-);
+auto AnyPtr =
+ MMTuple(MMString("any pointer"), OmnipotentCharC, MConstInt(0, 64));
auto OmnipotentCharCXX = MMTuple(
MMString("omnipotent char"),
@@ -116,28 +113,19 @@ TEST(TBAAMetadataTest, BasicTypes) {
MConstInt(0))));
ASSERT_TRUE(I);
- I = matchNext(I,
+ I = matchNext(
+ I,
MInstruction(Instruction::Store,
- MValType(PointerType::getUnqual(Compiler.Context)),
- MMTuple(
- MMTuple(
- MMString("p1 void"),
- AnyPtr,
- MConstInt(0)),
- MSameAs(0),
- MConstInt(0))));
+ MValType(PointerType::getUnqual(Compiler.Context)),
+ MMTuple(MMTuple(MMString("p1 void"), AnyPtr, MConstInt(0)),
+ MSameAs(0), MConstInt(0))));
ASSERT_TRUE(I);
- I = matchNext(I,
- MInstruction(Instruction::Store,
- MValType(PointerType::getUnqual(Compiler.Context)),
- MMTuple(
- MMTuple(
- MMString("p1 int"),
- AnyPtr,
- MConstInt(0)),
- MSameAs(0),
- MConstInt(0))));
+ I = matchNext(
+ I, MInstruction(Instruction::Store,
+ MValType(PointerType::getUnqual(Compiler.Context)),
+ MMTuple(MMTuple(MMString("p1 int"), AnyPtr, MConstInt(0)),
+ MSameAs(0), MConstInt(0))));
ASSERT_TRUE(I);
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable to me. Since it's standards-compliant, has a specific opt-out flag, and is overridden by the general -fno-strict-aliasing
flag, I don't see any need for the RFC process. Still, you should definitely wait for more feedback, especially from @AaronBallman.
This is not a prerequisite for landing the patch — or a requirement at all, for that matter — but I think it'd be great if you could make a sort of capstone post on the forums talking about this work and your performance analysis in more detail.
58f3bcf
to
1222bbb
Compare
Thanks @rjmccall, I'll try to write something up, although it might take a few weeks. |
What are the chances that this will exploit undefined behavior in existing user code in ways that will make upgrading to the latest Clang more difficult because of the perception of "miscompiles?" If this is a likely scenario for users to hit, do other tools like UBSan (etc) help users to find and fix the issues? I'm not opposed to the changes, I just want to make sure we are setting users up for success when they upgrade. |
We don't yet have a reliable tool for sanitizing TBAA failures. There is a project to do that (the TypeSanitizer, see https://discourse.llvm.org/t/reviving-typesanitizer-a-sanitizer-to-catch-type-based-aliasing-violations/66092), but the last time I tried using the TypeSanitizer, it wasn't ready for production use. |
As @jcranmer-intel mentioned, there is work on a type sanitizer and there are multiple people interested in making improvements to it. Hopefully we will be able to get an initial version in However the existing Given that cutting the release branch is still ~2 months away, I am hoping that enabling this now we will get this exposed to a large number of users before going in the release candidate phase, which should give us a good amount of signal if this is going to be disruptive to a larger number of users. If that's the case, there would be plenty of time to restore to original behavior in time for the Clang 20 release. Another thing to note is that GCC (and I think other commercial compilers) already use this information in their optimizers, so it is probably likely that users of those compilers would already have any existing violations exposed. |
I'd also like to share a bit more info about the expected compile-time impact of the change:
TLDR: While there are some increase in compile-time, those are caused by a number of additional optimizations. The increase in clang build time yields a slightly faster stage2 clang while doing more optimizations. Lets take a closer look at some of the increases in compile-time. The statistics data below is for ARM64 macOS not X86 Linux like the compile-time tracker, but most impact is from IR optimizations, so I don’t expect the data to be vastly different. For stage1-O3, the biggest increase is Bullet +0.39%. This boils down to being able to apply more transformations, most notably probably +70% more loops being vectorized, which itself adds compile-time, as well as adding extra code to simplify. Top Statistic Changes for Bullet For stage1-ReleaseLTO-g the biggest increase is kimwitu++ +0.7%. The input bitcode size after merging all modules before LTO is the same. Looking at the impact on the transformations, we optimize quite a bit more around memory instructions, so likely the main contributors to compile-time being GVN, DSE and LICM. Top Statistic Changes for kimwitu++ Another increase to highlight is clang build time (+0.13%). I didn’t collect stats for this one, but it looks like the extra optimizations make the stage2 Clang about 0.08% faster on CTMark (stage1-O3 +0.01% vs stage2-O3 -0.07%), while doing a number of additional transformations. |
ping :) @AaronBallman WDYT re making this clear in the release notes (also that there's an easy way to disable) + the ongoing work in parallel for the type sanitizer? |
I think it's a reasonable experiment to try. I would call it a potentially breaking change because it's possible this will quietly change code behavior, though. |
Support for more precise TBAA metadata has been added a while ago (behind the -fpointer-tbaa flag). The more precise TBAA metadata allows treating accesses of different pointer types as no-alias. This helps to remove more redundant loads and stores in a number of workloads. Some highlights on the impact across llvm-test-suite's MultiSource, SPEC2006 & SPEC2017 include: * +2% more NoAlias results for memory access * +4% more loops vectorized * +3% more stores removed by DSE. This closes a relatively big gap to GCC, which has been supporting disambiguating based on pointer types for a long time. (https://clang.godbolt.org/z/K7Wbhrz4q) Pointer-TBAA support for pointers to builtin types has been added in llvm#76612. Support for user-defined types has been added in llvm#110569. There are 2 pending PRs with bug fixes for special cases uncovered during some of my testing: * llvm#116991 * llvm#116596
1222bbb
to
fa436bc
Compare
Thanks, I moved the release note to the 'Potentially Breaking Changes' section and included a sentence that this may silently change code behavior for code that has strict-aliasing violations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a strict improvement over the general 'fstrict-aliasing' mode we already have, and I see little value in maintaining separate levels of strict aliasing conformance long term.
I think it would not be a good situation for users to rely on some level of strict aliasing violations, by clang accepting some bugs, rejecting others, without a clear line separating them.
I'd favor not having the flag, with 'fno-strict-aliasing' still being enforced.
I don't have strong objections against the new flag, if it's only temporary, with a clear goalpost for removing it.
I agree that long-term we should remove the separate flag, maybe after one or two releases. Initially I expect it will help users during the transition phase (and also us during bug triage to quickly identify if an issue is related to the new behavior) |
@fhahn we have identified a bunch of tests at google that fail somewhere in the open source library https://code.videolan.org/videolan/x264. The Is this expected? Are the two mutually exclusive? |
This is likely caused by strict-aliasing violations in the source, could you share a reproducer or steps on how to reproduce? |
No reproducer yet. The dependency to |
[Clang] Enable -fpointer-tbaa by default. (llvm/llvm-project#117244)
[Clang] Enable -fpointer-tbaa by default. (llvm/llvm-project#117244)
This reverts commit 7954a05.
This reverts commit 7954a05.
Support for more precise TBAA metadata has been added a while ago (behind the -fpointer-tbaa flag). The more precise TBAA metadata allows treating accesses of different pointer types as no-alias.
This helps to remove more redundant loads and stores in a number of workloads.
Some highlights on the impact across llvm-test-suite's MultiSource, SPEC2006 & SPEC2017 include:
This closes a relatively big gap to GCC, which has been supporting disambiguating based on pointer types for a long time. (https://clang.godbolt.org/z/K7Wbhrz4q)
Pointer-TBAA support for pointers to builtin types has been added in #76612.
Support for user-defined types has been added in
#110569.
There are 2 recent PRs with bug fixes for special cases uncovered during some of my testing: