-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Global string alignment #142346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Global string alignment #142346
Conversation
@uweigand FYI |
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-llvm-ir Author: Dominik Steenken (dominik-steenken) ChangesWhen creating global strings, some targets have requirements that need to be taken into account. Previously, the global strings created by This commit makes it so that the alignment is taken from the data layout instead, giving targets the chance to align global strings according to their preferences. This PR is motivated by (and should fix) #141491, where the 1-byte alignment in a global string created by a I don't know and would appreciate feedback on whether this is the correct place to fix something like that. Full diff: https://github.com/llvm/llvm-project/pull/142346.diff 2 Files Affected:
diff --git a/clang/test/CodeGen/SystemZ/align-systemz-globalstring.c b/clang/test/CodeGen/SystemZ/align-systemz-globalstring.c
new file mode 100644
index 0000000000000..f5178a079f898
--- /dev/null
+++ b/clang/test/CodeGen/SystemZ/align-systemz-globalstring.c
@@ -0,0 +1,14 @@
+// RUN: %clang_cc1 -O1 -triple s390x-linux-gnu -emit-llvm %s -o - | FileCheck %s
+
+// #include <stdio.h>
+
+// CHECK: @msg1 = local_unnamed_addr constant [13 x i8] c"Hello World\0A\00", align 2
+// CHECK: @str = private unnamed_addr constant [12 x i8] c"Hello World\00", align 2
+
+const char msg1 [] = "Hello World\n";
+
+extern int printf(const char *__restrict __format, ...);
+
+void foo() {
+ printf(msg1);
+}
\ No newline at end of file
diff --git a/llvm/lib/IR/IRBuilder.cpp b/llvm/lib/IR/IRBuilder.cpp
index 580b0af709337..ab06a587a861f 100644
--- a/llvm/lib/IR/IRBuilder.cpp
+++ b/llvm/lib/IR/IRBuilder.cpp
@@ -52,7 +52,7 @@ GlobalVariable *IRBuilderBase::CreateGlobalString(StringRef Str,
*M, StrConstant->getType(), true, GlobalValue::PrivateLinkage,
StrConstant, Name, nullptr, GlobalVariable::NotThreadLocal, AddressSpace);
GV->setUnnamedAddr(GlobalValue::UnnamedAddr::Global);
- GV->setAlignment(Align(1));
+ GV->setAlignment(M->getDataLayout().getPreferredAlign(GV));
return GV;
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is the right fix for the issue. Just like for allocas, the global variable alignment is a minimum required alignment, which can and should be raised by targets if it improves code generation. (The exception to this is if the global has a section.)
It looks like CGP has a shouldAlignPointerArgs hook for this purpose, but that seems overly specific for what is desired here. I'm surprised we don't seem have something more generic for this...
llvm/lib/IR/IRBuilder.cpp
Outdated
@@ -52,7 +52,7 @@ GlobalVariable *IRBuilderBase::CreateGlobalString(StringRef Str, | |||
*M, StrConstant->getType(), true, GlobalValue::PrivateLinkage, | |||
StrConstant, Name, nullptr, GlobalVariable::NotThreadLocal, AddressSpace); | |||
GV->setUnnamedAddr(GlobalValue::UnnamedAddr::Global); | |||
GV->setAlignment(Align(1)); | |||
GV->setAlignment(M->getDataLayout().getPreferredAlign(GV)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is just equivalent to not setting the alignment at all...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well it did change the alignment of the string produced by the PrintFOptimizer from 1
to 2
, which is why i thought i might be on the right track.
Like i said, i'm very open to moving this to someplace different, just lacking the background knowledge of where to put it :) I'm going to try and take a look at how we handle |
If there's some general rule that means we want to increase the alignment of globals, I think it makes sense to make CodeGenPrepare or some target-specific pass, instead of trying to fix every frontend and every optimization pass to use increased alignment. |
97f3c20
to
f646713
Compare
So i've been reading through the code and the history a bit, and there is this commit, which introduces the concept of a minimum alignment for globals to From what i can tell, this doesn't work here because the optimization for the I don't know what should happen in general for optimizations that end up adding global variables, but in this particular case i think there is an easier way. The point of this optimization is to convert the |
The clang patch was written the way it was because it was necessary to comply with the ABI rules. Strings passed to printf don't have any sort of alignment requirement, so you can't really appeal to the ABI rules here, I think? The problem with copying the alignment is that we don't really have any reason to believe the alignment of the original string is actually appropriate. For the original patch, I think the comment was that the patch was equivalent to deleting the setAlignment() call. And we really don't want to set the alignment that high; the current getPreferredAlign() prefer higher alignment. Maybe |
When creating global strings, some targets have requirements that need to be taken into account. Previously, the global strings created by `IRBuilder::createGlobalString` had a hard-coded alignment of `1`. This commit makes it so that the alignment is taken from the data layout instead, giving targets the chance to align global strings according to their preferences.
This commit adds a test to the `clang` test suite for the SystemZ backend that checks for the correct alignment of global strings created by the PrintFOptimizer for the case `printf("foo\n")` -> `puts("foo")`.
f646713
to
59f7f10
Compare
Ok. As far as i can tell, M->getDataLayout().getPrefTypeAlign(getInt8Ty()) should be functionally equivalent to M->getDataLayout().getPreferredAlign(GV) for strings, with the exception that the latter potentially preserves a preexisting alignment on the In any case, i've reverted my changes to copy the alignment, and changed |
Just to be clear, the ABI requirement is that every symbol has to be at least 2-byte aligned. The problem is not passing the address of the string to printf, the problem is loading that address into a register in the first place. The most efficient way to load an address uses a PC-relative load address instruction, and this requires that the target symbol is 2-byte aligned. |
When creating global strings, some targets have requirements that need to be taken into account. Previously, the global strings created by
IRBuilder::createGlobalString
had a hard-coded alignment of1
.This commit makes it so that the alignment is taken from the data layout instead, giving targets the chance to align global strings according to their preferences.
This PR is motivated by (and should fix) #141491, where the 1-byte alignment in a global string created by a
printf
optimization led to the resulting assembly in a-fno-PIC
compile to unexpectedly reference the GOT based on whether the code containedprintf
statements of the formprintf("foo\n");
.I don't know and would appreciate feedback on whether this is the correct place to fix something like that.