-
Notifications
You must be signed in to change notification settings - Fork 11.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang] Fix underlying type of EmbedExpr #99050
Conversation
This patch makes remaining cases of #embed to emit int type since there is an agreement to do that for C. C++ is being discussed, but in general we don't want to produce different types for C and C++.
@llvm/pr-subscribers-clang Author: Mariya Podchishchaeva (Fznamznon) ChangesThis patch makes remaining cases of #embed to emit int type since there is an agreement to do that for C. C++ is being discussed, but in general we don't want to produce different types for C and C++. Full diff: https://github.com/llvm/llvm-project/pull/99050.diff 3 Files Affected:
diff --git a/clang/lib/AST/Expr.cpp b/clang/lib/AST/Expr.cpp
index 6af1b5683e08b..9d5b8167d0ee6 100644
--- a/clang/lib/AST/Expr.cpp
+++ b/clang/lib/AST/Expr.cpp
@@ -2376,7 +2376,7 @@ APValue SourceLocExpr::EvaluateInContext(const ASTContext &Ctx,
EmbedExpr::EmbedExpr(const ASTContext &Ctx, SourceLocation Loc,
EmbedDataStorage *Data, unsigned Begin,
unsigned NumOfElements)
- : Expr(EmbedExprClass, Ctx.UnsignedCharTy, VK_PRValue, OK_Ordinary),
+ : Expr(EmbedExprClass, Ctx.IntTy, VK_PRValue, OK_Ordinary),
EmbedKeywordLoc(Loc), Ctx(&Ctx), Data(Data), Begin(Begin),
NumOfElements(NumOfElements) {
setDependence(ExprDependence::None);
diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index a27ed02fc73b8..2c963b28aeb2b 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -2007,7 +2007,7 @@ static bool canInitializeArrayWithEmbedDataString(ArrayRef<Expr *> ExprList,
if (InitType->isArrayType()) {
const ArrayType *InitArrayType = InitType->getAsArrayTypeUnsafe();
QualType InitElementTy = InitArrayType->getElementType();
- QualType EmbedExprElementTy = EE->getType();
+ QualType EmbedExprElementTy = EE->getDataStringLiteral()->getType();
const bool TypesMatch =
Context.typesAreCompatible(InitElementTy, EmbedExprElementTy) ||
(InitElementTy->isCharType() && EmbedExprElementTy->isCharType());
diff --git a/clang/test/Preprocessor/embed_codegen.cpp b/clang/test/Preprocessor/embed_codegen.cpp
index 201bf300bc669..fdbf4f2aa400f 100644
--- a/clang/test/Preprocessor/embed_codegen.cpp
+++ b/clang/test/Preprocessor/embed_codegen.cpp
@@ -13,9 +13,9 @@ int ca[] = {
};
// CHECK: %arrayinit.element = getelementptr inbounds i32, ptr %notca, i64 1
-// CHECK: store i8 106, ptr %arrayinit.element, align 4
+// CHECK: store i32 106, ptr %arrayinit.element, align 4
// CHECK: %arrayinit.element1 = getelementptr inbounds i32, ptr %notca, i64 2
-// CHECK: store i8 107, ptr %arrayinit.element1, align 4
+// CHECK: store i32 107, ptr %arrayinit.element1, align 4
int notca[] = {
a
#embed <jk.txt> prefix(,)
@@ -74,9 +74,9 @@ constexpr struct T t[] = {
// CHECK: %arrayinit.element7 = getelementptr inbounds %struct.T, ptr %tnonc, i64 1
// CHECK: call void @llvm.memset.p0.i64(ptr align 4 %arrayinit.element7, i8 0, i64 20, i1 false)
// CHECK: %arr8 = getelementptr inbounds %struct.T, ptr %arrayinit.element7, i32 0, i32 0
-// CHECK: store i8 106, ptr %arr8, align 4
+// CHECK: store i32 106, ptr %arr8, align 4
// CHECK: %arrayinit.element9 = getelementptr inbounds i32, ptr %arr8, i64 1
-// CHECK: store i8 107, ptr %arrayinit.element9, align 4
+// CHECK: store i32 107, ptr %arrayinit.element9, align 4
struct T tnonc[] = {
a, 300, 1, 2, 3
#embed <jk.txt> prefix(,)
|
@@ -13,9 +13,9 @@ int ca[] = { | |||
}; | |||
|
|||
// CHECK: %arrayinit.element = getelementptr inbounds i32, ptr %notca, i64 1 | |||
// CHECK: store i8 106, ptr %arrayinit.element, align 4 | |||
// CHECK: store i32 106, ptr %arrayinit.element, align 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually shows a bug in CodeGen for #embed, it should have always stored i32 because the target storage is i32. A cast emission is missing in CGExprAgg.cpp. Though I'm not sure it should be included into this PR.
Another problem is that I can't think of a good test case that shows that elements embedded in the initializer list are of int type and not of unsigned char type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to write a test case and found an unrelated embed crash https://compiler-explorer.com/z/bMWbGfh79
can we add the test here #97274 (comment) (with sizeof) ? Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM but I agree with @cor3ntin that we should add the test case from #97274 (comment) explicitly. Some other tests to consider adding at the same time:
static_assert(_Generic(
#embed __FILE__ limit(1)
, int : 1, default : 0));
static_assert(alignof(typeof(
#embed __FILE__ limit(1)
)) == alignof(int));
printf("%hhu", // Do we get a -Wformat diagnostic about the specifier mismatch?
#embed __FILE__ limit(1)
);
I can add test with sizeof for sure, the problem is that it is already working https://godbolt.org/z/K69v8KPYa and we have similar ones in codebase already. |
Ah, if we have sufficient coverage already, that's fine. I was just thinking about coverage in general, didn't check to see if it was redundant. :-)
If you have an embed file with a single byte in it whose value is 0xFF, you could use a test like:
which should give the same behavior as: https://godbolt.org/z/Tno5YPexM |
Thanks! That worked. I didn't add these though
for several reasons:
|
Posted #99624 to fix the assertion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This patch makes remaining cases of #embed to emit int type since there is an agreement to do that for C. C++ is being discussed, but in general we don't want to produce different types for C and C++.
Summary: This patch makes remaining cases of #embed to emit int type since there is an agreement to do that for C. C++ is being discussed, but in general we don't want to produce different types for C and C++. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60251342
This patch makes remaining cases of #embed to emit int type since there is an agreement to do that for C. C++ is being discussed, but in general we don't want to produce different types for C and C++.