-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[clang][CompundLiteralExpr] Don't defer evaluation for CLEs #137163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-clang Author: kadir çetinkaya (kadircet) ChangesPreviously we would defer evaluation of CLEs until LValue to RValue This patch instead eagerly evaluates CLEs, within the scope requiring Full diff: https://github.com/llvm/llvm-project/pull/137163.diff 4 Files Affected:
diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h
index a83320a7ddec2..95c0f910c22f8 100644
--- a/clang/include/clang/AST/Expr.h
+++ b/clang/include/clang/AST/Expr.h
@@ -3489,6 +3489,11 @@ class CompoundLiteralExpr : public Expr {
/// The int part of the pair stores whether this expr is file scope.
llvm::PointerIntPair<TypeSourceInfo *, 1, bool> TInfoAndScope;
Stmt *Init;
+
+ /// Value of constant literals with static storage duration. Used only for
+ /// constant folding as CompoundLiteralExpr is not an ICE.
+ mutable APValue *StaticValue = nullptr;
+
public:
CompoundLiteralExpr(SourceLocation lparenloc, TypeSourceInfo *tinfo,
QualType T, ExprValueKind VK, Expr *init, bool fileScope)
@@ -3518,6 +3523,13 @@ class CompoundLiteralExpr : public Expr {
TInfoAndScope.setPointer(tinfo);
}
+ bool hasStaticStorage() const { return isFileScope() && isGLValue(); }
+ APValue *getOrCreateStaticValue(ASTContext& Ctx) const;
+ APValue &getStaticValue() const {
+ assert(StaticValue);
+ return *StaticValue;
+ }
+
SourceLocation getBeginLoc() const LLVM_READONLY {
// FIXME: Init should never be null.
if (!Init)
diff --git a/clang/lib/AST/Expr.cpp b/clang/lib/AST/Expr.cpp
index 59c0e47c7c195..442e85b892a51 100644
--- a/clang/lib/AST/Expr.cpp
+++ b/clang/lib/AST/Expr.cpp
@@ -5467,3 +5467,12 @@ ConvertVectorExpr *ConvertVectorExpr::Create(
return new (Mem) ConvertVectorExpr(SrcExpr, TI, DstType, VK, OK, BuiltinLoc,
RParenLoc, FPFeatures);
}
+
+APValue *CompoundLiteralExpr::getOrCreateStaticValue(ASTContext &Ctx) const {
+ assert(hasStaticStorage());
+ if (!StaticValue) {
+ StaticValue = new (Ctx) APValue;
+ Ctx.addDestruction(StaticValue);
+ }
+ return StaticValue;
+}
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 7c933f47bf7f0..2379e78c1631a 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -4596,10 +4596,6 @@ handleLValueToRValueConversion(EvalInfo &Info, const Expr *Conv, QualType Type,
return false;
}
- APValue Lit;
- if (!Evaluate(Lit, Info, CLE->getInitializer()))
- return false;
-
// According to GCC info page:
//
// 6.28 Compound Literals
@@ -4622,7 +4618,12 @@ handleLValueToRValueConversion(EvalInfo &Info, const Expr *Conv, QualType Type,
}
}
- CompleteObject LitObj(LVal.Base, &Lit, Base->getType());
+ APValue *Lit =
+ CLE->hasStaticStorage()
+ ? &CLE->getStaticValue()
+ : Info.CurrentCall->getTemporary(Base, LVal.Base.getVersion());
+
+ CompleteObject LitObj(LVal.Base, Lit, Base->getType());
return extractSubobject(Info, Conv, LitObj, LVal.Designator, RVal, AK);
} else if (isa<StringLiteral>(Base) || isa<PredefinedExpr>(Base)) {
// Special-case character extraction so we don't have to construct an
@@ -9125,9 +9126,25 @@ bool
LValueExprEvaluator::VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
assert((!Info.getLangOpts().CPlusPlus || E->isFileScope()) &&
"lvalue compound literal in c++?");
- // Defer visiting the literal until the lvalue-to-rvalue conversion. We can
- // only see this when folding in C, so there's no standard to follow here.
- return Success(E);
+ APValue *Lit;
+ // If CompountLiteral has static storage, its value can be used outside
+ // this expression. So evaluate it once and store it in ASTContext.
+ if (E->hasStaticStorage()) {
+ Lit = E->getOrCreateStaticValue(Info.Ctx);
+ Result.set(E);
+ // Reset any previously evaluated state, otherwise evaluation below might
+ // fail.
+ // FIXME: Should we just re-use the previously evaluated value instead?
+ *Lit = APValue();
+ } else {
+ Lit = &Info.CurrentCall->createTemporary(E, E->getInitializer()->getType(),
+ ScopeKind::FullExpression, Result);
+ }
+ if (!EvaluateInPlace(*Lit, Info, Result, E->getInitializer())) {
+ *Lit = APValue();
+ return false;
+ }
+ return true;
}
bool LValueExprEvaluator::VisitCXXTypeidExpr(const CXXTypeidExpr *E) {
diff --git a/clang/test/AST/static-compound-literals.cpp b/clang/test/AST/static-compound-literals.cpp
new file mode 100644
index 0000000000000..ceb8b985ab9dc
--- /dev/null
+++ b/clang/test/AST/static-compound-literals.cpp
@@ -0,0 +1,12 @@
+// Test that we can successfully compile this code, especially under ASAN.
+// RUN: %clang_cc1 -verify -std=c++20 -fsyntax-only %s
+// expected-no-diagnostics
+struct Foo {
+ Foo* f;
+ operator bool() const { return true; }
+};
+constexpr Foo f((Foo[]){});
+int foo() {
+ if (Foo(*f.f)) return 1;
+ return 0;
+}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Followup for #118480 |
c3efccd
to
99fa9a1
Compare
clang/include/clang/AST/Expr.h
Outdated
@@ -3489,6 +3489,11 @@ class CompoundLiteralExpr : public Expr { | |||
/// The int part of the pair stores whether this expr is file scope. | |||
llvm::PointerIntPair<TypeSourceInfo *, 1, bool> TInfoAndScope; | |||
Stmt *Init; | |||
|
|||
/// Value of constant literals with static storage duration. Used only for | |||
/// constant folding as CompoundLiteralExpr is not an ICE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A compound literal expression is an ICE in C though?
An integer constant expression116) shall have integer type and shall only have operands that are
integer constants, named and compound literal constants of integer type, character constants,
sizeof expressions whose results are integer constants, alignof expressions, and floating, named,
or compound literal constants of arithmetic type that are the immediate operands of casts. Cast
operators in an integer constant expression shall only convert arithmetic types to integer types,
except as part of an operand to the typeof operators, sizeof operator, or alignof operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, but that's probably from an old comment that confused us (at least me):
LValueExprEvaluator
used to say:
// only see this when folding in C, so there's no standard to follow here
I think we can just remove this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am happy to drop that comment, but I was inclined to preserve it as well since I came across logic aligned with that in couple of other places too, e.g.
llvm-project/clang/lib/AST/ExprConstant.cpp
Line 17278 in 5bad5d8
case Expr::CompoundLiteralExprClass: |
clang/include/clang/AST/Expr.h
Outdated
@@ -3489,6 +3489,11 @@ class CompoundLiteralExpr : public Expr { | |||
/// The int part of the pair stores whether this expr is file scope. | |||
llvm::PointerIntPair<TypeSourceInfo *, 1, bool> TInfoAndScope; | |||
Stmt *Init; | |||
|
|||
/// Value of constant literals with static storage duration. Used only for | |||
/// constant folding as CompoundLiteralExpr is not an ICE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, but that's probably from an old comment that confused us (at least me):
LValueExprEvaluator
used to say:
// only see this when folding in C, so there's no standard to follow here
I think we can just remove this comment.
// Reset any previously evaluated state, otherwise evaluation below might | ||
// fail. | ||
// FIXME: Should we just re-use the previously evaluated value instead? | ||
*Lit = APValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The evaluation should produce the same result every time, so it doesn't really do any harm to do it twice. But if you do it right, it's also very easy to avoid evaluating it twice.
I suspect using EvaluateInPlace will cause state to leak into the evaluation when it shouldn't (like, we accidentally accept a use of a temporary inside the initializer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I don't fully understand the implications here, I'd rather keep the evaluation behavior similar to the previous version (e.g. evaluate every-time we need it).
Happy to evaluate once if you think that should be safe ~always and feel strongly about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm more concerned about the fact that you're using EvaluateInPlace, as opposed to using EvaluateAsInitializer. Which... I'm not sure it's actually possible to observe a difference at the moment due to the rule that static compound literals are required to have a constant initializer, but it's fragile.
On a sort of related note, I was experimenting with some testcases, and the following crashed:
struct A {int x[1]; };
A f();
typedef int *t[];
consteval int* f(int* x) { return x; }
int ** x = (t){f(f().x)};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think of the opposite. The code here mimics what MaterializeTemporaryExpr
does and I think it's a good thing because MaterializeTemporaryExpr
should be the most battle-tested code for similar cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it seems to EvaluateAsInitializer
requires a VarDecl
, I am not sure if that makes sense in LValueExprEvaluator
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dug into the EvalMode thing... I remember why we added it now. See https://reviews.llvm.org/D151587 for context. Roughly, the issue is that, if you constant-evaluate the expression in isolation, you get a different result from what you'd get evaluating the whole expression. So we can't redo the evaluation in ConstantFold mode: if you do, you'll corrupt the precomputed value, for cases like clang/test/CodeGenCXX/const-init-cxx1y.cpp.
I think compound literals run into similar issues if you allow ConstantFold mode to overwrite the evaluated value.
Maybe we can add some workaround specifically for cases where we immediately do an lvalue-to-rvalue conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the late response here, it took me a while to wrap my head around it and then I got interrupted :(
Roughly, the issue is that, if you constant-evaluate the expression in isolation, you get a different result from what you'd get evaluating the whole expression. So we can't redo the evaluation in ConstantFold mode: if you do, you'll corrupt the precomputed value
I am not sure if this applies to CLEs and even if it does I don't think this patch is making it worse:
- CLEs can only have constant expressions as initializers, as you also highlighted above. Hence (i think) evaluating them multiple times or evaluating sub-parts of them shouldn't matter.
- As things stand clang is already evaluating CLEs from scratch every time it needs to (so if there's a problem here, it already exists).
I think compound literals run into similar issues if you allow ConstantFold mode to overwrite the evaluated value.
Sorry if I am being dense, but there's a good chance I didn't fully understand your point. So if you think what I said above doesn't make sense, I'd really appreciate if you can provide a code snippet that demonstrates such a case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did some experiments. The !CLETy.isConstant(Info.Ctx)
check mostly stops the bad interaction with examples like const-init-cxx1y.cpp. If it's not constant, we can't access it at all (unlike lifetime-extended temporaries), and if it is constant it can't be modified. Re-evaluating can change the value in some weird cases, though. For example:
struct RR { int r; };
struct Z { int x; const RR* y; int z; };
constinit Z z = { 10, (const RR[1]){__builtin_constant_p(z.x)}, z.y->r };
The field "r" in the compound literal is initialized to zero, but the read from the field returns 1.
But you're right that it isn't a regression. Maybe sufficient to add a FIXME explaining why it's wrong to evaluate in the current context.
On a related note, the following cases crash, but they're not regressions, I guess.
struct RR { int&& r; };
struct Z { RR* x; };
constinit Z z = { (RR[1]){1} };
struct RR { int r; };
struct Z { int x; const RR* y; int z; };
inline int f() { return 0; }
Z z2 = { 10, (const RR[1]){__builtin_constant_p(z2.x)}, z2.y->r+f() };
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you're right that it isn't a regression. Maybe sufficient to add a FIXME explaining why it's wrong to evaluate in the current context.
I added the example as a test case and left a FIXME. but I am not really sure if it's about re-evaluation. I fiddled with the code a little to not re-evaluate CLE's initializer again but use the previous one, it still initializes field to 1 🤷
I'll also create an issue for that, in case someone wants to investigate (FWIW, that behavior is same without this patch).
As for the crashes, again they also happen at HEAD, still added test cases for those too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really about whether we re-evaluate it, it's about whether we evaluate it in the right context. If you implement something like VarDecl::evaluateValue(), it should evaluate to zero.
But I think I'm okay with this as an incremental improvement.
// Reset any previously evaluated state, otherwise evaluation below might | ||
// fail. | ||
// FIXME: Should we just re-use the previously evaluated value instead? | ||
*Lit = APValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm more concerned about the fact that you're using EvaluateInPlace, as opposed to using EvaluateAsInitializer. Which... I'm not sure it's actually possible to observe a difference at the moment due to the rule that static compound literals are required to have a constant initializer, but it's fragile.
On a sort of related note, I was experimenting with some testcases, and the following crashed:
struct A {int x[1]; };
A f();
typedef int *t[];
consteval int* f(int* x) { return x; }
int ** x = (t){f(f().x)};
ping @AaronBallman @efriedma-quic if you don't have more concerns here, I'd like to move forward with this patch |
Previously we would defer evaluation of CLEs until LValue to RValue conversions, which would result in creating values within wrong scope and triggering use-after-frees. This patch instead eagerly evaluates CLEs, within the scope requiring them. This requires storing an extra pointer for CLE expressions with static storage.
ee8e0f8
to
6ca1c58
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the review @efriedma-quic! LMK if you have more concerns, otherwise i'd like to move forward with the change.
// Reset any previously evaluated state, otherwise evaluation below might | ||
// fail. | ||
// FIXME: Should we just re-use the previously evaluated value instead? | ||
*Lit = APValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you're right that it isn't a regression. Maybe sufficient to add a FIXME explaining why it's wrong to evaluate in the current context.
I added the example as a test case and left a FIXME. but I am not really sure if it's about re-evaluation. I fiddled with the code a little to not re-evaluate CLE's initializer again but use the previous one, it still initializes field to 1 🤷
I'll also create an issue for that, in case someone wants to investigate (FWIW, that behavior is same without this patch).
As for the crashes, again they also happen at HEAD, still added test cases for those too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// Reset any previously evaluated state, otherwise evaluation below might | ||
// fail. | ||
// FIXME: Should we just re-use the previously evaluated value instead? | ||
*Lit = APValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really about whether we re-evaluate it, it's about whether we evaluate it in the right context. If you implement something like VarDecl::evaluateValue(), it should evaluate to zero.
But I think I'm okay with this as an incremental improvement.
Previously we would defer evaluation of CLEs until LValue to RValue
conversions, which would result in creating values within wrong scope
and triggering use-after-frees.
This patch instead eagerly evaluates CLEs, within the scope requiring
them. This requires storing an extra pointer for CLE expressions with
static storage.
Fixes #137165