Skip to content

[CIR] Upstream lowering of conditional operators to TernaryOp #138156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 3, 2025

Conversation

mmha
Copy link
Contributor

@mmha mmha commented May 1, 2025

This patch adds visitors for BinLAnd, BinLOr and AbstractConditionalOperator. Note that this patch still lacks visitation of OpaqueValueExpr which is needed for the GNU ?: operator.

@mmha mmha requested review from erichkeane and andykaylor May 1, 2025 15:49
@mmha mmha requested review from lanza and bcardosolopes as code owners May 1, 2025 15:49
@llvmbot llvmbot added clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project labels May 1, 2025
@llvmbot
Copy link
Member

llvmbot commented May 1, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clangir

Author: Morris Hafner (mmha)

Changes

This patch adds visitors for BinLAnd, BinLOr and AbstractConditionalOperator. Note that this patch still lacks visitation of OpaqueValueExpr which is needed for the GNU ?: operator.


Patch is 63.94 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/138156.diff

12 Files Affected:

  • (modified) clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h (+16)
  • (modified) clang/include/clang/CIR/MissingFeatures.h (+4-1)
  • (modified) clang/lib/CIR/CodeGen/CIRGenDecl.cpp (+1-2)
  • (modified) clang/lib/CIR/CodeGen/CIRGenExpr.cpp (+263-13)
  • (modified) clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp (+9-2)
  • (modified) clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp (+334)
  • (modified) clang/lib/CIR/CodeGen/CIRGenFunction.cpp (+14)
  • (modified) clang/lib/CIR/CodeGen/CIRGenFunction.h (+243-2)
  • (modified) clang/lib/CIR/CodeGen/CIRGenValue.h (+1)
  • (added) clang/test/CIR/CodeGen/binop.c (+13)
  • (modified) clang/test/CIR/CodeGen/binop.cpp (+225)
  • (added) clang/test/CIR/CodeGen/ternary.cpp (+120)
diff --git a/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h b/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
index d58ced6ec8bff..104cf9e3d3875 100644
--- a/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
+++ b/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
@@ -272,6 +272,22 @@ class CIRBaseBuilderTy : public mlir::OpBuilder {
     return createCast(loc, cir::CastKind::bitcast, src, newTy);
   }
 
+  // TODO(cir): the following function was introduced to keep in sync with LLVM
+  // codegen. CIR does not have "zext" operations. It should eventually be
+  // renamed or removed. For now, we just add whatever cast is required here.
+  mlir::Value createZExtOrBitCast(mlir::Location loc, mlir::Value src,
+                                  mlir::Type newTy) {
+    mlir::Type srcTy = src.getType();
+
+    if (srcTy == newTy)
+      return src;
+
+    if (mlir::isa<cir::BoolType>(srcTy) && mlir::isa<cir::IntType>(newTy))
+      return createBoolToInt(src, newTy);
+
+    llvm_unreachable("unhandled extension cast");
+  }
+
   //===--------------------------------------------------------------------===//
   // Binary Operators
   //===--------------------------------------------------------------------===//
diff --git a/clang/include/clang/CIR/MissingFeatures.h b/clang/include/clang/CIR/MissingFeatures.h
index 4d4951aa0e126..7b345cf1e8c7f 100644
--- a/clang/include/clang/CIR/MissingFeatures.h
+++ b/clang/include/clang/CIR/MissingFeatures.h
@@ -163,6 +163,8 @@ struct MissingFeatures {
   static bool setDSOLocal() { return false; }
   static bool foldCaseStmt() { return false; }
   static bool constantFoldSwitchStatement() { return false; }
+  static bool peepholeProtection() { return false; }
+  static bool instrumenation() { return false; }
 
   // Missing types
   static bool dataMemberType() { return false; }
@@ -188,8 +190,9 @@ struct MissingFeatures {
   static bool ptrStrideOp() { return false; }
   static bool selectOp() { return false; }
   static bool switchOp() { return false; }
-  static bool ternaryOp() { return false; }
+  static bool throwOp() { return false; }
   static bool tryOp() { return false; }
+  static bool vecTernaryOp() { return false; }
   static bool zextOp() { return false; }
 
   // Future CIR attributes
diff --git a/clang/lib/CIR/CodeGen/CIRGenDecl.cpp b/clang/lib/CIR/CodeGen/CIRGenDecl.cpp
index 90498cd18f46b..6f1ed2959c1dc 100644
--- a/clang/lib/CIR/CodeGen/CIRGenDecl.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenDecl.cpp
@@ -50,8 +50,7 @@ CIRGenFunction::emitAutoVarAlloca(const VarDecl &d) {
   // A normal fixed sized variable becomes an alloca in the entry block,
   mlir::Type allocaTy = convertTypeForMem(ty);
   // Create the temp alloca and declare variable using it.
-  address = createTempAlloca(allocaTy, alignment, loc, d.getName(),
-                             /*insertIntoFnEntryBlock=*/false);
+  address = createTempAlloca(allocaTy, alignment, loc, d.getName());
   declare(address.getPointer(), &d, ty, getLoc(d.getSourceRange()), alignment);
 
   emission.Addr = address;
diff --git a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
index da5a0b97a395e..48102e8c56b3e 100644
--- a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
@@ -15,6 +15,7 @@
 #include "CIRGenModule.h"
 #include "CIRGenValue.h"
 #include "mlir/IR/BuiltinAttributes.h"
+#include "mlir/IR/Value.h"
 #include "clang/AST/Attr.h"
 #include "clang/AST/CharUnits.h"
 #include "clang/AST/Decl.h"
@@ -22,6 +23,7 @@
 #include "clang/AST/ExprCXX.h"
 #include "clang/CIR/Dialect/IR/CIRDialect.h"
 #include "clang/CIR/MissingFeatures.h"
+#include <optional>
 
 using namespace clang;
 using namespace clang::CIRGen;
@@ -218,7 +220,7 @@ void CIRGenFunction::emitStoreThroughLValue(RValue src, LValue dst,
 
 static LValue emitGlobalVarDeclLValue(CIRGenFunction &cgf, const Expr *e,
                                       const VarDecl *vd) {
-  QualType T = e->getType();
+  QualType t = e->getType();
 
   // If it's thread_local, emit a call to its wrapper function instead.
   assert(!cir::MissingFeatures::opGlobalThreadLocal());
@@ -248,7 +250,7 @@ static LValue emitGlobalVarDeclLValue(CIRGenFunction &cgf, const Expr *e,
     cgf.cgm.errorNYI(e->getSourceRange(),
                      "emitGlobalVarDeclLValue: reference type");
   else
-    lv = cgf.makeAddrLValue(addr, T, AlignmentSource::Decl);
+    lv = cgf.makeAddrLValue(addr, t, AlignmentSource::Decl);
   assert(!cir::MissingFeatures::setObjCGCLValueClass());
   return lv;
 }
@@ -948,6 +950,165 @@ void CIRGenFunction::emitIgnoredExpr(const Expr *e) {
   emitLValue(e);
 }
 
+// Handle the case where the condition is a constant evaluatable simple integer,
+// which means we don't have to separately handle the true/false blocks.
+static std::optional<LValue> handleConditionalOperatorLValueSimpleCase(
+    CIRGenFunction &cgf, const AbstractConditionalOperator *e) {
+  const Expr *condExpr = e->getCond();
+  bool condExprBool;
+  if (cgf.constantFoldsToSimpleInteger(condExpr, condExprBool)) {
+    const Expr *live = e->getTrueExpr(), *dead = e->getFalseExpr();
+    if (!condExprBool)
+      std::swap(live, dead);
+
+    if (!cgf.containsLabel(dead)) {
+      // If the true case is live, we need to track its region.
+      if (condExprBool) {
+        assert(!cir::MissingFeatures::incrementProfileCounter());
+      }
+      // If a throw expression we emit it and return an undefined lvalue
+      // because it can't be used.
+      if (isa<CXXThrowExpr>(live->IgnoreParens())) {
+        assert(!cir::MissingFeatures::throwOp());
+        cgf.cgm.errorNYI(live->getSourceRange(),
+                         "throw expressions in conditional operator");
+        return std::nullopt;
+      }
+      return cgf.emitLValue(live);
+    }
+  }
+  return std::nullopt;
+}
+
+/// Emit the operand of a glvalue conditional operator. This is either a glvalue
+/// or a (possibly-parenthesized) throw-expression. If this is a throw, no
+/// LValue is returned and the current block has been terminated.
+static std::optional<LValue> emitLValueOrThrowExpression(CIRGenFunction &cgf,
+                                                         const Expr *operand) {
+  if (isa<CXXThrowExpr>(operand->IgnoreParens())) {
+    assert(!cir::MissingFeatures::throwOp());
+    cgf.cgm.errorNYI(operand->getSourceRange(),
+                     "throw expressions in conditional operator");
+    return std::nullopt;
+  }
+
+  return cgf.emitLValue(operand);
+}
+
+// Create and generate the 3 blocks for a conditional operator.
+// Leaves the 'current block' in the continuation basic block.
+template <typename FuncTy>
+CIRGenFunction::ConditionalInfo
+CIRGenFunction::emitConditionalBlocks(const AbstractConditionalOperator *e,
+                                      const FuncTy &branchGenFunc) {
+  ConditionalInfo info;
+  CIRGenFunction &cgf = *this;
+  ConditionalEvaluation eval(cgf);
+  mlir::Location loc = cgf.getLoc(e->getSourceRange());
+  CIRGenBuilderTy &builder = cgf.getBuilder();
+  Expr *trueExpr = e->getTrueExpr();
+  Expr *falseExpr = e->getFalseExpr();
+
+  mlir::Value condV = cgf.emitOpOnBoolExpr(loc, e->getCond());
+  SmallVector<mlir::OpBuilder::InsertPoint, 2> insertPoints{};
+  mlir::Type yieldTy{};
+
+  auto emitBranch = [&](mlir::OpBuilder &b, mlir::Location loc, Expr *expr,
+                        std::optional<LValue> &branchInfo) {
+    CIRGenFunction::LexicalScope lexScope{cgf, loc, b.getInsertionBlock()};
+    cgf.curLexScope->setAsTernary();
+
+    assert(!cir::MissingFeatures::incrementProfileCounter());
+    eval.begin(cgf);
+    branchInfo = branchGenFunc(cgf, expr);
+    mlir::Value branch = branchInfo->getPointer();
+    eval.end(cgf);
+
+    if (branch) {
+      yieldTy = branch.getType();
+      b.create<cir::YieldOp>(loc, branch);
+    } else {
+      // If LHS or RHS is a throw or void expression we need to patch
+      // arms as to properly match yield types.
+      insertPoints.push_back(b.saveInsertionPoint());
+    }
+  };
+
+  info.result = builder
+                    .create<cir::TernaryOp>(
+                        loc, condV, /*trueBuilder=*/
+                        [&](mlir::OpBuilder &b, mlir::Location loc) {
+                          emitBranch(b, loc, trueExpr, info.lhs);
+                        },
+                        /*falseBuilder=*/
+                        [&](mlir::OpBuilder &b, mlir::Location loc) {
+                          emitBranch(b, loc, falseExpr, info.rhs);
+                        })
+                    .getResult();
+
+  if (!insertPoints.empty()) {
+    // If both arms are void, so be it.
+    if (!yieldTy)
+      yieldTy = cgf.VoidTy;
+
+    // Insert required yields.
+    for (mlir::OpBuilder::InsertPoint &toInsert : insertPoints) {
+      mlir::OpBuilder::InsertionGuard guard(builder);
+      builder.restoreInsertionPoint(toInsert);
+
+      // Block does not return: build empty yield.
+      if (mlir::isa<cir::VoidType>(yieldTy)) {
+        builder.create<cir::YieldOp>(loc);
+      } else { // Block returns: set null yield value.
+        mlir::Value op0 = builder.getNullValue(yieldTy, loc);
+        builder.create<cir::YieldOp>(loc, op0);
+      }
+    }
+  }
+  return info;
+}
+
+LValue CIRGenFunction::emitConditionalOperatorLValue(
+    const AbstractConditionalOperator *expr) {
+  if (!expr->isGLValue()) {
+    // ?: here should be an aggregate.
+    assert(hasAggregateEvaluationKind(expr->getType()) &&
+           "Unexpected conditional operator!");
+    return emitAggExprToLValue(expr);
+  }
+
+  OpaqueValueMapping binding(*this, expr);
+  if (std::optional<LValue> res =
+          handleConditionalOperatorLValueSimpleCase(*this, expr))
+    return *res;
+
+  ConditionalInfo info =
+      emitConditionalBlocks(expr, [](CIRGenFunction &cgf, const Expr *e) {
+        return emitLValueOrThrowExpression(cgf, e);
+      });
+
+  if ((info.lhs && !info.lhs->isSimple()) ||
+      (info.rhs && !info.rhs->isSimple())) {
+    cgm.errorNYI(expr->getSourceRange(), "unsupported conditional operator");
+    return {};
+  }
+
+  if (info.lhs && info.rhs) {
+    Address lhsAddr = info.lhs->getAddress();
+    Address rhsAddr = info.rhs->getAddress();
+    Address result(info.result, lhsAddr.getElementType(),
+                   std::min(lhsAddr.getAlignment(), rhsAddr.getAlignment()));
+    AlignmentSource alignSource =
+        std::max(info.lhs->getBaseInfo().getAlignmentSource(),
+                 info.rhs->getBaseInfo().getAlignmentSource());
+    assert(!cir::MissingFeatures::opTBAA());
+    return makeAddrLValue(result, expr->getType(), LValueBaseInfo(alignSource));
+  }
+  assert((info.lhs || info.rhs) &&
+         "both operands of glvalue conditional are throw-expressions?");
+  return info.lhs ? *info.lhs : *info.rhs;
+}
+
 /// Emit an `if` on a boolean condition, filling `then` and `else` into
 /// appropriated regions.
 mlir::LogicalResult CIRGenFunction::emitIfOnBoolExpr(const Expr *cond,
@@ -1012,10 +1173,28 @@ mlir::Value CIRGenFunction::emitOpOnBoolExpr(mlir::Location loc,
   //  cir.ternary(!x, t, f) -> cir.ternary(x, f, t)
   assert(!cir::MissingFeatures::shouldReverseUnaryCondOnBoolExpr());
 
-  if (isa<ConditionalOperator>(cond)) {
-    cgm.errorNYI(cond->getExprLoc(), "Ternary NYI");
-    assert(!cir::MissingFeatures::ternaryOp());
-    return createDummyValue(loc, cond->getType());
+  if (const ConditionalOperator *condOp = dyn_cast<ConditionalOperator>(cond)) {
+    Expr *trueExpr = condOp->getTrueExpr();
+    Expr *falseExpr = condOp->getFalseExpr();
+    mlir::Value condV = emitOpOnBoolExpr(loc, condOp->getCond());
+
+    mlir::Value ternaryOpRes =
+        builder
+            .create<cir::TernaryOp>(
+                loc, condV, /*thenBuilder=*/
+                [this, trueExpr](mlir::OpBuilder &b, mlir::Location loc) {
+                  mlir::Value lhs = emitScalarExpr(trueExpr);
+                  b.create<cir::YieldOp>(loc, lhs);
+                },
+                /*elseBuilder=*/
+                [this, falseExpr](mlir::OpBuilder &b, mlir::Location loc) {
+                  mlir::Value rhs = emitScalarExpr(falseExpr);
+                  b.create<cir::YieldOp>(loc, rhs);
+                })
+            .getResult();
+
+    return emitScalarConversion(ternaryOpRes, condOp->getType(),
+                                getContext().BoolTy, condOp->getExprLoc());
   }
 
   if (isa<CXXThrowExpr>(cond)) {
@@ -1118,13 +1297,84 @@ mlir::Value CIRGenFunction::createDummyValue(mlir::Location loc,
   return builder.createDummyValue(loc, t, alignment);
 }
 
-/// This creates an alloca and inserts it into the entry block if
-/// \p insertIntoFnEntryBlock is true, otherwise it inserts it at the current
-/// insertion point of the builder.
+//===----------------------------------------------------------------------===//
+// CIR builder helpers
+//===----------------------------------------------------------------------===//
+
+Address CIRGenFunction::createMemTemp(QualType ty, mlir::Location loc,
+                                      const Twine &name, Address *alloca,
+                                      mlir::OpBuilder::InsertPoint ip) {
+  // FIXME: Should we prefer the preferred type alignment here?
+  return createMemTemp(ty, getContext().getTypeAlignInChars(ty), loc, name,
+                       alloca, ip);
+}
+
+Address CIRGenFunction::createMemTemp(QualType ty, CharUnits align,
+                                      mlir::Location loc, const Twine &name,
+                                      Address *alloca,
+                                      mlir::OpBuilder::InsertPoint ip) {
+  Address result = createTempAlloca(convertTypeForMem(ty), align, loc, name,
+                                    /*ArraySize=*/nullptr, alloca, ip);
+  if (ty->isConstantMatrixType()) {
+    assert(!cir::MissingFeatures::matrixType());
+    cgm.errorNYI(loc, "temporary matrix value");
+  }
+  return result;
+}
+
+/// This creates a alloca and inserts it into the entry block of the
+/// current region.
+Address CIRGenFunction::createTempAllocaWithoutCast(
+    mlir::Type ty, CharUnits align, mlir::Location loc, const Twine &name,
+    mlir::Value arraySize, mlir::OpBuilder::InsertPoint ip) {
+  cir::AllocaOp alloca = ip.isSet()
+                             ? createTempAlloca(ty, loc, name, ip, arraySize)
+                             : createTempAlloca(ty, loc, name, arraySize);
+  alloca.setAlignmentAttr(cgm.getSize(align));
+  return Address(alloca, ty, align);
+}
+
+/// This creates a alloca and inserts it into the entry block. The alloca is
+/// casted to default address space if necessary.
 Address CIRGenFunction::createTempAlloca(mlir::Type ty, CharUnits align,
                                          mlir::Location loc, const Twine &name,
-                                         bool insertIntoFnEntryBlock) {
-  mlir::Value alloca =
-      emitAlloca(name.str(), ty, loc, align, insertIntoFnEntryBlock);
-  return Address(alloca, ty, align);
+                                         mlir::Value arraySize,
+                                         Address *allocaAddr,
+                                         mlir::OpBuilder::InsertPoint ip) {
+  Address alloca =
+      createTempAllocaWithoutCast(ty, align, loc, name, arraySize, ip);
+  if (allocaAddr)
+    *allocaAddr = alloca;
+  mlir::Value v = alloca.getPointer();
+  // Alloca always returns a pointer in alloca address space, which may
+  // be different from the type defined by the language. For example,
+  // in C++ the auto variables are in the default address space. Therefore
+  // cast alloca to the default address space when necessary.
+  assert(!cir::MissingFeatures::addressSpace());
+  return Address(v, ty, align);
+}
+
+/// This creates an alloca and inserts it into the entry block if \p ArraySize
+/// is nullptr, otherwise inserts it at the current insertion point of the
+/// builder.
+cir::AllocaOp CIRGenFunction::createTempAlloca(mlir::Type ty,
+                                               mlir::Location loc,
+                                               const Twine &name,
+                                               mlir::Value arraySize,
+                                               bool insertIntoFnEntryBlock) {
+  return cast<cir::AllocaOp>(emitAlloca(name.str(), ty, loc, CharUnits(),
+                                        insertIntoFnEntryBlock, arraySize)
+                                 .getDefiningOp());
+}
+
+/// This creates an alloca and inserts it into the provided insertion point
+cir::AllocaOp CIRGenFunction::createTempAlloca(mlir::Type ty,
+                                               mlir::Location loc,
+                                               const Twine &name,
+                                               mlir::OpBuilder::InsertPoint ip,
+                                               mlir::Value arraySize) {
+  assert(ip.isSet() && "Insertion point is not set");
+  return cast<cir::AllocaOp>(
+      emitAlloca(name.str(), ty, loc, CharUnits(), ip, arraySize)
+          .getDefiningOp());
 }
diff --git a/clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp b/clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp
index e006a77c6e7d6..928a5eb8fe3be 100644
--- a/clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp
@@ -155,8 +155,7 @@ void AggExprEmitter::emitArrayInit(Address destPtr, cir::ArrayType arrayTy,
     // Allocate the temporary variable
     // to store the pointer to first unitialized element
     const Address tmpAddr = cgf.createTempAlloca(
-        cirElementPtrType, cgf.getPointerAlign(), loc, "arrayinit.temp",
-        /*insertIntoFnEntryBlock=*/false);
+        cirElementPtrType, cgf.getPointerAlign(), loc, "arrayinit.temp");
     LValue tmpLV = cgf.makeAddrLValue(tmpAddr, elementPtrType);
     cgf.emitStoreThroughLValue(RValue::get(element), tmpLV);
 
@@ -275,3 +274,11 @@ void AggExprEmitter::visitCXXParenListOrInitListExpr(
 void CIRGenFunction::emitAggExpr(const Expr *e, AggValueSlot slot) {
   AggExprEmitter(*this, slot).Visit(const_cast<Expr *>(e));
 }
+
+LValue CIRGenFunction::emitAggExprToLValue(const Expr *e) {
+  assert(hasAggregateEvaluationKind(e->getType()) && "Invalid argument!");
+  Address temp = createMemTemp(e->getType(), getLoc(e->getSourceRange()));
+  LValue lv = makeAddrLValue(temp, e->getType());
+  emitAggExpr(e, AggValueSlot::forLValue(lv));
+  return lv;
+}
diff --git a/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp b/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
index aef5b125a2877..cd07a6cc59fc5 100644
--- a/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
@@ -302,6 +302,8 @@ class ScalarExprEmitter : public StmtVisitor<ScalarExprEmitter, mlir::Value> {
   }
 
   mlir::Value VisitUnaryExprOrTypeTraitExpr(const UnaryExprOrTypeTraitExpr *e);
+  mlir::Value
+  VisitAbstractConditionalOperator(const AbstractConditionalOperator *e);
 
   // Unary Operators.
   mlir::Value VisitUnaryPostDec(const UnaryOperator *e) {
@@ -875,6 +877,174 @@ class ScalarExprEmitter : public StmtVisitor<ScalarExprEmitter, mlir::Value> {
     // NOTE: We don't need to EnsureInsertPoint() like LLVM codegen.
     return Visit(e->getRHS());
   }
+
+  mlir::Value VisitBinLAnd(const clang::BinaryOperator *e) {
+    if (e->getType()->isVectorType()) {
+      assert(!cir::MissingFeatures::vectorType());
+      return {};
+    }
+
+    bool instrumentRegions = cgf.cgm.getCodeGenOpts().hasProfileClangInstr();
+    mlir::Type resTy = cgf.convertType(e->getType());
+    mlir::Location loc = cgf.getLoc(e->getExprLoc());
+
+    // If we have 0 && RHS, see if we can elide RHS, if so, just return 0.
+    // If we have 1 && X, just emit X without inserting the control flow.
+    bool lhsCondVal;
+    if (cgf.constantFoldsToSimpleInteger(e->getLHS(), lhsCondVal)) {
+      if (lhsCondVal) { // If we have 1 && X, just emit X.
+
+        mlir::Value rhsCond = cgf.evaluateExprAsBool(e->getRHS());
+
+        if (instrumentRegions) {
+          assert(!cir::MissingFeatures::instrumenation());
+          cgf.cgm.errorNYI(e->getExprLoc(), "instrumenation");
+        }
+        // ZExt result to int or bool.
+        return builder.createZExtOrBitCast(rhsCond.getLoc(), rhsCond, resTy);
+      }
+      // 0 && RHS: If it is safe, just elide the RHS, and return 0/false.
+ ...
[truncated]

SmallVector<mlir::OpBuilder::InsertPoint, 2> insertPoints{};
mlir::Type yieldTy{};

auto emitBranch = [&](mlir::OpBuilder &b, mlir::Location loc, Expr *expr,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the control flow here wrt. to the incubator. Note that I patch up the regions after the builder.create instead of doing this within the false expression builder. I think that's okay but I rather highlight this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the incubator code patches up the region as part of the ternary operation creation. Is it possible that doing it after could cause a verification failure?

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whew... this ends up being a sizable patch. I know you tried to break it up, but it is still pretty significant.

I have a couple of comments as drive-bys, else I'll let others do the approvals.

if (mlir::isa<cir::BoolType>(srcTy) && mlir::isa<cir::IntType>(newTy))
return createBoolToInt(src, newTy);

llvm_unreachable("unhandled extension cast");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem like it should be an unreachable, right? Perhaps an assert that the types are equal or its int-to-bool, but this unreachable seems odd? also, what function in classic-codegen is this intended to mirror? The only one I found was the CastInst one by the same name, and it doesn't seem to mirror its implementation much at all.

if (!condExprBool)
std::swap(live, dead);

if (!cgf.containsLabel(dead)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a comment that explains why 'label in dead' means we can't do this opt?


if (!cgf.containsLabel(dead)) {
// If the true case is live, we need to track its region.
if (condExprBool) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is suspicious. We've already swapped based on this, right? So the association of dead/live/true/false is broken.

Copy link
Contributor Author

@mmha mmha May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know much about PGO, but from what I gather this code is correct. Clang/LLVM optimize branch coverage by only emitting counters for true branches. The false branch count is then calculated with parent_count - true_count See slide 12 here. If the condition is always false there's no counter to increment.

}
// If a throw expression we emit it and return an undefined lvalue
// because it can't be used.
if (isa<CXXThrowExpr>(live->IgnoreParens())) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just be return emitLValueOrThrowExprssion?

Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you break this up a bit more? Maybe cut it back to just the code needed for the logical operators?

@@ -272,6 +272,22 @@ class CIRBaseBuilderTy : public mlir::OpBuilder {
return createCast(loc, cir::CastKind::bitcast, src, newTy);
}

// TODO(cir): the following function was introduced to keep in sync with LLVM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably made sense when the incubator was in the early stages of development, but I don't think there's much value in staying structurally in sync with classic codegen when the CIR implementation is doing something entirely different. The "eventually" in the comment should be now. However, we should probably leave a bread crumb so anyone upstreaming code that calls this function can easily figure out what we did with it.

@@ -948,6 +950,165 @@ void CIRGenFunction::emitIgnoredExpr(const Expr *e) {
emitLValue(e);
}

// Handle the case where the condition is a constant evaluatable simple integer,
// which means we don't have to separately handle the true/false blocks.
static std::optional<LValue> handleConditionalOperatorLValueSimpleCase(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another of those cases where we're doing optimization in the front end. This has come up several times. I personally don't like the idea, but I guess the classic codegen does things like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to move this to a CIR pass eventually? The downside of that would be a bunch of extra dead code being generated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to see it happen in CIRSimplify. Especially in the case of a ternary operator, I wouldn't expect it to be common for there to be a lot of dead code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to see it happen in CIRSimplify

+1

CIRGenFunction &cgf = *this;
ConditionalEvaluation eval(cgf);
mlir::Location loc = cgf.getLoc(e->getSourceRange());
CIRGenBuilderTy &builder = cgf.getBuilder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of cgf throughout this function is odd. In the classic codegen it was a standalone function that took CGF as a parameter, but since it's a member function here, there's no reason to do that. Also, builder here is shadowing the builder member variable in CIRGenFunction.

SmallVector<mlir::OpBuilder::InsertPoint, 2> insertPoints{};
mlir::Type yieldTy{};

auto emitBranch = [&](mlir::OpBuilder &b, mlir::Location loc, Expr *expr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the incubator code patches up the region as part of the ternary operation creation. Is it possible that doing it after could cause a verification failure?

static bool isCheapEnoughToEvaluateUnconditionally(const Expr *e,
CIRGenFunction &cgf) {
// Anything that is an integer or floating point constant is fine.
return e->IgnoreParens()->isEvaluatable(cgf.getContext());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wouldn't want to speculatively evaluate constant foldable floating point expressions if they might raise an exception. I don't think this handles that correctly. We are a long way from having strict floating-point semantics in CIR, but the classic codegen gets this wrong: https://godbolt.org/z/nvfT4WxK4

Copy link
Contributor Author

@mmha mmha May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I modified your example a bit because of a comment in Expr.h on Expr::EvalStatus::HasUndefinedBehavior: https://godbolt.org/z/Kvn3M9hr7

For example, 1.0 / 0.0 can be folded to Inf, but has undefined behavior.

Looks like Info.noteUndefinedBehavior() doesn't get called for FP division by zero in handleFloatFloatBinOp(). Easy enough fix, I'll create a separate PR for this :)

If we want to be strict about this wouldn't this exclude almost any speculative FP expression evaluation as the FP environment can be set at runtime to make this observable? E. g. trapping on inexact results happens pretty much all the time. If so, Expr::SideEffectsKind probably needs a new kind for FP evaluation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW even if isCheapEnoughToEvaluateUnconditionally returns true we still emit a TernaryOp instead of a SelectOp. This needs to be fixed here and in the incubator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is really a matter of isEvaluatable() giving us a bad answer. It should be taking FP exceptions into account when FP exceptions are enabled.

// If we have 0 && RHS, see if we can elide RHS, if so, just return 0.
// If we have 1 && X, just emit X without inserting the control flow.
bool lhsCondVal;
if (cgf.constantFoldsToSimpleInteger(e->getLHS(), lhsCondVal)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like we have code the code to check for this folding and then create a ternary op in a lot of places. Is there no way to reduce the duplication?

return info;
}

LValue CIRGenFunction::emitConditionalOperatorLValue(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like this is called from anywhere in this patch. In the incubator, it is called from emitLValue(), and the fact that you didn't need that here makes me suspect this is entirely duplicated somewhere else (maybe in the VisitAbstractConditionalOperator implementation?).

b.getInsertionBlock()};
cgf.curLexScope->setAsTernary();
mlir::Value rhsCondV = cgf.evaluateExprAsBool(e->getRHS());
auto res = b.create<cir::TernaryOp>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this second ternary is entirely unnecessary. This code is effectively doing this for a && b:

t1 = evaluateExprAsBool(a);
if (t1) {
  t2 = evaluateExprAsBool(b);
  if (t2) {
    result = true;
  } else {
    result = false;
  }
} else {
  result = false;
}

Couldn't the same result be accomplished like this?

t1 = evaluateExprAsBool(a);
if (t1) {
  result = evaluateExprAsBool(b);
} else {
  result = t1;
}

Similar reasoning applies to the LOr handling below.

auto res = b.create<cir::ConstantOp>(loc, builder.getFalseAttr());
b.create<cir::YieldOp>(loc, res.getRes());
});
return builder.createZExtOrBitCast(resOp.getLoc(), resOp.getResult(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VisitUnaryLNot does this:

  // ZExt result to the expr type.
  mlir::Type dstTy = cgf.convertType(e->getType());
  if (mlir::isa<cir::IntType>(dstTy))
    return builder.createBoolToInt(boolVal, dstTy);
  if (mlir::isa<cir::BoolType>(dstTy))
    return boolVal;

  cgf.cgm.errorNYI("destination type for logical-not unary operator is NYI");

A helper function seems reasonable, but I'd prefer to see it be named something like maybePromoteBoolResult.

mlir::Location loc = cgf.getLoc(e->getExprLoc());

// If we have 0 && RHS, see if we can elide RHS, if so, just return 0.
// If we have 1 && X, just emit X without inserting the control flow.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This results in inconsistent behavior. We generate a pair of cir.ternary ops for x && true but we simplify true && x.

https://godbolt.org/z/1rfYT8E46

It seems to me that it would be better to leave both alone and let the optimizer do the optimization.

: insertPt(cgf.builder.saveInsertionPoint()) {}
ConditionalEvaluation(mlir::OpBuilder::InsertPoint ip) : insertPt(ip) {}

void begin(CIRGenFunction &cgf) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we saved a reference to cgf in the constructor, we wouldn't need to pass it to begin() and end()

if (mlir::isa<cir::VoidType>(yieldTy)) {
builder.create<cir::YieldOp>(loc);
} else { // Block returns: set null yield value.
mlir::Value op0 = builder.getNullValue(yieldTy, loc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is right. If a block contains a throw, the incubator generates a cir.throw followed by a cir.unreachable and then inserts a null constant and a cir.yield. This causes verification to fail because cir.unreachable isn't the last operation in the block.

Copy link
Contributor Author

@mmha mmha May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, you're basically asking about a case like this:

int val = 42;

int test4() {
  return 1 ? (throw val, 0) : val;
}

This does fail verification with the incubator, but I don't think this is an issue with the ternary op codegen but an incorrect assumption in the verification or CIRGenItaniumCXXABI::emitThrow depending on what exact semantics we would like cir.unreachable to have since this fails, too:

int f() {
    return (throw 0, 456);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bcardosolopes pushed a commit to llvm/clangir that referenced this pull request May 23, 2025
…valuate unconditionally (#1642)

This came up during the review of
llvm/llvm-project#138156

During codegen we check whether the LHS and RHS of the conditional
operator are cheap enough to evaluate uncondionally. Unlike classic
codegen we still emit `TernaryOp` instead of `SelectOp` and defer that
optimization to cir-simplify.

This patch changes codegen to directly emit `SelectOp` for `cond ?
constant : constant` expressions.
@mmha mmha force-pushed the cir-ternary-codegen branch from be8886f to 8d992ef Compare May 27, 2025 16:59
Copy link

github-actions bot commented May 27, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. My remaining concerns are mostly about dead code.

falseValue);
}

mlir::Value createLogicalAnd(mlir::Location loc, mlir::Value lhs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears these will only work if lhs and rhs are bool values. I guess the assert above will force that, but it initially surprised me.

/// Emit the operand of a glvalue conditional operator. This is either a glvalue
/// or a (possibly-parenthesized) throw-expression. If this is a throw, no
/// LValue is returned and the current block has been terminated.
static std::optional<LValue> emitLValueOrThrowExpression(CIRGenFunction &cgf,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this being used in this PR?

const Twine &name,
mlir::Value arraySize,
bool insertIntoFnEntryBlock) {
return cast<cir::AllocaOp>(emitAlloca(name.str(), ty, loc, CharUnits(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this should be doing something to get the alignment based on the type. Looking at the equivalent classic codegen, if CreateTempAlloca is called without explicit alignment, it calls IRBuilder::CreateAlloca, which uses DataLayout::getPrefTypeAlign() to set the alignment.

// Leaves the 'current block' in the continuation basic block.
template <typename FuncTy>
CIRGenFunction::ConditionalInfo
CIRGenFunction::emitConditionalBlocks(const AbstractConditionalOperator *e,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this called in this PR?

/// If the specified expression does not fold
/// to a constant, or if it does but contains a label, return false. If it
/// constant folds return true and set the boolean result in `resultBool`.
bool CIRGenFunction::constantFoldsToSimpleInteger(const Expr *cond,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we already have this function, but it was renamed to constantFoldsToBool

// multiple VLA types can share the same size expression.
// FIXME: Maybe this could be a stack of maps that is pushed/popped as we
// enter/leave scopes.
llvm::DenseMap<const Expr *, mlir::Value> vlaSizeMap;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used in this PR?

ignoreResultAssign = false;

// Bind the common expression if necessary.
CIRGenFunction::OpaqueValueMapping binding(cgf, e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be necessary yet. As far as I can tell, we'll need this when we add handling for OpaqueValueExprClass but not until then.

// LLVM: [[OR_CONT]]:
// LLVM: %[[ZEXT_OR:.*]] = zext i1 %[[OR_PHI]] to i8
// LLVM: store i8 %[[ZEXT_OR]], ptr %[[X]]
// LLVM: ret void
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add OGCG checks for comparison?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to write that I didn't do so because CIR and OGCG codegen differ too much for this to be worthwhile but with cir-simplify and llvm/clangir#1651 we should now generate exactly the same IR. I'll add the checks now.

@mmha mmha force-pushed the cir-ternary-codegen branch from b9566b7 to 6fa1d09 Compare June 2, 2025 16:58
Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

mmha and others added 6 commits June 3, 2025 11:42
This patch adds visitors for BinLAnd, BinLOr and AbstractConditionalOperator. Note that this patch still lacks visitation of OpaqueValueExpr which are needed for the GNU ?: operator.
Co-authored-by: Erich Keane <ekeane@nvidia.com>
@mmha mmha force-pushed the cir-ternary-codegen branch from 6fa1d09 to 69d3087 Compare June 3, 2025 09:57
@mmha mmha merged commit 4b2cb11 into llvm:main Jun 3, 2025
11 checks passed
return builder.createBoolToInt(value, dstTy);
if (mlir::isa<cir::BoolType>(dstTy))
return value;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the 'else' case here is problematic. We need a return here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants