Skip to content

[clang][CodeGen] Extends lifetime of coroutine promise(CWG2563) #151067

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

NewSigma
Copy link
Contributor

@NewSigma NewSigma commented Jul 29, 2025

This patch attempts to implement piece of the proposed solution to CWG2563:

[9.6.4 dcl.fct.def.coroutine.p8] This return exits the scope of gro. It exits the scope of promise only if the coroutine completed without suspending.

The patch contains two parts:

  1. Middle end part: Do not lower coro.end to ret void if its return value will be used. This allows front end separate coro start section and resume section without early return from resume function.
  2. Front end part: Clone and defer the destruction of coro promise if the coroutine completed without suspending. This is achieved by having the start section and the resume section take different branches.

I'm not quite sure I understood the wording correctly. Feel free to correct me.

Close #120200

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. coroutines C++20 coroutines llvm:transforms labels Jul 29, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 29, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-coroutines
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-llvm-transforms

Author: Weibo He (NewSigma)

Changes

This patch attempts to implement piece of the proposed solution to CWG2563:

> [9.6.4 dcl.fct.def.coroutine.p8] This return exits the scope of gro. It exits the scope of promise only if the coroutine completed without suspending.

The patch contains two parts:

  1. Middle end part: Do not lower coro.end to ret void if its return value will be used. This allows front end separate coro start section and resume section without early return from resume function.
  2. Front end part: Clone and defer the destruction of coro promise if the coroutine completed without suspending. This is achieved by branching just after destruction of the function parameter copies.

I'm not quite sure I understood the wording correctly. Feel free to correct me.

Close #120200


Patch is 20.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/151067.diff

11 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+3)
  • (modified) clang/lib/CodeGen/CGCoroutine.cpp (+111-16)
  • (modified) clang/test/CodeGenCXX/type-aware-coroutines.cpp (+5-5)
  • (modified) clang/test/CodeGenCoroutines/coro-dest-slot.cpp (+1-1)
  • (modified) clang/test/CodeGenCoroutines/coro-params.cpp (+11-4)
  • (added) clang/test/CodeGenCoroutines/coro-promise-dtor2.cpp (+53)
  • (modified) clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp (+2-2)
  • (modified) llvm/docs/Coroutines.rst (+6-7)
  • (modified) llvm/include/llvm/Transforms/Coroutines/CoroInstr.h (+1-1)
  • (modified) llvm/lib/Transforms/Coroutines/CoroSplit.cpp (+1-1)
  • (added) llvm/test/Transforms/Coroutines/coro-split-coroend.ll (+42)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index dd6b44dd0c657..e1ea3ec53244a 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -151,6 +151,9 @@ C++20 Feature Support
 - Implement constant evaluation of lambdas that capture structured bindings.
   (#GH145956)
 
+- Fixed premature coroutine promise destruction when the coroutine completes
+  without suspending, as part of CWG2563. (#GH120200)
+
 C++17 Feature Support
 ^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/clang/lib/CodeGen/CGCoroutine.cpp b/clang/lib/CodeGen/CGCoroutine.cpp
index 5ee908922b5a3..805670249c315 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -73,6 +73,15 @@ struct clang::CodeGen::CGCoroData {
   // the address of the coroutine frame of the current coroutine.
   llvm::CallInst *CoroBegin = nullptr;
 
+  // Stores call to promise destruction and the llvm.lifetime.end of promise alloca
+  // We will clone them into deferred promise free block.
+  llvm::CallInst *PromiseDtor = nullptr;
+  llvm::CallInst *PromiseEnd = nullptr;
+
+  // Stores the llvm.coro.end that identifies if the coroutine exit without
+  // suspend.
+  llvm::CallInst *InResume = nullptr;
+
   // Stores the last emitted coro.free for the deallocate expressions, we use it
   // to wrap dealloc code with if(auto mem = coro.free) dealloc(mem).
   llvm::CallInst *LastCoroFree = nullptr;
@@ -595,7 +604,7 @@ struct CallCoroEnd final : public EHScopeStack::Cleanup {
 namespace {
 // Make sure to call coro.delete on scope exit.
 struct CallCoroDelete final : public EHScopeStack::Cleanup {
-  Stmt *Deallocate;
+  const CoroutineBodyStmt *S;
 
   // Emit "if (coro.free(CoroId, CoroBegin)) Deallocate;"
 
@@ -605,16 +614,73 @@ struct CallCoroDelete final : public EHScopeStack::Cleanup {
   // builds a single call to a deallocation function which is safe to emit
   // multiple times.
   void Emit(CodeGenFunction &CGF, Flags) override {
+    bool Sinked = CGF.CurCoro.Data->InResume != nullptr;
+    if (!Sinked)
+      sinkPromiseBB(CGF);
+    EmitCoroFree(CGF);
+  }
+
+  explicit CallCoroDelete(const CoroutineBodyStmt *S) : S(S) {}
+
+  // Sink promise destructor to separate block
+  void sinkPromiseBB(CodeGenFunction &CGF) {
+    auto &Builder = CGF.Builder;
+    auto &CoroData = *CGF.CurCoro.Data;
+    auto *CleanupBB = Builder.GetInsertBlock()->getSinglePredecessor();
+    auto *Promise = CGF.GetAddrOfLocalVar(S->getPromiseDecl()).getBasePointer();
+    llvm::CallInst *PromiseDtor = nullptr;
+    BasicBlock *PromiseBB = nullptr;
+    for (llvm::User *U : Promise->users()) {
+      auto *I = dyn_cast<llvm::CallInst>(U);
+      if (!I || I->getParent() != CleanupBB)
+        continue;
+
+      if (I->isLifetimeStartOrEnd()) {
+        assert(!CoroData.PromiseEnd && "Unexpected multiple lifetime.end");
+        I->moveBefore(CleanupBB->getTerminator()->getIterator());
+        PromiseBB = CleanupBB->splitBasicBlock(I, "promise.free");
+        CoroData.PromiseEnd = I;
+      }
+      else {
+        assert(!PromiseDtor && "Unexpected multiple destructor call");
+        PromiseDtor = I;
+      }
+    }
+
+    if (PromiseDtor) {
+      PromiseDtor->moveBefore(CoroData.PromiseEnd->getIterator());
+      CoroData.PromiseDtor = PromiseDtor;
+    }
+
+    BasicBlock *SaveInsertBlock = CGF.Builder.GetInsertBlock();
+    CleanupBB->getTerminator()->eraseFromParent();
+    Builder.SetInsertPoint(CleanupBB);
+
+    llvm::Function *CoroEnd = CGF.CGM.getIntrinsic(llvm::Intrinsic::coro_end);
+    auto *NullPtr = llvm::ConstantPointerNull::get(CGF.Int8PtrTy);
+    CoroData.InResume = Builder.CreateCall(
+        CoroEnd,
+        {NullPtr, Builder.getFalse(),
+         llvm::ConstantTokenNone::get(CoroEnd->getContext())},
+        "InResume");
+
+    BasicBlock *EndBB = CoroData.CleanupJD.getBlock();
+    Builder.CreateCondBr(CoroData.InResume, PromiseBB, EndBB);
+    Builder.SetInsertPoint(SaveInsertBlock);
+  }
+
+  void EmitCoroFree(CodeGenFunction &CGF, const Twine &NameSuffix = "") {
     // Remember the current point, as we are going to emit deallocation code
     // first to get to coro.free instruction that is an argument to a delete
     // call.
     BasicBlock *SaveInsertBlock = CGF.Builder.GetInsertBlock();
 
-    auto *FreeBB = CGF.createBasicBlock("coro.free");
+    Stmt *Deallocate = S->getDeallocate();
+    auto *FreeBB = CGF.createBasicBlock("coro.free" + NameSuffix);
     CGF.EmitBlock(FreeBB);
     CGF.EmitStmt(Deallocate);
 
-    auto *AfterFreeBB = CGF.createBasicBlock("after.coro.free");
+    auto *AfterFreeBB = CGF.createBasicBlock("after.coro.free" + NameSuffix);
     CGF.EmitBlock(AfterFreeBB);
 
     // We should have captured coro.free from the emission of deallocate.
@@ -639,7 +705,6 @@ struct CallCoroDelete final : public EHScopeStack::Cleanup {
     InsertPt->eraseFromParent();
     CGF.Builder.SetInsertPoint(AfterFreeBB);
   }
-  explicit CallCoroDelete(Stmt *DeallocStmt) : Deallocate(DeallocStmt) {}
 };
 }
 
@@ -787,13 +852,13 @@ void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {
   auto *AllocBB = createBasicBlock("coro.alloc");
   auto *InitBB = createBasicBlock("coro.init");
   auto *FinalBB = createBasicBlock("coro.final");
-  auto *RetBB = createBasicBlock("coro.ret");
+  auto *EndBB = createBasicBlock("coro.end");
 
   auto *CoroId = Builder.CreateCall(
       CGM.getIntrinsic(llvm::Intrinsic::coro_id),
       {Builder.getInt32(NewAlign), NullPtr, NullPtr, NullPtr});
   createCoroData(*this, CurCoro, CoroId);
-  CurCoro.Data->SuspendBB = RetBB;
+  CurCoro.Data->SuspendBB = EndBB;
   assert(ShouldEmitLifetimeMarkers &&
          "Must emit lifetime intrinsics for coroutines");
 
@@ -829,23 +894,25 @@ void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {
 
   EmitBlock(InitBB);
 
-  // Pass the result of the allocation to coro.begin.
-  auto *Phi = Builder.CreatePHI(VoidPtrTy, 2);
-  Phi->addIncoming(NullPtr, EntryBB);
-  Phi->addIncoming(AllocateCall, AllocOrInvokeContBB);
-  auto *CoroBegin = Builder.CreateCall(
-      CGM.getIntrinsic(llvm::Intrinsic::coro_begin), {CoroId, Phi});
-  CurCoro.Data->CoroBegin = CoroBegin;
+  {
+    // Pass the result of the allocation to coro.begin.
+    auto *Phi = Builder.CreatePHI(VoidPtrTy, 2);
+    Phi->addIncoming(NullPtr, EntryBB);
+    Phi->addIncoming(AllocateCall, AllocOrInvokeContBB);
+    auto *CoroBegin = Builder.CreateCall(
+        CGM.getIntrinsic(llvm::Intrinsic::coro_begin), {CoroId, Phi});
+    CurCoro.Data->CoroBegin = CoroBegin;
+  }
 
   GetReturnObjectManager GroManager(*this, S);
   GroManager.EmitGroAlloca();
 
-  CurCoro.Data->CleanupJD = getJumpDestInCurrentScope(RetBB);
+  CurCoro.Data->CleanupJD = getJumpDestInCurrentScope(EndBB);
   {
     CGDebugInfo *DI = getDebugInfo();
     ParamReferenceReplacerRAII ParamReplacer(LocalDeclMap);
     CodeGenFunction::RunCleanupsScope ResumeScope(*this);
-    EHStack.pushCleanup<CallCoroDelete>(NormalAndEHCleanup, S.getDeallocate());
+    EHStack.pushCleanup<CallCoroDelete>(NormalAndEHCleanup, &S);
 
     // Create mapping between parameters and copy-params for coroutine function.
     llvm::ArrayRef<const Stmt *> ParamMoves = S.getParamMoves();
@@ -950,7 +1017,14 @@ void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {
     }
   }
 
-  EmitBlock(RetBB);
+  EmitBlock(EndBB);
+  auto *Phi = Builder.CreatePHI(Builder.getInt1Ty(), llvm::pred_size(EndBB),
+                                "never.suspend");
+  BasicBlock *CleanupBB = CurCoro.Data->InResume->getParent();
+  for (auto *Pred : llvm::predecessors(EndBB)) {
+    auto *V = (Pred == CleanupBB) ? Builder.getTrue() : Builder.getFalse();
+    Phi->addIncoming(V, Pred);
+  }
   // Emit coro.end before getReturnStmt (and parameter destructors), since
   // resume and destroy parts of the coroutine should not include them.
   llvm::Function *CoroEnd = CGM.getIntrinsic(llvm::Intrinsic::coro_end);
@@ -971,7 +1045,28 @@ void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {
     // shouldn't change the AST.
     if (PreviousRetValue)
       cast<ReturnStmt>(Ret)->setRetValue(PreviousRetValue);
+
+    // Defer destruction of promise if coroutine completed without suspending
+    auto *DeferPromiseBB = createBasicBlock("promise.free.defer");
+    auto *RetBB = EndBB->splitBasicBlock(EndBB->getTerminator(), "coro.ret");
+    EndBB->getTerminator()->eraseFromParent();
+
+    Builder.SetInsertPoint(EndBB);
+    Builder.CreateCondBr(Phi, DeferPromiseBB, RetBB);
+
+    EmitBlock(DeferPromiseBB);
+    auto *InsertPt = Builder.CreateBr(RetBB);
+    if (auto *I = CurCoro.Data->PromiseDtor)
+      I->clone()->insertBefore(InsertPt->getIterator());
+    CurCoro.Data->PromiseEnd->clone()->insertBefore(InsertPt->getIterator());
+    InsertPt->eraseFromParent();
+
+    CallCoroDelete(&S).EmitCoroFree(*this, ".defer");
+    Builder.CreateBr(RetBB);
+    Builder.ClearInsertionPoint();
   }
+  else
+    EndBB->setName("coro.ret");
 
   // LLVM require the frontend to mark the coroutine.
   CurFn->setPresplitCoroutine();
diff --git a/clang/test/CodeGenCXX/type-aware-coroutines.cpp b/clang/test/CodeGenCXX/type-aware-coroutines.cpp
index 0a19079d987e9..84b054ff8a214 100644
--- a/clang/test/CodeGenCXX/type-aware-coroutines.cpp
+++ b/clang/test/CodeGenCXX/type-aware-coroutines.cpp
@@ -55,7 +55,7 @@ extern "C" resumable f1(int) {
   co_return;
 // CHECK: coro.alloc:
 // CHECK: _ZN9resumable12promise_typenwEmi
-// CHECK: coro.free:
+// CHECK: coro.free.defer:
 // CHECK: _ZN9resumable12promise_typedlEPv
 }
 
@@ -64,7 +64,7 @@ extern "C" resumable f2(float) {
   co_return;
 // CHECK: coro.alloc:
 // CHECK: _ZN9resumable12promise_typenwEmf
-// CHECK: coro.free:
+// CHECK: coro.free.defer:
 // CHECK: _ZN9resumable12promise_typedlEPv
 }
 
@@ -74,7 +74,7 @@ extern "C" resumable2 f3(int, float, const char*, Allocator) {
    co_return;
 // CHECK: coro.alloc:
 // CHECK: _ZN10resumable212promise_typenwIJifPKc9AllocatorEEEPvmDpT_
-// CHECK: coro.free:
+// CHECK: coro.free.defer:
 // CHECK: _ZN10resumable212promise_typedlEPv
 // CHECK: _ZN10resumable212promise_typedlEPv
 }
@@ -84,7 +84,7 @@ extern "C" resumable f4(int n = 10) {
    for (int i = 0; i < n; i++) co_yield i;
 // CHECK: coro.alloc:
 // CHECK: call {{.*}}@_ZN9resumable12promise_typenwEmi(
-// CHECK: coro.free:
+// CHECK: coro.free.defer:
 // CHECK: call void @_ZN9resumable12promise_typedlEPv(
 // CHECK: call void @_ZN9resumable12promise_typedlEPv(
 }
@@ -110,7 +110,7 @@ extern "C" resumable3 f5(float) {
   co_return;
 // CHECK: coro.alloc:
 // CHECK: call {{.*}}@_Znwm(
-// CHECK: coro.free:
+// CHECK: coro.free.defer:
 // CHECK: call void @_ZdlPvm(
 // CHECK: call void @_ZdlPvm(
 }
diff --git a/clang/test/CodeGenCoroutines/coro-dest-slot.cpp b/clang/test/CodeGenCoroutines/coro-dest-slot.cpp
index d794c74cd52d6..b8bc2434f6fc1 100644
--- a/clang/test/CodeGenCoroutines/coro-dest-slot.cpp
+++ b/clang/test/CodeGenCoroutines/coro-dest-slot.cpp
@@ -25,7 +25,7 @@ extern "C" coro f(int) { co_return; }
 // CHECK: %[[CLEANUP_DEST0:.+]] = phi i32 [ 0, %[[INIT_READY]] ], [ 2, %[[INIT_CLEANUP]] ]
 
 // CHECK: %[[FINAL_SUSPEND:.+]] = call i8 @llvm.coro.suspend(
-// CHECK-NEXT: switch i8 %{{.*}}, label %coro.ret [
+// CHECK-NEXT: switch i8 %{{.*}}, label %coro.end [
 // CHECK-NEXT:   i8 0, label %[[FINAL_READY:.+]]
 // CHECK-NEXT:   i8 1, label %[[FINAL_CLEANUP:.+]]
 // CHECK-NEXT: ]
diff --git a/clang/test/CodeGenCoroutines/coro-params.cpp b/clang/test/CodeGenCoroutines/coro-params.cpp
index 719726cca29c5..1f41ddcf34ac0 100644
--- a/clang/test/CodeGenCoroutines/coro-params.cpp
+++ b/clang/test/CodeGenCoroutines/coro-params.cpp
@@ -104,17 +104,24 @@ void f(int val, MoveOnly moParam, MoveAndCopy mcParam, TrivialABI trivialParam)
   // CHECK: call void @_ZNSt16coroutine_traitsIJvi8MoveOnly11MoveAndCopy10TrivialABIEE12promise_type13final_suspendEv(
   // CHECK: call void @_ZN14suspend_always12await_resumeEv(
 
-  // Destroy promise, then parameter copies:
-  // CHECK: call void @_ZNSt16coroutine_traitsIJvi8MoveOnly11MoveAndCopy10TrivialABIEE12promise_typeD1Ev(ptr {{[^,]*}} %__promise)
-  // CHECK-NEXT: call void @llvm.lifetime.end.p0(
-  // CHECK-NEXT: call void @_ZN10TrivialABID1Ev(ptr {{[^,]*}} %[[TrivialCopy]])
+  // Parameter copies:
+  // CHECK: call void @_ZN10TrivialABID1Ev(ptr {{[^,]*}} %[[TrivialCopy]])
   // CHECK-NEXT: call void @llvm.lifetime.end.p0(
   // CHECK-NEXT: call void @_ZN11MoveAndCopyD1Ev(ptr {{[^,]*}} %[[McCopy]])
   // CHECK-NEXT: call void @llvm.lifetime.end.p0(
   // CHECK-NEXT: call void @_ZN8MoveOnlyD1Ev(ptr {{[^,]*}} %[[MoCopy]]
   // CHECK-NEXT: call void @llvm.lifetime.end.p0(
   // CHECK-NEXT: call void @llvm.lifetime.end.p0(
+  // CHECK-NEXT: %InResume = call i1 @llvm.coro.end(ptr null, i1 false, token none)
+  // CHECK-NEXT: br i1 %InResume, label %promise.free, label %coro.ret
+
+  // Destroy promise:
+  // CHECK: promise.free
+  // CHECK-NEXT: call void @_ZNSt16coroutine_traitsIJvi8MoveOnly11MoveAndCopy10TrivialABIEE12promise_typeD1Ev(ptr {{[^,]*}} %__promise)
+  // CHECK-NEXT: call void @llvm.lifetime.end.p0(
   // CHECK-NEXT: call ptr @llvm.coro.free(
+  // CHECK-NEXT: {{[^,]*}} = icmp ne ptr {{[^,]*}}, null
+  // CHECK-NEXT: br i1 {{[^,]*}}, label %coro.free, label %after.coro.free
 
   // The original trivial_abi parameter is destroyed when returning from the ramp.
   // CHECK: call i1 @llvm.coro.end
diff --git a/clang/test/CodeGenCoroutines/coro-promise-dtor2.cpp b/clang/test/CodeGenCoroutines/coro-promise-dtor2.cpp
new file mode 100644
index 0000000000000..0c1b347461acf
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-promise-dtor2.cpp
@@ -0,0 +1,53 @@
+// Test that destruction of promise is deferred if coroutine completed without suspending
+// RUN: %clang_cc1 -std=c++20 -triple=x86_64-unknown-linux-gnu -emit-llvm -o - %s -fexceptions -fcxx-exceptions -disable-llvm-passes | FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+template<class T>
+struct coro {
+  struct RValueWrapper {
+    T* p;
+
+    operator T&&() const noexcept { return static_cast<T&&>(*p); }
+  };
+
+  using promise_type = T;
+
+  T& getDerived() noexcept { return *static_cast<T*>(this); }
+
+  auto get_return_object() noexcept { return RValueWrapper(&getDerived()); }
+  std::suspend_never initial_suspend() noexcept { return {}; }
+  std::suspend_never final_suspend() noexcept { return {}; }
+  void return_value(T&& x) noexcept { getDerived() = static_cast<T&&>(x); }
+  void unhandled_exception() {}
+};
+
+struct A : public coro<A> {
+  int a;
+
+  ~A() {}
+};
+
+A func() {
+  A aa{};
+  aa.a = 5;
+  co_return static_cast<A&&>(aa);
+}
+
+
+// CHECK-LABEL: define {{.*}} void @_Z4funcv(
+// CHECK: coro.end:
+// CHECK-NEXT: %never.suspend = phi i1 [ false, %cleanup.cont16 ], [ true, %cleanup12 ], [ false, %after.coro.free ], [ false, %final.suspend ], [ false, %init.suspend ]
+// CHECK-NEXT: call i1 @llvm.coro.end
+// CHECK-NEXT: ptr @_ZNK4coroI1AE13RValueWrappercvOS0_Ev
+// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i64
+// CHECK-NEXT: br i1 %never.suspend, label %promise.free.defer, label %coro.ret
+
+// CHECK: promise.free.defer:
+// CHECK-NEXT: call void @_ZN1AD1Ev
+// CHECK-NEXT: call void @llvm.lifetime.end.p0
+// CHECK-NEXT: call ptr @llvm.coro.free
+
+// CHECK: coro.ret:
+// CHECK-NEXT: call void @llvm.lifetime.end.p0
+// CHECK-NEXT: ret void
diff --git a/clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp b/clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp
index f36f89926505f..41012065f914a 100644
--- a/clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp
+++ b/clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp
@@ -91,7 +91,7 @@ Task bar() {
 // CHECK-NEXT:    %{{.+}} = call token @llvm.coro.save(ptr null)
 // CHECK-NEXT:    call void @llvm.coro.await.suspend.handle
 // CHECK-NEXT:    %{{.+}} = call i8 @llvm.coro.suspend
-// CHECK-NEXT:    switch i8 %{{.+}}, label %coro.ret [
+// CHECK-NEXT:    switch i8 %{{.+}}, label %coro.end [
 // CHECK-NEXT:      i8 0, label %[[CASE1_AWAIT_READY]]
 // CHECK-NEXT:      i8 1, label %[[CASE1_AWAIT_CLEANUP:.+]]
 // CHECK-NEXT:    ]
@@ -105,7 +105,7 @@ Task bar() {
 // CHECK-NEXT:    %{{.+}} = call token @llvm.coro.save(ptr null)
 // CHECK-NEXT:    call void @llvm.coro.await.suspend.handle
 // CHECK-NEXT:    %{{.+}} = call i8 @llvm.coro.suspend
-// CHECK-NEXT:    switch i8 %{{.+}}, label %coro.ret [
+// CHECK-NEXT:    switch i8 %{{.+}}, label %coro.end [
 // CHECK-NEXT:      i8 0, label %[[CASE2_AWAIT_READY]]
 // CHECK-NEXT:      i8 1, label %[[CASE2_AWAIT_CLEANUP:.+]]
 // CHECK-NEXT:    ]
diff --git a/llvm/docs/Coroutines.rst b/llvm/docs/Coroutines.rst
index 7472c68a70df4..c87657666c31e 100644
--- a/llvm/docs/Coroutines.rst
+++ b/llvm/docs/Coroutines.rst
@@ -1477,7 +1477,7 @@ The purpose of this intrinsic is to allow frontends to mark the cleanup and
 other code that is only relevant during the initial invocation of the coroutine
 and should not be present in resume and destroy parts.
 
-In returned-continuation lowering, ``llvm.coro.end`` fully destroys the
+* In returned-continuation lowering, ``llvm.coro.end`` fully destroys the
 coroutine frame.  If the second argument is `false`, it also returns from
 the coroutine with a null continuation pointer, and the next instruction
 will be unreachable.  If the second argument is `true`, it falls through
@@ -1485,13 +1485,12 @@ so that the following logic can resume unwinding.  In a yield-once
 coroutine, reaching a non-unwind ``llvm.coro.end`` without having first
 reached a ``llvm.coro.suspend.retcon`` has undefined behavior.
 
-The remainder of this section describes the behavior under switched-resume
-lowering.
-
-This intrinsic is lowered when a coroutine is split into
-the start, resume and destroy parts. In the start part, it is a no-op,
+* In switched-resume lowering, This intrinsic is lowered when a coroutine is
+split into the start, resume and destroy parts. In the start part, it is a no-op,
 in resume and destroy parts, it is replaced with `ret void` instruction and
-the rest of the block containing `coro.end` instruction is discarded.
+the rest of the block containing `coro.end` instruction is discarded if the
+return value of `coro.end` is unused.
+
 In landing pads it is replaced with an appropriate instruction to unwind to
 caller. The handling of coro.end differs depending on whether the target is
 using landingpad or WinEH exception model.
diff --git a/llvm/include/llvm/Transforms/Coroutines/CoroInstr.h b/llvm/include/llvm/Transforms/Coroutines/CoroInstr.h
index 0688068167ae6..7ec2913044e43 100644
--- a/llvm/include/llvm/Transforms/Coroutines/CoroInstr.h
+++ b/llvm/include/llvm/Transforms/Coroutines/CoroInstr.h
@@ -682,7 +682,7 @@ class AnyCoroEndInst : public IntrinsicInst {
   enum { FrameArg, UnwindArg, TokenArg };
 
 public:
-  bool isFallthrough() const { return !isUnwind(); }
+  bool isFallthrough() const { return !isUnwind() && user_empty(); }
   bool isUnwind() const {
     return cast<Constant>(getArgOperand(UnwindArg))->isOneValue();
   }
diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
index 64b33e46404f0..63ba80588a5d3 100644
--- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -225,7 +225,7 @@ static void replaceFallthroughCoroEnd(AnyCoroEndInst *End,
            "switch coroutine should not return any values");
     // coro.end doesn't immediately end the coroutine in the main function
     // in this lowering, because we need to deallocate the coroutine.
-    if (!InResume)
+    if (!InResume || !End->user_empty())
       return;
     Builder.CreateRetVoid();
     break;
diff --git a/llvm/test/Transforms/Coroutines/coro-split-coroend.ll b/llvm/test/Transforms/Coroutines/coro-split-coroend.ll
new file mode 100644
index 0000000000000..a8bd8504fc4c0
--- /dev/null
+++ b/llvm/test/Transforms/Coroutines/coro-split-coroend.ll
@@ -0,0 +1,42 @@
+; Check that we do not lower coro.end to 'ret void' if its return value is used
+; RUN: opt < %s -passes='cgscc(coro-split),simplifycfg,early-cse' -S | FileCheck %s
+
+define ptr @f1() presplitcoroutine {
+  ; CHECK-LABEL: define internal fastcc void @f1.resume(
+  ; CHECK-NOT: call void @a()
+entry:
+  %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null)
+  %hdl = call ptr @llvm.coro.begin(token %id, ptr null)
+  %0 = ...
[truncated]

@ChuanqiXu9
Copy link
Member

I don't understand this:

Do not lower coro.end to ret void if its return value will be used. This allows front end separate coro start section and resume section without early return from resume function.

Could elaborate it? Especially why it is safe to not lower coro.end to ret void?

@NewSigma
Copy link
Contributor Author

NewSigma commented Aug 1, 2025

Could elaborate it? Especially why it is safe to not lower coro.end to ret void?

The idea is: if we lower coro.end to ret void, users of coro.end will become unreachable. Thus, I assume that if the frontend uses the return value of coro.end, the intention is to determine whether we are in the start or resume part instead of returning.

Personally, I think the problem is that coro.end mixes two functionalities: querying where we are and lowering to some code. I think it might be reasonable to introduce a new intrinsic, llvm.coro.where, and drop the return value of coro.end. However, I’d prefer to hear others’ thoughts before actually doing it.

if (CoroEnds.size() > 1) {
if (CoroEnds.front()->isFallthrough())
report_fatal_error(
"Only one coro.end can be marked as fallthrough");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that simplifycfg might duplicate coro.end in some cases, I suggest relaxing this restriction.

@ChuanqiXu9
Copy link
Member

Could elaborate it? Especially why it is safe to not lower coro.end to ret void?

The idea is: if we lower coro.end to ret void, users of coro.end will become unreachable. Thus, I assume that if the frontend uses the return value of coro.end, the intention is to determine whether we are in the start or resume part instead of returning.

Personally, I think the problem is that coro.end mixes two functionalities: querying where we are and lowering to some code. I think it might be reasonable to introduce a new intrinsic, llvm.coro.where, and drop the return value of coro.end. However, I’d prefer to hear others’ thoughts before actually doing it.

Since I didn't have time to look into details, the instinct reaction to my mind is, then why do we change the lowering of coro.end conditionally? I mean, if it is good, it will be better and simpler to do it unconditionally.

@NewSigma
Copy link
Contributor Author

NewSigma commented Aug 7, 2025

then why do we change the lowering of coro.end conditionally? I mean, if it is good, it will be better and simpler to do it unconditionally.

It is reasonable. I will use a new intrinsic instead of lower coro.end conditionally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category coroutines C++20 coroutines llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[clang] [coroutine] Initialize return value before destruction of coroutine state
3 participants