Skip to content

[C++20] [Modules] Don't generate call to an imported module that dont init anything #67638

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 28, 2023

Conversation

ChuanqiXu9
Copy link
Member

Close #56794

And see #67582 for a detailed backgrond for the issue.

As required by the Itanium ABI, the module units have to generate the initialization function. However, the importers are allowed to elide the call to the initialization function if they are sure the initialization function doesn't do anything.

This patch implemented this semantics.

@ChuanqiXu9 ChuanqiXu9 requested a review from iains September 28, 2023 07:37
@ChuanqiXu9 ChuanqiXu9 added clang:modules C++20 modules and Clang Header Modules ABI Application Binary Interface labels Sep 28, 2023
@ChuanqiXu9 ChuanqiXu9 requested a review from urnathan September 28, 2023 07:37
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Sep 28, 2023
@llvmbot
Copy link
Member

llvmbot commented Sep 28, 2023

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-modules

Changes

Close #56794

And see #67582 for a detailed backgrond for the issue.

As required by the Itanium ABI, the module units have to generate the initialization function. However, the importers are allowed to elide the call to the initialization function if they are sure the initialization function doesn't do anything.

This patch implemented this semantics.


Full diff: https://github.com/llvm/llvm-project/pull/67638.diff

7 Files Affected:

  • (modified) clang/include/clang/Basic/Module.h (+8)
  • (modified) clang/lib/Basic/Module.cpp (+1-1)
  • (modified) clang/lib/CodeGen/CGDeclCXX.cpp (+5-2)
  • (modified) clang/lib/Sema/Sema.cpp (+20)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+2)
  • (modified) clang/lib/Serialization/ASTWriter.cpp (+3-1)
  • (modified) clang/test/CodeGenCXX/module-initializer-guard-elision.cpp (+41-12)
diff --git a/clang/include/clang/Basic/Module.h b/clang/include/clang/Basic/Module.h
index edbd2dfc2973e92..fc8c39719979ecf 100644
--- a/clang/include/clang/Basic/Module.h
+++ b/clang/include/clang/Basic/Module.h
@@ -358,6 +358,10 @@ class alignas(8) Module {
   /// to a regular (public) module map.
   unsigned ModuleMapIsPrivate : 1;
 
+  /// Whether this C++20 named modules doesn't need an initializer.
+  /// This is only meaningful for C++20 modules.
+  unsigned NamedModuleNoInit : 1;
+
   /// Describes the visibility of the various names within a
   /// particular module.
   enum NameVisibilityKind {
@@ -596,6 +600,10 @@ class alignas(8) Module {
     return Kind == ModuleInterfaceUnit || Kind == ModulePartitionInterface;
   }
 
+  bool isNamedModuleInterfaceNoInit() const {
+    return NamedModuleNoInit;
+  }
+
   /// Get the primary module interface name from a partition.
   StringRef getPrimaryModuleInterfaceName() const {
     // Technically, global module fragment belongs to global module. And global
diff --git a/clang/lib/Basic/Module.cpp b/clang/lib/Basic/Module.cpp
index 4e6fbfc754b9173..82769f95bca90aa 100644
--- a/clang/lib/Basic/Module.cpp
+++ b/clang/lib/Basic/Module.cpp
@@ -44,7 +44,7 @@ Module::Module(StringRef Name, SourceLocation DefinitionLoc, Module *Parent,
       InferSubmodules(false), InferExplicitSubmodules(false),
       InferExportWildcard(false), ConfigMacrosExhaustive(false),
       NoUndeclaredIncludes(false), ModuleMapIsPrivate(false),
-      NameVisibility(Hidden) {
+      NamedModuleNoInit(false), NameVisibility(Hidden) {
   if (Parent) {
     IsAvailable = Parent->isAvailable();
     IsUnimportable = Parent->isUnimportable();
diff --git a/clang/lib/CodeGen/CGDeclCXX.cpp b/clang/lib/CodeGen/CGDeclCXX.cpp
index 80543ba6311de45..a163069adf7c62c 100644
--- a/clang/lib/CodeGen/CGDeclCXX.cpp
+++ b/clang/lib/CodeGen/CGDeclCXX.cpp
@@ -675,6 +675,10 @@ void CodeGenModule::EmitCXXModuleInitFunc(Module *Primary) {
     // No Itanium initializer in header like modules.
     if (M->isHeaderLikeModule())
       continue; // TODO: warn of mixed use of module map modules and C++20?
+    // We're allowed to skip the initialization if we are sure it doesn't
+    // do any thing.
+    if (M->isNamedModuleInterfaceNoInit())
+      continue;
     llvm::FunctionType *FTy = llvm::FunctionType::get(VoidTy, false);
     SmallString<256> FnName;
     {
@@ -731,8 +735,7 @@ void CodeGenModule::EmitCXXModuleInitFunc(Module *Primary) {
     // If we have a completely empty initializer then we do not want to create
     // the guard variable.
     ConstantAddress GuardAddr = ConstantAddress::invalid();
-    if (!AllImports.empty() || !PrioritizedCXXGlobalInits.empty() ||
-        !CXXGlobalInits.empty()) {
+    if (!ModuleInits.empty()) {
       // Create the guard var.
       llvm::GlobalVariable *Guard = new llvm::GlobalVariable(
           getModule(), Int8Ty, /*isConstant=*/false,
diff --git a/clang/lib/Sema/Sema.cpp b/clang/lib/Sema/Sema.cpp
index a401017d4c1c0b8..d2a7fcb0ee27d89 100644
--- a/clang/lib/Sema/Sema.cpp
+++ b/clang/lib/Sema/Sema.cpp
@@ -1245,6 +1245,26 @@ void Sema::ActOnEndOfTranslationUnit() {
       }
     }
 
+    // Now we can decide whether the modules we're building need an initializer.
+    if (Module *CurrentModule = getCurrentModule(); CurrentModule && CurrentModule->isInterfaceOrPartition()) {      
+      auto DoesModNeedInit = [this] (Module *M) {
+        if (!getASTContext().getModuleInitializers(M).empty())
+          return false;
+        for (auto [Exported, _] : M->Exports)
+          if (!Exported->isNamedModuleInterfaceNoInit())
+            return false;
+        for (Module *I : M->Imports)
+          if (!I->isNamedModuleInterfaceNoInit())
+            return false;
+
+        return true;
+      };
+
+      CurrentModule->NamedModuleNoInit = DoesModNeedInit(CurrentModule);
+      for (Module *SubModules : CurrentModule->submodules())
+        CurrentModule->NamedModuleNoInit &= DoesModNeedInit(SubModules);
+    }
+
     // Warnings emitted in ActOnEndOfTranslationUnit() should be emitted for
     // modules when they are built, not every time they are used.
     emitAndClearUnusedLocalTypedefWarnings();
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..ccc201e57a6277b 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -5685,6 +5685,7 @@ llvm::Error ASTReader::ReadSubmoduleBlock(ModuleFile &F,
       bool InferExportWildcard = Record[Idx++];
       bool ConfigMacrosExhaustive = Record[Idx++];
       bool ModuleMapIsPrivate = Record[Idx++];
+      bool NamedModuleNoInit = Record[Idx++];
 
       Module *ParentModule = nullptr;
       if (Parent)
@@ -5735,6 +5736,7 @@ llvm::Error ASTReader::ReadSubmoduleBlock(ModuleFile &F,
       CurrentModule->InferExportWildcard = InferExportWildcard;
       CurrentModule->ConfigMacrosExhaustive = ConfigMacrosExhaustive;
       CurrentModule->ModuleMapIsPrivate = ModuleMapIsPrivate;
+      CurrentModule->NamedModuleNoInit = NamedModuleNoInit;
       if (DeserializationListener)
         DeserializationListener->ModuleRead(GlobalID, CurrentModule);
 
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index 216ca94111e156b..b7a8807fcb04ccf 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -2779,6 +2779,7 @@ void ASTWriter::WriteSubmodules(Module *WritingModule) {
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // InferExportWild...
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // ConfigMacrosExh...
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // ModuleMapIsPriv...
+  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // NamedModuleNoInit
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob)); // Name
   unsigned DefinitionAbbrev = Stream.EmitAbbrev(std::move(Abbrev));
 
@@ -2888,7 +2889,8 @@ void ASTWriter::WriteSubmodules(Module *WritingModule) {
                                          Mod->InferExplicitSubmodules,
                                          Mod->InferExportWildcard,
                                          Mod->ConfigMacrosExhaustive,
-                                         Mod->ModuleMapIsPrivate};
+                                         Mod->ModuleMapIsPrivate,
+                                         Mod->NamedModuleNoInit};
       Stream.EmitRecordWithBlob(DefinitionAbbrev, Record, Mod->Name);
     }
 
diff --git a/clang/test/CodeGenCXX/module-initializer-guard-elision.cpp b/clang/test/CodeGenCXX/module-initializer-guard-elision.cpp
index 2b01a670fb72764..58189ca949d8e01 100644
--- a/clang/test/CodeGenCXX/module-initializer-guard-elision.cpp
+++ b/clang/test/CodeGenCXX/module-initializer-guard-elision.cpp
@@ -8,14 +8,24 @@
 // RUN:  -o - | FileCheck %s --check-prefix=CHECK-O
 
 // RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 P.cpp \
-// RUN:    -emit-module-interface -fmodule-file=O.pcm -o P.pcm
+// RUN:    -emit-module-interface -fmodule-file=O=O.pcm -o P.pcm
 // RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 P.pcm -S -emit-llvm \
 // RUN:  -o - | FileCheck %s --check-prefix=CHECK-P
 
 // RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 Q.cpp \
-// RUN:    -emit-module-interface -fmodule-file=O.pcm -o Q.pcm
+// RUN:    -emit-module-interface -o Q.pcm
 // RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 Q.pcm -S -emit-llvm \
-// RUN:  -o - | FileCheck %s --check-prefix=CHECK-Q
+// RUN:    -o - | FileCheck %s --check-prefix=CHECK-Q
+
+// RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 R.cpp \
+// RUN:    -emit-module-interface -fmodule-file=Q=Q.pcm -o R.pcm
+// RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 R.pcm -S -emit-llvm \
+// RUN:    -o - | FileCheck %s --check-prefix=CHECK-R
+
+// RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 S.cpp \
+// RUN:    -emit-module-interface -fmodule-file=Q=Q.pcm -fmodule-file=R=R.pcm -o S.pcm
+// RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 S.pcm -S -emit-llvm \
+// RUN:    -o - | FileCheck %s --check-prefix=CHECK-S
 
 // Testing cases where we can elide the module initializer guard variable.
 
@@ -31,8 +41,8 @@ export int foo ();
 // CHECK-O-NEXT: ret void
 // CHECK-O-NOT: @_ZGIW1O__in_chrg
 
-// This has no global inits but imports a module, and therefore needs a guard
-// variable.
+// This has no global inits and all the imported modules don't need inits. So
+// guard variable is not needed.
 //--- P.cpp
 
 export module P;
@@ -41,16 +51,14 @@ export import O;
 export int bar ();
 
 // CHECK-P: define void @_ZGIW1P
-// CHECK-P-LABEL: init
-// CHECK-P: store i8 1, ptr @_ZGIW1P__in_chrg
-// CHECK-P: call void @_ZGIW1O()
-// CHECK-P-NOT: call void @__cxx_global_var_init
+// CHECK-P-LABEL: entry
+// CHECK-P-NEXT: ret void
+// CHECK-P-NOT: @_ZGIW1P__in_chrg
 
-// This imports a module and has global inits, so needs a guard.
+// This has global inits, so needs a guard.
 //--- Q.cpp
 
 export module Q;
-export import O;
 
 export struct Quack {
   Quack(){};
@@ -64,6 +72,27 @@ export int baz ();
 // CHECK-Q: call {{.*}} @_ZNW1Q5QuackC1Ev
 // CHECK-Q: define void @_ZGIW1Q
 // CHECK-Q: store i8 1, ptr @_ZGIW1Q__in_chrg
-// CHECK-Q: call void @_ZGIW1O()
 // CHECK-Q: call void @__cxx_global_var_init
 
+// This doesn't have a global init, but it imports a module which needs global
+// init, so needs a guard
+//--- R.cpp
+
+export module R;
+export import Q;
+
+// CHECK-R: define void @_ZGIW1R
+// CHECK-R: store i8 1, ptr @_ZGIW1R__in_chrg
+// CHECK-R: call{{.*}}@_ZGIW1Q
+
+// This doesn't have a global init and the imported module doesn't have variables needs
+// dynamic initialization.
+// But the imported module contains modules initialization. So needs a guard.
+//--- S.cpp
+
+export module S;
+export import R;
+
+// CHECK-S: define void @_ZGIW1S
+// CHECK-S: store i8 1, ptr @_ZGIW1S__in_chrg
+// CHECK-S: call{{.*}}@_ZGIW1R

@ChuanqiXu9 ChuanqiXu9 self-assigned this Sep 28, 2023
@github-actions
Copy link

github-actions bot commented Sep 28, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@ChuanqiXu9 ChuanqiXu9 force-pushed the modules-elide-init-calls branch from 2448304 to 51c53a1 Compare September 28, 2023 07:54
Copy link
Contributor

@iains iains left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this.

Generally, this LGTM, but perhaps we can choose names that are unambiguously related to state - the current one could be read like an action.

maybe something like:

NamedModuleHasNoInit
and
namedModuleHasNoInit()

(or I'm open to other suggestions) ..

…t init anything

Close llvm#56794

And see llvm#67582 for a detailed
backgrond for the issue.

As required by the Itanium ABI, the module units have to generate the
initialization function. However, the importers are allowed to elide the
call to the initialization function if they are sure the initialization
function doesn't do anything.

This patch implemented this semantics.
@ChuanqiXu9
Copy link
Member Author

Opinions addressed.

@ChuanqiXu9 ChuanqiXu9 force-pushed the modules-elide-init-calls branch from 51c53a1 to 02b65f0 Compare September 28, 2023 09:25
Copy link
Contributor

@iains iains left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, LGTM

@ChuanqiXu9 ChuanqiXu9 merged commit 989173c into llvm:main Sep 28, 2023
Comment on lines +1265 to +1266
for (Module *SubModules : CurrentModule->submodules())
CurrentModule->NamedModuleHasNoInit &= DoesModNeedInit(SubModules);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be, maybe:

CurrentModule->NamedModuleHasNoInit = DoesModNeedInit(CurrentModule) && llvm::all_of(CurrentModule->submodules(), DoesModNeedInit);

(be slightly more efficient, since it'd bail out and return false as soon as one module needed init, rather than checking later modules)
Though the DoesModNeedInit might need a better name, since you'd expect that to return true if the module needs init, but the "no" in "noinit" is confusing/inverting - maybe it'd be better to name that flag "NamedModuleHasInit" - to avoid double negatives, etc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. It is addressed in the NFC patch: 7e8a0e4

legrosbuffle pushed a commit to legrosbuffle/llvm-project that referenced this pull request Sep 29, 2023
… init anything (llvm#67638)

Close llvm#56794

And see llvm#67582 for a detailed
backgrond for the issue.

As required by the Itanium ABI, the module units have to generate the
initialization function. However, the importers are allowed to elide the
call to the initialization function if they are sure the initialization
function doesn't do anything.

This patch implemented this semantics.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ABI Application Binary Interface clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++20] [Modules] Named Module Units would generate initializers unconditionally
4 participants