Skip to content

[CIR] Upstream limited support for nested structures #136331

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 18, 2025

Conversation

andykaylor
Copy link
Contributor

Previously when we checked to see if it was safe to create CIR for a structure type, we were conservatively saying no if any structure was in the process of being converted. That prevented handling nested structures, even when it would have actually been safe to handle them. Handling structures which recursively reference structures currently being processed requires deferring the handling of the recursively referenced structure, and that still isn't implemented after this change.

This change adds the less conservative check that allows handling of non-recursive nested structures.

Previously when we checked to see if it was safe to create CIR for a
structure type, we were conservatively saying no if any structure was
in the process of being converted. That prevented handling nested
structures, even when it would have actually been safe to handle them.
Handling structures which recursively reference structures currently
being processed requires deferring the handling of the recursively
referenced structure, and that still isn't implemented after this
change.

This change adds the less conservative check that allows handling of
non-recursive nested structures.
@llvmbot llvmbot added clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project labels Apr 18, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 18, 2025

@llvm/pr-subscribers-clang

Author: Andy Kaylor (andykaylor)

Changes

Previously when we checked to see if it was safe to create CIR for a structure type, we were conservatively saying no if any structure was in the process of being converted. That prevented handling nested structures, even when it would have actually been safe to handle them. Handling structures which recursively reference structures currently being processed requires deferring the handling of the recursively referenced structure, and that still isn't implemented after this change.

This change adds the less conservative check that allows handling of non-recursive nested structures.


Full diff: https://github.com/llvm/llvm-project/pull/136331.diff

3 Files Affected:

  • (modified) clang/lib/CIR/CodeGen/CIRGenTypes.cpp (+81-4)
  • (modified) clang/lib/CIR/CodeGen/CIRGenTypes.h (+4)
  • (modified) clang/test/CIR/CodeGen/struct.c (+21)
diff --git a/clang/lib/CIR/CodeGen/CIRGenTypes.cpp b/clang/lib/CIR/CodeGen/CIRGenTypes.cpp
index a896b9dce14d6..7bd86cf0c7bcd 100644
--- a/clang/lib/CIR/CodeGen/CIRGenTypes.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenTypes.cpp
@@ -116,16 +116,93 @@ std::string CIRGenTypes::getRecordTypeName(const clang::RecordDecl *recordDecl,
   return builder.getUniqueRecordName(std::string(typeName));
 }
 
+/// Return true if the specified type is already completely laid out.
+bool CIRGenTypes::isRecordLayoutComplete(const Type *ty) const {
+  const auto it = recordDeclTypes.find(ty);
+  return it != recordDeclTypes.end() && it->second.isComplete();
+}
+
+// We have multiple forms of this function that call each other, so we need to
+// declare one in advance.
+static bool
+isSafeToConvert(QualType qt, CIRGenTypes &cgt,
+                llvm::SmallPtrSetImpl<const RecordDecl *> &alreadyChecked);
+
+/// Return true if it is safe to convert the specified record decl to CIR and
+/// lay it out, false if doing so would cause us to get into a recursive
+/// compilation mess.
+static bool
+isSafeToConvert(const RecordDecl *rd, CIRGenTypes &cgt,
+                llvm::SmallPtrSetImpl<const RecordDecl *> &alreadyChecked) {
+  // If we have already checked this type (maybe the same type is used by-value
+  // multiple times in multiple record fields, don't check again.
+  if (!alreadyChecked.insert(rd).second)
+    return true;
+
+  const Type *key = cgt.getASTContext().getTagDeclType(rd).getTypePtr();
+
+  // If this type is already laid out, converting it is a noop.
+  if (cgt.isRecordLayoutComplete(key))
+    return true;
+
+  // If this type is currently being laid out, we can't recursively compile it.
+  if (cgt.isRecordBeingLaidOut(key))
+    return false;
+
+  // If this type would require laying out bases that are currently being laid
+  // out, don't do it.  This includes virtual base classes which get laid out
+  // when a class is translated, even though they aren't embedded by-value into
+  // the class.
+  if (const CXXRecordDecl *crd = dyn_cast<CXXRecordDecl>(rd)) {
+    assert(!cir::MissingFeatures::cxxSupport());
+    cgt.getCGModule().errorNYI(rd->getSourceRange(),
+                               "isSafeToConvert: CXXRecordDecl");
+    return false;
+  }
+
+  // If this type would require laying out members that are currently being laid
+  // out, don't do it.
+  for (const FieldDecl *field : rd->fields())
+    if (!isSafeToConvert(field->getType(), cgt, alreadyChecked))
+      return false;
+
+  // If there are no problems, lets do it.
+  return true;
+}
+
+/// Return true if it is safe to convert this field type, which requires the
+/// record elements contained by-value to all be recursively safe to convert.
+static bool
+isSafeToConvert(QualType qt, CIRGenTypes &cgt,
+                llvm::SmallPtrSetImpl<const RecordDecl *> &alreadyChecked) {
+  // Strip off atomic type sugar.
+  if (const auto *at = qt->getAs<AtomicType>())
+    qt = at->getValueType();
+
+  // If this is a record, check it.
+  if (const auto *rt = qt->getAs<RecordType>())
+    return isSafeToConvert(rt->getDecl(), cgt, alreadyChecked);
+
+  // If this is an array, check the elements, which are embedded inline.
+  if (const auto *at = cgt.getASTContext().getAsArrayType(qt))
+    return isSafeToConvert(at->getElementType(), cgt, alreadyChecked);
+
+  // Otherwise, there is no concern about transforming this. We only care about
+  // things that are contained by-value in a record that can have another
+  // record as a member.
+  return true;
+}
+
 // Return true if it is safe to convert the specified record decl to CIR and lay
 // it out, false if doing so would cause us to get into a recursive compilation
 // mess.
-static bool isSafeToConvert(const RecordDecl *RD, CIRGenTypes &CGT) {
+static bool isSafeToConvert(const RecordDecl *rd, CIRGenTypes &cgt) {
   // If no records are being laid out, we can certainly do this one.
-  if (CGT.noRecordsBeingLaidOut())
+  if (cgt.noRecordsBeingLaidOut())
     return true;
 
-  assert(!cir::MissingFeatures::recursiveRecordLayout());
-  return false;
+  llvm::SmallPtrSet<const RecordDecl *, 16> alreadyChecked;
+  return isSafeToConvert(rd, cgt, alreadyChecked);
 }
 
 /// Lay out a tagged decl type like struct or union.
diff --git a/clang/lib/CIR/CodeGen/CIRGenTypes.h b/clang/lib/CIR/CodeGen/CIRGenTypes.h
index fc6e9cf621cb3..2bb78420700f8 100644
--- a/clang/lib/CIR/CodeGen/CIRGenTypes.h
+++ b/clang/lib/CIR/CodeGen/CIRGenTypes.h
@@ -89,7 +89,11 @@ class CIRGenTypes {
   mlir::MLIRContext &getMLIRContext() const;
   clang::ASTContext &getASTContext() const { return astContext; }
 
+  bool isRecordLayoutComplete(const clang::Type *ty) const;
   bool noRecordsBeingLaidOut() const { return recordsBeingLaidOut.empty(); }
+  bool isRecordBeingLaidOut(const clang::Type *ty) const {
+    return recordsBeingLaidOut.count(ty);
+  }
 
   const ABIInfo &getABIInfo() const { return theABIInfo; }
 
diff --git a/clang/test/CIR/CodeGen/struct.c b/clang/test/CIR/CodeGen/struct.c
index 55f316f27eeb1..3dc1655e15d2c 100644
--- a/clang/test/CIR/CodeGen/struct.c
+++ b/clang/test/CIR/CodeGen/struct.c
@@ -8,9 +8,13 @@
 // For LLVM IR checks, the structs are defined before the variables, so these
 // checks are at the top.
 // LLVM: %struct.CompleteS = type { i32, i8 }
+// LLVM: %struct.OuterS = type { %struct.InnerS, i32 }
+// LLVM: %struct.InnerS = type { i32, i8 }
 // LLVM: %struct.PackedS = type <{ i32, i8 }>
 // LLVM: %struct.PackedAndPaddedS = type <{ i32, i8, i8 }>
 // OGCG: %struct.CompleteS = type { i32, i8 }
+// OGCG: %struct.OuterS = type { %struct.InnerS, i32 }
+// OGCG: %struct.InnerS = type { i32, i8 }
 // OGCG: %struct.PackedS = type <{ i32, i8 }>
 // OGCG: %struct.PackedAndPaddedS = type <{ i32, i8, i8 }>
 
@@ -31,6 +35,23 @@ struct CompleteS {
 // LLVM:      @cs = dso_local global %struct.CompleteS zeroinitializer
 // OGCG:      @cs = global %struct.CompleteS zeroinitializer, align 4
 
+struct InnerS {
+  int a;
+  char b;
+};
+
+struct OuterS {
+  struct InnerS is;
+  int c;
+};
+
+struct OuterS os;
+
+// CIR:       cir.global external @os = #cir.zero : !cir.record<struct
+// CIR-SAME:      "OuterS" {!cir.record<struct "InnerS" {!s32i, !s8i}>, !s32i}>
+// LLVM:      @os = dso_local global %struct.OuterS zeroinitializer
+// OGCG:      @os = global %struct.OuterS zeroinitializer, align 4
+
 #pragma pack(push)
 #pragma pack(1)
 

@llvmbot
Copy link
Member

llvmbot commented Apr 18, 2025

@llvm/pr-subscribers-clangir

Author: Andy Kaylor (andykaylor)

Changes

Previously when we checked to see if it was safe to create CIR for a structure type, we were conservatively saying no if any structure was in the process of being converted. That prevented handling nested structures, even when it would have actually been safe to handle them. Handling structures which recursively reference structures currently being processed requires deferring the handling of the recursively referenced structure, and that still isn't implemented after this change.

This change adds the less conservative check that allows handling of non-recursive nested structures.


Full diff: https://github.com/llvm/llvm-project/pull/136331.diff

3 Files Affected:

  • (modified) clang/lib/CIR/CodeGen/CIRGenTypes.cpp (+81-4)
  • (modified) clang/lib/CIR/CodeGen/CIRGenTypes.h (+4)
  • (modified) clang/test/CIR/CodeGen/struct.c (+21)
diff --git a/clang/lib/CIR/CodeGen/CIRGenTypes.cpp b/clang/lib/CIR/CodeGen/CIRGenTypes.cpp
index a896b9dce14d6..7bd86cf0c7bcd 100644
--- a/clang/lib/CIR/CodeGen/CIRGenTypes.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenTypes.cpp
@@ -116,16 +116,93 @@ std::string CIRGenTypes::getRecordTypeName(const clang::RecordDecl *recordDecl,
   return builder.getUniqueRecordName(std::string(typeName));
 }
 
+/// Return true if the specified type is already completely laid out.
+bool CIRGenTypes::isRecordLayoutComplete(const Type *ty) const {
+  const auto it = recordDeclTypes.find(ty);
+  return it != recordDeclTypes.end() && it->second.isComplete();
+}
+
+// We have multiple forms of this function that call each other, so we need to
+// declare one in advance.
+static bool
+isSafeToConvert(QualType qt, CIRGenTypes &cgt,
+                llvm::SmallPtrSetImpl<const RecordDecl *> &alreadyChecked);
+
+/// Return true if it is safe to convert the specified record decl to CIR and
+/// lay it out, false if doing so would cause us to get into a recursive
+/// compilation mess.
+static bool
+isSafeToConvert(const RecordDecl *rd, CIRGenTypes &cgt,
+                llvm::SmallPtrSetImpl<const RecordDecl *> &alreadyChecked) {
+  // If we have already checked this type (maybe the same type is used by-value
+  // multiple times in multiple record fields, don't check again.
+  if (!alreadyChecked.insert(rd).second)
+    return true;
+
+  const Type *key = cgt.getASTContext().getTagDeclType(rd).getTypePtr();
+
+  // If this type is already laid out, converting it is a noop.
+  if (cgt.isRecordLayoutComplete(key))
+    return true;
+
+  // If this type is currently being laid out, we can't recursively compile it.
+  if (cgt.isRecordBeingLaidOut(key))
+    return false;
+
+  // If this type would require laying out bases that are currently being laid
+  // out, don't do it.  This includes virtual base classes which get laid out
+  // when a class is translated, even though they aren't embedded by-value into
+  // the class.
+  if (const CXXRecordDecl *crd = dyn_cast<CXXRecordDecl>(rd)) {
+    assert(!cir::MissingFeatures::cxxSupport());
+    cgt.getCGModule().errorNYI(rd->getSourceRange(),
+                               "isSafeToConvert: CXXRecordDecl");
+    return false;
+  }
+
+  // If this type would require laying out members that are currently being laid
+  // out, don't do it.
+  for (const FieldDecl *field : rd->fields())
+    if (!isSafeToConvert(field->getType(), cgt, alreadyChecked))
+      return false;
+
+  // If there are no problems, lets do it.
+  return true;
+}
+
+/// Return true if it is safe to convert this field type, which requires the
+/// record elements contained by-value to all be recursively safe to convert.
+static bool
+isSafeToConvert(QualType qt, CIRGenTypes &cgt,
+                llvm::SmallPtrSetImpl<const RecordDecl *> &alreadyChecked) {
+  // Strip off atomic type sugar.
+  if (const auto *at = qt->getAs<AtomicType>())
+    qt = at->getValueType();
+
+  // If this is a record, check it.
+  if (const auto *rt = qt->getAs<RecordType>())
+    return isSafeToConvert(rt->getDecl(), cgt, alreadyChecked);
+
+  // If this is an array, check the elements, which are embedded inline.
+  if (const auto *at = cgt.getASTContext().getAsArrayType(qt))
+    return isSafeToConvert(at->getElementType(), cgt, alreadyChecked);
+
+  // Otherwise, there is no concern about transforming this. We only care about
+  // things that are contained by-value in a record that can have another
+  // record as a member.
+  return true;
+}
+
 // Return true if it is safe to convert the specified record decl to CIR and lay
 // it out, false if doing so would cause us to get into a recursive compilation
 // mess.
-static bool isSafeToConvert(const RecordDecl *RD, CIRGenTypes &CGT) {
+static bool isSafeToConvert(const RecordDecl *rd, CIRGenTypes &cgt) {
   // If no records are being laid out, we can certainly do this one.
-  if (CGT.noRecordsBeingLaidOut())
+  if (cgt.noRecordsBeingLaidOut())
     return true;
 
-  assert(!cir::MissingFeatures::recursiveRecordLayout());
-  return false;
+  llvm::SmallPtrSet<const RecordDecl *, 16> alreadyChecked;
+  return isSafeToConvert(rd, cgt, alreadyChecked);
 }
 
 /// Lay out a tagged decl type like struct or union.
diff --git a/clang/lib/CIR/CodeGen/CIRGenTypes.h b/clang/lib/CIR/CodeGen/CIRGenTypes.h
index fc6e9cf621cb3..2bb78420700f8 100644
--- a/clang/lib/CIR/CodeGen/CIRGenTypes.h
+++ b/clang/lib/CIR/CodeGen/CIRGenTypes.h
@@ -89,7 +89,11 @@ class CIRGenTypes {
   mlir::MLIRContext &getMLIRContext() const;
   clang::ASTContext &getASTContext() const { return astContext; }
 
+  bool isRecordLayoutComplete(const clang::Type *ty) const;
   bool noRecordsBeingLaidOut() const { return recordsBeingLaidOut.empty(); }
+  bool isRecordBeingLaidOut(const clang::Type *ty) const {
+    return recordsBeingLaidOut.count(ty);
+  }
 
   const ABIInfo &getABIInfo() const { return theABIInfo; }
 
diff --git a/clang/test/CIR/CodeGen/struct.c b/clang/test/CIR/CodeGen/struct.c
index 55f316f27eeb1..3dc1655e15d2c 100644
--- a/clang/test/CIR/CodeGen/struct.c
+++ b/clang/test/CIR/CodeGen/struct.c
@@ -8,9 +8,13 @@
 // For LLVM IR checks, the structs are defined before the variables, so these
 // checks are at the top.
 // LLVM: %struct.CompleteS = type { i32, i8 }
+// LLVM: %struct.OuterS = type { %struct.InnerS, i32 }
+// LLVM: %struct.InnerS = type { i32, i8 }
 // LLVM: %struct.PackedS = type <{ i32, i8 }>
 // LLVM: %struct.PackedAndPaddedS = type <{ i32, i8, i8 }>
 // OGCG: %struct.CompleteS = type { i32, i8 }
+// OGCG: %struct.OuterS = type { %struct.InnerS, i32 }
+// OGCG: %struct.InnerS = type { i32, i8 }
 // OGCG: %struct.PackedS = type <{ i32, i8 }>
 // OGCG: %struct.PackedAndPaddedS = type <{ i32, i8, i8 }>
 
@@ -31,6 +35,23 @@ struct CompleteS {
 // LLVM:      @cs = dso_local global %struct.CompleteS zeroinitializer
 // OGCG:      @cs = global %struct.CompleteS zeroinitializer, align 4
 
+struct InnerS {
+  int a;
+  char b;
+};
+
+struct OuterS {
+  struct InnerS is;
+  int c;
+};
+
+struct OuterS os;
+
+// CIR:       cir.global external @os = #cir.zero : !cir.record<struct
+// CIR-SAME:      "OuterS" {!cir.record<struct "InnerS" {!s32i, !s8i}>, !s32i}>
+// LLVM:      @os = dso_local global %struct.OuterS zeroinitializer
+// OGCG:      @os = global %struct.OuterS zeroinitializer, align 4
+
 #pragma pack(push)
 #pragma pack(1)
 

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 oddity that I'm sure comes from 'the future' :D Else LGTM.

@andykaylor andykaylor merged commit 3191cfd into llvm:main Apr 18, 2025
14 checks passed
@andykaylor andykaylor deleted the cir-nested-structs branch April 18, 2025 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants