Skip to content

[CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined functions #75385

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 11, 2024

Conversation

dzhidzhoev
Copy link
Member

  • [DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)
  • [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined functions

This is a follow-up for https://reviews.llvm.org/D144006, fixing a crash reported
in Chromium (https://reviews.llvm.org/D144006#4651955).

The first commit is added for convenience, as it has already been accepted.

If DISubpogram was not cloned (e.g. we are cloning a function that has other
functions inlined into it, and subprograms of the inlined functions are
not supposed to be cloned), it doesn't make sense to clone its
DILocalVariables as well.
Otherwise get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.

This is meant to be committed along with https://reviews.llvm.org/D144006.

@llvmbot llvmbot added clang Clang issues not falling into any other category debuginfo llvm:ir llvm:transforms labels Dec 13, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 13, 2023

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-debuginfo
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-clang

Author: Vladislav Dzhidzhoev (dzhidzhoev)

Changes
  • [DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)
  • [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined functions

This is a follow-up for https://reviews.llvm.org/D144006, fixing a crash reported
in Chromium (https://reviews.llvm.org/D144006#4651955).

The first commit is added for convenience, as it has already been accepted.

If DISubpogram was not cloned (e.g. we are cloning a function that has other
functions inlined into it, and subprograms of the inlined functions are
not supposed to be cloned), it doesn't make sense to clone its
DILocalVariables as well.
Otherwise get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.

This is meant to be committed along with https://reviews.llvm.org/D144006.


Patch is 97.26 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/75385.diff

31 Files Affected:

  • (modified) clang/test/CodeGen/debug-info-codeview-unnamed.c (+8-8)
  • (modified) clang/test/CodeGen/debug-info-unused-types.c (+9-7)
  • (modified) clang/test/CodeGen/debug-info-unused-types.cpp (+8-6)
  • (modified) clang/test/CodeGenCXX/debug-info-access.cpp (+1-1)
  • (modified) clang/test/CodeGenCXX/debug-info-anon-union-vars.cpp (+6-6)
  • (modified) clang/test/CodeGenCXX/debug-info-codeview-unnamed.cpp (+62-48)
  • (modified) clang/test/CodeGenCXX/debug-info-gline-tables-only-codeview.cpp (+2-2)
  • (modified) clang/test/CodeGenCXX/debug-lambda-this.cpp (+2-2)
  • (modified) llvm/include/llvm/IR/DIBuilder.h (+3-3)
  • (modified) llvm/include/llvm/IR/DebugInfo.h (+9-1)
  • (modified) llvm/lib/Bitcode/Reader/MetadataLoader.cpp (+78-33)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp (+40-20)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h (+8-8)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+10-3)
  • (modified) llvm/lib/IR/DIBuilder.cpp (+27-10)
  • (modified) llvm/lib/IR/DebugInfo.cpp (+10-4)
  • (modified) llvm/lib/IR/Verifier.cpp (+3-3)
  • (modified) llvm/lib/Transforms/Utils/CloneFunction.cpp (+13-1)
  • (added) llvm/test/Bitcode/clone-local-types.ll (+46)
  • (modified) llvm/test/Bitcode/upgrade-cu-locals.ll (+41-27)
  • (modified) llvm/test/Bitcode/upgrade-cu-locals.ll.bc ()
  • (added) llvm/test/DebugInfo/Generic/inlined-local-type.ll (+128)
  • (added) llvm/test/DebugInfo/Generic/lexical-block-retained-types.ll (+55)
  • (added) llvm/test/DebugInfo/Generic/lexical-block-types.ll (+425)
  • (modified) llvm/test/DebugInfo/Generic/verifier-invalid-disubprogram.ll (+1-1)
  • (added) llvm/test/DebugInfo/X86/local-type-as-template-parameter.ll (+161)
  • (modified) llvm/test/DebugInfo/X86/set.ll (+2-2)
  • (renamed) llvm/test/DebugInfo/X86/split-dwarf-local-import.ll (-1)
  • (renamed) llvm/test/DebugInfo/X86/split-dwarf-local-import2.ll (-1)
  • (renamed) llvm/test/DebugInfo/X86/split-dwarf-local-import3.ll ()
  • (modified) llvm/unittests/Transforms/Utils/CloningTest.cpp (+105)
diff --git a/clang/test/CodeGen/debug-info-codeview-unnamed.c b/clang/test/CodeGen/debug-info-codeview-unnamed.c
index bd2a7543e56b2b..16ffb3682236f1 100644
--- a/clang/test/CodeGen/debug-info-codeview-unnamed.c
+++ b/clang/test/CodeGen/debug-info-codeview-unnamed.c
@@ -8,23 +8,23 @@ int main(int argc, char* argv[], char* arge[]) {
   //
   struct { int bar; } one = {42};
   //
-  // LINUX:      !{{[0-9]+}} = !DILocalVariable(name: "one"
-  // LINUX-SAME:     type: [[TYPE_OF_ONE:![0-9]+]]
-  // LINUX-SAME:     )
-  // LINUX:      [[TYPE_OF_ONE]] = distinct !DICompositeType(
+  // LINUX:      [[TYPE_OF_ONE:![0-9]+]] = distinct !DICompositeType(
   // LINUX-SAME:     tag: DW_TAG_structure_type
   // LINUX-NOT:      name:
   // LINUX-NOT:      identifier:
   // LINUX-SAME:     )
+  // LINUX:      !{{[0-9]+}} = !DILocalVariable(name: "one"
+  // LINUX-SAME:     type: [[TYPE_OF_ONE]]
+  // LINUX-SAME:     )
   //
-  // MSVC:       !{{[0-9]+}} = !DILocalVariable(name: "one"
-  // MSVC-SAME:      type: [[TYPE_OF_ONE:![0-9]+]]
-  // MSVC-SAME:      )
-  // MSVC:       [[TYPE_OF_ONE]] = distinct !DICompositeType
+  // MSVC:       [[TYPE_OF_ONE:![0-9]+]] = distinct !DICompositeType
   // MSVC-SAME:      tag: DW_TAG_structure_type
   // MSVC-NOT:       name:
   // MSVC-NOT:       identifier:
   // MSVC-SAME:      )
+  // MSVC:       !{{[0-9]+}} = !DILocalVariable(name: "one"
+  // MSVC-SAME:      type: [[TYPE_OF_ONE]]
+  // MSVC-SAME:      )
 
   return 0;
 }
diff --git a/clang/test/CodeGen/debug-info-unused-types.c b/clang/test/CodeGen/debug-info-unused-types.c
index 3e9f7b07658e36..31d608d92a06b4 100644
--- a/clang/test/CodeGen/debug-info-unused-types.c
+++ b/clang/test/CodeGen/debug-info-unused-types.c
@@ -18,13 +18,15 @@ void quux(void) {
 // CHECK: !DICompileUnit{{.+}}retainedTypes: [[RETTYPES:![0-9]+]]
 // CHECK: [[TYPE0:![0-9]+]] = !DICompositeType(tag: DW_TAG_enumeration_type, name: "bar"
 // CHECK: [[TYPE1:![0-9]+]] = !DIEnumerator(name: "BAR"
-// CHECK: [[TYPE2:![0-9]+]] = !DICompositeType(tag: DW_TAG_enumeration_type, name: "z"
-// CHECK: [[TYPE3:![0-9]+]] = !DIEnumerator(name: "Z"
-// CHECK: [[RETTYPES]] = !{[[TYPE4:![0-9]+]], [[TYPE5:![0-9]+]], [[TYPE0]], [[TYPE6:![0-9]+]], {{![0-9]+}}, [[TYPE7:![0-9]+]], [[TYPE2]], [[TYPE8:![0-9]+]]}
-// CHECK: [[TYPE4]] = !DIDerivedType(tag: DW_TAG_typedef, name: "my_int"
-// CHECK: [[TYPE5]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "foo"
-// CHECK: [[TYPE6]] = distinct !DICompositeType(tag: DW_TAG_union_type, name: "baz"
-// CHECK: [[TYPE7]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "y"
+// CHECK: [[RETTYPES]] = !{[[TYPE2:![0-9]+]], [[TYPE3:![0-9]+]], [[TYPE0]], [[TYPE4:![0-9]+]], {{![0-9]+}}}
+// CHECK: [[TYPE2]] = !DIDerivedType(tag: DW_TAG_typedef, name: "my_int"
+// CHECK: [[TYPE3]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "foo"
+// CHECK: [[TYPE4]] = distinct !DICompositeType(tag: DW_TAG_union_type, name: "baz"
+// CHECK: [[SP:![0-9]+]] = distinct !DISubprogram(name: "quux", {{.*}}, retainedNodes: [[SPRETNODES:![0-9]+]]
+// CHECK: [[SPRETNODES]] = !{[[TYPE5:![0-9]+]], [[TYPE6:![0-9]+]], [[TYPE8:![0-9]+]]}
+// CHECK: [[TYPE5]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "y"
+// CHECK: [[TYPE6]] = !DICompositeType(tag: DW_TAG_enumeration_type, name: "z"
+// CHECK: [[TYPE7:![0-9]+]] = !DIEnumerator(name: "Z"
 // CHECK: [[TYPE8]] = distinct !DICompositeType(tag: DW_TAG_union_type, name: "w"
 
 // Check that debug info is not emitted for the typedef, struct, enum, and
diff --git a/clang/test/CodeGen/debug-info-unused-types.cpp b/clang/test/CodeGen/debug-info-unused-types.cpp
index 023cac159faa4b..5b01c6dbb39414 100644
--- a/clang/test/CodeGen/debug-info-unused-types.cpp
+++ b/clang/test/CodeGen/debug-info-unused-types.cpp
@@ -13,12 +13,14 @@ void quux() {
 // CHECK: !DICompileUnit{{.+}}retainedTypes: [[RETTYPES:![0-9]+]]
 // CHECK: [[TYPE0:![0-9]+]] = !DICompositeType(tag: DW_TAG_enumeration_type, name: "baz"
 // CHECK: [[TYPE1:![0-9]+]] = !DIEnumerator(name: "BAZ"
-// CHECK: [[TYPE2:![0-9]+]] = !DICompositeType(tag: DW_TAG_enumeration_type, name: "z"
-// CHECK: [[TYPE3:![0-9]+]] = !DIEnumerator(name: "Z"
-// CHECK: [[RETTYPES]] = !{[[TYPE4:![0-9]+]], [[TYPE5:![0-9]+]], [[TYPE0]], {{![0-9]+}}, [[TYPE6:![0-9]+]], [[TYPE2]]}
-// CHECK: [[TYPE4]] = !DIDerivedType(tag: DW_TAG_typedef, name: "foo"
-// CHECK: [[TYPE5]] = distinct !DICompositeType(tag: DW_TAG_class_type, name: "bar"
-// CHECK: [[TYPE6]] = distinct !DICompositeType(tag: DW_TAG_class_type, name: "y"
+// CHECK: [[RETTYPES]] = !{[[TYPE2:![0-9]+]], [[TYPE3:![0-9]+]], [[TYPE0]], {{![0-9]+}}}
+// CHECK: [[TYPE2]] = !DIDerivedType(tag: DW_TAG_typedef, name: "foo"
+// CHECK: [[TYPE3]] = distinct !DICompositeType(tag: DW_TAG_class_type, name: "bar"
+// CHECK: [[SP:![0-9]+]] = distinct !DISubprogram(name: "quux", {{.*}}, retainedNodes: [[SPRETNODES:![0-9]+]]
+// CHECK: [[SPRETNODES]] = !{[[TYPE4:![0-9]+]], [[TYPE5:![0-9]+]]}
+// CHECK: [[TYPE4]] = distinct !DICompositeType(tag: DW_TAG_class_type, name: "y", scope: [[SP]]
+// CHECK: [[TYPE5]] = !DICompositeType(tag: DW_TAG_enumeration_type, name: "z", scope: [[SP]]
+// CHECK: [[TYPE6:![0-9]+]] = !DIEnumerator(name: "Z"
 
 // NODBG-NOT: !DI{{CompositeType|Enumerator|DerivedType}}
 
diff --git a/clang/test/CodeGenCXX/debug-info-access.cpp b/clang/test/CodeGenCXX/debug-info-access.cpp
index 9f2c044843d0f0..7c0bf8ccb03842 100644
--- a/clang/test/CodeGenCXX/debug-info-access.cpp
+++ b/clang/test/CodeGenCXX/debug-info-access.cpp
@@ -18,9 +18,9 @@ class B : public A {
   static int public_static;
 
 protected:
+  // CHECK-DAG: !DIDerivedType(tag: DW_TAG_typedef, name: "prot_using",{{.*}} line: [[@LINE+3]],{{.*}} flags: DIFlagProtected)
   // CHECK-DAG: !DIDerivedType(tag: DW_TAG_typedef, name: "prot_typedef",{{.*}} line: [[@LINE+1]],{{.*}} flags: DIFlagProtected)
   typedef int prot_typedef;
-  // CHECK-DAG: !DIDerivedType(tag: DW_TAG_typedef, name: "prot_using",{{.*}} line: [[@LINE+1]],{{.*}} flags: DIFlagProtected)
   using prot_using = prot_typedef;
   prot_using prot_member;
 
diff --git a/clang/test/CodeGenCXX/debug-info-anon-union-vars.cpp b/clang/test/CodeGenCXX/debug-info-anon-union-vars.cpp
index 61b3c7c0526c8e..c91cf83c0405fe 100644
--- a/clang/test/CodeGenCXX/debug-info-anon-union-vars.cpp
+++ b/clang/test/CodeGenCXX/debug-info-anon-union-vars.cpp
@@ -51,13 +51,13 @@ void instantiate(int x) {
 // CHECK: !DIGlobalVariable(name: "b",{{.*}} file: [[FILE]], line: 6,{{.*}} isLocal: true, isDefinition: true
 // CHECK: !DIGlobalVariable(name: "result", {{.*}} isLocal: false, isDefinition: true
 // CHECK: !DIGlobalVariable(name: "value", {{.*}} isLocal: false, isDefinition: true
-// CHECK: !DILocalVariable(name: "i", {{.*}}, flags: DIFlagArtificial
-// CHECK: !DILocalVariable(name: "c", {{.*}}, flags: DIFlagArtificial
-// CHECK: !DILocalVariable(
-// CHECK-NOT: name:
-// CHECK: type: ![[UNION:[0-9]+]]
-// CHECK: ![[UNION]] = distinct !DICompositeType(tag: DW_TAG_union_type,
+// CHECK: ![[UNION:[0-9]+]] = distinct !DICompositeType(tag: DW_TAG_union_type,
 // CHECK-NOT: name:
 // CHECK: elements
 // CHECK: !DIDerivedType(tag: DW_TAG_member, name: "i", scope: ![[UNION]],
 // CHECK: !DIDerivedType(tag: DW_TAG_member, name: "c", scope: ![[UNION]],
+// CHECK: !DILocalVariable(name: "i", {{.*}}, flags: DIFlagArtificial
+// CHECK: !DILocalVariable(name: "c", {{.*}}, flags: DIFlagArtificial
+// CHECK: !DILocalVariable(
+// CHECK-NOT: name:
+// CHECK: type: ![[UNION]]
diff --git a/clang/test/CodeGenCXX/debug-info-codeview-unnamed.cpp b/clang/test/CodeGenCXX/debug-info-codeview-unnamed.cpp
index b4c79936ab33e6..9602ac1b024973 100644
--- a/clang/test/CodeGenCXX/debug-info-codeview-unnamed.cpp
+++ b/clang/test/CodeGenCXX/debug-info-codeview-unnamed.cpp
@@ -3,6 +3,60 @@
 
 int main(int argc, char* argv[], char* arge[]) {
   //
+  // LINUX:      [[TYPE_OF_ONE:![0-9]+]] = distinct !DICompositeType(
+  // LINUX-SAME:     tag: DW_TAG_structure_type
+  // LINUX-NOT:      name:
+  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     )
+  //
+  // MSVC:       [[TYPE_OF_ONE:![0-9]+]] = distinct !DICompositeType
+  // MSVC-SAME:      tag: DW_TAG_structure_type
+  // MSVC-SAME:      name: "<unnamed-type-one>"
+  // MSVC-SAME:      identifier: ".?AU<unnamed-type-one>@?1??main@@9@"
+  // MSVC-SAME:      )
+
+
+  //
+  // LINUX:      [[TYPE_OF_TWO:![0-9]+]] = distinct !DICompositeType(
+  // LINUX-SAME:     tag: DW_TAG_structure_type
+  // LINUX-NOT:      name:
+  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     )
+  //
+  // MSVC:       [[TYPE_OF_TWO:![0-9]+]] = distinct !DICompositeType
+  // MSVC-SAME:      tag: DW_TAG_structure_type
+  // MSVC-SAME:      name: "<unnamed-type-two>"
+  // MSVC-SAME:      identifier: ".?AU<unnamed-type-two>@?2??main@@9@"
+  // MSVC-SAME:      )
+
+
+  //
+  // LINUX:      [[TYPE_OF_THREE:![0-9]+]] = distinct !DICompositeType(
+  // LINUX-SAME:     tag: DW_TAG_structure_type
+  // LINUX-SAME:     name: "named"
+  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     )
+  //
+  // MSVC:       [[TYPE_OF_THREE:![0-9]+]] = distinct !DICompositeType
+  // MSVC-SAME:      tag: DW_TAG_structure_type
+  // MSVC-SAME:      name: "named"
+  // MSVC-SAME:      identifier: ".?AUnamed@?1??main@@9@"
+  // MSVC-SAME:      )
+
+  //
+  // LINUX:      [[TYPE_OF_FOUR:![0-9]+]] = distinct !DICompositeType(
+  // LINUX-SAME:     tag: DW_TAG_class_type
+  // LINUX-NOT:      name:
+  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     )
+  //
+  // MSVC:       [[TYPE_OF_FOUR:![0-9]+]] = distinct !DICompositeType
+  // MSVC-SAME:      tag: DW_TAG_class_type
+  // MSVC-SAME:      name: "<lambda_0>"
+  // MSVC-SAME:      identifier: ".?AV<lambda_0>@?0??main@@9@"
+  // MSVC-SAME:      )
+
+
   // In CodeView, the LF_MFUNCTION entry for "bar()" refers to the forward
   // reference of the unnamed struct. Visual Studio requires a unique
   // identifier to match the LF_STRUCTURE forward reference to the definition.
@@ -10,21 +64,11 @@ int main(int argc, char* argv[], char* arge[]) {
   struct { void bar() {} } one;
   //
   // LINUX:      !{{[0-9]+}} = !DILocalVariable(name: "one"
-  // LINUX-SAME:     type: [[TYPE_OF_ONE:![0-9]+]]
-  // LINUX-SAME:     )
-  // LINUX:      [[TYPE_OF_ONE]] = distinct !DICompositeType(
-  // LINUX-SAME:     tag: DW_TAG_structure_type
-  // LINUX-NOT:      name:
-  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     type: [[TYPE_OF_ONE]]
   // LINUX-SAME:     )
   //
   // MSVC:       !{{[0-9]+}} = !DILocalVariable(name: "one"
-  // MSVC-SAME:      type: [[TYPE_OF_ONE:![0-9]+]]
-  // MSVC-SAME:      )
-  // MSVC:       [[TYPE_OF_ONE]] = distinct !DICompositeType
-  // MSVC-SAME:      tag: DW_TAG_structure_type
-  // MSVC-SAME:      name: "<unnamed-type-one>"
-  // MSVC-SAME:      identifier: ".?AU<unnamed-type-one>@?1??main@@9@"
+  // MSVC-SAME:      type: [[TYPE_OF_ONE]]
   // MSVC-SAME:      )
 
 
@@ -36,21 +80,11 @@ int main(int argc, char* argv[], char* arge[]) {
   int decltype(two)::*ptr2unnamed = &decltype(two)::bar;
   //
   // LINUX:      !{{[0-9]+}} = !DILocalVariable(name: "two"
-  // LINUX-SAME:     type: [[TYPE_OF_TWO:![0-9]+]]
-  // LINUX-SAME:     )
-  // LINUX:      [[TYPE_OF_TWO]] = distinct !DICompositeType(
-  // LINUX-SAME:     tag: DW_TAG_structure_type
-  // LINUX-NOT:      name:
-  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     type: [[TYPE_OF_TWO]]
   // LINUX-SAME:     )
   //
   // MSVC:       !{{[0-9]+}} = !DILocalVariable(name: "two"
-  // MSVC-SAME:      type: [[TYPE_OF_TWO:![0-9]+]]
-  // MSVC-SAME:      )
-  // MSVC:       [[TYPE_OF_TWO]] = distinct !DICompositeType
-  // MSVC-SAME:      tag: DW_TAG_structure_type
-  // MSVC-SAME:      name: "<unnamed-type-two>"
-  // MSVC-SAME:      identifier: ".?AU<unnamed-type-two>@?2??main@@9@"
+  // MSVC-SAME:      type: [[TYPE_OF_TWO]]
   // MSVC-SAME:      )
 
 
@@ -61,21 +95,11 @@ int main(int argc, char* argv[], char* arge[]) {
   struct named { int bar; int named::* p2mem; } three = { 42, &named::bar };
   //
   // LINUX:      !{{[0-9]+}} = !DILocalVariable(name: "three"
-  // LINUX-SAME:     type: [[TYPE_OF_THREE:![0-9]+]]
-  // LINUX-SAME:     )
-  // LINUX:      [[TYPE_OF_THREE]] = distinct !DICompositeType(
-  // LINUX-SAME:     tag: DW_TAG_structure_type
-  // LINUX-SAME:     name: "named"
-  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     type: [[TYPE_OF_THREE]]
   // LINUX-SAME:     )
   //
   // MSVC:       !{{[0-9]+}} = !DILocalVariable(name: "three"
-  // MSVC-SAME:      type: [[TYPE_OF_THREE:![0-9]+]]
-  // MSVC-SAME:      )
-  // MSVC:       [[TYPE_OF_THREE]] = distinct !DICompositeType
-  // MSVC-SAME:      tag: DW_TAG_structure_type
-  // MSVC-SAME:      name: "named"
-  // MSVC-SAME:      identifier: ".?AUnamed@?1??main@@9@"
+  // MSVC-SAME:      type: [[TYPE_OF_THREE]]
   // MSVC-SAME:      )
 
 
@@ -87,21 +111,11 @@ int main(int argc, char* argv[], char* arge[]) {
   auto four = [argc](int i) -> int { return argc == i ? 1 : 0; };
   //
   // LINUX:      !{{[0-9]+}} = !DILocalVariable(name: "four"
-  // LINUX-SAME:     type: [[TYPE_OF_FOUR:![0-9]+]]
-  // LINUX-SAME:     )
-  // LINUX:      [[TYPE_OF_FOUR]] = distinct !DICompositeType(
-  // LINUX-SAME:     tag: DW_TAG_class_type
-  // LINUX-NOT:      name:
-  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     type: [[TYPE_OF_FOUR]]
   // LINUX-SAME:     )
   //
   // MSVC:       !{{[0-9]+}} = !DILocalVariable(name: "four"
-  // MSVC-SAME:      type: [[TYPE_OF_FOUR:![0-9]+]]
-  // MSVC-SAME:      )
-  // MSVC:       [[TYPE_OF_FOUR]] = distinct !DICompositeType
-  // MSVC-SAME:      tag: DW_TAG_class_type
-  // MSVC-SAME:      name: "<lambda_0>"
-  // MSVC-SAME:      identifier: ".?AV<lambda_0>@?0??main@@9@"
+  // MSVC-SAME:      type: [[TYPE_OF_FOUR]]
   // MSVC-SAME:      )
 
   return 0;
diff --git a/clang/test/CodeGenCXX/debug-info-gline-tables-only-codeview.cpp b/clang/test/CodeGenCXX/debug-info-gline-tables-only-codeview.cpp
index 6b9c9a243decd1..122e4aa62ea7df 100644
--- a/clang/test/CodeGenCXX/debug-info-gline-tables-only-codeview.cpp
+++ b/clang/test/CodeGenCXX/debug-info-gline-tables-only-codeview.cpp
@@ -51,9 +51,9 @@ void test() {
   // CHECK-SAME:                              name: "<lambda_2_1>",
   c.lambda_params();
 
-  // CHECK: !DISubprogram(name: "operator()", scope: ![[LAMBDA1:[0-9]+]],
-  // CHECK: ![[LAMBDA1]] = !DICompositeType(tag: DW_TAG_class_type,
+  // CHECK: ![[LAMBDA1:[0-9]+]] = !DICompositeType(tag: DW_TAG_class_type,
   // CHECK-SAME:                            name: "<lambda_1>",
   // CHECK-SAME:                            flags: DIFlagFwdDecl
+  // CHECK: !DISubprogram(name: "operator()", scope: ![[LAMBDA1]],
   c.lambda2();
 }
diff --git a/clang/test/CodeGenCXX/debug-lambda-this.cpp b/clang/test/CodeGenCXX/debug-lambda-this.cpp
index eecbac6520ac97..3d659e7bfd004a 100644
--- a/clang/test/CodeGenCXX/debug-lambda-this.cpp
+++ b/clang/test/CodeGenCXX/debug-lambda-this.cpp
@@ -13,10 +13,10 @@ int D::d(int x) {
 }
 
 // CHECK: ![[D:[0-9]+]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "D",
-// CHECK: ![[POINTER:.*]] = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: ![[D]], size: 64)
 // CHECK: !DIDerivedType(tag: DW_TAG_member, name: "this",
 // CHECK-SAME:           line: 11
-// CHECK-SAME:           baseType: ![[POINTER]]
+// CHECK-SAME:           baseType: ![[POINTER:[0-9]+]]
 // CHECK-SAME:           size: 64
 // CHECK-NOT:            offset: 0
 // CHECK-SAME:           ){{$}}
+// CHECK: ![[POINTER]] = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: ![[D]], size: 64)
diff --git a/llvm/include/llvm/IR/DIBuilder.h b/llvm/include/llvm/IR/DIBuilder.h
index edec161b397155..166cd664aec1de 100644
--- a/llvm/include/llvm/IR/DIBuilder.h
+++ b/llvm/include/llvm/IR/DIBuilder.h
@@ -49,7 +49,7 @@ namespace llvm {
     Function *LabelFn;       ///< llvm.dbg.label
     Function *AssignFn;      ///< llvm.dbg.assign
 
-    SmallVector<TrackingMDNodeRef, 4> AllEnumTypes;
+    SmallVector<TrackingMDNodeRef, 4> EnumTypes;
     /// Track the RetainTypes, since they can be updated later on.
     SmallVector<TrackingMDNodeRef, 4> AllRetainTypes;
     SmallVector<DISubprogram *, 4> AllSubprograms;
@@ -64,8 +64,8 @@ namespace llvm {
     SmallVector<TrackingMDNodeRef, 4> UnresolvedNodes;
     bool AllowUnresolvedNodes;
 
-    /// Each subprogram's preserved local variables, labels and imported
-    /// entities.
+    /// Each subprogram's preserved local variables, labels, imported entities,
+    /// and types.
     ///
     /// Do not use a std::vector.  Some versions of libc++ apparently copy
     /// instead of move on grow operations, and TrackingMDRef is expensive to
diff --git a/llvm/include/llvm/IR/DebugInfo.h b/llvm/include/llvm/IR/DebugInfo.h
index 36ef77f9505bc1..d60011e14ed172 100644
--- a/llvm/include/llvm/IR/DebugInfo.h
+++ b/llvm/include/llvm/IR/DebugInfo.h
@@ -104,7 +104,7 @@ class DebugInfoFinder {
   void processInstruction(const Module &M, const Instruction &I);
 
   /// Process a DILocalVariable.
-  void processVariable(const Module &M, const DILocalVariable *DVI);
+  void processVariable(DILocalVariable *DVI);
   /// Process debug info location.
   void processLocation(const Module &M, const DILocation *Loc);
   // Process a DPValue, much like a DbgVariableIntrinsic.
@@ -120,6 +120,7 @@ class DebugInfoFinder {
   void processCompileUnit(DICompileUnit *CU);
   void processScope(DIScope *Scope);
   void processType(DIType *DT);
+  void processLocalVariable(DILocalVariable *DV);
   bool addCompileUnit(DICompileUnit *CU);
   bool addGlobalVariable(DIGlobalVariableExpression *DIG);
   bool addScope(DIScope *Scope);
@@ -134,6 +135,7 @@ class DebugInfoFinder {
       SmallVectorImpl<DIGlobalVariableExpression *>::const_iterator;
   using type_iterator = SmallVectorImpl<DIType *>::const_iterator;
   using scope_iterator = SmallVectorImpl<DIScope *>::const_iterator;
+  using local_variable_iterator = SmallVectorImpl<DILocalVariable *>::const_iterator;
 
   iterator_range<compile_unit_iterator> compile_units() const {
     return make_range(CUs.begin(), CUs.end());
@@ -147,6 +149,10 @@ class DebugInfoFinder {
     return make_range(GVs.begin(), GVs.end());
   }
 
+  iterator_range<local_variable_iterator> local_variables() const {
+    return make_range(LVs.begin(), LVs.end());
+  }
+
   iterator_range<type_iterator> types() const {
     return make_range(TYs.begin(), TYs.end());
   }
@@ -157,6 +163,7 @@ class DebugInfoFinder {
 
   unsigned compile_unit_count() const { return CUs.size(); }
   unsigned global_variable_count() const { return GVs.size(); }
+  unsigned local_variable_count() const { return LVs.size(); }
   unsigned subprogram_count() const { return SPs.size(); }
   unsigned type_count() const { return TYs.size(); }
   unsigned scope_count() const { return Scopes.size(); }
@@ -165,6 +172,7 @@ class DebugInfoFinder {
   SmallVector<DICompileUnit *, 8> CUs;
   SmallVector<DISubprogram *, 8> SPs;
   SmallVector<DIGlobalVariableExpression *, 8> GVs;
+  SmallVector<DILocalVariable *, 8> LVs;
   SmallVector<DIType *, 8> TYs;
   SmallVector<DIScope *, 8> Scopes;
   SmallPtrSet<const MDNode *, 32> NodesSeen;
diff --git a/llvm/lib/Bitcode/Reader/MetadataLoader.cpp b/llvm/lib/Bitcode/Reader/MetadataLoader.cpp
index 910e97489dbbe0..d118695b431c5a 100644
--- a/llvm/lib/Bitcode/Reader/MetadataLoader.cpp
+++ b/llvm/lib/Bitcode/Reader/MetadataLoader.cpp
@@ -547,6 +547,8 @@ class MetadataLoader::MetadataLoaderImpl {
 
   /// Move local imports from DICompileUnit's 'imports' field to
   /// DISubprogram's retainedNodes.
+  /// Move fucntion-local enums from DICompileUnit's enums
+  /// to DISubprogram's retainedNodes.
   void upgradeCULocals() {
     if (NamedMDNode *CUNodes = TheModule.getNamedMetadata("llvm.dbg.cu")) {
       for (unsigned I = 0, E = CUNodes->getNumOperands(); I != E; ++I) {
@@ -554,48 +556,66 @@ class MetadataLoader::MetadataLoaderImpl {
         if (!CU)
           continue;
 
-        if (CU->getRawImportedEntities()) {
-          // Collect a set of imported entities to be moved.
-          SetVector<Metadata *> EntitiesToRemove;
+        SetVector<Metadata *> MetadataToRemove;
+        // Collect imported entities to be moved.
+        if (CU->getRawImportedEntities())
           for (Metadata *Op : CU->getImportedEntities()->operands()) {
             auto *IE = cast<DIImportedEntity>(Op);
-            if (dyn_cast_or_null<DILocalScope>(IE->g...
[truncated]

Copy link

github-actions bot commented Dec 13, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -238,6 +238,13 @@ void llvm::CloneFunctionInto(Function *NewFunc, const Function *OldFunc,
}
}

// Avoid cloning local variables of subprograms that won't be cloned.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a sentence explaining why some subprograms won't be cloned? Is it because they are inlined and a copy already exists?

Copy link
Collaborator

@adrian-prantl adrian-prantl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this LGTM, but it would be good if someone else also took a look.

@dzhidzhoev
Copy link
Member Author

@dwblaikie
Could you please take a look at this when you have a chance?

Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, looks OK to me - sorry for the delay.

…lock scopes (4/7)

RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544

Similar to imported declarations, the patch tracks function-local types in
DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with
the aforementioned metadata change and provided a support of function-local
types scoped within a lexical block.

The patch assumes that DICompileUnit's 'enums field' no longer tracks local
types and DwarfDebug would assert if any locally-scoped types get placed there.

Additionally, if DISubprogram is not cloned in CloneFunctionInto(),
cloning of its DILocalVariables is avoided.
Otherwise we get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.

Reviewed By: jmmartinez
Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144006
@dzhidzhoev dzhidzhoev merged commit fc6faa1 into llvm:main Jan 11, 2024
@jmorse
Copy link
Member

jmorse commented Jan 16, 2024

Hi,

We're seeing a crash with this commit and reproducer https://gist.github.com/jmorse/b0248c3c9f9195487ffd7c7431a8d15e

llc: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2338: virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *): Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms && "getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed.

I'd previously posted the reproducer on https://reviews.llvm.org/D144006#4656728 , however I'd anonymised the IR too much to the point where it was broken in unrelated ways. Revision 2 of the gist, as linked, should produce the crash. I suspect the extra lexical scopes reachable through the retained-nodes list also need to be explored when the LexicalScopes object gets constructed, to avoid scopes being added late and causing containers to invalidate iterators. (Which is what that assertion is there to detect).

@zmodem
Copy link
Collaborator

zmodem commented Jan 16, 2024

We're hitting the same assert (tracking in https://crbug.com/1518841) but don't have a good reproducer to share (it's during lto of a large binary). Jeremy's reproducer seems good though.

Can we revert this while it's being investigated, please?

@dcci
Copy link
Member

dcci commented Jan 17, 2024

Hitting the same problem on a bunch of services. I am going to revert this to unblock people, our repro is similar enough to the one Jeremy shared.

$ ./exe
clang -cc1 version 18.0.0git based upon LLVM 18.0.0git default target x86_64-redhat-linux-gnu
clang++: /home/davidino/llvm-project/llvm/include/llvm/Support/Casting.h:109: static bool llvm::isa_impl_cl<llvm::DILocalScope, const llvm::DIScope *>::doit(const From *) [To = llvm::DILocalScope, From = const llvm::DIScope *]: Assertion `Val && "isa<> used on a null pointer"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /home/davidino/llvm-build-upstream/bin/clang++ @/tmp/fbcc.4ddf7_1j/compiler.argsfile
1.      Running pass 'Function Pass Manager' on module 'buck-out/v2/gen/fbcode/349d8ab5ad6260b7/dsi/logger/configs/CacheClientFciRequestLoggerConfig/__logger__/__objects__/Logger.cpp.o'.
2.      Running pass 'X86 Assembly Printer' on function '@_ZN8facebook6logger27CacheClientFciRequestLogger9invokeLogIDnDnEEibT_PKciT0_'
 #0 0x00007f123a20a448 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/davidino/llvm-build-upstream/bin/../lib/libLLVMSupport.so.18git+0x20a448)
 #1 0x00007f123a207eee llvm::sys::RunSignalHandlers() (/home/davidino/llvm-build-upstream/bin/../lib/libLLVMSupport.so.18git+0x207eee)
 #2 0x00007f123a20ab18 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f1239854db0 __restore_rt (/lib64/libc.so.6+0x54db0)
 #4 0x00007f12398a365c __pthread_kill_implementation (/lib64/libc.so.6+0xa365c)
 #5 0x00007f1239854d06 gsignal (/lib64/libc.so.6+0x54d06)
 #6 0x00007f12398287f3 abort (/lib64/libc.so.6+0x287f3)
 #7 0x00007f123982871b _nl_load_domain.cold (/lib64/libc.so.6+0x2871b)
 #8 0x00007f123984dca6 (/lib64/libc.so.6+0x4dca6)
 #9 0x00007f123d3f54c3 getRetainedNodeScope(llvm::MDNode const*) DwarfDebug.cpp:0:0
#10 0x00007f123d3f686a llvm::DwarfDebug::endFunctionImpl(llvm::MachineFunction const*) (/home/davidino/llvm-build-upstream/bin/../lib/../lib/libLLVMAsmPrinter.so.18git+0x10086a)
#11 0x00007f123d3ce46f llvm::DebugHandlerBase::endFunction(llvm::MachineFunction const*) (/home/davidino/llvm-build-upstream/bin/../lib/../lib/libLLVMAsmPrinter.so.18git+0xd846f)
#12 0x00007f123d3a70fb llvm::AsmPrinter::emitFunctionBody() (/home/davidino/llvm-build-upstream/bin/../lib/../lib/libLLVMAsmPrinter.so.18git+0xb10fb)
#13 0x00007f1240c3db08 llvm::X86AsmPrinter::runOnMachineFunction(llvm::MachineFunction&) X86AsmPrinter.cpp:0:0
#14 0x00007f123e1de48d llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/davidino/llvm-build-upstream/bin/../lib/libLLVMCodeGen.so.18git+0x5de48d)
#15 0x00007f123a8a0842 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/davidino/llvm-build-upstream/bin/../lib/libLLVMCore.so.18git+0x4a0842)
#16 0x00007f123a8a7542 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/davidino/llvm-build-upstream/bin/../lib/libLLVMCore.so.18git+0x4a7542)
#17 0x00007f123a8a0f56 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/davidino/llvm-build-upstream/bin/../lib/libLLVMCore.so.18git+0x4a0f56)
#18 0x00007f123cbd248a codegen(llvm::lto::Config const&, llvm::TargetMachine*, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex const&) LTOBackend.cpp:0:0
#19 0x00007f123cbd35cb llvm::lto::thinBackend(llvm::lto::Config const&, unsigned int, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, llvm::Module&, llvm::ModuleSummaryIndex const&, llvm::DenseMap<llvm::StringRef, std::unordered_set<unsigned long, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<unsigned long>>, llvm::DenseMapInfo<llvm::StringRef, void>, llvm::detail::DenseMapPair<llvm::StringRef, std::unordered_set<unsigned long, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<unsigned long>>>> const&, llvm::DenseMap<unsigned long, llvm::GlobalValueSummary*, llvm::DenseMapInfo<unsigned long, void>, llvm::detail::DenseMapPair<unsigned long, llvm::GlobalValueSummary*>> const&, llvm::MapVector<llvm::StringRef, llvm::BitcodeModule, llvm::DenseMap<llvm::StringRef, unsigned int, llvm::DenseMapInfo<llvm::StringRef, void>, llvm::detail::DenseMapPair<llvm::StringRef, unsigned int>>, llvm::SmallVector<std::pair<llvm::StringRef, llvm::BitcodeModule>, 0u>>*, std::vector<unsigned char, std::allocator<unsigned char>> const&)::$_6::operator()(llvm::Module&, llvm::TargetMachine*, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>) const LTOBackend.cpp:0:0
#20 0x00007f123cbd3329 llvm::lto::thinBackend(llvm::lto::Config const&, unsigned int, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, llvm::Module&, llvm::ModuleSummaryIndex const&, llvm::DenseMap<llvm::StringRef, std::unordered_set<unsigned long, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<unsigned long>>, llvm::DenseMapInfo<llvm::StringRef, void>, llvm::detail::DenseMapPair<llvm::StringRef, std::unordered_set<unsigned long, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<unsigned long>>>> const&, llvm::DenseMap<unsigned long, llvm::GlobalValueSummary*, llvm::DenseMapInfo<unsigned long, void>, llvm::detail::DenseMapPair<unsigned long, llvm::GlobalValueSummary*>> const&, llvm::MapVector<llvm::StringRef, llvm::BitcodeModule, llvm::DenseMap<llvm::StringRef, unsigned int, llvm::DenseMapInfo<llvm::StringRef, void>, llvm::detail::DenseMapPair<llvm::StringRef, unsigned int>>, llvm::SmallVector<std::pair<llvm::StringRef, llvm::BitcodeModule>, 0u>>*, std::vector<unsigned char, std::allocator<unsigned char>> const&) (/home/davidino/llvm-build-upstream/bin/../lib/../lib/libLLVMLTO.so.18git+0x5a329)
#21 0x00007f123e8a1d0a clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/home/davidino/llvm-build-upstream/bin/../lib/libclangCodeGen.so.18git+0x2a1d0a)
#22 0x00007f123ed589dc clang::CodeGenAction::ExecuteAction() (/home/davidino/llvm-build-upstream/bin/../lib/libclangCodeGen.so.18git+0x7589dc)
#23 0x00007f123cd98aaf clang::FrontendAction::Execute() (/home/davidino/llvm-build-upstream/bin/../lib/libclangFrontend.so.18git+0x198aaf)
#24 0x00007f123ccfb21d clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/davidino/llvm-build-upstream/bin/../lib/libclangFrontend.so.18git+0xfb21d)
#25 0x00007f1240ff67fa clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/davidino/llvm-build-upstream/bin/../lib/libclangFrontendTool.so.18git+0x57fa)
#26 0x0000000000216d0f cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/davidino/llvm-build-upstream/bin/clang+++0x216d0f)
#27 0x00000000002131df ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#28 0x0000000000212497 clang_main(int, char**, llvm::ToolContext const&) (/home/davidino/llvm-build-upstream/bin/clang+++0x212497)

dcci added a commit that referenced this pull request Jan 17, 2024
@dcci
Copy link
Member

dcci commented Jan 17, 2024

Reverted in b6f922f

@dzhidzhoev
Copy link
Member Author

Hi,

We're seeing a crash with this commit and reproducer https://gist.github.com/jmorse/b0248c3c9f9195487ffd7c7431a8d15e

llc: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2338: virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *): Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms && "getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed.

I'd previously posted the reproducer on https://reviews.llvm.org/D144006#4656728 , however I'd anonymised the IR too much to the point where it was broken in unrelated ways. Revision 2 of the gist, as linked, should produce the crash. I suspect the extra lexical scopes reachable through the retained-nodes list also need to be explored when the LexicalScopes object gets constructed, to avoid scopes being added late and causing containers to invalidate iterators. (Which is what that assertion is there to detect).

Hello,

Sorry for bothering you. Could you please share the command you used as an interestingness test to reduce the crash? I've tried running llc/opt/ld.lld, but I haven't been able to reproduce it :(

jmorse added a commit to jmorse/llvm-project that referenced this pull request Jan 23, 2024
@jmorse
Copy link
Member

jmorse commented Jan 23, 2024

Hmmm, that's unexpected -- I reverted the revert (tree here, contains one unrelated commit: https://github.com/jmorse/llvm-project/tree/reapply-localvars-patch) and rebuilt. The assertion-failure occurs just with llc foobar.ll -o out.o -filetype=obj, where foobar.ll is the contents of the gist file linked. For the avoidance of doubt, it's:

$ wget https://gist.githubusercontent.com/jmorse/b0248c3c9f9195487ffd7c7431a8d15e/raw/e57ca0417b77fe04f9551a0a69ce58f0db697527/gistfile1.txt
$ mv gistfile1.txt foobar.ll
$ md5sum foobar.ll
d6c888d4e163545da596071fb5143377  foobar.ll

I don't think there's anything special about the build -- it's a CMAKE_BUILD_TYPE=Release build with LLVM_ENABLE_ASSERTIONS=On, built with clang-14 as shipped by ubuntu.

@dzhidzhoev
Copy link
Member Author

Hmmm, that's unexpected -- I reverted the revert (tree here, contains one unrelated commit: https://github.com/jmorse/llvm-project/tree/reapply-localvars-patch) and rebuilt. The assertion-failure occurs just with llc foobar.ll -o out.o -filetype=obj, where foobar.ll is the contents of the gist file linked. For the avoidance of doubt, it's:

$ wget https://gist.githubusercontent.com/jmorse/b0248c3c9f9195487ffd7c7431a8d15e/raw/e57ca0417b77fe04f9551a0a69ce58f0db697527/gistfile1.txt
$ mv gistfile1.txt foobar.ll
$ md5sum foobar.ll
d6c888d4e163545da596071fb5143377  foobar.ll

I don't think there's anything special about the build -- it's a CMAKE_BUILD_TYPE=Release build with LLVM_ENABLE_ASSERTIONS=On, built with clang-14 as shipped by ubuntu.

Thanks a lot! I've stubbornly tried to pass it through llvm-as before feeding it to llc.

dzhidzhoev added a commit to dzhidzhoev/llvm-project that referenced this pull request Jan 26, 2024
…functions (llvm#75385)

- [DebugMetadata][DwarfDebug] Support function-local types in lexical
block scopes (4/7)
- [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined
functions

This is a follow-up for https://reviews.llvm.org/D144006, fixing a crash
reported
in Chromium (https://reviews.llvm.org/D144006#4651955).

The first commit is added for convenience, as it has already been
accepted.

If DISubpogram was not cloned (e.g. we are cloning a function that has
other
functions inlined into it, and subprograms of the inlined functions are
not supposed to be cloned), it doesn't make sense to clone its
DILocalVariables as well.
Otherwise get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.

This is meant to be committed along with
https://reviews.llvm.org/D144006.
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
…functions (llvm#75385)

- [DebugMetadata][DwarfDebug] Support function-local types in lexical
block scopes (4/7)
- [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined
functions

This is a follow-up for https://reviews.llvm.org/D144006, fixing a crash
reported
in Chromium (https://reviews.llvm.org/D144006#4651955).

The first commit is added for convenience, as it has already been
accepted.

If DISubpogram was not cloned (e.g. we are cloning a function that has
other
functions inlined into it, and subprograms of the inlined functions are
not supposed to be cloned), it doesn't make sense to clone its
DILocalVariables as well.
Otherwise get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.

This is meant to be committed along with
https://reviews.llvm.org/D144006.
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
@dzhidzhoev
Copy link
Member Author

Hi,

We're seeing a crash with this commit and reproducer https://gist.github.com/jmorse/b0248c3c9f9195487ffd7c7431a8d15e

llc: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2338: virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *): Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms && "getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed.

I'd previously posted the reproducer on https://reviews.llvm.org/D144006#4656728 , however I'd anonymised the IR too much to the point where it was broken in unrelated ways. Revision 2 of the gist, as linked, should produce the crash. I suspect the extra lexical scopes reachable through the retained-nodes list also need to be explored when the LexicalScopes object gets constructed, to avoid scopes being added late and causing containers to invalidate iterators. (Which is what that assertion is there to detect).

It seems that the debug info metadata was already malformed before getting into the AsmPrinter pass, probably during the LTO process. I ran into trouble trying to find the root cause of the problem, since the information provided by the reproducer is not sufficient to investigate it.
Could you please provide an IR dump of the earlier stage of compilation or any source code?

@jmorse
Copy link
Member

jmorse commented Jan 31, 2024

Rats; we should be able to pin down where the problem originates from -- would you be able to describe what's malformed about the metadata, as it's not clear to us (sorry). We'll probably be able to llvm-reduce (or similar) around that description.

Unfortunately IIRC it comes from a reasonably large codebase that we can't share, possibly @zmodem @dcci have more accessible sources? (Short response sorry; currently up a mountain).

@jmorse
Copy link
Member

jmorse commented Feb 1, 2024

...ah, actually is it malformed because there's a DICompositeType in the retainedNodes list for a DISubprogram? I remember that the verifier considered that illegal before your patch landed, but not after.

@dzhidzhoev
Copy link
Member Author

...ah, actually is it malformed because there's a DICompositeType in the retainedNodes list for a DISubprogram? I remember that the verifier considered that illegal before your patch landed, but not after.

Having this commit applied, DICompositeTypes with the scope of DISubprogram must appear in the corresponding subprogram's retainedNodes list, so that's not the issue.
However, in the reproducer, DICompositeType node no !54 ("type4") has a scope value !55 ("plimsoles") but appears in the retainedNodes list of subprogram !49 ("crumpets").
!54 must have either scope of !49 or belong to the subprogram !55's retainedNodes list.
This discrepancy was likely to happen in CloneFunctionInto if LTO was used. However, I'm not sure about that.

@jmorse
Copy link
Member

jmorse commented Feb 6, 2024

I see what you're saying about the metadata being incorrect; I feel like I've seen it before, but can't pin it down. For the record, all the builds where we saw this assertion were thin/full LTO.

I've kicked off a reduction of a large reproducer that I have to hand; I'm not immensely confident that it'll complete in a reasonable timescale (i.e.: days) as it's a large project being reduced. If any of the other reporters have anything smaller they can reduce, that'd assist significantly.

@dzhidzhoev
Copy link
Member Author

I see what you're saying about the metadata being incorrect; I feel like I've seen it before, but can't pin it down. For the record, all the builds where we saw this assertion were thin/full LTO.

I've kicked off a reduction of a large reproducer that I have to hand; I'm not immensely confident that it'll complete in a reasonable timescale (i.e.: days) as it's a large project being reduced. If any of the other reporters have anything smaller they can reduce, that'd assist significantly.

Thanks a lot!

@jmorse
Copy link
Member

jmorse commented Feb 12, 2024

EDIT: my favourite part of github is its liberal interpretation of the enter key, will try to add a meaningful comment...

@jmorse
Copy link
Member

jmorse commented Feb 12, 2024

Several decades later; I applied this check to the verifier and reduced around it:

diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index b1cf81a3dbdc..9e2250b584b1 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -1421,6 +1421,9 @@ void Verifier::visitDISubprogram(const DISubprogram &N) {
               "invalid retained nodes, expected DILocalVariable, DILabel, "
               "DIImportedEntity or DIType",
               &N, Node, Op);
+      if (DIType *T = dyn_cast<DIType>(Op)) {
+        CheckDI(T->getScope() == &N, "wrong scope for thingy", T, T->getScope(), &N);
+      }
     }
   }
   CheckDI(!hasConflictingReferenceFlags(N.getFlags()),

Which produced these two IR files:
https://gist.github.com/jmorse/fc7d5479171b9943ae27d0f03cd9db5c
https://gist.github.com/jmorse/17040c19a096dd3780274f8e58d97b16
Which when linked with llvm-link aa.ll bb.ll -o cc.ll -S hit the verifier assertion, and have output IR which exhibits the problem you identified @dzhidzhoev , where the DICompositeType for handle ends up in one DISubprograms retainedNodes list but points its scope at a different DISubprogram. Seemingly this is something to do with the merging (or not merging) of distinct DISubprograms during the IR linker, unfortunately I've zero knowledge about that portion of LLVM.

EDIT: note that the order of the files on the command line seems to be important, it has to be the IR file with source_filename = aa.ll first, then bb.ll.

@jmorse
Copy link
Member

jmorse commented Jul 23, 2024

I think I've nailed down the source of this problem, and it's the matter to do with LLVMContext::enableDebugTypeODRUniquing that aeubanks mentioned on https://reviews.llvm.org/D144006. LLVM will unique DICompositeTypes via their ODR-name, which isn't always safe with function-local types. It's another circumstance where an ODR-violation that doesn't actually cause a linking error can instead cause debug-info to be corrupt. I don't have a reproducer that forces the crash found here but I can demonstrate the fundamental problem.

To reproduce, take the rebased patch from the tip of https://github.com/jmorse/llvm-project/tree/rebased-composite-type-fix-4 , and this code from llvm/test/Linker/odr-lambda-1.ll:

class Error {};
template <typename HandlerTs>
void handleAllErrors(HandlerTs Handlers) {}
inline void consumeError(Error Err) {
  handleAllErrors( []() {});
}
void ArchiveMemberHeader()
{
  consumeError(Error());
}

Compile thusly:

clang input.cpp -o out1.ll -emit-llvm -S -c  -g
clang input.cpp -o out2.ll -emit-llvm -S -c  -g
vi out1.ll # Delete all LLVM function definitions to avoid linking-errors
llvm-link out1.ll out2.ll -o out3.ll -S

The link should succeed because there are no symbol conflicts. In the output IR in out3.ll.txt, observe that there are:

  • Two DISubprogram records for the "consumeError" function,
  • One DICompositeType for _ZTSZ12consumeError5ErrorEUlvE_, "typeinfo name for consumeError(Error)::{lambda()#1}"
  • The DICompositeType is the retainedNodes member of one DISubprogram, but...
  • The DICompositeType's Scope pointer is to the other DISubprogram, that is otherwise unlinked to the rest of the metadata hierarchy.

It's this mismatch where the scope of an element in the retained nodes points somewhere unexpected, which causes the assertion failure reported. It's all because of the ODR-uniquing by type name that LLVM can do, a rationale is here:

// Determines if the debug info for this tag declaration needs a type

When DICompositeType::buildODRType de-duplicates on the basis of the type name, if it's a type inside a function then the scope pointer will point at the DISubprogram where the type was first seen. If there's an ODR violation, this might not be the same as the DISubprogram that is eventually selected for that function.

IIRC there's precedent in LLVM for throwing away lots of debug-info in the presence of ODR violations, but we should avoid hard errors.

@rnk
Copy link
Collaborator

rnk commented Jul 23, 2024

I'm not seeing how your example constitutes an ODR violation, or how merging these lambda types by mangled name is incorrect. They are equivalent in your example. It seems like the issue has more to do with the details of how exactly we do the merge, and where the metadata references from DICompositeType to DISubprogram point.

@jmorse
Copy link
Member

jmorse commented Oct 1, 2024

[This keeps on slipping to the back of my TODO list,]

I've been enlightened by the comments on #68929 about ODR rules, and that there isn't a violation in the example; it does at least exercise the code path of interest, my summary of which is that the ODR-uniquing of types is happening at such a low of a level that it causes surprises and can't be easily fixed. Here's a more well designed reproducer:

inline int foo() {
  class bar {
  private:
    int a = 0;
  public:
    int get_a() { return a; }
  };

  static bar baz;
  return baz.get_a();
}

int a() {
  return foo();
}

Compile and link this similar to above:
clang a.cpp -o b.ll -emit-llvm -S -g -c -O2
clang b.cpp -o b.ll -emit-llvm -S -g -c -O2
llvm-link a.ll b.ll -o c.ll -S
llc c.ll -o out.o -filetype=obj

Where b.cpp is a copy of the file above with the function renamed from 'a' to 'b' to ensure there aren't multiple conflicting definitions. In this code, we inline the body of "foo" into the 'a' and 'b' functions, and furthermore we inline the get_a method of foo::bar too. In each of the compilation units, this leads to a chain of lexical scopes for the most deeply inlined instruction of:

  • get_a method,
  • foo::bar class
  • foo function
  • 'a' or 'b' function.

The trouble comes when the two modules are linked together: the two collections of DILocations / DILexicalScopes / DISubprograms describing source-locations in each module are distinct and kept separate through linking. However the DICompositeType for foo::bar is unique'd based on its name, and its "scope" field will point into one of the metadata collections. Thus, where we used to have two distinct chains of lexical scopes we've now got a tree of them, joining at the unique'd DICompositeType, and llc is not prepared for this.

I don't know that this is a bug, more of a design mismatch: most of the rest of LLVM is probably OK with having the lexical-scope chain actually being a tree, given that it only ever looks up it. However LexicalScopes does a top down exploration of a function looking for lexical scopes, and is then surprised when it finds different scopes looking from the bottom up. We could adjust it to search upwards for more lexical scopes (it already does that for block-scopes), but then I imagine we would produce very confusing DWARF that contained two Subprogram scopes for the same function.

There's also no easy way of working around this in metadata: we can't describe any other metadata relationship because it's done at such a low level, and we can't selectively not-ODR-unique DICompositeTypes that are inside functions because the lexical scope metadata might not have been loaded yet, so can't be examined.

An immediate fix would be to not set the "identifier" field for the DICompositeType when it's created if it's inside a function scope to avoid ODRUniqing. I've only got a light understanding of what the identifier field is for, so there might be unexpected consequences, plus there'll be a metadata/DWARF size cost to that.

@dexonsmith
Copy link
Collaborator

An immediate fix would be to not set the "identifier" field for the DICompositeType when it's created if it's inside a function scope to avoid ODRUniqing. I've only got a light understanding of what the identifier field is for, so there might be unexpected consequences, plus there'll be a metadata/DWARF size cost to that.

Actually, that seems like probably the correct fix. The identifier field is precisely for "ODR-unique-based-on-this" and only for that, IIRC (@aprantl, can you confirm? these are old memories for me at this point...).

If the function-local types should not be ODR-uniqued, then dropping the identifier field sounds correct.

@rnk
Copy link
Collaborator

rnk commented Oct 1, 2024

If the function-local types should not be ODR-uniqued, then dropping the identifier field sounds correct.

I can't speak to the complexities of the alternative, but I'm immediately suspicious of this direction. We have stable manglings for static locals in inline functions and static data members of local classes and static locals in methods in local classes in inline functions... Surely if we can compute a unique mangled name for this local class, we should use it as the identifier and merge it. It seems important that, for size reasons, we figure out a way to do this for lambdas, which are common in inline functions in headers.

@jmorse
Copy link
Member

jmorse commented Oct 2, 2024

Hadn't realised lambda types will pop up everywhere, the cost might be too high. I'll test with building clang and our internal codebases to see if the fix works and what the cost is.

Surely if we can compute a unique mangled name for this local class, we should use it as the identifier and merge it.

Indeed, it's not the ODR-uniquing per-se that's causing the trouble here, it's the fact that we do not merge together the DISubprograms that are the containing scope for that type upon module linking. I'm not familiar with the design decisions behind that, but a courtesy glance suggests it's important to identify the compilation unit that a DISubprogram instance came from. We're then stuck with multiple DISubprograms and a single DIComposite type referring to one of them as being its scope, which presumably breaks a lot of assumptions.

Perhaps a worthwhile direction is separating DISubprograms into abstract and definition portions, much like DWARF does, which would allow us to put these types into the declaration and have a well defined scope hierarchy: types that get uniqued refer to the abstract-DISubprogram for their lexical scope. Perhaps that's can be achieved with the existing distinction between definition DISubprograms and "declaration" DISubprograms? A quick survey of callers to DISubprogram::getDefinition and isDefinition indicates there aren't many decisions made throughout LLVM based on the definition/declaration split, and there are comments in LLVMContextImpl.h suggesting declaration-DISubprograms are used in ODR'd types too.

If we took that route, the scope-chain would still become a tree of scopes instead of distinct chains, and we would still need to fix the problem of CodeGen LexicalScopes encountering the abstract/declaration DISubprogram. However, I think we'd have the reasonable solution of mapping declaration-DISubprograms in a function to their definition-DISubprogram (assuming the definition had been found already), which is much more reliable than seeking a DISubprogram by name or something.

@rnk
Copy link
Collaborator

rnk commented Oct 2, 2024

Perhaps a worthwhile direction is separating DISubprograms into abstract and definition portions, much like DWARF does, which would allow us to put these types into the declaration and have a well defined scope hierarchy: types that get uniqued refer to the abstract-DISubprogram for their lexical scope.

This sounds like a promising direction to me, but I have no idea what the relative costs are.

@dexonsmith
Copy link
Collaborator

I remember discovering that there are effectively two kinds of DISubprogram already... one for declarations (which usually get uniqued/deduped) and another for definitions (which I believe never do... IIRC they are always "distinct").

I imagine it would be possible/good to separate into two classes. If I'd understood there were two uses I would have done so originally. (The main effort is updating testcases.)

Even without two classes, you might be able to clean up the type graph, knowing that there are in fact two ways DISubprogram is used.

@jmorse
Copy link
Member

jmorse commented Dec 6, 2024

I've placed a re-application of this code in #119001, along with a commit that attempts to fix the matter discovered here. To avoid multiple DICompositeTypes in multiple DISubprograms being ODRUnique'd, we instead put them in the declaration DISubprogram which should be unique'd anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category debuginfo llvm:ir llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants