Skip to content

[Dwarf] Emit DW_AT_bit_size for non-byte-sized types #137118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Apr 24, 2025

When the base type is non-byte-sized (e.g., boolean), we should emit DW_AT_bit_size to make DW_OP_convert correct.
This patch was tested with gdb. lldb ignores DW_AT_bit_size field if DW_AT_byte_size is available. I will post a follow-up patch later.

This patch is stacked on #137013
Related issue: #71065

@llvmbot
Copy link
Member

llvmbot commented Apr 24, 2025

@llvm/pr-subscribers-debuginfo

Author: Yingwei Zheng (dtcxzyw)

Changes

When the base type is non-byte-sized (e.g., boolean), we should emit DW_AT_bit_size to make DW_OP_convert correct.
This patch was tested with gdb. lldb ignores DW_AT_bit_size field if DW_AT_byte_size is available. I will post a follow-up patch later.

This patch is stacked on #137013
Related issue: #71065


Full diff: https://github.com/llvm/llvm-project/pull/137118.diff

4 Files Affected:

  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp (+4)
  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+18)
  • (added) llvm/test/DebugInfo/X86/convert-bool.ll (+43)
  • (added) llvm/test/Transforms/InstCombine/debuginfo-invert.ll (+70)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 3939dae81841f..e73028d2e28d9 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -1749,6 +1749,10 @@ void DwarfCompileUnit::createBaseTypeDIEs() {
     // Round up to smallest number of bytes that contains this number of bits.
     addUInt(Die, dwarf::DW_AT_byte_size, std::nullopt,
             divideCeil(Btr.BitSize, 8));
+    // If the size is not a multiple of 8 (e.g., boolean), add the bit size
+    // field.
+    if (Btr.BitSize % 8 != 0)
+      addUInt(Die, dwarf::DW_AT_bit_size, std::nullopt, Btr.BitSize);
 
     Btr.Die = &Die;
   }
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index f807f5f4519fc..c47d32dc01cd3 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -1396,6 +1396,24 @@ void InstCombinerImpl::freelyInvertAllUsersOf(Value *I, Value *IgnoredUser) {
                        "canFreelyInvertAllUsersOf() ?");
     }
   }
+
+  // Update pre-existing debug value uses.
+  SmallVector<DbgValueInst *, 4> DbgValues;
+  SmallVector<DbgVariableRecord *, 4> DbgVariableRecords;
+  llvm::findDbgValues(DbgValues, I, &DbgVariableRecords);
+  auto ApplyNot = [](DIExpression *Src) {
+    SmallVector<uint64_t> Elements;
+    Elements.reserve(Src->getElements().size() + 1);
+    Elements.push_back(dwarf::DW_OP_not);
+    Elements.append(Src->getElements().begin(), Src->getElements().end());
+    return DIExpression::get(Src->getContext(), Elements);
+  };
+
+  for (auto *DVI : DbgValues)
+    DVI->setExpression(ApplyNot(DVI->getExpression()));
+
+  for (DbgVariableRecord *DVR : DbgVariableRecords)
+    DVR->setExpression(ApplyNot(DVR->getExpression()));
 }
 
 /// Given a 'sub' instruction, return the RHS of the instruction if the LHS is a
diff --git a/llvm/test/DebugInfo/X86/convert-bool.ll b/llvm/test/DebugInfo/X86/convert-bool.ll
new file mode 100644
index 0000000000000..0abd011c56668
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/convert-bool.ll
@@ -0,0 +1,43 @@
+; RUN: llc -mtriple=x86_64 -dwarf-version=5 -filetype=obj -O0 < %s | llvm-dwarfdump - | FileCheck %s
+
+; CHECK: DW_TAG_base_type
+; CHECK-NEXT: DW_AT_name      ("DW_ATE_unsigned_1")
+; CHECK-NEXT: DW_AT_encoding  (DW_ATE_unsigned)
+; CHECK-NEXT: DW_AT_byte_size (0x01)
+; CHECK-NEXT: DW_AT_bit_size  (0x01)
+
+define void @main() !dbg !18 {
+entry:
+    #dbg_value(i1 false, !22, !DIExpression(DW_OP_not, DW_OP_LLVM_convert, 1, DW_ATE_unsigned, DW_OP_LLVM_convert, 32, DW_ATE_unsigned, DW_OP_stack_value), !23)
+  ret void, !dbg !24
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!10, !11, !12, !13, !14, !15, !16}
+!llvm.ident = !{!17}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 21.0.0git", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, globals: !2, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "test.c", directory: "/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build", checksumkind: CSK_MD5, checksum: "100bdbce655d729c24c7c0e8523a58ae")
+!2 = !{!3, !6, !8}
+!3 = !DIGlobalVariableExpression(var: !4, expr: !DIExpression())
+!4 = distinct !DIGlobalVariable(name: "a", scope: !0, file: !1, line: 1, type: !5, isLocal: false, isDefinition: true)
+!5 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!6 = !DIGlobalVariableExpression(var: !7, expr: !DIExpression())
+!7 = distinct !DIGlobalVariable(name: "b", scope: !0, file: !1, line: 1, type: !5, isLocal: false, isDefinition: true)
+!8 = !DIGlobalVariableExpression(var: !9, expr: !DIExpression())
+!9 = distinct !DIGlobalVariable(name: "c", scope: !0, file: !1, line: 1, type: !5, isLocal: false, isDefinition: true)
+!10 = !{i32 7, !"Dwarf Version", i32 5}
+!11 = !{i32 2, !"Debug Info Version", i32 3}
+!12 = !{i32 1, !"wchar_size", i32 4}
+!13 = !{i32 8, !"PIC Level", i32 2}
+!14 = !{i32 7, !"PIE Level", i32 2}
+!15 = !{i32 7, !"uwtable", i32 2}
+!16 = !{i32 7, !"debug-info-assignment-tracking", i1 true}
+!17 = !{!"clang version 21.0.0git"}
+!18 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 3, type: !19, scopeLine: 3, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !21)
+!19 = !DISubroutineType(types: !20)
+!20 = !{null}
+!21 = !{!22}
+!22 = !DILocalVariable(name: "l_4516", scope: !18, file: !1, line: 4, type: !5)
+!23 = !DILocation(line: 0, scope: !18)
+!24 = !DILocation(line: 9, column: 1, scope: !18)
diff --git a/llvm/test/Transforms/InstCombine/debuginfo-invert.ll b/llvm/test/Transforms/InstCombine/debuginfo-invert.ll
new file mode 100644
index 0000000000000..8c673e319f7f7
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/debuginfo-invert.ll
@@ -0,0 +1,70 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=instcombine -S %s -o - | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define i32 @test(i32 noundef %x, i32 noundef %y) !dbg !10 {
+; CHECK-LABEL: define i32 @test(
+; CHECK-SAME: i32 noundef [[X:%.*]], i32 noundef [[Y:%.*]]) !dbg [[DBG10:![0-9]+]] {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:      #dbg_value(i32 [[X]], [[META15:![0-9]+]], !DIExpression(), [[META18:![0-9]+]])
+; CHECK-NEXT:      #dbg_value(i32 [[Y]], [[META16:![0-9]+]], !DIExpression(), [[META18]])
+; CHECK-NEXT:    [[CMP_NOT:%.*]] = icmp eq i32 [[X]], 0, !dbg [[DBG19:![0-9]+]]
+; CHECK-NEXT:      #dbg_value(i1 [[CMP_NOT]], [[META17:![0-9]+]], !DIExpression(DW_OP_not, DW_OP_LLVM_convert, 1, DW_ATE_unsigned, DW_OP_LLVM_convert, 32, DW_ATE_unsigned, DW_OP_stack_value), [[META18]])
+; CHECK-NEXT:    [[TMP0:%.*]] = and i32 [[Y]], 1, !dbg [[DBG20:![0-9]+]]
+; CHECK-NEXT:    [[AND:%.*]] = select i1 [[CMP_NOT]], i32 0, i32 [[TMP0]], !dbg [[DBG20]]
+; CHECK-NEXT:    ret i32 [[AND]], !dbg [[DBG21:![0-9]+]]
+;
+entry:
+  #dbg_value(i32 %x, !15, !DIExpression(), !18)
+  #dbg_value(i32 %y, !16, !DIExpression(), !18)
+  %cmp = icmp ne i32 %x, 0, !dbg !19
+  %conv = zext i1 %cmp to i32, !dbg !19
+  #dbg_value(i32 %conv, !17, !DIExpression(), !18)
+  %and = and i32 %conv, %y, !dbg !20
+  ret i32 %and, !dbg !21
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4, !5, !6, !7, !8}
+!llvm.ident = !{!9}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 21.0.0git", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "test.c", directory: "/", checksumkind: CSK_MD5, checksum: "b2d9ffc7905684d8b7c3b52a3136e57c")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{i32 8, !"PIC Level", i32 2}
+!6 = !{i32 7, !"PIE Level", i32 2}
+!7 = !{i32 7, !"uwtable", i32 2}
+!8 = !{i32 7, !"debug-info-assignment-tracking", i1 true}
+!9 = !{!"clang version 21.0.0git"}
+!10 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 1, type: !11, scopeLine: 1, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !14)
+!11 = !DISubroutineType(types: !12)
+!12 = !{!13, !13, !13}
+!13 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!14 = !{!15, !16, !17}
+!15 = !DILocalVariable(name: "x", arg: 1, scope: !10, file: !1, line: 1, type: !13)
+!16 = !DILocalVariable(name: "y", arg: 2, scope: !10, file: !1, line: 1, type: !13)
+!17 = !DILocalVariable(name: "z", scope: !10, file: !1, line: 2, type: !13)
+!18 = !DILocation(line: 0, scope: !10)
+!19 = !DILocation(line: 2, column: 13, scope: !10)
+!20 = !DILocation(line: 3, column: 12, scope: !10)
+!21 = !DILocation(line: 3, column: 3, scope: !10)
+;.
+; CHECK: [[META0:![0-9]+]] = distinct !DICompileUnit(language: DW_LANG_C11, file: [[META1:![0-9]+]], producer: "{{.*}}clang version {{.*}}", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+; CHECK: [[META1]] = !DIFile(filename: "test.c", directory: {{.*}})
+; CHECK: [[DBG10]] = distinct !DISubprogram(name: "test", scope: [[META1]], file: [[META1]], line: 1, type: [[META11:![0-9]+]], scopeLine: 1, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: [[META0]], retainedNodes: [[META14:![0-9]+]])
+; CHECK: [[META11]] = !DISubroutineType(types: [[META12:![0-9]+]])
+; CHECK: [[META12]] = !{[[META13:![0-9]+]], [[META13]], [[META13]]}
+; CHECK: [[META13]] = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+; CHECK: [[META14]] = !{[[META15]], [[META16]], [[META17]]}
+; CHECK: [[META15]] = !DILocalVariable(name: "x", arg: 1, scope: [[DBG10]], file: [[META1]], line: 1, type: [[META13]])
+; CHECK: [[META16]] = !DILocalVariable(name: "y", arg: 2, scope: [[DBG10]], file: [[META1]], line: 1, type: [[META13]])
+; CHECK: [[META17]] = !DILocalVariable(name: "z", scope: [[DBG10]], file: [[META1]], line: 2, type: [[META13]])
+; CHECK: [[META18]] = !DILocation(line: 0, scope: [[DBG10]])
+; CHECK: [[DBG19]] = !DILocation(line: 2, column: 13, scope: [[DBG10]])
+; CHECK: [[DBG20]] = !DILocation(line: 3, column: 12, scope: [[DBG10]])
+; CHECK: [[DBG21]] = !DILocation(line: 3, column: 3, scope: [[DBG10]])
+;.

@llvmbot
Copy link
Member

llvmbot commented Apr 24, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

When the base type is non-byte-sized (e.g., boolean), we should emit DW_AT_bit_size to make DW_OP_convert correct.
This patch was tested with gdb. lldb ignores DW_AT_bit_size field if DW_AT_byte_size is available. I will post a follow-up patch later.

This patch is stacked on #137013
Related issue: #71065


Full diff: https://github.com/llvm/llvm-project/pull/137118.diff

4 Files Affected:

  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp (+4)
  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+18)
  • (added) llvm/test/DebugInfo/X86/convert-bool.ll (+43)
  • (added) llvm/test/Transforms/InstCombine/debuginfo-invert.ll (+70)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 3939dae81841f..e73028d2e28d9 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -1749,6 +1749,10 @@ void DwarfCompileUnit::createBaseTypeDIEs() {
     // Round up to smallest number of bytes that contains this number of bits.
     addUInt(Die, dwarf::DW_AT_byte_size, std::nullopt,
             divideCeil(Btr.BitSize, 8));
+    // If the size is not a multiple of 8 (e.g., boolean), add the bit size
+    // field.
+    if (Btr.BitSize % 8 != 0)
+      addUInt(Die, dwarf::DW_AT_bit_size, std::nullopt, Btr.BitSize);
 
     Btr.Die = &Die;
   }
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index f807f5f4519fc..c47d32dc01cd3 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -1396,6 +1396,24 @@ void InstCombinerImpl::freelyInvertAllUsersOf(Value *I, Value *IgnoredUser) {
                        "canFreelyInvertAllUsersOf() ?");
     }
   }
+
+  // Update pre-existing debug value uses.
+  SmallVector<DbgValueInst *, 4> DbgValues;
+  SmallVector<DbgVariableRecord *, 4> DbgVariableRecords;
+  llvm::findDbgValues(DbgValues, I, &DbgVariableRecords);
+  auto ApplyNot = [](DIExpression *Src) {
+    SmallVector<uint64_t> Elements;
+    Elements.reserve(Src->getElements().size() + 1);
+    Elements.push_back(dwarf::DW_OP_not);
+    Elements.append(Src->getElements().begin(), Src->getElements().end());
+    return DIExpression::get(Src->getContext(), Elements);
+  };
+
+  for (auto *DVI : DbgValues)
+    DVI->setExpression(ApplyNot(DVI->getExpression()));
+
+  for (DbgVariableRecord *DVR : DbgVariableRecords)
+    DVR->setExpression(ApplyNot(DVR->getExpression()));
 }
 
 /// Given a 'sub' instruction, return the RHS of the instruction if the LHS is a
diff --git a/llvm/test/DebugInfo/X86/convert-bool.ll b/llvm/test/DebugInfo/X86/convert-bool.ll
new file mode 100644
index 0000000000000..0abd011c56668
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/convert-bool.ll
@@ -0,0 +1,43 @@
+; RUN: llc -mtriple=x86_64 -dwarf-version=5 -filetype=obj -O0 < %s | llvm-dwarfdump - | FileCheck %s
+
+; CHECK: DW_TAG_base_type
+; CHECK-NEXT: DW_AT_name      ("DW_ATE_unsigned_1")
+; CHECK-NEXT: DW_AT_encoding  (DW_ATE_unsigned)
+; CHECK-NEXT: DW_AT_byte_size (0x01)
+; CHECK-NEXT: DW_AT_bit_size  (0x01)
+
+define void @main() !dbg !18 {
+entry:
+    #dbg_value(i1 false, !22, !DIExpression(DW_OP_not, DW_OP_LLVM_convert, 1, DW_ATE_unsigned, DW_OP_LLVM_convert, 32, DW_ATE_unsigned, DW_OP_stack_value), !23)
+  ret void, !dbg !24
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!10, !11, !12, !13, !14, !15, !16}
+!llvm.ident = !{!17}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 21.0.0git", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, globals: !2, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "test.c", directory: "/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build", checksumkind: CSK_MD5, checksum: "100bdbce655d729c24c7c0e8523a58ae")
+!2 = !{!3, !6, !8}
+!3 = !DIGlobalVariableExpression(var: !4, expr: !DIExpression())
+!4 = distinct !DIGlobalVariable(name: "a", scope: !0, file: !1, line: 1, type: !5, isLocal: false, isDefinition: true)
+!5 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!6 = !DIGlobalVariableExpression(var: !7, expr: !DIExpression())
+!7 = distinct !DIGlobalVariable(name: "b", scope: !0, file: !1, line: 1, type: !5, isLocal: false, isDefinition: true)
+!8 = !DIGlobalVariableExpression(var: !9, expr: !DIExpression())
+!9 = distinct !DIGlobalVariable(name: "c", scope: !0, file: !1, line: 1, type: !5, isLocal: false, isDefinition: true)
+!10 = !{i32 7, !"Dwarf Version", i32 5}
+!11 = !{i32 2, !"Debug Info Version", i32 3}
+!12 = !{i32 1, !"wchar_size", i32 4}
+!13 = !{i32 8, !"PIC Level", i32 2}
+!14 = !{i32 7, !"PIE Level", i32 2}
+!15 = !{i32 7, !"uwtable", i32 2}
+!16 = !{i32 7, !"debug-info-assignment-tracking", i1 true}
+!17 = !{!"clang version 21.0.0git"}
+!18 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 3, type: !19, scopeLine: 3, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !21)
+!19 = !DISubroutineType(types: !20)
+!20 = !{null}
+!21 = !{!22}
+!22 = !DILocalVariable(name: "l_4516", scope: !18, file: !1, line: 4, type: !5)
+!23 = !DILocation(line: 0, scope: !18)
+!24 = !DILocation(line: 9, column: 1, scope: !18)
diff --git a/llvm/test/Transforms/InstCombine/debuginfo-invert.ll b/llvm/test/Transforms/InstCombine/debuginfo-invert.ll
new file mode 100644
index 0000000000000..8c673e319f7f7
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/debuginfo-invert.ll
@@ -0,0 +1,70 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=instcombine -S %s -o - | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define i32 @test(i32 noundef %x, i32 noundef %y) !dbg !10 {
+; CHECK-LABEL: define i32 @test(
+; CHECK-SAME: i32 noundef [[X:%.*]], i32 noundef [[Y:%.*]]) !dbg [[DBG10:![0-9]+]] {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:      #dbg_value(i32 [[X]], [[META15:![0-9]+]], !DIExpression(), [[META18:![0-9]+]])
+; CHECK-NEXT:      #dbg_value(i32 [[Y]], [[META16:![0-9]+]], !DIExpression(), [[META18]])
+; CHECK-NEXT:    [[CMP_NOT:%.*]] = icmp eq i32 [[X]], 0, !dbg [[DBG19:![0-9]+]]
+; CHECK-NEXT:      #dbg_value(i1 [[CMP_NOT]], [[META17:![0-9]+]], !DIExpression(DW_OP_not, DW_OP_LLVM_convert, 1, DW_ATE_unsigned, DW_OP_LLVM_convert, 32, DW_ATE_unsigned, DW_OP_stack_value), [[META18]])
+; CHECK-NEXT:    [[TMP0:%.*]] = and i32 [[Y]], 1, !dbg [[DBG20:![0-9]+]]
+; CHECK-NEXT:    [[AND:%.*]] = select i1 [[CMP_NOT]], i32 0, i32 [[TMP0]], !dbg [[DBG20]]
+; CHECK-NEXT:    ret i32 [[AND]], !dbg [[DBG21:![0-9]+]]
+;
+entry:
+  #dbg_value(i32 %x, !15, !DIExpression(), !18)
+  #dbg_value(i32 %y, !16, !DIExpression(), !18)
+  %cmp = icmp ne i32 %x, 0, !dbg !19
+  %conv = zext i1 %cmp to i32, !dbg !19
+  #dbg_value(i32 %conv, !17, !DIExpression(), !18)
+  %and = and i32 %conv, %y, !dbg !20
+  ret i32 %and, !dbg !21
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4, !5, !6, !7, !8}
+!llvm.ident = !{!9}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 21.0.0git", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "test.c", directory: "/", checksumkind: CSK_MD5, checksum: "b2d9ffc7905684d8b7c3b52a3136e57c")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{i32 8, !"PIC Level", i32 2}
+!6 = !{i32 7, !"PIE Level", i32 2}
+!7 = !{i32 7, !"uwtable", i32 2}
+!8 = !{i32 7, !"debug-info-assignment-tracking", i1 true}
+!9 = !{!"clang version 21.0.0git"}
+!10 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 1, type: !11, scopeLine: 1, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !14)
+!11 = !DISubroutineType(types: !12)
+!12 = !{!13, !13, !13}
+!13 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!14 = !{!15, !16, !17}
+!15 = !DILocalVariable(name: "x", arg: 1, scope: !10, file: !1, line: 1, type: !13)
+!16 = !DILocalVariable(name: "y", arg: 2, scope: !10, file: !1, line: 1, type: !13)
+!17 = !DILocalVariable(name: "z", scope: !10, file: !1, line: 2, type: !13)
+!18 = !DILocation(line: 0, scope: !10)
+!19 = !DILocation(line: 2, column: 13, scope: !10)
+!20 = !DILocation(line: 3, column: 12, scope: !10)
+!21 = !DILocation(line: 3, column: 3, scope: !10)
+;.
+; CHECK: [[META0:![0-9]+]] = distinct !DICompileUnit(language: DW_LANG_C11, file: [[META1:![0-9]+]], producer: "{{.*}}clang version {{.*}}", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+; CHECK: [[META1]] = !DIFile(filename: "test.c", directory: {{.*}})
+; CHECK: [[DBG10]] = distinct !DISubprogram(name: "test", scope: [[META1]], file: [[META1]], line: 1, type: [[META11:![0-9]+]], scopeLine: 1, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: [[META0]], retainedNodes: [[META14:![0-9]+]])
+; CHECK: [[META11]] = !DISubroutineType(types: [[META12:![0-9]+]])
+; CHECK: [[META12]] = !{[[META13:![0-9]+]], [[META13]], [[META13]]}
+; CHECK: [[META13]] = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+; CHECK: [[META14]] = !{[[META15]], [[META16]], [[META17]]}
+; CHECK: [[META15]] = !DILocalVariable(name: "x", arg: 1, scope: [[DBG10]], file: [[META1]], line: 1, type: [[META13]])
+; CHECK: [[META16]] = !DILocalVariable(name: "y", arg: 2, scope: [[DBG10]], file: [[META1]], line: 1, type: [[META13]])
+; CHECK: [[META17]] = !DILocalVariable(name: "z", scope: [[DBG10]], file: [[META1]], line: 2, type: [[META13]])
+; CHECK: [[META18]] = !DILocation(line: 0, scope: [[DBG10]])
+; CHECK: [[DBG19]] = !DILocation(line: 2, column: 13, scope: [[DBG10]])
+; CHECK: [[DBG20]] = !DILocation(line: 3, column: 12, scope: [[DBG10]])
+; CHECK: [[DBG21]] = !DILocation(line: 3, column: 3, scope: [[DBG10]])
+;.

// If the size is not a multiple of 8 (e.g., boolean), add the bit size
// field.
if (Btr.BitSize % 8 != 0)
addUInt(Die, dwarf::DW_AT_bit_size, std::nullopt, Btr.BitSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DWARF seems to say that either the bit- or byte-size should be specified. So rather than emit both sizes, I think it's preferable to emit just the bit size if it is not a multiple of 8. See DWARF 5 section 5.1 "Base Type Entries":

A base type entry has a DW_AT_byte_size attribute or a
DW_AT_bit_size attribute whose integer constant value (see Section 2.21 on
page 56) is the amount of storage needed to hold a value of the type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at this again and there's some later text about a scenario where DW_AT_byte_size is used for the object size, but then the actual data is described with DW_AT_bit_size plus DW_AT_data_bit_offset. So while it's not totally correct to say that the base type must have one or the other, it's still the case that this particular patch should probably do that. IMO.

@dwblaikie
Copy link
Collaborator

Could you point me to where you've written up (or write it up here) where we need non-byte sized objects? A C++ bool is still byte sized (its sizeof is 1) so I wouldn't expect to want/need bit-sized things to implement C++ bool... is there something else going on here?

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Apr 25, 2025

Could you point me to where you've written up (or write it up here) where we need non-byte sized objects? A C++ bool is still byte sized (its sizeof is 1) so I wouldn't expect to want/need bit-sized things to implement C++ bool... is there something else going on here?

The sequence DW_OP_LLVM_convert, 1, DW_ATE_unsigned is generated by salvageDebugInfo:

auto ExtOps = DIExpression::getExtOps(FromTypeBitSize, ToTypeBitSize,
isa<SExtInst>(&I));
Ops.append(ExtOps.begin(), ExtOps.end());
return FromValue;

@dwblaikie
Copy link
Collaborator

Could you point me to where you've written up (or write it up here) where we need non-byte sized objects? A C++ bool is still byte sized (its sizeof is 1) so I wouldn't expect to want/need bit-sized things to implement C++ bool... is there something else going on here?

The sequence DW_OP_LLVM_convert, 1, DW_ATE_unsigned is generated by salvageDebugInfo:

auto ExtOps = DIExpression::getExtOps(FromTypeBitSize, ToTypeBitSize,
isa<SExtInst>(&I));
Ops.append(ExtOps.begin(), ExtOps.end());
return FromValue;

But the size of the actual C++ bool variable is one byte. We lost the high bits, so we should probably be describing it as 7 unknown high bits and 1 known low bit. At the LLVM level we probably don't want to embed language-specific knowledge that those high bits are all zero - but a DWARF consumer might have that knowledge & be able to render the value to the user as true/false because it knows the high bits are always all zero in valid code.

@dwblaikie
Copy link
Collaborator

Yeah, not sure we should create DW_OP_LLVM_convert, 1 and if we do, not sure we should lower it to an integer of size 1... doesn't seem likely to be what most consumers would expect?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants