Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DWARF] Use ULEB128 and not just one byte for directory indices #109067

Merged
merged 4 commits into from
Sep 24, 2024

Conversation

drodriguez
Copy link
Contributor

According to the standard DW_LNCT_directory_index can be data1, data2, or udata (see 6.2.4.1). The code was using data1, but this limits the number of directories to 256, even if the variable holding the directory index is a uint64_t. dsymutil was hitting an assertion when trying to write directory indices higher than 255.

Modify the classic and the parallel DWARF linkers to use udata and encode the directory indices as ULEB128 and provide a test that has more than 256 directories to check the changes are working as expected.

For people that were using dsymutil with CUs that had between 128-256 directories, this will mean that for those indices 2 bytes will be used now, instead of just one.

According to the standard `DW_LNCT_directory_index` can be `data1`, `data2`,
or `udata` (see 6.2.4.1). The code was using `data1`, but this limits the
number of directories to 256, even if the variable holding the directory
index is a `uint64_t`. `dsymutil` was hitting an assertion when trying
to write directory indices higher than 255.

Modify the classic and the parallel DWARF linkers to use `udata` and
encode the directory indices as ULEB128 and provide a test that has more
than 256 directories to check the changes are working as expected.

For people that were using `dsymutil` with CUs that had between 128-256
directories, this will mean that for those indices 2 bytes will be used
now, instead of just one.
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 18, 2024

@llvm/pr-subscribers-debuginfo

Author: Daniel Rodríguez Troitiño (drodriguez)

Changes

According to the standard DW_LNCT_directory_index can be data1, data2, or udata (see 6.2.4.1). The code was using data1, but this limits the number of directories to 256, even if the variable holding the directory index is a uint64_t. dsymutil was hitting an assertion when trying to write directory indices higher than 255.

Modify the classic and the parallel DWARF linkers to use udata and encode the directory indices as ULEB128 and provide a test that has more than 256 directories to check the changes are working as expected.

For people that were using dsymutil with CUs that had between 128-256 directories, this will mean that for those indices 2 bytes will be used now, instead of just one.


Patch is 144.49 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/109067.diff

4 Files Affected:

  • (modified) llvm/lib/DWARFLinker/Classic/DWARFStreamer.cpp (+2-3)
  • (modified) llvm/lib/DWARFLinker/Parallel/DebugLineSectionEmitter.h (+2-2)
  • (added) llvm/test/tools/dsymutil/X86/dwarf5-many-include-directories.ll (+2631)
  • (modified) llvm/test/tools/dsymutil/X86/lit.local.cfg (+1-1)
diff --git a/llvm/lib/DWARFLinker/Classic/DWARFStreamer.cpp b/llvm/lib/DWARFLinker/Classic/DWARFStreamer.cpp
index bca31253683530..947db9cbcd92d0 100644
--- a/llvm/lib/DWARFLinker/Classic/DWARFStreamer.cpp
+++ b/llvm/lib/DWARFLinker/Classic/DWARFStreamer.cpp
@@ -933,7 +933,7 @@ void DwarfStreamer::emitLineTablePrologueV5IncludeAndFileTable(
     LineSectionSize += MS->emitULEB128IntValue(StrForm);
 
     LineSectionSize += MS->emitULEB128IntValue(dwarf::DW_LNCT_directory_index);
-    LineSectionSize += MS->emitULEB128IntValue(dwarf::DW_FORM_data1);
+    LineSectionSize += MS->emitULEB128IntValue(dwarf::DW_FORM_udata);
 
     if (HasChecksums) {
       LineSectionSize += MS->emitULEB128IntValue(dwarf::DW_LNCT_MD5);
@@ -952,8 +952,7 @@ void DwarfStreamer::emitLineTablePrologueV5IncludeAndFileTable(
   // file_names (sequence of file name entries).
   for (auto File : P.FileNames) {
     emitLineTableString(P, File.Name, DebugStrPool, DebugLineStrPool);
-    MS->emitInt8(File.DirIdx);
-    LineSectionSize += 1;
+    LineSectionSize += MS->emitULEB128IntValue(File.DirIdx);
     if (HasChecksums) {
       MS->emitBinaryData(
           StringRef(reinterpret_cast<const char *>(File.Checksum.data()),
diff --git a/llvm/lib/DWARFLinker/Parallel/DebugLineSectionEmitter.h b/llvm/lib/DWARFLinker/Parallel/DebugLineSectionEmitter.h
index 38357c7f97314c..b035c4b1d6c30d 100644
--- a/llvm/lib/DWARFLinker/Parallel/DebugLineSectionEmitter.h
+++ b/llvm/lib/DWARFLinker/Parallel/DebugLineSectionEmitter.h
@@ -215,7 +215,7 @@ class DebugLineSectionEmitter {
       encodeULEB128(FileNameForm, Section.OS);
 
       encodeULEB128(dwarf::DW_LNCT_directory_index, Section.OS);
-      encodeULEB128(dwarf::DW_FORM_data1, Section.OS);
+      encodeULEB128(dwarf::DW_FORM_udata, Section.OS);
 
       if (HasChecksums) {
         encodeULEB128(dwarf::DW_LNCT_MD5, Section.OS);
@@ -242,7 +242,7 @@ class DebugLineSectionEmitter {
       // A null-terminated string containing the full or relative path name of a
       // source file.
       Section.emitString(FileNameForm, *FileNameStr);
-      Section.emitIntVal(File.DirIdx, 1);
+      encodeULEB128(File.DirIdx, Section.OS);
 
       if (HasChecksums) {
         assert((File.Checksum.size() == 16) &&
diff --git a/llvm/test/tools/dsymutil/X86/dwarf5-many-include-directories.ll b/llvm/test/tools/dsymutil/X86/dwarf5-many-include-directories.ll
new file mode 100644
index 00000000000000..74e84e22c03219
--- /dev/null
+++ b/llvm/test/tools/dsymutil/X86/dwarf5-many-include-directories.ll
@@ -0,0 +1,2631 @@
+; RUN: rm -rf %t && mkdir -p %t
+; RUN: %llc_dwarf -o %t/all.o -filetype=obj %s
+; Dummy debug map to get the results. Bare object doesn't seem to work
+; RUN: echo "---" > %t/debug.map
+; RUN: echo "triple: 'x86_64-apple-darwin'" >> %t/debug.map
+; RUN: echo "objects:" >> %t/debug.map
+; RUN: echo "  - filename: %t/all.o" >> %t/debug.map
+; RUN: echo "    symbols:" >> %t/debug.map
+; RUN: echo "      - { sym: _all, objAddr: 0x0, binAddr: 0x0, size: 0x0 }" >> %t/debug.map
+; RUN: echo "..." >> %t/debug.map
+; RUN: dsymutil -f -y %t/debug.map -o - | llvm-dwarfdump -debug-line - | FileCheck %s
+; RUN: dsymutil --linker parallel -f -y %t/debug.map -o - | llvm-dwarfdump -debug-line - | tee %t/output.txt | FileCheck %s
+
+; CHECK:      include_directories[255] = "/tmp/tmp.0HPkdttdoU/d254"
+; CHECK-NEXT: include_directories[256] = "/tmp/tmp.0HPkdttdoU/d255"
+; CHECK-NEXT: include_directories[257] = "/tmp/tmp.0HPkdttdoU/d256"
+
+; CHECK: dir_index: 255
+; CHECK: dir_index: 256
+; CHECK: dir_index: 257
+
+; ---
+; Generated doing (fish shell):
+; - for cnt in (seq 0 256); mkdir -p d$cnt ; printf "void func$cnd() {}\n#define FUNC$cnt func$cnt()\n" >> d$cnt/f$cnt.c ; end
+; - for cnt in (seq 0 256); printf "#include \"f$cnt.c\"" >> all.c ; end
+; - printf "void all() {\n" >> all.c
+; - for cnt in (seq 0 256); printf "FUNC$cnt;\n" >> all.c ; end
+; - printf "}\n" >> all.c
+; - clang -target x86_64-apple-macos -S -emit-llvm -gdwarf-5 -o all.ll all.c (for cnt in (seq 0 256); echo "-Id$cnt"; end)
+; - Edit all.ll manually and change all DIFile so the directory in filename is
+;   moved into the directory field.
+; ---
+; ModuleID = 'all.c'
+source_filename = "all.c"
+target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-apple-macosx10.4.0"
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func0() #0 !dbg !9 {
+  ret void, !dbg !13
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func1() #0 !dbg !14 {
+  ret void, !dbg !16
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func2() #0 !dbg !17 {
+  ret void, !dbg !19
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func3() #0 !dbg !20 {
+  ret void, !dbg !22
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func4() #0 !dbg !23 {
+  ret void, !dbg !25
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func5() #0 !dbg !26 {
+  ret void, !dbg !28
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func6() #0 !dbg !29 {
+  ret void, !dbg !31
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func7() #0 !dbg !32 {
+  ret void, !dbg !34
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func8() #0 !dbg !35 {
+  ret void, !dbg !37
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func9() #0 !dbg !38 {
+  ret void, !dbg !40
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func10() #0 !dbg !41 {
+  ret void, !dbg !43
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func11() #0 !dbg !44 {
+  ret void, !dbg !46
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func12() #0 !dbg !47 {
+  ret void, !dbg !49
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func13() #0 !dbg !50 {
+  ret void, !dbg !52
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func14() #0 !dbg !53 {
+  ret void, !dbg !55
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func15() #0 !dbg !56 {
+  ret void, !dbg !58
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func16() #0 !dbg !59 {
+  ret void, !dbg !61
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func17() #0 !dbg !62 {
+  ret void, !dbg !64
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func18() #0 !dbg !65 {
+  ret void, !dbg !67
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func19() #0 !dbg !68 {
+  ret void, !dbg !70
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func20() #0 !dbg !71 {
+  ret void, !dbg !73
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func21() #0 !dbg !74 {
+  ret void, !dbg !76
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func22() #0 !dbg !77 {
+  ret void, !dbg !79
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func23() #0 !dbg !80 {
+  ret void, !dbg !82
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func24() #0 !dbg !83 {
+  ret void, !dbg !85
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func25() #0 !dbg !86 {
+  ret void, !dbg !88
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func26() #0 !dbg !89 {
+  ret void, !dbg !91
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func27() #0 !dbg !92 {
+  ret void, !dbg !94
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func28() #0 !dbg !95 {
+  ret void, !dbg !97
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func29() #0 !dbg !98 {
+  ret void, !dbg !100
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func30() #0 !dbg !101 {
+  ret void, !dbg !103
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func31() #0 !dbg !104 {
+  ret void, !dbg !106
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func32() #0 !dbg !107 {
+  ret void, !dbg !109
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func33() #0 !dbg !110 {
+  ret void, !dbg !112
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func34() #0 !dbg !113 {
+  ret void, !dbg !115
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func35() #0 !dbg !116 {
+  ret void, !dbg !118
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func36() #0 !dbg !119 {
+  ret void, !dbg !121
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func37() #0 !dbg !122 {
+  ret void, !dbg !124
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func38() #0 !dbg !125 {
+  ret void, !dbg !127
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func39() #0 !dbg !128 {
+  ret void, !dbg !130
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func40() #0 !dbg !131 {
+  ret void, !dbg !133
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func41() #0 !dbg !134 {
+  ret void, !dbg !136
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func42() #0 !dbg !137 {
+  ret void, !dbg !139
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func43() #0 !dbg !140 {
+  ret void, !dbg !142
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func44() #0 !dbg !143 {
+  ret void, !dbg !145
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func45() #0 !dbg !146 {
+  ret void, !dbg !148
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func46() #0 !dbg !149 {
+  ret void, !dbg !151
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func47() #0 !dbg !152 {
+  ret void, !dbg !154
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func48() #0 !dbg !155 {
+  ret void, !dbg !157
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func49() #0 !dbg !158 {
+  ret void, !dbg !160
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func50() #0 !dbg !161 {
+  ret void, !dbg !163
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func51() #0 !dbg !164 {
+  ret void, !dbg !166
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func52() #0 !dbg !167 {
+  ret void, !dbg !169
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func53() #0 !dbg !170 {
+  ret void, !dbg !172
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func54() #0 !dbg !173 {
+  ret void, !dbg !175
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func55() #0 !dbg !176 {
+  ret void, !dbg !178
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func56() #0 !dbg !179 {
+  ret void, !dbg !181
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func57() #0 !dbg !182 {
+  ret void, !dbg !184
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func58() #0 !dbg !185 {
+  ret void, !dbg !187
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func59() #0 !dbg !188 {
+  ret void, !dbg !190
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func60() #0 !dbg !191 {
+  ret void, !dbg !193
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func61() #0 !dbg !194 {
+  ret void, !dbg !196
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func62() #0 !dbg !197 {
+  ret void, !dbg !199
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func63() #0 !dbg !200 {
+  ret void, !dbg !202
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func64() #0 !dbg !203 {
+  ret void, !dbg !205
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func65() #0 !dbg !206 {
+  ret void, !dbg !208
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func66() #0 !dbg !209 {
+  ret void, !dbg !211
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func67() #0 !dbg !212 {
+  ret void, !dbg !214
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func68() #0 !dbg !215 {
+  ret void, !dbg !217
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func69() #0 !dbg !218 {
+  ret void, !dbg !220
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func70() #0 !dbg !221 {
+  ret void, !dbg !223
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func71() #0 !dbg !224 {
+  ret void, !dbg !226
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func72() #0 !dbg !227 {
+  ret void, !dbg !229
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func73() #0 !dbg !230 {
+  ret void, !dbg !232
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func74() #0 !dbg !233 {
+  ret void, !dbg !235
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func75() #0 !dbg !236 {
+  ret void, !dbg !238
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func76() #0 !dbg !239 {
+  ret void, !dbg !241
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func77() #0 !dbg !242 {
+  ret void, !dbg !244
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func78() #0 !dbg !245 {
+  ret void, !dbg !247
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func79() #0 !dbg !248 {
+  ret void, !dbg !250
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func80() #0 !dbg !251 {
+  ret void, !dbg !253
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func81() #0 !dbg !254 {
+  ret void, !dbg !256
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func82() #0 !dbg !257 {
+  ret void, !dbg !259
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func83() #0 !dbg !260 {
+  ret void, !dbg !262
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func84() #0 !dbg !263 {
+  ret void, !dbg !265
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func85() #0 !dbg !266 {
+  ret void, !dbg !268
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func86() #0 !dbg !269 {
+  ret void, !dbg !271
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func87() #0 !dbg !272 {
+  ret void, !dbg !274
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func88() #0 !dbg !275 {
+  ret void, !dbg !277
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func89() #0 !dbg !278 {
+  ret void, !dbg !280
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func90() #0 !dbg !281 {
+  ret void, !dbg !283
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func91() #0 !dbg !284 {
+  ret void, !dbg !286
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func92() #0 !dbg !287 {
+  ret void, !dbg !289
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func93() #0 !dbg !290 {
+  ret void, !dbg !292
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func94() #0 !dbg !293 {
+  ret void, !dbg !295
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func95() #0 !dbg !296 {
+  ret void, !dbg !298
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func96() #0 !dbg !299 {
+  ret void, !dbg !301
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func97() #0 !dbg !302 {
+  ret void, !dbg !304
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func98() #0 !dbg !305 {
+  ret void, !dbg !307
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func99() #0 !dbg !308 {
+  ret void, !dbg !310
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func100() #0 !dbg !311 {
+  ret void, !dbg !313
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func101() #0 !dbg !314 {
+  ret void, !dbg !316
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func102() #0 !dbg !317 {
+  ret void, !dbg !319
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func103() #0 !dbg !320 {
+  ret void, !dbg !322
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func104() #0 !dbg !323 {
+  ret void, !dbg !325
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func105() #0 !dbg !326 {
+  ret void, !dbg !328
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func106() #0 !dbg !329 {
+  ret void, !dbg !331
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func107() #0 !dbg !332 {
+  ret void, !dbg !334
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func108() #0 !dbg !335 {
+  ret void, !dbg !337
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func109() #0 !dbg !338 {
+  ret void, !dbg !340
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func110() #0 !dbg !341 {
+  ret void, !dbg !343
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func111() #0 !dbg !344 {
+  ret void, !dbg !346
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func112() #0 !dbg !347 {
+  ret void, !dbg !349
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func113() #0 !dbg !350 {
+  ret void, !dbg !352
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func114() #0 !dbg !353 {
+  ret void, !dbg !355
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func115() #0 !dbg !356 {
+  ret void, !dbg !358
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func116() #0 !dbg !359 {
+  ret void, !dbg !361
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func117() #0 !dbg !362 {
+  ret void, !dbg !364
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func118() #0 !dbg !365 {
+  ret void, !dbg !367
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func119() #0 !dbg !368 {
+  ret void, !dbg !370
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func120() #0 !dbg !371 {
+  ret void, !dbg !373
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func121() #0 !dbg !374 {
+  ret void, !dbg !376
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func122() #0 !dbg !377 {
+  ret void, !dbg !379
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func123() #0 !dbg !380 {
+  ret void, !dbg !382
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func124() #0 !dbg !383 {
+  ret void, !dbg !385
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func125() #0 !dbg !386 {
+  ret void, !dbg !388
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func126() #0 !dbg !389 {
+  ret void, !dbg !391
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func127() #0 !dbg !392 {
+  ret void, !dbg !394
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func128() #0 !dbg !395 {
+  ret void, !dbg !397
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func129() #0 !dbg !398 {
+  ret void, !dbg !400
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func130() #0 !dbg !401 {
+  ret void, !dbg !403
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func131() #0 !dbg !404 {
+  ret void, !dbg !406
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func132() #0 !dbg !407 {
+  ret void, !dbg !409
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func133() #0 !dbg !410 {
+  ret void, !dbg !412
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define void @func134() #0 !dbg !413 {
+  ret void, !dbg !415
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define v...
[truncated]

- Use `.test` as the extension. Remove `.ll` from test extensions.
- The output generated by Python is the same, except for the checksum
  fields, which are constant, but does not seem to matter.
Copy link
Contributor

@kyulee-com kyulee-com left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me!
Thanks for creating a tricky test that requires large files from a script.
While I'm approving this, I would also like to give @JDevlieghere, a chance to review.

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we know the number of indices, we could always emit the most efficient encoding. That said, we're emitting this only once per line table, so I'm not particularly worried about the additional byte and simplifies the implementation. LGTM.

@drodriguez
Copy link
Contributor Author

Given that we know the number of indices, we could always emit the most efficient encoding. That said, we're emitting this only once per line table, so I'm not particularly worried about the additional byte and simplifies the implementation. LGTM.

I was striving for implementation simplicity. I started to think about a better implementation with different formats, but I did not see clearly how it would be better.

| Range         | data1    | data2     | udata     |
| 0-127         | 1 byte * | 2 bytes   | 1 byte *  |
| 128-255       | 1 byte * | 2 bytes   | 2 bytes   |
| 255-16383     | NA       | 2 bytes * | 2 bytes * |
| 16384-65535   | NA       | 2 bytes * | 3 bytes   |
| 65536-2097151 | NA       | NA        | 3 bytes * | 

One thing to notice is that one can use the range to choose a format, and that works for getting the best format for 0-16383, but for 16384-65536, one needs to calculate the distribution of indices to see if the udata encoding or the data2 encoding will be smaller.

@drodriguez drodriguez merged commit 12285cc into llvm:main Sep 24, 2024
8 checks passed
augusto2112 pushed a commit to augusto2112/llvm-project that referenced this pull request Sep 26, 2024
…#109067)

According to the standard `DW_LNCT_directory_index` can be `data1`,
`data2`, or `udata` (see 6.2.4.1). The code was using `data1`, but this
limits the number of directories to 256, even if the variable holding
the directory index is a `uint64_t`. `dsymutil` was hitting an assertion
when trying to write directory indices higher than 255.

Modify the classic and the parallel DWARF linkers to use `udata` and
encode the directory indices as ULEB128 and provide a test that has more
than 256 directories to check the changes are working as expected.

For people that were using `dsymutil` with CUs that had between 128-256
directories, this will mean that for those indices 2 bytes will be used
now, instead of just one.
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Oct 4, 2024
…#109067)

According to the standard `DW_LNCT_directory_index` can be `data1`,
`data2`, or `udata` (see 6.2.4.1). The code was using `data1`, but this
limits the number of directories to 256, even if the variable holding
the directory index is a `uint64_t`. `dsymutil` was hitting an assertion
when trying to write directory indices higher than 255.

Modify the classic and the parallel DWARF linkers to use `udata` and
encode the directory indices as ULEB128 and provide a test that has more
than 256 directories to check the changes are working as expected.

For people that were using `dsymutil` with CUs that had between 128-256
directories, this will mean that for those indices 2 bytes will be used
now, instead of just one.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants