Skip to content

Commit 3f6fd58

Browse files
authored
DWARF debug line updating (#2545)
With this, we can update DWARF debug line info properly as we write a new binary. To do that we track binary locations as we write. Each instruction is mapped to the location it is written to. We must also adjust them as we move code around because of LEB optimization (we emit a function or a section with a 5-byte LEB placeholder, the maximal size; later we shrink it which is almost always possible). writeDWARFSections() now takes a second param, the new locations of instructions. It then maps debug line info from the original offsets in the binary to the new offsets in the binary being written. The core logic for updating the debug line section is in wasm-debug.cpp. It basically tracks state machine logic both to read the existing debug lines and to emit the new ones. I couldn't find a way to reuse LLVM code for this, but reading LLVM's code was very useful here. A final tricky thing we need to do is to update the DWARF section's internal size annotation. The LLVM YAML writing code doesn't do that for us. Luckily it's pretty easy, in fixEmittedSection we just update the first 4 bytes in place to have the section size, after we've emitted it and know the size. This ignores debug lines with a 0 in the line, col, or addr, see WebAssembly/debugging#9 (comment) This ignores debug line offsets into the middle of instructions, which LLVM sometimes emits for some reason, see WebAssembly/debugging#9 (comment) Handling that would likely at least double our memory usage, which is unfortunate - we are run in an LTO manner, where the entire app's DWARF is present, and it may be massive. I think we should see if such odd offsets are a bug in LLVM, and if we can fix or prevent that. This does not emit "special" opcodes for debug lines. Those are purely an optimization, which I wanted to leave for later. (Even without them we decrease the size quite a lot, btw, as many lines have 0s in them...) This adds some testing that shows we can load and save fib2.c and fannkuch.cpp properly. The latter includes more than one function and has nontrivial code. To actually emit correct offsets a few minor fixes are done here: * Fix the code section location tracking during reading - the correct offset we care about is the body of the code section, not including the section declaration and size. * Fix wasm-stack debug line emitting. We need to update in BinaryInstWriter::visit(), that is, right before writing bytes for the instruction. That differs from * BinaryenIRWriter::visit which is a recursive function that also calls the children - so the offset there would be of the first child. For some reason that is correct with source maps, I don't understand why, but it's wrong for DWARF... * Print code section offsets in hex, to match other tools. Remove DWARFUpdate pass, which was useful for testing temporarily, but doesn't make sense now (it just updates without writing a binary). cc @yurydelendik
1 parent c97d6e4 commit 3f6fd58

22 files changed

+6946
-173
lines changed

src/passes/DWARF.cpp

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -35,14 +35,6 @@ struct DWARFDump : public Pass {
3535
}
3636
};
3737

38-
struct DWARFUpdate : public Pass {
39-
void run(PassRunner* runner, Module* module) override {
40-
Debug::writeDWARFSections(*module);
41-
}
42-
};
43-
4438
Pass* createDWARFDumpPass() { return new DWARFDump(); }
4539

46-
Pass* createDWARFUpdatePass() { return new DWARFUpdate(); }
47-
4840
} // namespace wasm

src/passes/Print.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1427,7 +1427,8 @@ struct PrintSExpression : public OverriddenVisitor<PrintSExpression> {
14271427
auto iter = currFunction->binaryLocations.find(curr);
14281428
if (iter != currFunction->binaryLocations.end()) {
14291429
Colors::grey(o);
1430-
o << ";; code offset: 0x" << iter->second << '\n';
1430+
o << ";; code offset: 0x" << std::hex << iter->second << std::dec
1431+
<< '\n';
14311432
restoreNormalColor(o);
14321433
doIndent(o, indent);
14331434
}

src/passes/RoundTrip.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ struct RoundTrip : public Pass {
5757
// Read
5858
ModuleUtils::clearModule(*module);
5959
ModuleReader reader;
60-
// TODO: enable debug info when relevant
60+
reader.setDWARF(runner->options.debugInfo);
6161
reader.read(tempName, *module);
6262
// Clean up
6363
std::remove(tempName.c_str());

src/passes/pass.cpp

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -110,8 +110,6 @@ void PassRegistry::registerPasses() {
110110
registerPass("dwarfdump",
111111
"dump DWARF debug info sections from the read binary",
112112
createDWARFDumpPass);
113-
registerPass(
114-
"dwarfupdate", "update DWARF debug info sections", createDWARFUpdatePass);
115113
registerPass("duplicate-import-elimination",
116114
"removes duplicate imports",
117115
createDuplicateImportEliminationPass);

src/passes/passes.h

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,6 @@ Pass* createDataFlowOptsPass();
3636
Pass* createDeadCodeEliminationPass();
3737
Pass* createDirectizePass();
3838
Pass* createDWARFDumpPass();
39-
Pass* createDWARFUpdatePass();
4039
Pass* createDuplicateImportEliminationPass();
4140
Pass* createDuplicateFunctionEliminationPass();
4241
Pass* createEmitTargetFeaturesPass();

src/wasm-binary.h

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1044,6 +1044,19 @@ class WasmBinaryWriter {
10441044

10451045
std::unique_ptr<ImportInfo> importInfo;
10461046

1047+
// General debugging info: map every instruction to its original position in
1048+
// the binary, relative to the beginning of the code section. This is similar
1049+
// to binaryLocations on Function objects, which are filled as we load the
1050+
// functions from the binary. Here we track them as we write, and then
1051+
// the combination of the two can be used to update DWARF info for the new
1052+
// locations of things.
1053+
BinaryLocationsMap binaryLocations;
1054+
size_t binaryLocationsSizeAtSectionStart;
1055+
// Track the expressions that we added for the current function being
1056+
// written, so that we can update those specific binary locations when
1057+
// the function is written out.
1058+
std::vector<Expression*> binaryLocationTrackedExpressionsForFunc;
1059+
10471060
void prepare();
10481061
};
10491062

src/wasm-debug.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ bool hasDWARFSections(const Module& wasm);
3737
void dumpDWARF(const Module& wasm);
3838

3939
// Update the DWARF sections.
40-
void writeDWARFSections(Module& wasm);
40+
void writeDWARFSections(Module& wasm, const BinaryLocationsMap& newLocations);
4141

4242
} // namespace Debug
4343

src/wasm-stack.h

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -83,8 +83,16 @@ class BinaryInstWriter : public OverriddenVisitor<BinaryInstWriter> {
8383
public:
8484
BinaryInstWriter(WasmBinaryWriter& parent,
8585
BufferWithRandomAccess& o,
86-
Function* func)
87-
: parent(parent), o(o), func(func) {}
86+
Function* func,
87+
bool sourceMap)
88+
: parent(parent), o(o), func(func), sourceMap(sourceMap) {}
89+
90+
void visit(Expression* curr) {
91+
if (func && !sourceMap) {
92+
parent.writeDebugLocation(curr, func);
93+
}
94+
OverriddenVisitor<BinaryInstWriter>::visit(curr);
95+
}
8896

8997
void visitBlock(Block* curr);
9098
void visitIf(If* curr);
@@ -144,6 +152,8 @@ class BinaryInstWriter : public OverriddenVisitor<BinaryInstWriter> {
144152
WasmBinaryWriter& parent;
145153
BufferWithRandomAccess& o;
146154
Function* func = nullptr;
155+
bool sourceMap;
156+
147157
std::vector<Name> breakStack;
148158

149159
// type => number of locals of that type in the compact form
@@ -758,7 +768,7 @@ class BinaryenIRToBinaryWriter
758768
Function* func = nullptr,
759769
bool sourceMap = false)
760770
: BinaryenIRWriter<BinaryenIRToBinaryWriter>(func), parent(parent),
761-
writer(parent, o, func), sourceMap(sourceMap) {}
771+
writer(parent, o, func, sourceMap), sourceMap(sourceMap) {}
762772

763773
void visit(Expression* curr) {
764774
BinaryenIRWriter<BinaryenIRToBinaryWriter>::visit(curr);
@@ -833,7 +843,7 @@ class StackIRToBinaryWriter {
833843
StackIRToBinaryWriter(WasmBinaryWriter& parent,
834844
BufferWithRandomAccess& o,
835845
Function* func)
836-
: writer(parent, o, func), func(func) {}
846+
: writer(parent, o, func, false /* sourceMap */), func(func) {}
837847

838848
void write();
839849

src/wasm.h

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1129,7 +1129,10 @@ struct Importable {
11291129
// Stack IR is a secondary IR to the main IR defined in this file (Binaryen
11301130
// IR). See wasm-stack.h.
11311131
class StackInst;
1132-
typedef std::vector<StackInst*> StackIR;
1132+
1133+
using StackIR = std::vector<StackInst*>;
1134+
1135+
using BinaryLocationsMap = std::unordered_map<Expression*, uint32_t>;
11331136

11341137
class Function : public Importable {
11351138
public:
@@ -1178,7 +1181,7 @@ class Function : public Importable {
11781181

11791182
// General debugging info: map every instruction to its original position in
11801183
// the binary, relative to the beginning of the code section.
1181-
std::unordered_map<Expression*, uint32_t> binaryLocations;
1184+
BinaryLocationsMap binaryLocations;
11821185

11831186
size_t getNumParams();
11841187
size_t getNumVars();

src/wasm/wasm-binary.cpp

Lines changed: 65 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,14 @@ void WasmBinaryWriter::write() {
7272
writeSourceMapEpilog();
7373
}
7474

75+
#ifdef BUILD_LLVM_DWARF
76+
// Update DWARF user sections after writing the data referred to by them
77+
// (function bodies), and before writing the user sections themselves.
78+
if (Debug::hasDWARFSections(*wasm)) {
79+
Debug::writeDWARFSections(*wasm, binaryLocations);
80+
}
81+
#endif
82+
7583
writeLateUserSections();
7684
writeFeaturesSection();
7785

@@ -109,29 +117,52 @@ template<typename T> int32_t WasmBinaryWriter::startSection(T code) {
109117
if (sourceMap) {
110118
sourceMapLocationsSizeAtSectionStart = sourceMapLocations.size();
111119
}
120+
binaryLocationsSizeAtSectionStart = binaryLocations.size();
112121
return writeU32LEBPlaceholder(); // section size to be filled in later
113122
}
114123

115124
void WasmBinaryWriter::finishSection(int32_t start) {
116125
// section size does not include the reserved bytes of the size field itself
117126
int32_t size = o.size() - start - MaxLEB32Bytes;
118127
auto sizeFieldSize = o.writeAt(start, U32LEB(size));
119-
if (sizeFieldSize != MaxLEB32Bytes) {
128+
// We can move things back if the actual LEB for the size doesn't use the
129+
// maximum 5 bytes. In that case we need to adjust offsets after we move
130+
// things backwards.
131+
auto adjustmentForLEBShrinking = MaxLEB32Bytes - sizeFieldSize;
132+
if (adjustmentForLEBShrinking) {
120133
// we can save some room, nice
121134
assert(sizeFieldSize < MaxLEB32Bytes);
122135
std::move(&o[start] + MaxLEB32Bytes,
123136
&o[start] + MaxLEB32Bytes + size,
124137
&o[start] + sizeFieldSize);
125-
auto adjustment = MaxLEB32Bytes - sizeFieldSize;
126-
o.resize(o.size() - adjustment);
138+
o.resize(o.size() - adjustmentForLEBShrinking);
127139
if (sourceMap) {
128140
for (auto i = sourceMapLocationsSizeAtSectionStart;
129141
i < sourceMapLocations.size();
130142
++i) {
131-
sourceMapLocations[i].first -= adjustment;
143+
sourceMapLocations[i].first -= adjustmentForLEBShrinking;
132144
}
133145
}
134146
}
147+
148+
if (binaryLocationsSizeAtSectionStart != binaryLocations.size()) {
149+
// We added the binary locations, adjust them: they must be relative
150+
// to the code section.
151+
assert(binaryLocationsSizeAtSectionStart == 0);
152+
// The section type byte is right before the LEB for the size; we want
153+
// offsets that are relative to the body, which is after that section type
154+
// byte and the the size LEB.
155+
auto body = start + sizeFieldSize;
156+
for (auto& pair : binaryLocations) {
157+
// Offsets are relative to the body of the code section: after the
158+
// section type byte and the size.
159+
// Everything was moved by the adjustment, track that. After this,
160+
// we are at the right absolute address.
161+
pair.second -= adjustmentForLEBShrinking;
162+
// We are relative to the section start.
163+
pair.second -= body;
164+
}
165+
}
135166
}
136167

137168
int32_t
@@ -266,6 +297,7 @@ void WasmBinaryWriter::writeFunctions() {
266297
auto start = startSection(BinaryConsts::Section::Code);
267298
o << U32LEB(importInfo->getNumDefinedFunctions());
268299
ModuleUtils::iterDefinedFunctions(*wasm, [&](Function* func) {
300+
assert(binaryLocationTrackedExpressionsForFunc.empty());
269301
size_t sourceMapLocationsSizeAtFunctionStart = sourceMapLocations.size();
270302
BYN_TRACE("write one at" << o.size() << std::endl);
271303
size_t sizePos = writeU32LEBPlaceholder();
@@ -284,22 +316,31 @@ void WasmBinaryWriter::writeFunctions() {
284316
BYN_TRACE("body size: " << size << ", writing at " << sizePos
285317
<< ", next starts at " << o.size() << "\n");
286318
auto sizeFieldSize = o.writeAt(sizePos, U32LEB(size));
287-
if (sizeFieldSize != MaxLEB32Bytes) {
319+
// We can move things back if the actual LEB for the size doesn't use the
320+
// maximum 5 bytes. In that case we need to adjust offsets after we move
321+
// things backwards.
322+
auto adjustmentForLEBShrinking = MaxLEB32Bytes - sizeFieldSize;
323+
if (adjustmentForLEBShrinking) {
288324
// we can save some room, nice
289325
assert(sizeFieldSize < MaxLEB32Bytes);
290326
std::move(&o[start], &o[start] + size, &o[sizePos] + sizeFieldSize);
291-
auto adjustment = MaxLEB32Bytes - sizeFieldSize;
292-
o.resize(o.size() - adjustment);
327+
o.resize(o.size() - adjustmentForLEBShrinking);
293328
if (sourceMap) {
294329
for (auto i = sourceMapLocationsSizeAtFunctionStart;
295330
i < sourceMapLocations.size();
296331
++i) {
297-
sourceMapLocations[i].first -= adjustment;
332+
sourceMapLocations[i].first -= adjustmentForLEBShrinking;
298333
}
299334
}
335+
for (auto* curr : binaryLocationTrackedExpressionsForFunc) {
336+
// We added the binary locations, adjust them: they must be relative
337+
// to the code section.
338+
binaryLocations[curr] -= adjustmentForLEBShrinking;
339+
}
300340
}
301341
tableOfContents.functionBodies.emplace_back(
302342
func->name, sizePos + sizeFieldSize, size);
343+
binaryLocationTrackedExpressionsForFunc.clear();
303344
});
304345
finishSection(start);
305346
}
@@ -649,10 +690,19 @@ void WasmBinaryWriter::writeDebugLocation(const Function::DebugLocation& loc) {
649690
}
650691

651692
void WasmBinaryWriter::writeDebugLocation(Expression* curr, Function* func) {
652-
auto& debugLocations = func->debugLocations;
653-
auto iter = debugLocations.find(curr);
654-
if (iter != debugLocations.end()) {
655-
writeDebugLocation(iter->second);
693+
if (sourceMap) {
694+
auto& debugLocations = func->debugLocations;
695+
auto iter = debugLocations.find(curr);
696+
if (iter != debugLocations.end()) {
697+
writeDebugLocation(iter->second);
698+
}
699+
}
700+
// TODO: remove source map debugging support and refactor this method
701+
// to something that directly thinks about DWARF, instead of indirectly
702+
// looking at func->binaryLocations as a proxy for that etc.
703+
if (func && !func->binaryLocations.empty()) {
704+
binaryLocations[curr] = o.size();
705+
binaryLocationTrackedExpressionsForFunc.push_back(curr);
656706
}
657707
}
658708

@@ -809,6 +859,9 @@ void WasmBinaryBuilder::read() {
809859
readFunctionSignatures();
810860
break;
811861
case BinaryConsts::Section::Code:
862+
if (DWARF) {
863+
codeSectionLocation = pos;
864+
}
812865
readFunctions();
813866
break;
814867
case BinaryConsts::Section::Export:
@@ -1288,9 +1341,6 @@ void WasmBinaryBuilder::readFunctionSignatures() {
12881341

12891342
void WasmBinaryBuilder::readFunctions() {
12901343
BYN_TRACE("== readFunctions\n");
1291-
if (DWARF) {
1292-
codeSectionLocation = pos;
1293-
}
12941344
size_t total = getU32LEB();
12951345
if (total != functionSignatures.size()) {
12961346
throwError("invalid function section size, must equal types");

0 commit comments

Comments
 (0)