Skip to content

Commit e8adbdd

Browse files
authored
Source maps: Fix locations without debug info between ones that do (#5906)
Previously we did nothing for instructions without debug info. So if we had one that did, followed by one that didn't, the one that didn't could get "smeared" with the debug info of the first. Source map locations are the start of segments, apparently, and so if we say a location has info then all others after it will as well, until the next segment. To fix that, add support for source map entries with just one field, the binary location. Such entries have no debug info (no file:line:col), and though the source maps spec is not very clear on this, this seems like the right way to prevent this problem: to stop a segment with debug info by starting a new one without, when we know we don't want that info any more. That is, before this PR we could have this: ;; file.cpp:10:1 (nop) ;; binary offset 5 in wasm (nop) ;; binary offset 6 in wasm and the second nop would get the same debug annotation, since we just have one source map segment, [5, file.cpp, 10, 1] // start at offset 5 in wasm With this PR, we emit: [5, file.cpp, 10, 1] // start at offset 5 in wasm; file.cpp:10:1 [6] // start at offset 6 in wasm; no debug info This does add 7% to source map sizes, however, since we add those 1-length segments now, but that seems unavoidable to fix this bug. To implement this, add a new field that says if the next location in the source map has debug info or not, and use that.
1 parent 2cdc61f commit e8adbdd

File tree

3 files changed

+197
-14
lines changed

3 files changed

+197
-14
lines changed

src/wasm-binary.h

Lines changed: 20 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1490,8 +1490,11 @@ class WasmBinaryWriter {
14901490

14911491
MixedArena allocator;
14921492

1493-
// storage of source map locations until the section is placed at its final
1494-
// location (shrinking LEBs may cause changes there)
1493+
// Storage of source map locations until the section is placed at its final
1494+
// location (shrinking LEBs may cause changes there).
1495+
//
1496+
// A null DebugLocation* indicates we have no debug information for that
1497+
// location.
14951498
std::vector<std::pair<size_t, const Function::DebugLocation*>>
14961499
sourceMapLocations;
14971500
size_t sourceMapLocationsSizeAtSectionStart;
@@ -1522,13 +1525,26 @@ class WasmBinaryReader {
15221525
Module& wasm;
15231526
MixedArena& allocator;
15241527
const std::vector<char>& input;
1528+
15251529
std::istream* sourceMap;
1530+
15261531
struct NextDebugLocation {
15271532
uint32_t availablePos;
15281533
uint32_t previousPos;
15291534
Function::DebugLocation next;
1530-
};
1531-
NextDebugLocation nextDebugLocation;
1535+
} nextDebugLocation;
1536+
1537+
// Whether debug info is present next or not in the next debug location. A
1538+
// debug location can contain debug info (file:line:col) or it might not. We
1539+
// need to track this boolean alongside |nextDebugLocation| - that is, we
1540+
// can't just do something like std::optional<DebugLocation> or such - as we
1541+
// still need to track the values in |next|, as later positions are relative
1542+
// to them. That is, if we have line number 100, then no debug info, and then
1543+
// line number 500, then when we get to 500 we will see "+400" which is
1544+
// relative to the last existing line number (we "skip" over the place without
1545+
// debug info).
1546+
bool nextDebugLocationHasDebugInfo;
1547+
15321548
bool debugInfo = true;
15331549
bool DWARF = false;
15341550
bool skipFunctionBodies = false;

src/wasm/wasm-binary.cpp

Lines changed: 55 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1186,12 +1186,16 @@ void WasmBinaryWriter::writeSourceMapEpilog() {
11861186
*sourceMap << ",";
11871187
}
11881188
writeBase64VLQ(*sourceMap, int32_t(offset - lastOffset));
1189-
writeBase64VLQ(*sourceMap, int32_t(loc->fileIndex - lastLoc.fileIndex));
1190-
writeBase64VLQ(*sourceMap, int32_t(loc->lineNumber - lastLoc.lineNumber));
1191-
writeBase64VLQ(*sourceMap,
1192-
int32_t(loc->columnNumber - lastLoc.columnNumber));
1193-
lastLoc = *loc;
11941189
lastOffset = offset;
1190+
if (loc) {
1191+
// There is debug information for this location, so emit the next 3
1192+
// fields and update lastLoc.
1193+
writeBase64VLQ(*sourceMap, int32_t(loc->fileIndex - lastLoc.fileIndex));
1194+
writeBase64VLQ(*sourceMap, int32_t(loc->lineNumber - lastLoc.lineNumber));
1195+
writeBase64VLQ(*sourceMap,
1196+
int32_t(loc->columnNumber - lastLoc.columnNumber));
1197+
lastLoc = *loc;
1198+
}
11951199
}
11961200
*sourceMap << "\"}";
11971201
}
@@ -1340,7 +1344,28 @@ void WasmBinaryWriter::writeDebugLocation(Expression* curr, Function* func) {
13401344
auto& debugLocations = func->debugLocations;
13411345
auto iter = debugLocations.find(curr);
13421346
if (iter != debugLocations.end()) {
1347+
// There is debug information here, write it out.
13431348
writeDebugLocation(iter->second);
1349+
} else {
1350+
// This expression has no debug location. We need to emit an indication
1351+
// of that (so that we do not get "smeared" with debug info from anything
1352+
// before or after us).
1353+
//
1354+
// We don't need to write repeated "no debug info" indications, as a
1355+
// single one is enough to make it clear that the debug information before
1356+
// us is valid no longer. We also don't need to write one if there is
1357+
// nothing before us.
1358+
if (!sourceMapLocations.empty() &&
1359+
sourceMapLocations.back().second != nullptr) {
1360+
sourceMapLocations.emplace_back(o.size(), nullptr);
1361+
1362+
// Initialize the state of debug info to indicate there is no current
1363+
// debug info relevant. This sets |lastDebugLocation| to a dummy value,
1364+
// so that later places with debug info can see that they differ from
1365+
// it (without this, if we had some debug info, then a nullptr for none,
1366+
// and then the same debug info, we could get confused).
1367+
initializeDebugInfo();
1368+
}
13441369
}
13451370
}
13461371
// If this is an instruction in a function, and if the original wasm had
@@ -1614,7 +1639,8 @@ WasmBinaryReader::WasmBinaryReader(Module& wasm,
16141639
FeatureSet features,
16151640
const std::vector<char>& input)
16161641
: wasm(wasm), allocator(wasm.allocator), input(input),
1617-
sourceMap(nullptr), nextDebugLocation{0, 0, {0, 0, 0}}, debugLocation() {
1642+
sourceMap(nullptr), nextDebugLocation{0, 0, {0, 0, 0}},
1643+
nextDebugLocationHasDebugInfo(false), debugLocation() {
16181644
wasm.features = features;
16191645
}
16201646

@@ -2776,13 +2802,18 @@ void WasmBinaryReader::readSourceMapHeader() {
27762802
return;
27772803
}
27782804
// read first debug location
2805+
// TODO: Handle the case where the very first one has only a position but not
2806+
// debug info. In practice that does not happen, which needs
2807+
// investigation (if it does, it will assert in readBase64VLQ, so it
2808+
// would not be a silent error at least).
27792809
uint32_t position = readBase64VLQ(*sourceMap);
27802810
uint32_t fileIndex = readBase64VLQ(*sourceMap);
27812811
uint32_t lineNumber =
27822812
readBase64VLQ(*sourceMap) + 1; // adjust zero-based line number
27832813
uint32_t columnNumber = readBase64VLQ(*sourceMap);
27842814
nextDebugLocation = {
27852815
position, position, {fileIndex, lineNumber, columnNumber}};
2816+
nextDebugLocationHasDebugInfo = true;
27862817
}
27872818

27882819
void WasmBinaryReader::readNextDebugLocation() {
@@ -2803,7 +2834,11 @@ void WasmBinaryReader::readNextDebugLocation() {
28032834
debugLocation.clear();
28042835
// use debugLocation only for function expressions
28052836
if (currFunction) {
2806-
debugLocation.insert(nextDebugLocation.next);
2837+
if (nextDebugLocationHasDebugInfo) {
2838+
debugLocation.insert(nextDebugLocation.next);
2839+
} else {
2840+
debugLocation.clear();
2841+
}
28072842
}
28082843

28092844
char ch;
@@ -2818,6 +2853,17 @@ void WasmBinaryReader::readNextDebugLocation() {
28182853

28192854
int32_t positionDelta = readBase64VLQ(*sourceMap);
28202855
uint32_t position = nextDebugLocation.availablePos + positionDelta;
2856+
2857+
nextDebugLocation.previousPos = nextDebugLocation.availablePos;
2858+
nextDebugLocation.availablePos = position;
2859+
2860+
auto peek = sourceMap->peek();
2861+
if (peek == ',' || peek == '\"') {
2862+
// This is a 1-length entry, so the next location has no debug info.
2863+
nextDebugLocationHasDebugInfo = false;
2864+
break;
2865+
}
2866+
28212867
int32_t fileIndexDelta = readBase64VLQ(*sourceMap);
28222868
uint32_t fileIndex = nextDebugLocation.next.fileIndex + fileIndexDelta;
28232869
int32_t lineNumberDelta = readBase64VLQ(*sourceMap);
@@ -2826,9 +2872,8 @@ void WasmBinaryReader::readNextDebugLocation() {
28262872
uint32_t columnNumber =
28272873
nextDebugLocation.next.columnNumber + columnNumberDelta;
28282874

2829-
nextDebugLocation = {position,
2830-
nextDebugLocation.availablePos,
2831-
{fileIndex, lineNumber, columnNumber}};
2875+
nextDebugLocation.next = {fileIndex, lineNumber, columnNumber};
2876+
nextDebugLocationHasDebugInfo = true;
28322877
}
28332878
}
28342879

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
;; NOTE: Assertions have been generated by update_lit_checks.py and should not be edited.
2+
3+
;; RUN: wasm-opt %s -g -o %s.wasm -osm %s.wasm.map
4+
;; RUN: wasm-opt %s.wasm -ism %s.wasm.map -S -o - | filecheck %s
5+
6+
;; Verify that writing to a source map and reading it back does not "smear"
7+
;; debug info across adjacent instructions. The debug info in the output should
8+
;; be identical to the input.
9+
10+
(module
11+
;; CHECK: (func $test (param $0 i32) (result i32)
12+
;; CHECK-NEXT: (drop
13+
;; CHECK-NEXT: (call $test
14+
;; CHECK-NEXT: ;;@ waka:100:1
15+
;; CHECK-NEXT: (i32.const 1)
16+
;; CHECK-NEXT: )
17+
;; CHECK-NEXT: )
18+
;; CHECK-NEXT: ;;@ waka:200:2
19+
;; CHECK-NEXT: (i32.const 2)
20+
;; CHECK-NEXT: )
21+
(func $test (param i32) (result i32)
22+
;; The drop&call have no debug info, and should remain so. Specifically the
23+
;; instruction right before them in the binary (the const 1) should not
24+
;; smear its debug info on it. And the drop is between an instruction that
25+
;; has debug info (the const 1) and another (the i32.const 2): we should not
26+
;; receive the debug info of either. (This is a regression test for a bug
27+
;; that only happens in that state: removing the debug info either before or
28+
;; after would avoid that bug.)
29+
30+
(drop
31+
(call $test
32+
;;@ waka:100:1
33+
(i32.const 1)
34+
)
35+
)
36+
;;@ waka:200:2
37+
(i32.const 2)
38+
)
39+
40+
;; CHECK: (func $same-later (param $0 i32) (result i32)
41+
;; CHECK-NEXT: (drop
42+
;; CHECK-NEXT: (call $test
43+
;; CHECK-NEXT: ;;@ waka:100:1
44+
;; CHECK-NEXT: (i32.const 1)
45+
;; CHECK-NEXT: )
46+
;; CHECK-NEXT: )
47+
;; CHECK-NEXT: ;;@ waka:100:1
48+
;; CHECK-NEXT: (i32.const 2)
49+
;; CHECK-NEXT: )
50+
(func $same-later (param i32) (result i32)
51+
;; As the first, but now the later debug info is also 100:1. No debug info
52+
;; should change here.
53+
54+
(drop
55+
(call $test
56+
;;@ waka:100:1
57+
(i32.const 1)
58+
)
59+
)
60+
;;@ waka:100:1
61+
(i32.const 2)
62+
)
63+
64+
;; CHECK: (func $more-before (param $0 i32) (result i32)
65+
;; CHECK-NEXT: ;;@ waka:50:5
66+
;; CHECK-NEXT: (nop)
67+
;; CHECK-NEXT: ;;@ waka:50:5
68+
;; CHECK-NEXT: (drop
69+
;; CHECK-NEXT: (call $test
70+
;; CHECK-NEXT: ;;@ waka:100:1
71+
;; CHECK-NEXT: (i32.const 1)
72+
;; CHECK-NEXT: )
73+
;; CHECK-NEXT: )
74+
;; CHECK-NEXT: ;;@ waka:200:2
75+
;; CHECK-NEXT: (i32.const 2)
76+
;; CHECK-NEXT: )
77+
(func $more-before (param i32) (result i32)
78+
;; As the first, but now there is more debug info before the 100:1 (the
79+
;; very first debug info in a function has special handling, so we test it
80+
;; more carefully).
81+
;;
82+
;; The s-parser actually smears 50:5 on the drop and call after it, so the
83+
;; output here looks incorrect. This may be a bug there, TODO
84+
85+
;;@ waka:50:5
86+
(nop)
87+
(drop
88+
(call $test
89+
;;@ waka:100:1
90+
(i32.const 1)
91+
)
92+
)
93+
;;@ waka:200:2
94+
(i32.const 2)
95+
)
96+
97+
;; CHECK: (func $nothing-before (param $0 i32) (result i32)
98+
;; CHECK-NEXT: (nop)
99+
;; CHECK-NEXT: (drop
100+
;; CHECK-NEXT: (call $test
101+
;; CHECK-NEXT: ;;@ waka:100:1
102+
;; CHECK-NEXT: (i32.const 1)
103+
;; CHECK-NEXT: )
104+
;; CHECK-NEXT: )
105+
;; CHECK-NEXT: ;;@ waka:200:2
106+
;; CHECK-NEXT: (i32.const 2)
107+
;; CHECK-NEXT: )
108+
(func $nothing-before (param i32) (result i32)
109+
;; As before, but no debug info on the nop before us (so the first
110+
;; instruction in the function no longer has a debug annotation). Nothing
111+
;; should change in the debug info.
112+
(nop)
113+
(drop
114+
(call $test
115+
;;@ waka:100:1
116+
(i32.const 1)
117+
)
118+
)
119+
;;@ waka:200:2
120+
(i32.const 2)
121+
)
122+
)

0 commit comments

Comments
 (0)