Skip to content

Commit a0cbc9d

Browse files
authored
DeadArgumentElimination Pass (#1641)
This adds a pass to remove unnecessary call arguments in an LTO-like manner, that is: * If a parameter is not actually used in a function, we don't need to send anything, and can remove it from the function's declaration. Concretely, (func $a (param $x i32) ..no uses of $x.. ) (func $b (call $a (..)) ) => (func $a ..no uses of $x.. ) (func $b (call $a) ) And * If a parameter is only ever sent the same constant value, we can just set that constant value in the function (which then means that the values sent from the outside are no longer used, as in the previous point). Concretely, (func $a (param $x i32) ..may use $x.. ) (func $b (call $a (i32.const 1)) (call $a (i32.const 1)) ) => (func $a (local $x i32) (set_local $x (i32.const 1) ..may use $x.. ) (func $b (call $a) (call $a) ) How much this helps depends on the codebase obviously, but sometimes it is pretty useful. For example, it shrinks 0.72% on Unity and 0.37% on Mono. Note that those numbers include not just the optimization itself, but the other optimizations it then enables - in particular the second point from earlier leads to inlining a constant value, which often allows constant propagation, and also removing parameters may enable more duplicate function elimination, etc. - which explains how this can shrink Unity by almost 1%. Implementation is pretty straightforward, but there is some work to make the heavy part of the pass parallel, and a bunch of corner cases to avoid (can't change a function that is exported or in the table, etc.). Like the Inlining pass, there is both a standard and an "optimizing" version of this pass - the latter also optimizes the functions it changes, as like Inlining, it's useful to not need to re-run all function optimizations on the whole module.
1 parent 8900ceb commit a0cbc9d

30 files changed

+1706
-1738
lines changed

build-js.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,7 @@ echo "building shared bitcode"
9090
$BINARYEN_SRC/ir/LocalGraph.cpp \
9191
$BINARYEN_SRC/passes/pass.cpp \
9292
$BINARYEN_SRC/passes/CoalesceLocals.cpp \
93+
$BINARYEN_SRC/passes/DeadArgumentElimination.cpp \
9394
$BINARYEN_SRC/passes/CodeFolding.cpp \
9495
$BINARYEN_SRC/passes/CodePushing.cpp \
9596
$BINARYEN_SRC/passes/ConstHoisting.cpp \

src/mixed_arena.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -223,6 +223,10 @@ class ArenaVectorBase {
223223
usedElements -= size;
224224
}
225225

226+
void erase(Iterator it) {
227+
erase(it, it + 1);
228+
}
229+
226230
void clear() {
227231
usedElements = 0;
228232
}

src/passes/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ SET(passes_SOURCES
1010
CodeFolding.cpp
1111
ConstHoisting.cpp
1212
DataFlowOpts.cpp
13+
DeadArgumentElimination.cpp
1314
DeadCodeElimination.cpp
1415
DuplicateFunctionElimination.cpp
1516
ExtractFunction.cpp
Lines changed: 367 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,367 @@
1+
/*
2+
* Copyright 2018 WebAssembly Community Group participants
3+
*
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
*
8+
* http://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
*/
16+
17+
//
18+
// Optimizes call arguments in a whole-program manner, removing ones
19+
// that are not used (dead).
20+
//
21+
// Specifically, this does these things:
22+
//
23+
// * Find functions for whom an argument is always passed the same
24+
// constant. If so, we can just set that local to that constant
25+
// in the function.
26+
// * Find functions that don't use the value passed to an argument.
27+
// If so, we can avoid even sending and receiving it. (Note how if
28+
// the previous point was true for an argument, then the second
29+
// must as well.)
30+
//
31+
// This pass does not depend on flattening, but it may be more effective,
32+
// as then call arguments never have side effects (which we need to
33+
// watch for here).
34+
//
35+
36+
#include <unordered_map>
37+
#include <unordered_set>
38+
39+
#include <wasm.h>
40+
#include <pass.h>
41+
#include <wasm-builder.h>
42+
#include <cfg/cfg-traversal.h>
43+
#include <ir/effects.h>
44+
#include <passes/opt-utils.h>
45+
#include <support/sorted_vector.h>
46+
47+
namespace wasm {
48+
49+
// Information for a function
50+
struct DAEFunctionInfo {
51+
// The unused parameters, if any.
52+
SortedVector unusedParams;
53+
// Maps a function name to the calls going to it.
54+
std::unordered_map<Name, std::vector<Call*>> calls;
55+
// Whether the function can be called from places that
56+
// affect what we can do. For now, any call we don't
57+
// see inhibits our optimizations, but TODO: an export
58+
// could be worked around by exporting a thunk that
59+
// adds the parameter.
60+
bool hasUnseenCalls = false;
61+
};
62+
63+
typedef std::unordered_map<Name, DAEFunctionInfo> DAEFunctionInfoMap;
64+
65+
// Information in a basic block
66+
struct DAEBlockInfo {
67+
// A local may be read, written, or not accessed in this block.
68+
// If it is both read and written, we just care about the first
69+
// action (if it is read first, that's all the info we are
70+
// looking for; if it is written first, it can't be read later).
71+
enum LocalUse {
72+
Read,
73+
Written
74+
};
75+
std::unordered_map<Index, LocalUse> localUses;
76+
};
77+
78+
struct DAEScanner : public WalkerPass<CFGWalker<DAEScanner, Visitor<DAEScanner>, DAEBlockInfo>> {
79+
bool isFunctionParallel() override { return true; }
80+
81+
Pass* create() override { return new DAEScanner(infoMap); }
82+
83+
DAEScanner(DAEFunctionInfoMap* infoMap) : infoMap(infoMap) {}
84+
85+
DAEFunctionInfoMap* infoMap;
86+
DAEFunctionInfo* info;
87+
88+
Index numParams;
89+
90+
// cfg traversal work
91+
92+
void visitGetLocal(GetLocal* curr) {
93+
if (currBasicBlock) {
94+
auto& localUses = currBasicBlock->contents.localUses;
95+
auto index = curr->index;
96+
if (localUses.count(index) == 0) {
97+
localUses[index] = DAEBlockInfo::Read;
98+
}
99+
}
100+
}
101+
102+
void visitSetLocal(SetLocal* curr) {
103+
if (currBasicBlock) {
104+
auto& localUses = currBasicBlock->contents.localUses;
105+
auto index = curr->index;
106+
if (localUses.count(index) == 0) {
107+
localUses[index] = DAEBlockInfo::Written;
108+
}
109+
}
110+
}
111+
112+
void visitCall(Call* curr) {
113+
info->calls[curr->target].push_back(curr);
114+
}
115+
116+
// main entry point
117+
118+
void doWalkFunction(Function* func) {
119+
numParams = func->getNumParams();
120+
info = &((*infoMap)[func->name]);
121+
CFGWalker<DAEScanner, Visitor<DAEScanner>, DAEBlockInfo>::doWalkFunction(func);
122+
// If there are relevant params, check if they are used. (If
123+
// we can't optimize the function anyhow, there's no point.)
124+
if (numParams > 0 && !info->hasUnseenCalls) {
125+
findUnusedParams(func);
126+
}
127+
}
128+
129+
void findUnusedParams(Function* func) {
130+
// Flow the incoming parameter values, see if they reach a read.
131+
// Once we've seen a parameter at a block, we need never consider it there
132+
// again.
133+
std::unordered_map<BasicBlock*, SortedVector> seenBlockIndexes;
134+
// Start with all the incoming parameters.
135+
SortedVector initial;
136+
for (Index i = 0; i < numParams; i++) {
137+
initial.push_back(i);
138+
}
139+
// The used params, which we now compute.
140+
std::unordered_set<Index> usedParams;
141+
// An item of work is a block plus the values arriving there.
142+
typedef std::pair<BasicBlock*, SortedVector> Item;
143+
std::vector<Item> work;
144+
work.emplace_back(entry, initial);
145+
while (!work.empty()) {
146+
auto item = std::move(work.back());
147+
work.pop_back();
148+
auto* block = item.first;
149+
auto& indexes = item.second;
150+
// Ignore things we've already seen, or we've already seen to be used.
151+
auto& seenIndexes = seenBlockIndexes[block];
152+
indexes.filter([&](const Index i) {
153+
if (seenIndexes.has(i) || usedParams.count(i)) {
154+
return false;
155+
} else {
156+
seenIndexes.insert(i);
157+
return true;
158+
}
159+
});
160+
if (indexes.empty()) {
161+
continue; // nothing more to flow
162+
}
163+
auto& localUses = block->contents.localUses;
164+
SortedVector remainingIndexes;
165+
for (auto i : indexes) {
166+
auto iter = localUses.find(i);
167+
if (iter != localUses.end()) {
168+
auto use = iter->second;
169+
if (use == DAEBlockInfo::Read) {
170+
usedParams.insert(i);
171+
}
172+
// Whether it was a read or a write, we can stop looking at that local here.
173+
} else {
174+
remainingIndexes.insert(i);
175+
}
176+
}
177+
// If there are remaining indexes, flow them forward.
178+
if (!remainingIndexes.empty()) {
179+
for (auto* next : block->out) {
180+
work.emplace_back(next, remainingIndexes);
181+
}
182+
}
183+
}
184+
// We can now compute the unused params.
185+
for (Index i = 0; i < numParams; i++) {
186+
if (usedParams.count(i) == 0) {
187+
info->unusedParams.insert(i);
188+
}
189+
}
190+
}
191+
};
192+
193+
struct DAE : public Pass {
194+
bool optimize = false;
195+
196+
void run(PassRunner* runner, Module* module) override {
197+
DAEFunctionInfoMap infoMap;
198+
// Ensure they all exist so the parallel threads don't modify the data structure.
199+
for (auto& func : module->functions) {
200+
infoMap[func->name];
201+
}
202+
// Check the influence of the table and exports.
203+
for (auto& curr : module->exports) {
204+
if (curr->kind == ExternalKind::Function) {
205+
infoMap[curr->value].hasUnseenCalls = true;
206+
}
207+
}
208+
for (auto& segment : module->table.segments) {
209+
for (auto name : segment.data) {
210+
infoMap[name].hasUnseenCalls = true;
211+
}
212+
}
213+
// Scan all the functions.
214+
{
215+
PassRunner runner(module);
216+
runner.setIsNested(true);
217+
runner.add<DAEScanner>(&infoMap);
218+
runner.run();
219+
}
220+
// Combine all the info.
221+
std::unordered_map<Name, std::vector<Call*>> allCalls;
222+
for (auto& pair : infoMap) {
223+
auto& info = pair.second;
224+
for (auto& pair : info.calls) {
225+
auto name = pair.first;
226+
auto& calls = pair.second;
227+
auto& allCallsToName = allCalls[name];
228+
allCallsToName.insert(allCallsToName.end(), calls.begin(), calls.end());
229+
}
230+
}
231+
// We now have a mapping of all call sites for each function. Check which
232+
// are always passed the same constant for a particular argument.
233+
for (auto& pair : allCalls) {
234+
auto name = pair.first;
235+
// We can only optimize if we see all the calls and can modify
236+
// them.
237+
if (infoMap[name].hasUnseenCalls) continue;
238+
auto& calls = pair.second;
239+
auto* func = module->getFunction(name);
240+
auto numParams = func->getNumParams();
241+
for (Index i = 0; i < numParams; i++) {
242+
Literal value;
243+
for (auto* call : calls) {
244+
assert(call->target == name);
245+
assert(call->operands.size() == numParams);
246+
auto* operand = call->operands[i];
247+
if (auto* c = operand->dynCast<Const>()) {
248+
if (value.type == none) {
249+
// This is the first value seen.
250+
value = c->value;
251+
} else if (value != c->value) {
252+
// Not identical, give up
253+
value.type = none;
254+
break;
255+
}
256+
} else {
257+
// Not a constant, give up
258+
value.type = none;
259+
break;
260+
}
261+
}
262+
if (value.type != none) {
263+
// Success! We can just apply the constant in the function, which makes
264+
// the parameter value unused, which lets us remove it later.
265+
Builder builder(*module);
266+
func->body = builder.makeSequence(
267+
builder.makeSetLocal(i, builder.makeConst(value)),
268+
func->body
269+
);
270+
// Mark it as unused, which we know it now is (no point to
271+
// re-scan just for that).
272+
infoMap[name].unusedParams.insert(i);
273+
}
274+
}
275+
}
276+
// Track which functions we changed, and optimize them later if necessary.
277+
std::unordered_set<Function*> changed;
278+
// We now know which parameters are unused, and can potentially remove them.
279+
for (auto& pair : allCalls) {
280+
auto name = pair.first;
281+
auto& calls = pair.second;
282+
auto* func = module->getFunction(name);
283+
auto numParams = func->getNumParams();
284+
if (numParams == 0) continue;
285+
// Iterate downwards, as we may remove more than one.
286+
Index i = numParams - 1;
287+
while (1) {
288+
if (infoMap[name].unusedParams.has(i)) {
289+
// Great, it's not used. Check if none of the calls has a param with side
290+
// effects, as that would prevent us removing them (flattening should
291+
// have been done earlier).
292+
bool canRemove = true;
293+
for (auto* call : calls) {
294+
auto* operand = call->operands[i];
295+
if (EffectAnalyzer(runner->options, operand).hasSideEffects()) {
296+
canRemove = false;
297+
break;
298+
}
299+
}
300+
if (canRemove) {
301+
// Wonderful, nothing stands in our way! Do it.
302+
// TODO: parallelize this?
303+
removeParameter(func, i, calls);
304+
changed.insert(func);
305+
}
306+
}
307+
if (i == 0) break;
308+
i--;
309+
}
310+
}
311+
if (optimize && changed.size() > 0) {
312+
OptUtils::optimizeAfterInlining(changed, module, runner);
313+
}
314+
}
315+
316+
private:
317+
void removeParameter(Function* func, Index i, std::vector<Call*> calls) {
318+
// Clear the type, which is no longer accurate.
319+
func->type = Name();
320+
// It's cumbersome to adjust local names - TODO don't clear them?
321+
Builder::clearLocalNames(func);
322+
// Remove the parameter from the function. We must add a new local
323+
// for uses of the parameter, but cannot make it use the same index
324+
// (in general).
325+
auto type = func->getLocalType(i);
326+
func->params.erase(func->params.begin() + i);
327+
Index newIndex = Builder::addVar(func, type);
328+
// Update local operations.
329+
struct LocalUpdater : public PostWalker<LocalUpdater> {
330+
Index removedIndex;
331+
Index newIndex;
332+
LocalUpdater(Function* func, Index removedIndex, Index newIndex) : removedIndex(removedIndex), newIndex(newIndex) {
333+
walk(func->body);
334+
}
335+
void visitGetLocal(GetLocal* curr) {
336+
updateIndex(curr->index);
337+
}
338+
void visitSetLocal(SetLocal* curr) {
339+
updateIndex(curr->index);
340+
}
341+
void updateIndex(Index& index) {
342+
if (index == removedIndex) {
343+
index = newIndex;
344+
} else if (index > removedIndex) {
345+
index--;
346+
}
347+
}
348+
} localUpdater(func, i, newIndex);
349+
// Remove the arguments from the calls.
350+
for (auto* call : calls) {
351+
call->operands.erase(call->operands.begin() + i);
352+
}
353+
}
354+
};
355+
356+
Pass *createDAEPass() {
357+
return new DAE();
358+
}
359+
360+
Pass *createDAEOptimizingPass() {
361+
auto* ret = new DAE();
362+
ret->optimize = true;
363+
return ret;
364+
}
365+
366+
} // namespace wasm
367+

0 commit comments

Comments
 (0)