Skip to content

[Strings] Automate string lifting and lowering in the optimization pipeline #7540

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 33 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ full changeset diff at the end of each section.
Current Trunk
-------------

- `-O2` and above will now automatically lift and lower strings, if the string
feature is enabled. That is, `--enable-strings` will now optimize imported
JS strings (by lifting at the start into stringref, which we can optimize,
and lowering at the end back to imported strings).
- Add a `--string-lifting` pass that raises imported string operations and
constants into stringref in Binaryen IR (which can then be fully optimized,
and typically lowered back down with `--string-lowering`).
Expand Down
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,11 @@ There are a few differences between Binaryen IR and the WebAssembly language:
output has the right type in the wasm spec. That may cause a few bytes of
extra size in rare cases (we avoid this overhead in the common case where
the `br_if` value is unused).
* Strings
* Strings: Binaryen has support for something close to the stringref proposal,
mainly for internal optimization purposes. When the strings feature is
enabled, we "lift" JS string imports into string instructions, which are then
optimized, and after optimizations we lower back into JS string imports. (If
you do not want the lowering, use `--no-string-lowering`.)
* Binaryen allows string views (`stringview_wtf16` etc.) to be cast using
`ref.cast`. This simplifies the IR, as it allows `ref.cast` to always be
used in all places (and it is lowered to `ref.as_non_null` where possible
Expand Down
6 changes: 5 additions & 1 deletion scripts/test/fuzzing.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,14 @@
unfuzzable = [
# Float16 is still experimental.
'f16.wast',
# TODO: fuzzer and interpreter support for strings
# TODO: fuzzer and interpreter support for strings, including limitations
# like the fuzzer not handling (ref extern) imports (there is no way
# to create a replacement value)
'strings.wast',
'simplify-locals-strings.wast',
'string-lowering-instructions.wast',
'O2_strings.wast',
'O2_O3_strings.wast',
# TODO: fuzzer and interpreter support for extern conversions
'extern-conversions.wast',
# ignore DWARF because it is incompatible with multivalue atm
Expand Down
12 changes: 12 additions & 0 deletions src/pass.h
Original file line number Diff line number Diff line change
Expand Up @@ -395,6 +395,18 @@ struct PassRunner {
void handleAfterEffects(Pass* pass, Function* func = nullptr);

bool shouldPreserveDWARF();

// Whether we lifted strings during our work. If we did, the caller may want
// to lower later. Noting the lifting work helps us avoid always running the
// lowering pass at the end, which would add work. With this mechanism, if we
// lift strings at the beginning of -O2 or such then we will lower them at the
// end, and if we failed to see any strings to lift then we have no work to do
// (note: no -O2 optimization passes in the middle can add strings to lower).
bool liftedStrings = false;

public:
void setLiftedStrings(bool lifted) { liftedStrings = lifted; }
bool getLiftedStrings() { return liftedStrings; }
};

//
Expand Down
14 changes: 14 additions & 0 deletions src/passes/StringLifting.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,12 @@
namespace wasm {

struct StringLifting : public Pass {
// Whether this is paired with a later lowering operation. If so, we mark
// whether we lifted anything.
bool paired;

StringLifting(bool paired = false) : paired(paired) {}

// Maps the global name of an imported string to the actual string.
std::unordered_map<Name, Name> importedStrings;

Expand Down Expand Up @@ -82,6 +88,7 @@ struct StringLifting : public Pass {
for (auto& section : module->customSections) {
if (section.name == "string.consts") {
// We found the string consts section. Parse it.
found = true;
auto copy = section.data;
json::Value array;
array.parse(copy.data(), json::Value::WTF16);
Expand Down Expand Up @@ -193,6 +200,12 @@ struct StringLifting : public Pass {
return;
}

if (paired) {
// We found strings to lift. Mark this on the PassRunner, so that the
// corresponding lowering will occur.
getPassRunner()->setLiftedStrings(true);
}

struct StringApplier : public WalkerPass<PostWalker<StringApplier>> {
bool isFunctionParallel() override { return true; }

Expand Down Expand Up @@ -292,5 +305,6 @@ struct StringLifting : public Pass {
};

Pass* createStringLiftingPass() { return new StringLifting(); }
Pass* createStringLiftingPairedPass() { return new StringLifting(true); }

} // namespace wasm
39 changes: 34 additions & 5 deletions src/passes/StringLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -204,8 +204,18 @@ struct StringLowering : public StringGathering {
// imports.
bool assertUTF8;

StringLowering(bool useMagicImports = false, bool assertUTF8 = false)
: useMagicImports(useMagicImports), assertUTF8(assertUTF8) {
// Whether we are "paired" to a lifting pass. The lifting pass happens early
// in the optimization pipeline in this mode, and we happen late, and we only
// want to lower if something was lifted. This mode also does not disable the
// strings feature, as we don't want the pair of passes to alter the features
// (we run it during the optimization pipeline, which should not do such
// things).
bool paired;

StringLowering(bool useMagicImports = false,
bool assertUTF8 = false,
bool paired = false)
: useMagicImports(useMagicImports), assertUTF8(assertUTF8), paired(paired) {
// If we are asserting valid UTF-8, we must be using magic imports.
assert(!assertUTF8 || useMagicImports);
}
Expand All @@ -215,6 +225,18 @@ struct StringLowering : public StringGathering {
return;
}

auto* runner = getPassRunner();

if (paired) {
// We only want to lower if we previously lifted.
if (!runner->getLiftedStrings()) {
return;
}

// Reset the flag for later optimization cycles.
runner->setLiftedStrings(false);
}

// First, run the gathering operation so all string.consts are in one place.
StringGathering::run(module);

Expand All @@ -231,10 +253,12 @@ struct StringLowering : public StringGathering {
replaceNulls(module);

// ReFinalize to apply all the above changes.
ReFinalize().run(getPassRunner(), module);
ReFinalize().run(runner, module);

// Disable the feature here after we lowered everything away.
module->features.disable(FeatureSet::Strings);
if (!paired) {
// Disable the feature here after we lowered everything away.
module->features.disable(FeatureSet::Strings);
}
}

void makeImports(Module* module) {
Expand Down Expand Up @@ -553,10 +577,15 @@ struct StringLowering : public StringGathering {
};

Pass* createStringGatheringPass() { return new StringGathering(); }

Pass* createStringLoweringPass() { return new StringLowering(); }
Pass* createStringLoweringMagicImportPass() { return new StringLowering(true); }
Pass* createStringLoweringMagicImportAssertPass() {
return new StringLowering(true, true);
}
Pass* createStringLoweringPairedPass() {
// Use magic imports; no asserts; paired.
return new StringLowering(true, false, true);
}

} // namespace wasm
26 changes: 20 additions & 6 deletions src/passes/pass.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -525,6 +525,9 @@ void PassRegistry::registerPasses() {
registerPass("string-lifting",
"lift string imports to wasm strings",
createStringLiftingPass);
registerTestPass("string-lifting-paired",
"same as string-lifting, but paired with a later lowering",
createStringLiftingPairedPass);
registerPass("string-lowering",
"lowers wasm strings and operations to imports",
createStringLoweringPass);
Expand All @@ -536,6 +539,11 @@ void PassRegistry::registerPasses() {
"same as string-lowering-magic-imports, but raise a fatal error "
"if there are invalid strings",
createStringLoweringMagicImportAssertPass);
registerTestPass(
"string-lowering-paired",
"same as string-lowering, but paired with an earlier lifting (only lowers "
"if we lifted, and does not reset the strings feature)",
createStringLoweringPairedPass);
registerPass(
"strip", "deprecated; same as strip-debug", createStripDebugPass);
registerPass("stack-check",
Expand Down Expand Up @@ -724,6 +732,11 @@ void PassRunner::addDefaultFunctionOptimizationPasses() {
}

void PassRunner::addDefaultGlobalOptimizationPrePasses() {
// Start by lifting strings into the optimizable IR form, which can help
// everything else.
if (wasm->features.hasStrings() && options.optimizeLevel >= 2) {
addIfNoDWARFIssues("string-lifting-paired");
}
// Removing duplicate functions is fast and saves work later.
addIfNoDWARFIssues("duplicate-function-elimination");
// Do a global cleanup before anything heavy, as it is fairly fast and can
Expand Down Expand Up @@ -794,16 +807,17 @@ void PassRunner::addDefaultGlobalOptimizationPostPasses() {
} else {
addIfNoDWARFIssues("simplify-globals");
}
addIfNoDWARFIssues("remove-unused-module-elements");
if (options.optimizeLevel >= 2 && wasm->features.hasStrings()) {
// Gather strings to globals right before reorder-globals, which will then
// sort them properly.
addIfNoDWARFIssues("string-gathering");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need string-gathering at all any more, or can we remove its code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need it for users, though it is nice for testing. I can make it a testing pass as a followup?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me.

// Lower away strings at the very end. We do this before
// remove-unused-module-elements so we don't add unused imports, and also
// before reorder-globals, which will sort the new globals.
if (wasm->features.hasStrings() && options.optimizeLevel >= 2) {
addIfNoDWARFIssues("string-lowering-paired");
}
addIfNoDWARFIssues("remove-unused-module-elements");
if (options.optimizeLevel >= 2 || options.shrinkLevel >= 1) {
addIfNoDWARFIssues("reorder-globals");
}
// may allow more inlining/dae/etc., need --converge for that
// May allow more inlining/dae/etc., need --converge for that
addIfNoDWARFIssues("directize");
}

Expand Down
2 changes: 2 additions & 0 deletions src/passes/passes.h
Original file line number Diff line number Diff line change
Expand Up @@ -162,9 +162,11 @@ Pass* createSimplifyLocalsNoTeeNoStructurePass();
Pass* createStackCheckPass();
Pass* createStringGatheringPass();
Pass* createStringLiftingPass();
Pass* createStringLiftingPairedPass();
Pass* createStringLoweringPass();
Pass* createStringLoweringMagicImportPass();
Pass* createStringLoweringMagicImportAssertPass();
Pass* createStringLoweringPairedPass();
Pass* createStripDebugPass();
Pass* createStripDWARFPass();
Pass* createStripProducersPass();
Expand Down
14 changes: 14 additions & 0 deletions src/tools/optimization-options.h
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,20 @@ struct OptimizationOptions : public ToolOptions {
Options::Arguments::N,
[this](Options*, const std::string& pass) {
passOptions.passesToSkip.insert(pass);
})
.add("--no-string-lowering",
"",
"Disables the automatic lowering of stringref to imported strings "
"(that normally happens at the end of the optimization pipeline, if "
"-O2+ is specified)",
OptimizationOptionsCategory,
Options::Arguments::Zero,
[&](Options*, const std::string&) {
// string-lowering-paired is internal (it only makes sense as part
// of a pair of passes in the optimization pipeline), and so it
// does not show up in --help. To make it convenient for users to
// disable the lowering, we provide this flag.
passOptions.passesToSkip.insert("string-lowering-paired");
});

// add passes in registry
Expand Down
2 changes: 1 addition & 1 deletion src/wasm.h
Original file line number Diff line number Diff line change
Expand Up @@ -2137,7 +2137,7 @@ struct BinaryLocations {
std::unordered_map<Function*, FunctionLocations> functions;
};

// Forward declaration for FuncEffectsMap.
// Forward declaration for function effects.
class EffectAnalyzer;

class Function : public Importable {
Expand Down
6 changes: 6 additions & 0 deletions test/lit/help/wasm-metadce.test
Original file line number Diff line number Diff line change
Expand Up @@ -663,6 +663,12 @@
;; CHECK-NEXT:
;; CHECK-NEXT: --skip-pass,-sp Skip a pass (do not run it)
;; CHECK-NEXT:
;; CHECK-NEXT: --no-string-lowering Disables the automatic lowering
;; CHECK-NEXT: of stringref to imported strings
;; CHECK-NEXT: (that normally happens at the
;; CHECK-NEXT: end of the optimization
;; CHECK-NEXT: pipeline, if -O2+ is specified)
;; CHECK-NEXT:
;; CHECK-NEXT:
;; CHECK-NEXT: Tool options:
;; CHECK-NEXT: -------------
Expand Down
6 changes: 6 additions & 0 deletions test/lit/help/wasm-opt.test
Original file line number Diff line number Diff line change
Expand Up @@ -675,6 +675,12 @@
;; CHECK-NEXT:
;; CHECK-NEXT: --skip-pass,-sp Skip a pass (do not run it)
;; CHECK-NEXT:
;; CHECK-NEXT: --no-string-lowering Disables the automatic lowering
;; CHECK-NEXT: of stringref to imported strings
;; CHECK-NEXT: (that normally happens at the
;; CHECK-NEXT: end of the optimization
;; CHECK-NEXT: pipeline, if -O2+ is specified)
;; CHECK-NEXT:
;; CHECK-NEXT:
;; CHECK-NEXT: Tool options:
;; CHECK-NEXT: -------------
Expand Down
6 changes: 6 additions & 0 deletions test/lit/help/wasm2js.test
Original file line number Diff line number Diff line change
Expand Up @@ -627,6 +627,12 @@
;; CHECK-NEXT:
;; CHECK-NEXT: --skip-pass,-sp Skip a pass (do not run it)
;; CHECK-NEXT:
;; CHECK-NEXT: --no-string-lowering Disables the automatic lowering
;; CHECK-NEXT: of stringref to imported strings
;; CHECK-NEXT: (that normally happens at the
;; CHECK-NEXT: end of the optimization
;; CHECK-NEXT: pipeline, if -O2+ is specified)
;; CHECK-NEXT:
;; CHECK-NEXT:
;; CHECK-NEXT: Tool options:
;; CHECK-NEXT: -------------
Expand Down
55 changes: 55 additions & 0 deletions test/lit/passes/O2_O3_strings.wast
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
;; NOTE: Assertions have been generated by update_lit_checks.py --all-items and should not be edited.
;; NOTE: This test was ported using port_passes_tests_to_lit.py and could be cleaned up.

;; Run -O2 and then -O3. Both the -O2 and -O3 should lift and lower strings. If
;; the -O2 somehow prevented the -O3 from doing so (say, forgetting to reset the
;; state between the paired lifting and lowering pass) then we would see that
;; -O3 failed to fully optimize, as the function here requires
;; --precompute-propagate which only -O3 does.

;; RUN: foreach %s %t wasm-opt -O2 -O3 -all -S -o - | filecheck %s

(module
(type $array16 (array (mut i16)))

(import "\'" "foo" (global $foo (ref extern)))

(import "\'" "bar" (global $bar (ref extern)))

(import "wasm:js-string" "concat" (func $concat (param externref externref) (result (ref extern))))

;; CHECK: (type $0 (func (result (ref extern))))

;; CHECK: (import "\'" "foobarbar" (global $"string.const_\"foobarbar\"" (ref extern)))

;; CHECK: (export "string.concat" (func $string.concat))

;; CHECK: (func $string.concat (type $0) (result (ref extern))
;; CHECK-NEXT: (global.get $"string.const_\"foobarbar\"")
;; CHECK-NEXT: )
(func $string.concat (export "string.concat") (result (ref extern))
(local $x (ref extern))
(local $y (ref extern))
;; -O2 does not propagate the string constants through the locals, but -O3
;; will, allowing this function to return the concatenated final string.
(local.set $x
(global.get $foo)
)
(local.set $y
(global.get $bar)
)
;; Add an extra concat that is written to $x, so the local has multiple
;; writes (otherwise, simplify-locals manages to remove the locals, and -O2
;; does just as well as -O3).
(local.set $x
(call $concat
(local.get $x)
(local.get $y)
)
)
(call $concat
(local.get $x)
(local.get $y)
)
)
)
Loading
Loading