-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Splitting into multiple lazily-loaded modules #3939
Comments
This is super cool work. FWIW, we would be happy to take PRs implementing the call-graph-based split analysis for wasm-split. Combined with PRs to add the ability to pass wasm-split function name patterns rather than complete function names to seed the list of functions to split out, I think we could recover all of the functionality you describe above without the need for a separate post-processing tool besides wasm-split. This could also simplify your macro somewhat because you wouldn't need to worry about disconnecting the call graph at the source level. One downside is that you wouldn't be able to do the analysis of the vtables in wasm-split, but it would be extremely useful for projects in other languages as well if that information could also be passed to wasm-split. |
I'll try to get my prototype into a git repo so that others can take a look. As far as integrating it into wasm-split vs wasm-bindgen --- I think that for users of wasm-bindgen it would work better to integrate it into wasm-bindgen, because wasm-bindgen already post-processes the .wasm and to properly split the wasm-bindgen-generated javascript will also require integration. However, the same strategy could certainly be employed for other languages. The only thing I've done that is rust-specific is the async integration. But I think you could expose it in a similar (maybe slightly less ergonomic way) to C and C++ via a C preprocessor macro. Note that the analysis of data symbols like vtables is not rust-specific at all --- it is entirely based on the relocation information emitted by LLVM wasm-ld, so I don't think there would be any problem integrating that into wasm-split. Disconnecting the call graph at the split points is not just to help the call graph analysis. Initially I tried leaving the call graph connected and just marking the function noinline. The problem was that the compiler/linker still inferred information across the call boundary, which made the splitting ineffective in the test case: in particular, in the example snippet I showed above, the split function returns |
The code is now available here: https://github.com/jbms/wasm-split-prototype |
@jbms I threw together a quick prototype adaptation of this using Leptos to lazy-load a second page. It took me about 15-20 mins to get it to work perfectly with our reactive system. I ran into a bunch of odd little things along the way that I didn't investigate (panics when trying to use one type as return type vs. another, etc.) but I just have to say: I've been waiting for this moment for about 4 years of using WASM with Rust. This is honestly a game-changer for Rust front-end frameworks: code splitting is one of the last big drawbacks relative to JS. Thanks so much for your work on it. Seeing our reactive system work in both directions across a split binary... Truly awesome. If there are ways people in the community can help out with testing or development let me know. Screen.Recording.2024-05-01.at.7.41.42.PM.mov |
@gbj Glad that you were able to get a test with leptos working. Indeed I also saw code splitting as a pretty critical limitation for moving certain types of applications to WebAssembly in rust, and put together this prototype in order to verify that the limitation could be eliminated in the future. In fact I don't have a ton of time to work on this, so if you or others are interested in helping turn this prototype into a real usable thing that would be awesome. The first step would be to determine whether this should be integrated into wasm-bindgen or made into a standalone library/tool -- I think integrating it into wasm-bindgen is the best option but that depends on the wasm-bindgen maintainers being supportive of its inclusion. |
By the way, in the prototype, functions marked as split points need to have a body that is compatible with a sync function, i.e. no use of Making real async function bodies work should be easy enough (though it may require boxing the Future). Making existential return types work might be possible, but they would run the risk of reducing the effectiveness of the splitting by propagating code dependencies across the split point. |
I encountered the no I was just messing around at this point so didn't bother experimenting further. If reproductions would be helpful I'm happy to share them but I was pulling in so much outside library code from the framework I don't know whether an MRE would be easy to make. |
Overall I didn't put in a lot of effort to make the implementation robust, because I was just trying to create a proof-of-concept and I expect all or most of the code will end up being rewritten before this gets integrated into wasm-bindgen. However, issues that indicate limitations of the current splitting strategy would be interesting to look at. Panics during building a probably relatively easy to analyze even without a reduced example. Crashes at runtime are significantly more annoying to debug... |
One place I've been wanting a split wasm is in an Audio Worklet. It gets its own special thread that doesn't have everything (no TextEncoder/Decoder). Will this code help with that? |
I don't have experience with audio worklets, but I think the prototype won't work as is because it generates some JavaScript that uses To support multi-threading with a SharedArrayBuffer memory we would actually benefit from something similar, to avoid redundantly fetching and compiling the same module in more than one worker. |
Just went back and checked and both panics I experienced were actually the assert at
|
Yeah previously the prototype did not properly handle references to imported functions from split modules. I just pushed out a fix for that. |
@daxpedda Would you be open to a PR that integrates module splitting into wasm-bindgen? |
Thanks, that fixed all my issues. With that additional commit, this is working perfectly for me. Here it is, lazily loading code for each of three separate routes, as well as code shared between the three routes. code-split-routing.movI'll stop spamming in this issue now :-) This is just very exciting, since WASM code-splitting has been an unrealized dream for a long time. I will be writing our next release to support async functions in reasonable places, so that this can be a drop-in enhancement -- whether it is included officially in wasm-bindgen (which would certainly have my vote) or whether we need to add it to our own build tooling separately. ETA: Thinking out loud; It would be very useful to have a way to output a manifest of all the WASM bundles per split. You can see one waterfall in my example above, where it needs to load the shared chunk and also the "view A" chunk, but doesn't load A until after the shared chunk. Not sure if those can be done concurrently, but in any case with server-side rendering if we have access to a manifest (just a JSON file that says "entry point function A will requires these 3 WASM files") we can start preloading them all immediately. |
It would be pretty easy to fix the generated JavaScript to fetch and instantiate all the modules in parallel. The start functions of the split modules just add entries to the indirect function table and there are no ordering constraints currently, though if support for splitting c++ dynamic initialization code for global variables were added, then those would have ordering constraints and need to be handled differently, probably by putting them in a separate function to be called after all the modules are instantiated. |
I pushed a change to fetch and instantiate the common chunks concurrently. |
Short answerUnfortunately not. Especially seeing how well this can work in an external tool. Long answerIn general I had to decline a lot of very useful features to Unfortunately My 2¢I'm quite amazed by this ad-hoc tool you built and how well it functions and I wish that this could be integrated in a bigger tooling ecosystem ( The reason why we want to keep the limited scope in I also think that Ultimately, my hope is that the component model can make |
Thanks for your response @daxpedda. If it is a separate tool that gets integrated into wasm-pack, then we have the following situation:
In general I understand the desire for modularity. On the other hand you can have "false modularity", where things that are nominally separate modules are in fact highly coupled in fragile ways. I worry that trying to do splitting separately from wasm-bindgen may introduce some "false modularity", where changes to wasm-bindgen may easily break wasm-split. Perhaps there is some way we can try to mitigate that risk? |
I agree, but unfortunately we have to aim for modularity to allow others to implement features that we can't afford to maintain ourselves.
I'm very much open to any suggestions! |
If your goal is to foster the ecosystem around wasm-bindgen, then its only responsible to be aware of ramifications of your changes. Vue's solution to this problem is to maintain an ecosystem-ci that routinely checks for breaking changes throughout their ecosystem. This sort of solution shortens feedback time and could even serve to centralize reporting / discussion for such issues if desired. It's hard for others to invest in that they cannot trust. Awareness through early detection and rapid iteration is often better than praying for ecosystem stability without measurement. Perhaps such a system may serve as the cornerstone for building that trust? Very exciting thread to follow, hope something great comes out of it! |
has anyone tried to get this working with C++ macro for emscripten like mentioned by @jbms here, cause this sounds interesting for doing stuff like porting & splitting CLIs (written in C++) to work in-the-browser in a lazy-loaded small & fast fashion.
Because for this use case it doesn't sound like wasm-split that emscripten provides is the best tool for splitting CLIs.
do you have a link to the code that does the vtables lookup & splitting? i'm more familiar with rust than C/C++, but do you have any idea what a macro implemented would look like (or how it'd work)? |
I think this is a great idea, I would support such an addition to our CI! |
That's great. Unfortunately I don't expect I'll have a lot of time to further develop wasm-split in the near future, but if you do, please go ahead and fork it. I'm happy to offer advice here and there. |
Motivation
See motivation here:
rustwasm/team#52
Proposed Solution
I have implemented a (limited/hacky) prototype, based on the following components:
#[wasm_split(xyz)]
function attribute macro that serves to annotate a function as a split point.xyz
is an identifier for the module that this function should be "split off" into. The same identifier can be used multiple times, in which case multiple functions will be "split off" into the same module. In my prototype the function must be non-async, and this macro turns it into an async function, but it wouldn't be hard to support both sync and async split points.For example, the macro converts:
into
Note that the real body of the function is moved to a separate exported function (
__wasm_split_00zstd00_export_56925a789e8e525628ef50b9c566f070_get_zstd_decoder
) that is never called. The original function body is replaced by code that ensures the module is asynchronously loaded, and then calls a separate imported function (__wasm_split_00zstd00_import_56925a789e8e525628ef50b9c566f070_get_zstd_decoder
). In a post-processing step,__wasm_split_00zstd00_import_56925a789e8e525628ef50b9c566f070_get_zstd_decoder
will be changed to refer to a function that does an indirect call of__wasm_split_00zstd00_export_56925a789e8e525628ef50b9c566f070_get_zstd_decoder
.This effectively disconnects the call graph at this split point, which is important for the post-processing.
Then we compile and link the program using
-Clink-args=--emit-relocs
.The post-processing reads in the linked
.wasm
file (before running wasm-bindgen, since wasm-bindgen does not preserve relocation information), identifies the split points based on the symbol names, and then determines the dependency graph of all symbols based on the relocation information.Note that the dependency graph includes both functions and data symbols, since data symbols such as vtables refer to functions via the indirect function table.
We then compute the contents of the "main" module as the transitive dependencies of:
For each split module, we then compute the transitive dependencies of the real implementation function (such as
__wasm_split_00zstd00_export_56925a789e8e525628ef50b9c566f070_get_zstd_decoder
) for each split point assigned to the module. When computing transitive dependencies here, we can stop once we encounter a symbol that is assigned to the main module.Symbols that are uniquely in the transitive dependencies of a single split module are assigned to that split module. Symbols that are in the transitive dependencies of more than one split module are assigned to a separate "chunk" module identified by the set of two or more split modules that have the symbol as a transitive dependency. Thus we may in general produce a large number of chunk modules. Various heuristics could be used to combine them.
The split point implementation functions, and any function that is called from more than one module, gets added to the
__indirect_function_table
.We then emit each module, using the relocation information to remap functions. In the prototype, although we compute dependencies as if data symbols are split out, in fact all of the data segments remain in the main module, but it should be feasible to split the data as well. Calls to functions defined in other modules are replaced by calls to a stub function that does an indirect call. Each split module has no
start
function but has an active element that initializes a portion of the__indirect_function_table
.The support javascript for loading the module looks something like:
Alternatives
This implementation was inspired by the description of the emscripten wasm-split tool (https://emscripten.org/docs/optimizing/Module-Splitting.html#module-splitting). The emscripten wasm-split tool differs in the following ways:
While the emscripten wasm-split approach could presumably be adapted to rust fairly easily, I think there are a lot of advantages to explicitly-annotated, asynchronously-loaded split points.
Another alternative would be to provide something closer to
dlopen
, which I think may be along the lines of what is being proposed for a webassembly dynamic linking mechanism. The advantage of what I'm proposing here over a dlopen-style interface is:wasm_split
macro provides a very ergonomic interfaceAdditional Context
The current prototype implementation is basically independent of wasm-bindgen --- it works with an unmodified wasm-bindgen but the module loading depends slightly on implementation details of wasm-bindgen.
Ultimately, though, as a feature for which I think there is quite a lot of interest in the community, it would probably be better to integrate this into wasm-bindgen itself --- that would allow the javascript code to be split along with the wasm module.
Towards that goal, I'd appreciate some guidance on whether this feature would likely be accepted, and if so, any comments on how best to integrate it.
The current prototype implementation uses wasm_encoder and wasmparser directly. I initially attempted to use walrus but found that its abstractions didn't work very well given the need to make use of the relocation information. Possibly walrus could be modified to provide the necessary functionality. Alternatively, the splitting could be done first using wasm_encoder and wasmparser directly, and then the remaining wasm-bindgen processing could be done using walrus.
The text was updated successfully, but these errors were encountered: