Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce main module size #1574

Closed
wants to merge 1 commit into from
Closed

Conversation

Y--
Copy link
Contributor

@Y-- Y-- commented Jan 8, 2024

Note: This PR is more a PoC to start a conversation: it is not in a mergeable state.

Today, when building DuckDB WASM with the "loadable extension" flag, the bundles are quite big because when building with MAIN_MODULE=1, Emscripten exports all absolutely the symbols:

Bundle Uncompressed Brotli GZip
duckdb-eh.js 45M 1.4M 2.0M
duckdb-eh.wasm 32M 4.1M 6.2M

This PR is an attempt to reduce these. Here are the size when built with this PR:

Bundle Uncompressed Brotli GZip
duckdb-eh.js 2.7M 193K 250K
duckdb-eh.wasm 19M 2.7M 4.2M

I was able to load all the in-tree DuckDB extensions, as well as ours with this build.

Of course, this is not that easy, here are the caveats:

Happy to hear your thoughts and whether it is a road worth pursuing, or if you have better ideas on how to solve this?

@carlopi
Copy link
Collaborator

carlopi commented Jan 9, 2024

Hi! Thanks a lot for bringing this up, and the exploration you did.

I think this is a bit too brittle, but if needed this can be considered as a bridge (potentially we might need a different list for each bundle, since functions might end up having different signatures, plus there are ad hoc functions like the personality function that do are missing if exceptions are not there, see https://github.com/duckdb/duckdb-wasm/actions/runs/7453872285/job/20280188337?pr=1574#step:7:5872).

The problem is partially that the functions are exported, but mainly the fact that for each exported functions it's name (as a string) it's repeated multiple times:
1 time in the wasm file, see the output of wasm2wat duckdb-eh.wasm --enable-all | grep export:

  (export "_ZNSt3__218__allocation_guardINS_9allocatorINS_20__shared_ptr_emplaceIN5arrow12NumericArrayINS3_10Date64TypeEEENS1_IS6_EEEEEEEC2B8ue170004IS7_EET_m" (func 185))
  (export "_ZNSt3__220__shared_ptr_emplaceIN5arrow12NumericArrayINS1_10Date64TypeEEENS_9allocatorIS4_EEEC2B8ue170004IJxNS_10shared_ptrINS1_6BufferEEERKSB_xxEEES6_DpOT_" (func 193))

that in my versions generates 80000+ lines.
Then each symbol (the actual string, that might look like __ZN5arrow7struct_ERKNSt3__26vectorINS0_10shared_ptrINS_5FieldEEENS0_9allocatorIS4_EEEE, is repeated 5 more times in the connected JavaScript file. (EDIT: 5 times is for functions, there are also globals in the exports, will need to check better but I would assume a somewhat similar number)

Here I think there are few possible solutions:

  • a proper solution would be for the JS code to avoid the duplication entirely, and just add code on the fly based on the content of the wasm file (that is available, given it's loaded by the JavaScript file itself). This would solve the JS overhead (that is the most significant one). Proper place for this to be done would be in the dlopen Emscripten implementation and seems a bit patchy to do this in post-processing. Potentially a single redefinition of the variable might be needed, but this would cut at least 75% of the overhead (possibly 100%, unsure if the variable definition needs to be there).
  • another solution would be having a way to remap function names (that are long strings of characters) to a shortened version, like _ZNSt3__220__shared_ptr_emplaceIN5arrow12NumericArrayINS1_10Date64TypeEEENS_9allocatorIS4_EEEC2B8ue170004IJxNS_10shared_ptrINS1_6BufferEEERKSB_xxEEES6_DpOT_ gets mapped to f12445, and everywhere there is the first string you put the second one. This would keep 1 reference in the wasm file and 4 in the js files, but those names will be significantly shorter. (to go up to 80K names you need strings of length 5 + 1 character to start and allow easier pattern matching). Average length of names is about 105 for now, so gains will be like 90%+ of the overhead.

Main problem of this remapping is that the original names still needs to be stored somewhere.
Simple solution would be storing the mapping as part of the JavaScript code (this reduces savings from 90%+ to 75% on JS side, and keeps them Wasm side). This is probably the simpler version I would like to start experimenting with.
Somehow more involved on the infrastructure side is keeping the mapping elsewhere (basically a property of duckdb), and have this mapping be used also to produce extensions, so the mapping becomes needed only at build-time (either for duckdb-wasm or for extensions).

@domoritz
Copy link
Collaborator

domoritz commented Jan 9, 2024

As you look at file sizes, please also measure gzip and brotli compressed sizes. We should track that as you make progress.

@carlopi
Copy link
Collaborator

carlopi commented Jan 9, 2024

Actually I think solution 1 (= iterating on available exports) should be doable, for now I implemented only a prototype but it seems that the whole overhead on JS files can be avoided (adding a post-processing step).

Wasm side the exports are still somehow pricey, so maybe both solution can be combined.

I have not explored changing optimization level, also that can be done independenlty.

@Y--
Copy link
Contributor Author

Y-- commented Jan 9, 2024

Thanks for the feedback!

Agree that solution 1 sounds the most promising (and the cleanest one). I actually tried to hack something quickly before the holiday break but couldn't come up with something obvious. I agree that this seems very doable indeed, happy to help on that front.

Solution 2 is basically what closure=1 is supposed to do, and while it works very well on the small examples I tried, I faced some errors with duckdb-wasm (though didn't investigate much time looking what the problem was exactly). In any case I think we should use it once we have exhausted solution 1.

And finally completely agree with @domoritz that we should also track the gzipped and brotli-compressed size. I actually almost provided the brotli size in the PR description but felt that while important, it is not the main point: these symbols are still an overhead for the browser even if bundle is compressed. I'll edit the description with them though :-)

@carlopi, let me know how I can best help you with solution 1.

@carlopi
Copy link
Collaborator

carlopi commented Jan 17, 2024

I found a solution to the size problem, I will open a PR soon (EDIT: it's here #1589), basics are that JavaScript size we don't need to expose any C++ side functions (apart for the few duckdb_web ones and some utilities listed as EXPORTED_FUNCTIONS).

I have not find a polished way to do this, butstripping them via awk from the JS side will be enough for now.

This solutions allows to reduce the size of the uncompressed duckdb-eh.js file to 2.0M, while the wasm files will not be modified (since the overhead on listing the exports will still be there).


The general problem is that Emscripten seems not to have (or at least, not one that I have found) a way to specify that there are two level of interfaces: symbols that needs to be externally available at the C++ side to other modules to be dlopened, and symbols that needs to be exposed to the JavaScript interface (for functions that needs to be there on the JS side).

I hope a solution can be implemented properly in Emscripten.


This PR is still interesting in the blue-print to select explicitly which functions are to be exported (C++ side), that allows to reduce the weight of the wasm side, so it might make sense to have both since they are independent, I will come back to this after merging the JS-side PR.

@Y--
Copy link
Contributor Author

Y-- commented Jan 29, 2024

Closing since #1589 is a much better approach.

@Y-- Y-- closed this Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants