-
Notifications
You must be signed in to change notification settings - Fork 406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emscripten support fundamentally broken? #650
Comments
Thanks for pinging!
This is indeed the case, and your analysis of the issue is correct. The strategy that users of bdwgc on Emscripten utilize is to restrict GC to only occur when there are no managed pointers on the stack, e.g. asynchronously via a setTimeout() trigger. Or, if you do need to synchronously collect, the coordination from Emscripten compiler that you are after is something that Emscripten's own Binaryen optimizer's --spill-pointers pass does. I have been developing a research garbage collector for WebAssembly/Emscripten here: https://github.com/juj/emgc . Maybe you can find more information there about the challenges of GCing in WebAssembly. |
That's great to know, thanks a lot! Compiling with |
See ivmai/bdwgc#650 for more discussion.
Summary
I recently encountered an issue with bdwgc in emscripten where pointers "on the stack" (reason for quotation marks is explained below) are not found during the marking phase and thus memory is released too early. I believe that this is a not a simple bug but rather a fundamental issue that cannot be resolved without cooperation from the emscripten C compiler.
A test case
A simple test program which shows the issue is as follows:
If everything works correctly, this should just print the numbers from 0 to 9999. But when compiled with
emcc -O2 test.c libgc.a -o test.html -sASYNCIFY
and run in node or in a web browser, it only prints numbers from 8192 upwards. The reason for the failure is that bdwgc does not find thehead
pointer during the marking phase and thus frees the entire list too early, reusing the memory for later allocations.Why I think this issue is unsolvable without changes in the emscripten C compiler
Unlike ordinary assembly languages, webassembly does not have a fixed register file. Instead, there is the concept of a "local". These locals effectively behave as an unlimited number of registers which are automatically saved and restored before/after function calls by the wasm virtual machine. The
head
pointer in the test case is stored in such a local variable. These are not scanned by bdwgc and thus memory is reclaimed too early. However, to the best of my knowledge, there is no way to access the value of locals for functions further up the call stack. They are not saved on the same stack that emscripten uses and which bdwgc does scan for pointers, but rather in a memory space that is internal to the wasm VM.Therefore, the issue does not seem to be solvable without instructing the C compiler to always keep pointers on a stack which is accessible to bdwgc. This usually happens when compiling with
-O0
(the above test case works correctly in this case), but I have also encountered examples where even with-O0
pointers are not kept on the emscripten stack.Unless there is some way around the problem that I did not think of, in my opinion it would be useful to warn about this issue in the documentation, if not remove the broken emscripten support altogether.
The text was updated successfully, but these errors were encountered: