Closed
Description
See
WebAssembly/binaryen#4334 (comment)
That builds wasm-opt with EH and pthreads, then optimizes a large file. Despite using multiple cores the wasm version running in node is over 10x slower.
Node profiler shows this:
[C++]:
ticks total nonlib name
31085 28.2% 28.4% __lll_lock_wait
14723 13.4% 13.4% __pthread_mutex_unlock_usercnt
5617 5.1% 5.1% __pthread_cond_wait
5279 4.8% 4.8% __pthread_cond_timedwait
That's a lot of time spent in pthreads helper code for locking. I wonder if it's related to wasm atomics being always sequentially consistent or something like that? Just a random guess though.