Skip to content

Commit 6d8b13c

Browse files
authored
Merge pull request #301 from pavelsavara/wasm-browser-threads
[browser][wasm] threads and JS interop
2 parents 54ddb59 + 4e781dc commit 6d8b13c

File tree

2 files changed

+244
-0
lines changed

2 files changed

+244
-0
lines changed

INDEX.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ Use update-index to regenerate it:
8383
| 2022 | [.NET 7 Version Selection Improvements](accepted/2022/version-selection.md) | [Rich Lander](https://github.com/richlander) |
8484
| 2023 | [.NET 8.0 Polyfill](accepted/2023/net8.0-polyfills/net8.0-polyfills.md) | [Immo Landwerth](https://github.com/terrajobst) |
8585
| 2023 | [Experimental APIs](accepted/2023/preview-apis/preview-apis.md) | [Immo Landwerth](https://github.com/terrjobst) |
86+
| 2023 | [Multi-threading on a browser](accepted/2023/wasm-browser-threads.md) | [Pavel Savara](https://github.com/pavelsavara) |
8687
| 2023 | [net8.0-browser TFM for applications running in the browser](accepted/2023/net8.0-browser-tfm.md) | [Javier Calvarro](https://github.com/javiercn) |
8788

8889
## Drafts

accepted/2023/wasm-browser-threads.md

Lines changed: 243 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,243 @@
1+
# Multi-threading on a browser
2+
3+
**Owner** [Pavel Savara](https://github.com/pavelsavara) |
4+
5+
## Table of content
6+
- [Goals](#goals)
7+
- [Key ideas](#key-ideas)
8+
- [State April 2024](#state-2024-april)
9+
- [Design details](#design-details)
10+
- [State September 2023](#state-2023-sep)
11+
- [Alternatives](#alternatives---as-considered-2023-sep)
12+
13+
# Goals
14+
- CPU intensive workloads on dotnet thread pool.
15+
- Allow user to start new managed threads using `new Thread` and join it.
16+
- Add new C# API for creating web workers with JS interop. Allow JS async/promises via external event loop.
17+
- enable blocking `Task.Wait` and `lock()` like APIs from C# user code on all threads
18+
- Current public API throws PNSE for it
19+
- This is core part on MT value proposition.
20+
- If people want to use existing MT code-bases, most of the time, the code is full of locks.
21+
- People want to use existing desktop/server multi-threaded code as is.
22+
- allow HTTP and WS C# APIs to be used from any thread despite underlying JS object affinity.
23+
- Blazor `BeginInvokeDotNet`/`EndInvokeDotNetAfterTask` APIs work correctly in multithreaded apps.
24+
- JSImport/JSExport interop in maximum possible extent.
25+
- don't change/break single threaded build. †
26+
27+
## Lower priority goals
28+
- try to make it debugging friendly
29+
- sync C# to async JS
30+
- dynamic creation of new pthread
31+
- implement crypto via `subtle` browser API
32+
- allow MonoVM to lazily download DLLs from the server, instead of during startup.
33+
- implement synchronous APIs of the HTTP and WS clients. At the moment they throw PNSE.
34+
- sync JS to async JS to sync C#
35+
- allow calls to synchronous JSExport from UI thread (callback)
36+
- don't prevent future marshaling of JS [transferable objects](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Transferable_objects), like streams and canvas.
37+
- offload CPU intensive part of WASM startup to WebWorker, so that the pre-rendered (blazor) UI could stay responsive during Mono VM startup.
38+
39+
## Non-goals
40+
- interact with JS state on `WebWorker` of managed threads other than UI thread or dedicated `JSWebWorker`
41+
42+
<sub><sup>† Note: all the text below discusses MT build only, unless explicit about ST build.</sup></sub>
43+
44+
# Key ideas
45+
46+
Move all managed user code out of UI/DOM thread, so that it becomes consistent with all other threads.
47+
48+
## Context - Problems
49+
**1)** If you have multithreading, any thread might need to block while waiting for any other to release a lock.
50+
- locks are in the user code, in nuget packages, in Mono VM itself
51+
- there are managed and un-managed locks
52+
- in single-threaded build of the runtime, all of this is NOOP. That's why it works on UI thread.
53+
54+
**2)** UI thread in the browser can't synchronously block
55+
- that means, "you can't not block" UI thread, not just usual "you should not block" UI
56+
- `Atomics.wait()` throws `TypeError` on UI thread
57+
- you can spin-wait but it's bad idea.
58+
- Deadlock: when you spin-block, the JS timer loop and any messages are not pumping.
59+
- But code in other threads may be waiting for some such event to resolve.
60+
- all async/await don't work
61+
- all networking doesn't work
62+
- you can't create or join another web worker
63+
- browser dev tools UI freeze
64+
- It eats your battery
65+
- Browser will kill your tab at random point (Aw, snap).
66+
- It's not deterministic and you can't really test your app to prove it harmless.
67+
- all the other threads/workers could synchronously block
68+
- `Atomics.wait()` works as expected
69+
- if we will have managed thread on the UI thread, any `lock` or Mono GC barrier could cause spin-wait
70+
- in case of Mono code, we at least know it's short duration
71+
- we should prevent it from blocking in user code
72+
73+
**3)** JavaScript engine APIs and objects have thread affinity.
74+
- The DOM and few other browser APIs are only available on the main UI "thread"
75+
- and so, you need to have C# interop with UI, but you can't block there.
76+
- HTTP & WS objects have affinity, but we would like to consume them (via Streams) from any managed thread
77+
- Any `JSObject`, `JSException` and `Promise`->`Task` have thread affinity
78+
- they need to be disposed on correct thread. GC is running on random thread
79+
80+
**4)** State management of JS context `self` of the worker.
81+
- emscripten pre-allocates pool of web worker to be used as pthreads.
82+
- Because they could only be created asynchronously, but `pthread_create` is synchronous call
83+
- Because they are slow to start
84+
- those pthreads have stateful JS context `self`, which is re-used when mapped to C# thread pool
85+
- when we allow JS interop on a managed thread, we need a way how to clean up the JS state
86+
87+
**5)** Blazor's `renderBatch` is using direct memory access
88+
89+
**6)** Dynamic creation of new WebWorker requires async operations on emscripten main thread.
90+
- we could pre-allocate fixed size pthread pool. But one size doesn't fit all and it's expensive to create too large pool.
91+
92+
**7)** There could be pending HTTP promise (which needs browser event loop to resolve) and blocking `.Wait` on the same thread and same task/chain. Leading to deadlock.
93+
94+
# State 2024 April
95+
96+
## What was implemented in Net9 - Deputy thread design
97+
98+
For other possible design options we considered [see below](#alternatives-and-details---as-considered-2023-sep).
99+
100+
- Introduce dedicated web worker called "deputy thread"
101+
- managed `Main()` is dispatched onto deputy thread
102+
- MonoVM startup on deputy thread
103+
- non-GC C functions of mono are still available
104+
- Emscripten startup stays on UI thread
105+
- C functions of emscripten
106+
- download of assets and into WASM memory
107+
- UI/DOM thread
108+
- because the UI thread would be mostly idling, it could:
109+
- render UI, keep debugger working
110+
- dynamically create pthreads
111+
- UI thread stays attached to Mono VM for Blazor's reasons (for Net9)
112+
- it keeps `renderBatch` working as is, bu it's far from ideal
113+
- there is risk that UI could be suspended by pending GC
114+
- It would be ideal change Blazor so that it doesn't touch managed objects via naked pointers during render.
115+
- we strive to detach the UI thread from Mono
116+
- I/O thread
117+
- is helper thread which allows `Task` to be resolved by UI's `Promise` even when deputy thread is blocked in `.Wait`
118+
- JS interop from any thread is marshaled to UI thread's JavaScript
119+
- HTTP and WS clients are implemented in JS of UI thread
120+
- There is draft of `JSWebWorker` API
121+
- it allows C# users to create dedicated JS thread
122+
- the `JSImport` calls are dispatched to it if you are on the that thread
123+
- or if you pass `JSObject` proxy with affinity to that thread as `JSImport` parameter.
124+
- The API was not made public in Net9 yet
125+
- calling synchronous `JSExports` is not supported on UI thread
126+
- this could be changed by configuration option but it's dangerous.
127+
- calling asynchronous `JSExports` is supported
128+
- calling asynchronous `JSImport` is supported
129+
- calling synchronous `JSImport` is supported without synchronous callback to C#
130+
- Strings are marshaled by value
131+
- as opposed to by reference optimization we have in single-threaded build
132+
- Emscripten VFS and other syscalls
133+
- file system operations are single-threaded and always marshaled to UI thread
134+
- Emscripten pool of pthreads
135+
- browser threads are expensive (as compared to normal OS)
136+
- creation of `WebWorker` requires UI thread to do it
137+
- there is quite complex and slow setup for `WebWorker` to become pthread and then to attach as Mono thread.
138+
- that's why Emscripten pre-allocates pthreads
139+
- this allows `pthread_create` to be synchronous and faster
140+
141+
# Design details
142+
143+
## Define terms
144+
- UI thread
145+
- this is the main browser "thread", the one with DOM on it
146+
- it can't block-wait, only spin-wait
147+
- "sidecar" thread - possible design
148+
- is a web worker with emscripten and mono VM started on it
149+
- there is no emscripten on UI thread
150+
- for Blazor rendering MAUI/BlazorWebView use the same concept
151+
- doing this allows all managed threads to allow blocking wait
152+
- "deputy" thread - possible design
153+
- is a web worker and pthread with C# `Main` entrypoint
154+
- emscripten startup stays on UI thread
155+
- doing this allows all managed threads to allow blocking wait
156+
- "managed thread"
157+
- is a thread with emscripten pthread and Mono VM attached thread and GC barriers
158+
- "main managed thread"
159+
- is a thread with C# `Main` entrypoint running on it
160+
- if this is UI thread, it means that one managed thread is special
161+
- see problems **1,2**
162+
- "managed thread pool thread"
163+
- pthread dedicated to serving Mono thread pool
164+
- "comlink"
165+
- in this document it stands for the pattern
166+
- dispatch to another worker via pure JS means
167+
- create JS proxies for types which can't be serialized, like `Function`
168+
- actual [comlink](https://github.com/GoogleChromeLabs/comlink)
169+
- doesn't implement spin-wait
170+
- we already have prototype of the similar functionality
171+
- which can spin-wait
172+
173+
## Proxies - thread affinity
174+
- all proxies of JS objects have thread affinity
175+
- all of them need to be used and disposed on correct thread
176+
- how to dispatch to correct thread is one of the questions here
177+
- all of them are registered to 2 GCs
178+
- `Dispose` need to be schedule asynchronously instead of blocking Mono GC
179+
- because of the proxy thread affinity, but the target thread is suspended during GC, so we could not dispatch to it, at that time.
180+
- the JS handles need to be freed only after both sides unregistered it (at the same time).
181+
- `JSObject`
182+
- have thread ID on them, so we know which thread owns them
183+
- `JSException`
184+
- they are a proxy because stack trace is lazy
185+
- we could eval stack trace eagerly, so they could become "value type"
186+
- but it would be expensive
187+
- `Task`
188+
- continuations need to be dispatched onto correct JS thread
189+
- they can't be passed back to wrong JS thread
190+
- resolving `Task` could be async
191+
- `Func`/`Action`/`JSImport`
192+
- callbacks need to be dispatched onto correct JS thread
193+
- they can't be passed back to wrong JS thread
194+
- calling functions which return `Task` could be aggressively async
195+
- unless the synchronous part of the implementation could throw exception
196+
- which maybe our HTTP/WS could do ?
197+
- could this difference be ignored ?
198+
- `JSExport`/`Function`
199+
- we already are on correct thread in JS, unless this is UI thread
200+
- would anything improve if we tried to be more async ?
201+
- `MonoString`
202+
- we have optimization for interned strings, that we marshal them only once by value. Subsequent calls in both directions are just a pinned pointer.
203+
- in deputy design we could create `MonoString` instance on the UI thread, but it involves GC barrier
204+
205+
## JSWebWorker with JS interop
206+
- is proposed concept to let user to manage JS state of the worker explicitly
207+
- because of problem **4**
208+
- is C# thread created and disposed by new API for it
209+
- could block on synchronization primitives
210+
- could do full JSImport/JSExport to it's own JS `self` context
211+
- there is `JSSynchronizationContext`` installed on it
212+
- so that user code could dispatch back to it, in case that it needs to call `JSObject` proxy (with thread affinity)
213+
- this thread needs to throw on any `.Wait` because of the problem **7**
214+
215+
## HTTP and WS clients
216+
- are implemented in terms of `JSObject` and `Promise` proxies
217+
- they have thread affinity, see above
218+
- typically to the `JSWebWorker` of the creator
219+
- but are consumed via their C# Streams from any thread.
220+
- therefore need to solve the dispatch to correct thread.
221+
- such dispatch will come with overhead
222+
- especially when called with small buffer in tight loop
223+
- or we could throw PNSE, but it may be difficult for user code to
224+
- know what thread created the client
225+
- have means how to dispatch the call there
226+
- other unknowing users are `XmlUrlResolver`, `XmlDownloadManager`, `X509ResourceClient`, ...
227+
- because we could have blocking wait now, we could also implement synchronous APIs of HTTP/WS
228+
- so that existing user code bases would just work without change
229+
- this would also require separate thread, doing the async job
230+
- we could use I/O thread for it
231+
232+
## Performance
233+
As compared to ST build for dotnet wasm:
234+
- the dispatch between threads (caused by JS object thread affinity) will have negative performance impact on the JS interop
235+
- in case of HTTP/WS clients used via Streams, it could be surprizing
236+
- browser performance is lower when working with SharedArrayBuffer
237+
- Mono performance is lower because there are GC safe-points and locks in the VM code
238+
- startup is slower because creation of WebWorker instances is slow
239+
- VFS access is slow because it's dispatched to UI thread
240+
- console output is slow because it's POSIX stream is dispatched to UI thread, call per line
241+
242+
# Alternatives and details - as considered 2023 Sep
243+
See https://gist.github.com/pavelsavara/c81ef3a9e4000d67f49ddb0f1b1c2284

0 commit comments

Comments
 (0)