Skip to content

Support running a pthreads-using program in a worker #10036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 15, 2019
Merged

Conversation

kripken
Copy link
Member

@kripken kripken commented Dec 14, 2019

@juj noticed we were missing tests for this mode. Looks like
the test suite relies on the convenience of PROXY_TO_PTHREAD
and --proxy-to-worker for testing related things, but we didn't
test the case of the entire compiled application being instantiated
in a worker (in that case, an interesting thing is that both the main
application thread, and the pthreads, are all in Web Workers).

The only thing preventing this from working was that we didn't
have a line of code to get _scriptDir when in a worker (we just
had main thread and node.js).

This depends on workers being able to spawn workers, which
appears to be true everywhere but Safari currently.

edit: added last paragraphs

@sbc100
Copy link
Collaborator

sbc100 commented Dec 14, 2019

What about the simpler case of simple starting a single threaded app in a worker? Do we test that already?

@kripken
Copy link
Member Author

kripken commented Dec 14, 2019

@sbc100 looks like we do in at least browser.do_test_worker.

@kripken kripken changed the title Add a test for running a pthreads program in a worker Support running a pthreads-using program in a worker Dec 14, 2019
@@ -100,6 +100,8 @@ var _scriptDir = (typeof document !== 'undefined' && document.currentScript) ? d

if (ENVIRONMENT_IS_NODE) {
_scriptDir = __filename;
} else if (ENVIRONMENT_IS_WORKER) {
_scriptDir = self.location.href;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this proves that nobody has ever used this configuration. Why do you think this configuration is important to support?

I think if we remove this support, then the whole contents of worker.js could be moved to reside in the same file a.js instead of having to have a separate file a.worker.js. That in turn would allow having worker.js content live inside the JS module in MODULARIZE=1, mode, which would allow removing the Module exports hack applied in #9569, while keeping complexity low, which was complained about in that same PR.

I write hack, because the exports on Module object should be restricted to code symbols that flow from JS module to outside to external user JS code, and not for code visibility for setting up threads. Reusing Module for that solidifies its role as a "communication hub of random symbols for all purposes" object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this hasn't been used is a point against it. However, I think it's an important use case, and I think users may have just seen it didn't work and not bother to file a bug, or they knew that nested workers weren't supported in Chrome (in the past when they weren't) so they didn't even try.

Example use cases might be a physics engine or a chess engine, or binaryen's wasm-opt (imagine an online wasm compressor site) or something else that does pure computation, and is called from JS. That JS might happen to run in a worker. I think it would be bad if it stopped working if the user moves it there, as in general we want people to do as much as possible in workers, so it would be an annoying limitation if pthreads was not supported in that environment.

About removing worker.js - maybe I don't see what you mean, but how specifically would disallowing instantiating in a worker make that easier? It seems orthogonal, so I must be missing something.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to distinguish between the main thread (ENVIRONMENT_IS_PTHREAD==False), and pthreads (ENVIRONMENT_IS_PTHREAD==True), we need one bit of information that we can use to differentiate these two. Currently this one bit of information is the fact that main thread loads a.js, and pthread workers load a.worker.js which has a ENVIRONMENT_IS_PTHREAD=True field in it.

If we did not support main()-in-Worker case, we could instead make this one bit of information be ENVIRONMENT_IS_PTHREAD=typeof importScripts == 'function', and both main() thread and pthread workers can then load up a.js and we could move everything we have in a.worker.js into a.js.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Or at least I have not been able to come up with any other way to differentiate..)

I suppose there could be a way to emit a a.worker.js only if one wanted to enable support for such a main()-in-Worker case, and if not, then the file a.worker.js could be avoided.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, now I see what you mean. Good point.

I think we just need a single bit of information there. We could move all of a.worker.js into a.js, getting rid of the Module usages that you don't like (and I agree), and all that is left in a.worker.js would be just "create Module, set Module.isPthread to true, load the main a.js", and that's it. So I think that would get you 99% of what you want there, but yeah, we would still need a (tiny) a.worker.js file.

There might even be a way to get rid of that file entirely: the docs say there is a "name" param that can be set when calling new Worker, which the worker can then access using self.name. Perhaps the name could be "emscripten_pthread", and we use that as an indicator? There seems some risk there, though, and we'd need to check if it works in node.js workers too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting info about the name parameter, had missed that completely!

But apparently so did Safari developers :/ , filed a WebKit bug at https://bugs.webkit.org/show_bug.cgi?id=205253 .

Node.js has workerData, which could be used for this purpose: https://nodejs.org/api/worker_threads.html#worker_threads_worker_threads

@juj
Copy link
Collaborator

juj commented Dec 15, 2019

I think I'll postpone the attempt to remove a.worker.js, closed PR #10036 - I'll see if I can figure out some other method for that in the future.

@kripken kripken merged commit 8d692d5 into incoming Dec 15, 2019
@delete-merged-branch delete-merged-branch bot deleted the pthinwork branch December 15, 2019 18:44
belraquib pushed a commit to belraquib/emscripten that referenced this pull request Dec 23, 2020
…#10036)

@juj noticed we were missing tests for this mode. Looks like
the test suite relies on the convenience of PROXY_TO_PTHREAD
and --proxy-to-worker for testing related things, but we didn't
test the case of the entire compiled application being instantiated
in a worker (in that case, an interesting thing is that both the main
application thread, and the pthreads, are all in Web Workers).

The only thing preventing this from working was that we didn't
have a line of code to get _scriptDir when in a worker (we just
had main thread and node.js).

This depends on workers being able to spawn workers, which
appears to be true everywhere but Safari currently.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants