Skip to content

Support running a pthreads-using program in a worker #10036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 15, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions src/shell.js
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,8 @@ var _scriptDir = (typeof document !== 'undefined' && document.currentScript) ? d

if (ENVIRONMENT_IS_NODE) {
_scriptDir = __filename;
} else if (ENVIRONMENT_IS_WORKER) {
_scriptDir = self.location.href;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this proves that nobody has ever used this configuration. Why do you think this configuration is important to support?

I think if we remove this support, then the whole contents of worker.js could be moved to reside in the same file a.js instead of having to have a separate file a.worker.js. That in turn would allow having worker.js content live inside the JS module in MODULARIZE=1, mode, which would allow removing the Module exports hack applied in #9569, while keeping complexity low, which was complained about in that same PR.

I write hack, because the exports on Module object should be restricted to code symbols that flow from JS module to outside to external user JS code, and not for code visibility for setting up threads. Reusing Module for that solidifies its role as a "communication hub of random symbols for all purposes" object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this hasn't been used is a point against it. However, I think it's an important use case, and I think users may have just seen it didn't work and not bother to file a bug, or they knew that nested workers weren't supported in Chrome (in the past when they weren't) so they didn't even try.

Example use cases might be a physics engine or a chess engine, or binaryen's wasm-opt (imagine an online wasm compressor site) or something else that does pure computation, and is called from JS. That JS might happen to run in a worker. I think it would be bad if it stopped working if the user moves it there, as in general we want people to do as much as possible in workers, so it would be an annoying limitation if pthreads was not supported in that environment.

About removing worker.js - maybe I don't see what you mean, but how specifically would disallowing instantiating in a worker make that easier? It seems orthogonal, so I must be missing something.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to distinguish between the main thread (ENVIRONMENT_IS_PTHREAD==False), and pthreads (ENVIRONMENT_IS_PTHREAD==True), we need one bit of information that we can use to differentiate these two. Currently this one bit of information is the fact that main thread loads a.js, and pthread workers load a.worker.js which has a ENVIRONMENT_IS_PTHREAD=True field in it.

If we did not support main()-in-Worker case, we could instead make this one bit of information be ENVIRONMENT_IS_PTHREAD=typeof importScripts == 'function', and both main() thread and pthread workers can then load up a.js and we could move everything we have in a.worker.js into a.js.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Or at least I have not been able to come up with any other way to differentiate..)

I suppose there could be a way to emit a a.worker.js only if one wanted to enable support for such a main()-in-Worker case, and if not, then the file a.worker.js could be avoided.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, now I see what you mean. Good point.

I think we just need a single bit of information there. We could move all of a.worker.js into a.js, getting rid of the Module usages that you don't like (and I agree), and all that is left in a.worker.js would be just "create Module, set Module.isPthread to true, load the main a.js", and that's it. So I think that would get you 99% of what you want there, but yeah, we would still need a (tiny) a.worker.js file.

There might even be a way to get rid of that file entirely: the docs say there is a "name" param that can be set when calling new Worker, which the worker can then access using self.name. Perhaps the name could be "emscripten_pthread", and we use that as an indicator? There seems some risk there, though, and we'd need to check if it works in node.js workers too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting info about the name parameter, had missed that completely!

But apparently so did Safari developers :/ , filed a WebKit bug at https://bugs.webkit.org/show_bug.cgi?id=205253 .

Node.js has workerData, which could be used for this purpose: https://nodejs.org/api/worker_threads.html#worker_threads_worker_threads

}
#endif
#endif
Expand Down
15 changes: 15 additions & 0 deletions tests/test_browser.py
Original file line number Diff line number Diff line change
Expand Up @@ -4666,6 +4666,21 @@ def test_single_file_worker_js(self):
self.assertExists('test.js')
self.assertNotExists('test.worker.js')

# Tests that pthreads code works as intended in a Worker. That is, a pthreads-using
# program can run either on the main thread (normal tests) or when we start it in
# a Worker in this test (in that case, both the main application thread and the worker threads
# are all inside Web Workers).
@requires_threads
def test_pthreads_started_in_worker(self):
create_test_file('src.cpp', self.with_report_result(open(path_from_root('tests', 'pthread', 'test_pthread_atomics.cpp')).read()))
self.compile_btest(['src.cpp', '-o', 'test.js', '-s', 'TOTAL_MEMORY=64MB', '-s', 'USE_PTHREADS=1', '-s', 'PTHREAD_POOL_SIZE=8'])
create_test_file('test.html', '''
<script>
new Worker('test.js');
</script>
''')
self.run_browser('test.html', None, '/report_result?0')

def test_access_file_after_heap_resize(self):
create_test_file('test.txt', 'hello from file')
create_test_file('page.c', self.with_report_result(open(path_from_root('tests', 'access_file_after_heap_resize.c'), 'r').read()))
Expand Down