Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new SharedWorker() should queue a task to fire an error when parse error occurs #5323

Open
elkurin opened this issue Mar 2, 2020 · 15 comments
Labels
interop Implementations are not interoperable with each other topic: workers

Comments

@elkurin
Copy link
Contributor

elkurin commented Mar 2, 2020

According to the current spec, a shared worker doesn't fire an error event when parse error occurs while it does for fetch failure which fails in the same step as parse error. It would be nice to fire an error event for parse error as well so that developers can catch the error by AbstractWorker.onerror with the description of failure.

Fetch failure is caught here:

Step 12: Obtain script by switching on the value of options's type member: ...
If the algorithm asynchronously completes with null, then:
SubStep1: Queue a task to fire an event named error at worker.

https://html.spec.whatwg.org/multipage/workers.html#run-a-worker

We get null as a result of fetching algorithm when fetch failure occurs but we get the result with |error to rethrow| when parse error occurs and it is treated as non-null. Therefore, I suggest to modify the spec by adding the condition to the error handling step like:
"If the algorithm asynchronously completes with null or script whose error to rethrow is not null, then: queue a task to fire an event named error at worker ..."
(As far as I know, FireFox has already implemented parse error handling as this.)

The similar change has done in worklets spec as well: w3c/css-houdini-drafts#509
The corresponding spec is here: https://drafts.css-houdini.org/worklets/#fetch-and-invoke-a-worklet-script

For more information if you need: https://docs.google.com/document/d/1ot2GurRwSUlvAEB1_XlVPR9-7KT8lAmK1sAb8P11qZY/edit#heading=h.fo73cyj0hll

@annevk
Copy link
Member

annevk commented Mar 2, 2020

I was thinking this was done in part because there can be multiple SharedWorker instances pointing to the same shared worker (which is why it's different from dedicated workers and worklets), but the algorithm doesn't seem to account for that match and in fact is still racy even with the shared worker manager (due to usage of "in parallel" in the end, perhaps that ought to not use that).

@annevk
Copy link
Member

annevk commented Mar 2, 2020

cc @asutherland @perryjiang

@annevk annevk changed the title [SharedWorkers] new SharedWorker() should queue a task to fire an error when parse error occurs new SharedWorker() should queue a task to fire an error when parse error occurs Mar 2, 2020
@asutherland
Copy link

Yes, Firefox will report parse/evaluation errors. (SharedWorkers' handling I believe lives here which has special cases for each of the 3 types of workers). And it seems appropriate for this to be in the spec.

However, it looks like Blink doesn't report the error? I made a quick test case at https://worker-playground.glitch.me/shared-worker-top-level-error.html that does console.log() for Dedicated and Shared Workers on the same file, and Firefox reports both errors, but Chrome does not. (And Safari doesn't support SharedWorkers.)

@annevk
Copy link
Member

annevk commented Mar 3, 2020

I think @elkurin wants to change that.

I think the other thing we should clarify is what happens if you have multiple SharedWorker instances all pointing to the same thing that fails to fetch or fails to parse. The specification looks somewhat broken in how it deals with that.

@asutherland
Copy link

Yes, the spec should be changed. And in the case of multiple SharedWorker instances, it would seem to make sense that every SharedWorker instance associated with the given worker global scope be notified of the error. If there's only one SharedWorker at the time of evaluation, only one hears about it.

I wonder if this should be handled by changing https://html.spec.whatwg.org/#runtime-script-errors-2 to treat shared workers similarly to dedicated workers?

@annevk
Copy link
Member

annevk commented Mar 3, 2020

That would go even further and include errors beyond fetch/parse. Not entirely convinced that's desirable. (Also not sure it was a good idea for dedicated workers.)

@elkurin
Copy link
Contributor Author

elkurin commented Mar 3, 2020

Notifying to all SharedWorker instance sounds reasonable. I thought current spec has already supported in that way but it seems not.

I think we shouldn't change https://html.spec.whatwg.org/#runtime-script-errors-2 similarly to dedicated workers. All SharedWorker instances associated with the worker are guaranteed to be related to error events during fetching (fetch failure and parse error) because all of them attempt to fetch the script. On the other hand, runtime script errors are not necessarily related to the associated workers.
I think firing error events only for fetching might be nicer.

@elkurin
Copy link
Contributor Author

elkurin commented Mar 3, 2020

However, it looks like Blink doesn't report the error?

This is true. Blink doesn't report parse errors nor evaluation errors. Only reports fetching failure errors.

@asutherland
Copy link

https://html.spec.whatwg.org/#runtime-script-errors-2 basically describes bubbling an error event from [WorkerGlobalScope, Worker, Worker's global scope] (unless handled). If the SharedWorker's global doesn't handle the error due to breakage, why wouldn't content want to be able to know that? A lot of recent discussion in the ServiceWorker WG relates to sites being paranoid about ServiceWorker and storage breakage, so providing a mechanism to detect unhandled errors seems useful, if duplicative.

Coincidentally, Firefox also reports runtime errors. I created https://worker-playground.glitch.me/shared-worker-dynamic-error.html which logs both errors from onconnect and onmessage from a handler added in onconnect.

@annevk
Copy link
Member

annevk commented Mar 3, 2020

In part it seems problematic because it's not robust. A shared worker might temporarily not have a document associated with it (although maybe it's event loop has to be frozen while that happens?). And it also doesn't seem particularly efficient to broadcast all errors. But I don't care too strongly.

@asutherland
Copy link

I took a deeper look at https://docs.google.com/document/d/1ot2GurRwSUlvAEB1_XlVPR9-7KT8lAmK1sAb8P11qZY/edit#heading=h.fo73cyj0hll and I'm not sure why it would be desirable to only do the #1 fix (throw on parse errors, step 12 of run a worker) versus #2 also (throw on errors running the script/top-level evaluation, step 13 of run a worker).

Given the dynamic nature of JS and the precedent of ServiceWorker's Run Service Worker algorithm it seems weird to only return parse errors when many typos would equally manifest as runtime errors running the classic/module script.

And I think this leads into Anne's point on robustness. The current spec would seem to dictate that if same-origin window A1 and A2 both create an equivalent SharedWorker with A1's request reaching the shared worker manager first, only A1 would receive any errors. And A1 might not actually remain alive long enough to process those errors.

If we're not changing the runtime behavior to report all errors (which would admittedly have robustness issues), then it seems like we want Run a Worker to be more like Run a Service Worker, specifically persisting its "start status"/"startFailed" state and (hand-wavingly) allowing subsequent requests to join a pending request. This would be the basis for consistently returning errors whenever attempting to attach a new SharedWorker to the underlying worker. This would include both the fetch/parse step 12 and the run at step 23.

A new spec decision is then what to do if there was a "run classic/module script" error that results in "startFailed" being true given that there's no SharedWorker.terminate() method. If the fetch/parse fails, the environment is discarded, but it's not obvious what to do in this case. For ServiceWorkers that are being installed, the install fails, but once they're installed, the ServiceWorker gets to keep running. It doesn't help that the desired behaviors for a deployed site versus a site that is currently being iterated on have different behaviors... maybe Safari had the right idea with unshipping SharedWorker...

Bug xlinks: https://bugzilla.mozilla.org/show_bug.cgi?id=1254125 is Firefox's bug on reporting all errors, and https://bugzilla.mozilla.org/show_bug.cgi?id=1619728 is the related bug that we close the SharedWorker binding down whenever we hear an error (oops!).

@elkurin
Copy link
Contributor Author

elkurin commented Mar 4, 2020

It seems 3 discussions are on-going here.

  1. Parse error should be handled similarly to fetch failure and should be reported to all of the associated shared workers.
    I think we can reach to the consensus on this part. At least, FireFox and Blink both agreed. We can report all of the associated shared workers by looking over owner's list. This is the initial scope of this spec issue.

  2. Runtime error should be handled similarly to dedicated workers.
    I assume @asutherland wants to handle runtime error as same as fetch/parse errors. I partially agree with this. It is true that firing error events only for parse errors while ignoring all other runtime errors. (BTW, are there cases that typo error are not handled as par errors?) On the other hand, for example, the following case would happen:
    Window A1 and A2 both construct the same SharedWorker instance and successfully completed worker creation algorithms. A1 calls the event handler via SharedWorker and get a runtime error while A2 is not calling anything, then both A1 and A2 will catch error events.
    In the case above, A2 is not related to the error itself, so it might be confusing for A2 to catch the error invoked by A1. Therefore, I hesitate to report the exception to all of the shared worker instances associated with SharedWorkerGlobalScope. (Fetch/parse errors only occur during the construction, so it's reasonable to report error events to all of the associated workers since all of them are attempting to construct the shared worker instance.) Note that this is a weak suggestion. I think handling runtime error as same as that of dedicated workers is also reasonable.

  3. How to handle the timing issue: racy algorithms when multiple SharedWorker instances point to the same SharedWorkerGlobalScope
    This issue is not unique to parse error handling issue but very critical.
    We create SharedWorkerGlobalScope inside the worker thread context, while we try to match SharedWorker instace with the existing SharedWorkerGlobalScope outside of the worker thread context, so it is difficult to remove asynchronous algorithms. As @asutherland mentioned, giving states makes the situation better but still the steps cannot be synchronize strictly because there needs cross-context conversation.

As for the last issue, it is not only for error handling but the problem of shared worker construction algorithms. Currently, shared workers with same conditions which should point to the same global scope might be duplicated. This issue is actually out of scope in this spec issue title, so maybe we should rename the title or create another issue and move?

@asutherland
Copy link

asutherland commented Mar 4, 2020

I'm okay with not reporting all runtime errors, I intended to convey a de-scoping to just reporting those resulting from the top-level evaluation should be reported. Once this initial bootstrap is completed, the SharedWorker content code should be able to handle "error" events fired on its global. (It's still appropriate for devtools to relay all such errors, of course.)

For typos I mainly mean mis-spelling an identifier name which results in ReferenceErrors. I made an example page at https://worker-playground.glitch.me/shared-worker-identifier-typo.html where Chrome doesn't report the error but Firefox obviously does (because we report everything).

@elkurin
Copy link
Contributor Author

elkurin commented Mar 11, 2020

Thanks for the discussion.

As I mentioned above, this thread is discussing the several issues at once, so let me focus on the issue 1 in the above comment.
As for the issue 2 (runtime error handling), whether we fire an error event or not, it is sure that runtime error and parse error should be handled differently. We can catch runtime error in the worker context but cannot catch parse error, so parse error is always unhandled. Let's discuss about runtime error divided from parse error elsewhere.
As for the issue 3 (sharedworker is racy), this seems to be a very critical issue and might need drastic changes. This is not only about the error handling. We need to discuss how to remove raciness, so I will open the issue for it.

I think we reached at the consensus that it is natural to handle parse error similarly as fetch failure, so now I would like to move on to creating PR.
If you find something wrong in my understanding or have any thought, please let me know.

domenic pushed a commit that referenced this issue Mar 13, 2020
This aligns the behavior of parse errors in a SharedWorker with fetch
failures. See discussion in #5323, which has gone beyond that to also
discuss other types of errors, but for now we have agreement on parse
errors at least.
@domenic
Copy link
Member

domenic commented Mar 13, 2020

OK, I've merged #5347, and we're almost finished with the tests in web-platform-tests/wpt#22185.

@elkurin, could you either:

  • Rename this issue to reflect the broader scope, and edit the original post to add a quick summary of the issues still under discussion, so that people can understand why it is still open; or
  • Close this issue, and open new issues for (2) and (3), each of which summarize the current discussion so far? (Or, a single issue combining them both, if you think they should be discussed together.)

@domenic domenic added the interop Implementations are not interoperable with each other label Mar 13, 2020
blueboxd pushed a commit to blueboxd/chromium-legacy that referenced this issue Mar 25, 2020
This CL handles parse error events in modules shared workers.
By this change, parse error events invoked by top-level scripts and
statically imported scripts can be handled by AbstractWorker.onerror.

The HTML spec defines script parsing should occur during the fetch step
and invoke a parse error event at that point if needed. This CL obeys
this behavior for module script workers, but not for classic script
workers because of an implementation reason that classic script workers
are supposed to parse the script during the evaluation step. Therefore,
the timing to catch parse error events differs.
In this CL, parse error handling is only implemented for module shared
workers, not for classic.

This is discussed in the html spec issue:
whatwg/html#5323

The wpt to check this feature will be added from external github:
web-platform-tests/wpt#22185


Bug: 1058259
Change-Id: I2397a7de8e2ae732fb0b29aea8d8703dd2a79a05
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2100058
Reviewed-by: Hiroki Nakagawa <nhiroki@chromium.org>
Commit-Queue: Eriko Kurimoto <elkurin@google.com>
Cr-Commit-Position: refs/heads/master@{#753185}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interop Implementations are not interoperable with each other topic: workers
Development

No branches or pull requests

4 participants