Possible race in feature policy in multiprocess implementations #256

bzbarsky · 2018-11-27T16:40:54Z

Consider the following sequence of events, for a page loaded from site A:

Create an iframe.
Load different-origin site B in the iframe. Wait for iframe's onload to fire.
Start load of different-origin site C in the iframe via setting location.href (which can be done cross-origin).
Mutate the allow attribute of the iframe to change the container policy.

Here step 4 happens immediately after step 3, without returning to the event loop. That is, the code looks like this:

iframe.contentWindow.location.href = "http://C.com";
iframe.allow = stuff;

What should happen? Per spec, the load starts, goes async. Then control unwinds, we end up setting the "allow" attribute. The load completion happens on the same event loop as this code (because the HTML spec has a single event loop there), so we're guaranteed that we can't create the new document in the iframe until after the script above runs to completion. At the point when that document creation happens it synchronously reads attributes from the iframe and picks up the modified feature policy.

That's the spec situation. What happens in a naive implementation with out of process iframes? The href set sends a message to the other process to do the load. After that, the load only involves the processes for origins B and C; it doesn't involve the original process for A at all, except sending back a message to run the load event. So as far as I can tell it would be entirely possible for the parent page process to lose the timeslice for the entire duration of the load and not do the allow attribute set until after the new document in the subframe is created.

OK, so what happens when the subframe document is created? Implementing the letter of the feature policy spec involves a sync IPC call of some sort to the A process. If that (1) blocks the document creation in process C and (2) is not serviced in A until the JS runs to completion, then the spec as currently written can be implemented, and the new document will pick up the modified policy. But having such a setup, where we can block a process until some other process returns to at least a microtask checkpoint, if not to the event loop, doesn't seem like a great idea, because we could have two such IPCs in flight at once going in different directions and end up deadlocking, afaict.

A more natural fit for a multiprocess scenario would be to broadcast the allow attribute change to the processes involved with rendering stuff inside the iframe, so they can mirror the changed information. But then we're back to the original problem: the load can win the race against the allow attribute change.

The other option, of course, is to stop synchronously poking at the iframe attributes across process boundaries. Instead, snapshot the container policy when a load starts. This eliminates the conceptual races, but does require either authors setting allow before setting src (or before inserting the iframe in the DOM) or doing the "process the iframe attributes" thing when allow is set (so canceling the existing load, starting a new one, etc) or something.

I'd be curious to know what Chrome actually does here in the out-of-process-iframe case...

@clelland @Baku @annevk @domenic

The text was updated successfully, but these errors were encountered:

annevk · 2018-11-28T09:05:07Z

Is there anything specific to feature policy here? Presumably this is more of a bug in the HTML standard, e.g.:

frame.contentWindow.location.href = "https://other-origin.example/alert-self.name"
frame.name = "blah"

Good times though. 😊

A design that could work here is for the Parent to batch up the changes relevant to Child during the current task and then message them all at once end-of-task. Child could then take all the details into account, though it would require Child to wait another turn to start the actual load (unless you reorder the changes somehow).

bzbarsky · 2018-11-28T16:04:31Z

The browsing context name has a similar problem, I agree.

If we can solve both with the same approach, that makes some sense. Or we could just subtly change how the browsing context name behaves in various ways.

I'm more worried about feature policy than the browsing context name because of its use as a security device.

batch up the changes relevant to Child during the current task

That has its own problems, because this buffer could end up growing very large.... :( But yes, it might work in general.

clelland · 2018-11-28T18:53:04Z

In Chrome, setting the relevant iframe attributes updates the container policy on the element (renderer process) That gets sent asynchronously to the browser process, where it is stored as the "pending frame policy" member on the child frame's node in the frame tree. The browser also sends these to the renderer for site B, where they are also stored (as part of its RemoteFrameOwner member)

(I think this matches your more natural, broadcast proposal.)

There may be a race in that case, as you say, if the network causes the new document to be created before the policy is propagated. I'm not sure if that's possible in practice, though -- I'm not as familiar with the loading code, and it may be that the policy messages will always arrive first (the pipes are asynchronous, but FIFO, so it will matter what order the messages are inserted). I'll do some investigation into what can actually happen in this situation.

The same code is used to update the pending sandbox flags for the iframe as well, so

iframe.contentWindow.location.href = "http://C.com";
iframe.sandbox = "allow-scripts";

is subject to the same race, I believe.

bzbarsky · 2018-11-28T18:58:20Z

Hmm. Yes, sandbox would have the same issue. That's not great. I wonder whether sites depend on the above ordering working for sandbox...

annevk · 2019-04-24T14:11:15Z

I think the ideal behavior here is that the navigate algorithm creates a "collection of state a future document might need" and passes that along, which means that mutations after a location set would not take effect, unless there is some kind of update mechanism defined for them.

This change fixes a race condition where an iframe's sandboxing flag set could be changed in between the start of a navigation and when the response is returned, and the new document created. In that case, it was unclear how the new document could reliably synchronously get the updated flags, or just exactly how late those flags could be changed and still impact the new document. Now, the sandboxing flag set is routed from the beginning of the navigation to the eventual document creation. Ref: whatwg#4783, and also see w3c/webappsec-permissions-policy#256

This change fixes a race condition where an iframe's sandboxing flag set could be changed in between the start of a navigation and when the response is returned, and the new document created. In that case, it was unclear how the new document could reliably synchronously get the updated flags, or just exactly how late those flags could be changed and still impact the new document. Now, the sandboxing flag set is routed from the beginning of the navigation to the eventual document creation. See #4783 and w3c/webappsec-permissions-policy#256 which outline similar problems for feature policy.

clelland · 2019-12-06T21:26:43Z

Now that this has been fixed for sandboxing, we can do the same thing for feature policy. I've created a PR that separates the steps of constructing the container policy from the parent document's state and creating the feature policy for a nested document. With updates to HTML to use those two steps, we should be able to fix this issue.

pabrai added feature question Questions and issues around specific policy-controlled features and removed question labels May 8, 2019

clelland added architecture and removed feature question Questions and issues around specific policy-controlled features labels May 9, 2019

clelland mentioned this issue Jul 17, 2019

Use the newly exposed methods to initialize feature policy. whatwg/html#4772

Merged

dtapuska mentioned this issue Jul 17, 2019

Initializing Feature Policy is racy whatwg/html#4783

Open

clelland mentioned this issue Nov 18, 2019

Snapshot sandbox at start of navigation. whatwg/html#5098

Merged

3 tasks

clelland linked a pull request Dec 6, 2019 that will close this issue

Fix navigation race #358

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible race in feature policy in multiprocess implementations #256

Possible race in feature policy in multiprocess implementations #256

bzbarsky commented Nov 27, 2018

annevk commented Nov 28, 2018

bzbarsky commented Nov 28, 2018

clelland commented Nov 28, 2018

bzbarsky commented Nov 28, 2018 •

edited

Loading

annevk commented Apr 24, 2019

clelland commented Dec 6, 2019

Possible race in feature policy in multiprocess implementations #256

Possible race in feature policy in multiprocess implementations #256

Comments

bzbarsky commented Nov 27, 2018

annevk commented Nov 28, 2018

bzbarsky commented Nov 28, 2018

clelland commented Nov 28, 2018

bzbarsky commented Nov 28, 2018 • edited Loading

annevk commented Apr 24, 2019

clelland commented Dec 6, 2019

bzbarsky commented Nov 28, 2018 •

edited

Loading