Open Bug 1440212 Opened 2 years ago Updated 6 months ago

When can we fire "load" events on iframes async from "load" firing on the window inside?

Categories

(Core :: DOM: Core & HTML, enhancement, P2)

enhancement

Tracking

()

ASSIGNED
Fission Milestone Future

People

(Reporter: bzbarsky, Assigned: bzbarsky)

References

(Depends on 1 open bug)

Details

Attachments

(3 files)

Right now, nsGlobalWindowInner::PostHandleEvent does this:

    nsCOMPtr<Element> element = GetOuterWindow()->GetFrameElementInternal();
    nsIDocShell* docShell = GetDocShell();
    if (element && GetParentInternal() &&
        docShell && docShell->ItemType() != nsIDocShellTreeItem::typeChrome) {
....
      WidgetEvent event(aVisitor.mEvent->IsTrusted(), eLoad);
...
      EventDispatcher::Dispatch(element, nullptr, &event, nullptr, &status);

That is, immediately after load event dispatch completes for the window we fire it on the parent <iframe> element.

Per HTML spec, there are two possible cases here:

1)  The initial about:blank case.  This is handled by https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:iframe-load-event-steps which queues a task to fire the load event on the element.

2)  Pageload in an iframe.  This is handled by the steps in https://html.spec.whatwg.org/multipage/parsing.html#stop-parsing which queue two tasks: one to fire a load event on the window (step 7) and one to mark the document as completely loaded (step 12).  The second task, per <https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:completely-loaded-2>, runs https://html.spec.whatwg.org/multipage/iframe-embed-object.html#iframe-load-event-steps which fires the load event on the <iframe> element.

The tasks for firing the two load events are queued to the "DOM manipulation task source", per the slightly hidden bit at the end https://html.spec.whatwg.org/multipage/parsing.html#the-end (which is where the "stops parsing" steps are).

Unfortunately, there are lots of other tasks that also use that task source.  Fortunately, all the ways of communicating cross-origin (e.g. postMessage) do NOT use it.  So reordering these tasks wrt such communication is OK.

In practice, I think that means that to comply with the current spec we need to effectively keep the sync firing if our parent is in the same docgroup (aka unit of similar-origin related browsing contexts, where synchronous communication would allow the pages to detect whether other things ran between the two events or not), but we can fire async if we're in different docgroups.

I haven't done any testing yet to see what other browsers do here.  It's not quite trivial to test because I think in many cases they violate the spec.  For example, Chrome fires the "load" event sync in case 1 above, as this testcase shows:

  <body><script>
    var i = document.createElement("iframe");
    i.onload = function() { console.log(3); }
    console.log(1);
    document.body.appendChild(i);
    console.log(2);
  </script></body>

Chrome logs "1, 3, 2".
Another interesting testcase, assuming that browsers actually fire "load" events on <style> elements per something resembling the current spec (which I know Firefox does not, btw...):

  <body><script>
    var i = document.createElement("iframe");
    i.srcdoc = '<script>onload = function() { parent.childLoaded(); }</' + 'script>';
    i.onload = function() { console.log(5); }

    function childLoaded() {
      var s = document.createElement("style");
      s.onload = function() { console.log(6); }
      console.log(3);
      document.body.appendChild(s);
      console.log(4);
    }

    console.log(1);
    document.body.appendChild(i);
    console.log(2);
  </script></body>

Per spec, this should log "1, 2, 3, 4, 5, 6", because the task to fire the "load" even on the <style> is queued to the "DOM manipulation task source" later than the two tasks to fire "load" events on the window and <iframe> element.  If we changed our code to queue async from PostHandleEvent, we would get "1, 2, 3, 4, 6, 5", I believe.  Right now, at least Firefox, Chrome, and Safari do "1, 2, 3, 4, 5, 6" here.
Anyway, for the purposes we care about here (out-of-process iframes) firing async at docgroup boundaries should be good enough.  Anne, is my analysis above correct in terms of this being spec-compliant?
Flags: needinfo?(annevk)
BroadcastChannel apparently uses the DOM manipulation task source too.  But those only broadcast same-origin, so cannot mess with the different-docgroup case.

And I just filed https://github.com/whatwg/html/issues/3492 on the spec having different behavior for <iframe> and <frame>; those should at least be consistent with each other....
I think so, yes. (The specification currently does assume synchronous access everywhere and doesn't actually allocate one Window object per theoretical process that can get a handle on it and such, but an implementation that does that should not technically be observably different. There might be/probably are mistakes though.)
Flags: needinfo?(annevk)
Depends on: 1440754
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
Comment on attachment 8953713 [details] [diff] [review]
Fire the "load" event on a frame element async from the load event on the window inside if they are in different docgroups

Review of attachment 8953713 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/html/test/file_iframe_sandbox_d_if11.html
@@ +11,5 @@
>    // this should fail the first time, but work the second
>    try {
>      window.parent.ok_wrapper(true, "a document that was loaded, navigated to another document, had 'allow-same-origin' added and then was" +
>        " navigated back should be same-origin with its parent");
>    } 

nit: clean up this trailing whitespace while you're here.

@@ +17,5 @@
> +  }
> +}
> +
> +// We depend on our link click adding a new session history entry for the new
> +// load.  Tht means we need to make sure it runs after our parent is done

typo: That

@@ +22,5 @@
> +// loading.  Otherwise the load can get turned into a replace load, not create a
> +// new session history entry, and then the back() call in the parent will fail.
> +//
> +// Since we are not same-origin with the parent, we wait for it to tell us when
> +// it's done loading.

Thanks for this comment :-)
Attachment #8953713 - Flags: review?(nika) → review+
Priority: -- → P2
Fission Milestone: --- → M3
Component: DOM → DOM: Core & HTML
No longer blocks: oop-frames
Duplicate of this bug: 1451455

There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:bzbarsky, could you have a look please?

Flags: needinfo?(bzbarsky)

This is blocked on another bug which is not landed, no?

Flags: needinfo?(bzbarsky) → needinfo?(sledru)

Yeah but it is hard for the bot to know that.
Maybe the bot should not do that when there is a "depends on".
Calixte, what do you think?

Flags: needinfo?(sledru) → needinfo?(cdenizet)

Maybe the bot should not do that when there is a "depends on".

That is my point, yes, if the "depends on" is open...

I'll find out a solution for this kind of bugs.

Flags: needinfo?(cdenizet)
Fission Milestone: M3 → M4

Boris, can we land this behind a pref until we get the consensus on the spec issue?

Flags: needinfo?(bzbarsky)

We can. I'd need to check whether you get a green try run without bug 1440754, but if we're just wanting this for testing, and off by default, then it might not matter.

OK, I merged the patch to tip. With the pref turned off, things are fine, so we can land that way: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1de1c9a6587b18e3c356f7327fdfb1f869211e15

With the pref turned on, there are various oranges: https://treeherder.mozilla.org/#/jobs?repo=try&revision=918935e9b236faf8917bd79e7fd9806dc8fb3aa9

It's probably worth debugging those oranges, but I definitely won't have time for that until Monday.

Ah, right. dom/security/test/csp/test_iframe_sandbox.html is failing because it more or less hits bug 1440754, which is why that "depends on" is there. That said, I just filed a clearer spec issue on that: https://github.com/whatwg/html/issues/4730

The toolkit/mozapps/extensions/test/browser/browser_CTP_plugins.js failure is because toolkit/mozapps/extensions/content/extensions.xul has markup sort of like this:

<browser id="html-view-browser" type="content" flex="1" disablehistory="true"/>

which queues up a load event on the <browser> with this new change (and I think per spec that's the right behavior, fwiw). Then before that load event has a chance to fire the code in toolkit/mozapps/extensions/content/extensions.js does:

    htmlBrowser.loadURI("chrome://mozapps/content/extensions/aboutaddons.html", {
      triggeringPrincipal: Services.scriptSecurityManager.getSystemPrincipal(),
    });
    htmlBrowserLoaded = new Promise(
      resolve => htmlBrowser.addEventListener("load", resolve, {once: true})
    ).then(() => htmlBrowser.contentWindow.initialize(htmlViewOpts));

which gets the about:blank load event that's already queued up, thinks it got the "aboutaddons.html" load event and gets an exception because htmlBrowser.contentWindow.initialize is undefined... Of course it would be even worse if it actually had an out-of-process thing there for that <browser type="content">, like it really probably should. That type="content" is why the about:blank is different-origin and hence is affected by these changes. Mark, do you know why the setup here is the way it is?

browser/components/extensions/test/browser/test-oop-extensions/browser_ext_runtime_openOptionsPage.js, browser/base/content/test/webextensions/browser_extension_sideloading.js, browser/components/extensions/test/browser/browser_ext_runtime_openOptionsPage.js are all that same aboutaddons failure.

The html/browsers/the-window-object/named-access-on-the-window-object/navigated-named-objects.window.html web-platform test goes from intermittent to perma-unexpected-pass. That's because if I recall correctly we do some async teardown stuff that the test expects to happen before it gets load events or something. And now we're firing load events a bit later, to the teardown gets a chance to actually happen....

Flags: needinfo?(nkochar)
Flags: needinfo?(mstriemer)
Flags: needinfo?(bzbarsky)

(answering for Mark because he is out and asked me to answer if possible)

I'm not sure why there is type="content" on the #html-view-browser. I removed it (and also from the #shortcuts-view browser in extensions.xul, which has similar logic), and all tests in toolkit/mozapps/extensions/test/browser/ still pass. type="content" might have been cargo-culted from the #discover-browser, which does actually load remote content (but has been obsoleted by bug 1546248 and will be removed soon as part of bug 1558982).

The #html-view-browser and #shortcuts-view browser elements both load chrome:-documents, so removing type="content" from them should be fine.

If switching to type="content" does not solve your problem, then I can also work on bug 1546711, which would remove the dependency on the "load" event from #html-view-browser (by relying on a message / function call from a script in the browser to know that the content has loaded).

In Firefox 70, the #html-view-browser will probably be gone anyway as part of bug 1558982, so you don't need to go out of your way to support this code path.

Flags: needinfo?(mstriemer)

Thank you!

Given comment 18 and 19, this is probably not a good candidate for Fission M4. I'll put this in Future bucket for now.

Fission Milestone: M4 → Future
Flags: needinfo?(nkochar)
You need to log in before you can comment on or make changes to this bug.