Closed Bug 1598674 Opened 2 months ago Closed 2 months ago

Hundreds of timeouts in upgrade-insecure-requests tests

Categories

(Testing :: web-platform-tests, defect)

Version 3
defect
Not set

Tracking

(firefox72 fixed)

RESOLVED FIXED
mozilla72
Tracking Status
firefox72 --- fixed

People

(Reporter: jgraham, Assigned: jgraham)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

Firefox has hundreds of timeouts in the upgrade-insecure-requests tests, which are causing about an hour of unnecessary runtime [1].

After some investigation the problem seems to be that the promise at [2] isn't resolving, meaning we aren't getting a load event from that iframe.

So a simple fix — but one that I'm not sure is valid — is just to get rid of the document.write and instead just do iframe.src = frameUrl. With that Firefox starts passing about the same tests as other browsers. However per the wpt.fyi results other browsers are passing these tests so there remains a compat issue that we need to investigate.

bz: I looked for a related bug around document.write and the load event, but couldn't see anything. Is this ringing any bells?

[1] https://wpt.fyi/results/upgrade-insecure-requests/gen/iframe-blank-inherit.meta?label=experimental&label=master&aligned
[2] https://searchfox.org/mozilla-central/source/testing/web-platform/tests/common/security-features/resources/common.sub.js#1237-1238

Flags: needinfo?(bzbarsky)

Another thing that seems to work is to spin the event loop after the first iframe load event, which seems like a plausible short term fix, and suggests the problem might be something like we're suppressing the second load event if we're in a microtask immediately after the first load event is fired or something.

I have a wpt for that difference, but I'll hold off on filing a bug for in in case I'm told that this actually some known issue :)

That live DOM viewer test basically has the following structure:

  1. Create a Promise p.
  2. Create an iframe element.
  3. Set its load event to resolve p
  4. Add a then handler to p that does document.write() followed by document.close() into the document in the iframe.
  5. Insert the iframe in the document.
  6. Check whether a second load event fires after the document.close.

There is a known behavior difference here between Firefox and the spec on the one hand and Chrome/Safari on the other hand. Specifically, when the iframe is inserted into the document, Chrome/Safari fire the load event before the appendChild call returns, while Firefox, and the spec, fire it from a task asynchronously.

So what happens here in Firefox, and per spec, is that we start at https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:process-the-iframe-attributes and then go to https://html.spec.whatwg.org/multipage/iframe-embed-object.html#process-the-iframe-attributes the second case, and end up executing https://html.spec.whatwg.org/multipage/iframe-embed-object.html#iframe-load-event-steps async, from a task. In those steps, we fire the load event, which has a listener that resolves a promise. Since there is no script on the stack before the listener's code is entered (because we are running async!), that promise's reaction microtasks are run right after the event listener returns, still under event dispatch. That runs the function that does the write/close() bits. Note that this is all happening under step 4 of "iframe load event steps", so the "iframe load in progress" flag that got set in step 3 is true.

Now when that flag is true at a point when an open() (explicit or implicit) happens, then https://html.spec.whatwg.org/multipage/dynamic-markup-insertion.html#document-open-steps step 13 sets the "mute iframe load" and then https://html.spec.whatwg.org/multipage/iframe-embed-object.html#iframe-load-event-steps step 2 suppresses the load event from the document.close(). So there is no second load event.

What happens in Chrome and Safari is that the iframe load event fires sync, while the script that called appendChild is on the stack. There is therefore no microtask checkpoint during that even firing, and in particular the promise that is resolved in the load event listener does not have its reactions run until the script that called appendChild completes, at which point the "iframe load in progress" flag is not set anymore. So in those browsers the open happens without that flag set, does not set "mute iframe load", and you get a second load event.

See also discussion in https://github.com/whatwg/html/issues/4965

The simple test fix here is to resolve the promise explicitly async from the load event, so we don't end up dealing with this "iframe load in progress" stuff.

Flags: needinfo?(bzbarsky)

Ah, I wondered if it followed from the different load event timing stuff, but I didn't know about the "is there a script on the stack" part of muting. Thanks!

Just to be clear, the "is there a script on the stack" just affects when Promise reactions (and hence the function passed to then) run, not muting per se... what matters for muting is whether open happens while the load event is firing.

Blink and WebKit dispatch the load event of the iframe synchronously,
whereas Gecko and (currently) the spec assume it's async. This causes
a hang in some tests using this helper library because in Gecko a
subsequent load ends up running in the event dispatch of the initial
load event, and so the load event is suppressed and the tests are
unable to complete. In other browsers the event is not suppressed and
so the tests run.

Avoid this by ensuring that the event loop always spins after the
iframe load.

Assignee: nobody → james
Pushed by james@hoppipolla.co.uk:
https://hg.mozilla.org/integration/autoland/rev/2b978939b653
Don't assume sync dispatch of iframe load event in tests, r=bzbarsky
https://hg.mozilla.org/integration/autoland/rev/206137f27f19
Update metadata, a=testonly
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/20418 for changes under testing/web-platform/tests
Can't merge web-platform-tests PR due to failing upstream checks:
Github PR https://github.com/web-platform-tests/wpt/pull/20418
* Community-TC (pull_request) (https://community-tc.services.mozilla.com/tasks/groups/UeXqYsx2T7av3RZ1i1flnw)
Flags: needinfo?(james)
Status: NEW → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla72
Upstream PR merged by jgraham
Flags: needinfo?(james)
You need to log in before you can comment on or make changes to this bug.