Closed Bug 1651297 Opened 4 years ago Closed 4 years ago

Stop using "about:newtab" in web-platform-tests

Categories

(Testing :: web-platform-tests, defect, P1)

defect

Tracking

(firefox81 fixed)

RESOLVED FIXED
81 Branch
Tracking Status
firefox81 --- fixed

People

(Reporter: whimboo, Assigned: whimboo)

References

Details

Attachments

(5 files)

Currently for newly opened tabs about:newtab is being loaded by the wptrunner. To check that run any kind of wpt test like:

mach wpt /html/cross-origin-opener-policy/blob-popup.https.html

Once it finished open a new tab in the harness window by pressing Cmd/Ctrl+T.

Note that this can have implications for tests opening new tabs on their own like the above one. As such make sure to not use about:newtab.

This can be done by setting browser.newtabpage.enabled to false.

Pushed by hskupin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ccf92483269b
[web-platform-tests] Use "about:blank" instead of "about:newtab" for newly opened tabs. r=jgraham

Backed out changeset ccf92483269b (bug 1651297) for many fonts related failures

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&group_state=expanded&tochange=254d403d51e2e523ae17c7daa3e24ce51e9987db&searchStr=windows%2C7%2Cdebug%2Cweb%2Cplatform%2Ctests%2Ctest-windows7-32%2Fdebug-web-platform-tests-reftest-e10s-3%2Cw%28wr3%29&fromchange=c7b6b13b88f4f6f703394ed3bf1981620818d723&selectedTaskRun=T94l5LTdQmeOFz92u84QbQ.0

Backout link: https://hg.mozilla.org/integration/autoland/rev/339829507e50bf130c8076977a73b7cab4715cfc

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=308974465&repo=autoland&lineNumber=7178

[task 2020-07-08T12:52:04.815Z] 12:52:04     INFO - TEST-START | /css/css-text/shaping/reference/shaping-001-ref.html
[task 2020-07-08T12:52:04.830Z] 12:52:04     INFO - PID 2572 | 1594212724817	Marionette	INFO	Testing http://web-platform.test:8000/css/css-text/shaping/reference/shaping-001-ref.html != http://web-platform.test:8000/css/css-text/shaping/reference/shaping-001-sanity-ref.html
[task 2020-07-08T12:52:04.830Z] 12:52:04     INFO - PID 2012 | Could not create glyph run analysis.
[task 2020-07-08T12:52:04.831Z] 12:52:04     INFO - PID 2012 | /builds/worker/checkouts/gecko/gfx/skia/skia/src/ports/SkScalerContext_win_dw.cpp(519) : error 0x80070005: Access is denied.
[task 2020-07-08T12:52:04.831Z] 12:52:04     INFO - PID 2012 | Requested bounding box could not be determined.
[task 2020-07-08T12:52:04.831Z] 12:52:04     INFO - PID 2012 | /builds/worker/checkouts/gecko/gfx/skia/skia/src/ports/SkScalerContext_win_dw.cpp(742) : error 0x80070005: Access is denied.
[task 2020-07-08T12:52:04.866Z] 12:52:04     INFO - PID 2012 | 1594212724856	Marionette	WARN	[46] http://web-platform.test:8000/css/css-fonts/font-features-across-space-2.html overflows viewport (width: 783, height: 606)
[task 2020-07-08T12:52:04.869Z] 12:52:04     INFO - PID 2012 | 1594212724861	Marionette	INFO	No differences allowed
[task 2020-07-08T12:52:04.950Z] 12:52:04     INFO - TEST-UNEXPECTED-FAIL | /css/css-fonts/font-features-across-space-2.html | Testing http://web-platform.test:8000/css/css-fonts/font-features-across-space-2.html == http://web-platform.test:8000/css/css-fonts/font-features-across-space-1-ref.html
[task 2020-07-08T12:52:04.951Z] 12:52:04     INFO - Found 219 pixels different, maximum difference per channel 255
[task 2020-07-08T12:52:04.951Z] 12:52:04     INFO - REFTEST   IMAGE 1 (TEST): 
[task 2020-07-08T12:52:04.951Z] 12:52:04     INFO - REFTEST   IMAGE 2 (REFERENCE): 
[task 2020-07-08T12:52:04.951Z] 12:52:04     INFO - TEST-INFO took 173ms
Flags: needinfo?(hskupin)

Affected here are random tests under /css/css-fonts/ for windows 7 debug only. Not sure how the change from about:newtab to about:blank for new tabs could have caused that.

Jonathan, do you have an idea what this could be?

Flags: needinfo?(hskupin) → needinfo?(jfkthame)

This is actually caused by:

[task 2020-07-08T13:59:14.388Z] 13:59:14     INFO - PID 2204 | Could not get gdi compatible glyph metrics.
[task 2020-07-08T13:59:14.388Z] 13:59:14     INFO - PID 2204 | /builds/worker/checkouts/gecko/gfx/skia/skia/src/ports/SkScalerContext_win_dw.cpp(435) : error 0x80070005: Access is denied

I spoke with James on Matrix and we might consider to just mark these tests as intermittent fail for Windows 7 debug, where the underlying problem might never be get fixed.

Yeah, as discussed on matrix this is basically the issue described in bug 1392106. It's a failure happening within the Windows graphics stack. Usual response is to mark badly-affected tests as random or skip them on Win7, as we don't have any real solution.

Flags: needinfo?(jfkthame)

I updated the meta data and here one more problematic test:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ba090a8446669db6fb20e46f6b4c1be289e2f0d6&selectedTaskRun=NUe-ikKJREGcP1RZZIFk9g.0

The test fails even without this Skia failure and the expected string PASS is not shown. Only P is visible in this image. I haven't seen it failing on other platforms. Shall we leave it as a new intermittent failure after this patch landed, or also mark it flaky?

Flags: needinfo?(jfkthame)

Given that the "P" in the reference rendering comes from the default Times font, not the testcase's webfont, that one looks like it's an example of bug 1396260 -- although it's "resolved", that was never really fixed, just swept (mostly) under the rug.

I guess if it happens more than once, I'd just mark it as (yet another) test that's flaky on win7 and move on....

Flags: needinfo?(jfkthame)

As noticed we are running multiple Firefox processes in parallel. This clearly can cause resource constraints. I pushed a patch to limit web-platform-tests to a single process for Windows 7 debug jobs.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=03d2cf63e5e5d030b8b02df89bfcf65cd085c7d0

The try build shows a couple of failures under /css/WOFF2. It's not only because some characters are missing due to the error 0x80070005: Access is denied failure, but in the cases that I checked the word FAIL instead of PASS appears. So it's different from the former failure, and running with a single process seem to have helped.

I pushed another try build with marionette trace logging enabled. Maybe there is something obvious why these tests are failing:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=23fa382ac9b60e43d6dc145b3d53492cc2579ab0

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #11)

The try build shows a couple of failures under /css/WOFF2. It's not only because some characters are missing due to the error 0x80070005: Access is denied failure, but in the cases that I checked the word FAIL instead of PASS appears. So it's different from the former failure, and running with a single process seem to have helped.

Those tests show FAIL instead of PASS if loading/instantiating the test's webfont fails ("FAIL" then comes from a fallback font). If this is happening intermittently (on win7 only), it looks like another manifestation of the graphics system giving up under the strain... unfortunately it doesn't come with a characteristic error in the log like the missing-glyph failure, but it's still a similar situation. Basically still bug 1396260.

In case of failing tests we have the following output:

Testing http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011.xht == http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011-ref.xht
screenshot http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011-ref.xht remainingCount: 1 cache: false cacheKey: 800x600
reusing canvas from canvas pool
Starting load of http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011-ref.xht
[44] Waiting for page load of http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011-ref.xht
[44] Waiting for event loop to spin
[44] Waiting for rendering
screenshot http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011.xht remainingCount: 1 cache: false cacheKey: 800x600
reusing canvas from canvas pool
Starting load of http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011.xht
[44] Waiting for page load of http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011.xht
[44] Waiting for event loop to spin
[44] Waiting for rendering
[44] reftestWait: true

And here a working case:

Testing http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011.xht == http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011-ref.xht
screenshot http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011-ref.xht remainingCount: 1 cache: false cacheKey: 800x600
reusing canvas from canvas pool
Starting load of http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011-ref.xht
[44] Waiting for page load of http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011-ref.xht
[44] Waiting for event loop to spin
[44] Waiting for rendering
[44] reftestWait: true
screenshot http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011.xht remainingCount: 1 cache: false cacheKey: 800x600
reusing canvas from canvas pool
Starting load of http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011.xht
[44] Waiting for page load of http://web-platform.test:8000/css/WOFF2/metadatadisplay-schema-extension-011.xht
[44] Waiting for event loop to spin
[44] Waiting for rendering
[44] reftestWait: true

As you can see there is no reftestWait: true line logged for the reference image, which means we might screenshot too early and as such the reference image is not correct, and a following comparison results in a false-positive.

The condition in marionette relies on windowUtils.isMozAfterPaintPending, and if that's false we call into requestAnimationFrame() and resolve the promise with the next frame.

James, given that you wrote that code, do you have an idea what's going on? If not I could clearly investigate that via try server builds but it might take a bit of time.

Flags: needinfo?(james)

The logging here is a little misleading; reftest-wait isn't actually involved. The code that produces the logging is:

      if (windowUtils.isMozAfterPaintPending) {
        logger.debug(`reftestWait: ${windowUtils.isMozAfterPaintPending}`);
        win.addEventListener("MozAfterPaint", maybeResolve, { once: true });
      } else {
        // resolve at the start of the next frame in case of leftover paints
        win.requestAnimationFrame(() => {
          win.requestAnimationFrame(resolve);
        });
      }

So in the failing case we don't think there's a pending paint event and proceed with the rendering. So I think there's a race condition but it at least isn't obviously a bug in the test harness.

Flags: needinfo?(james)

Ok, when moving out the logging line I sometimes see:

[task 2020-07-15T11:19:42.220Z] 11:19:42     INFO - PID 5356 | 1594811982216	Marionette	DEBUG	[44] Waiting for rendering
[task 2020-07-15T11:19:42.223Z] 11:19:42     INFO - PID 5356 | 1594811982219	Marionette	DEBUG	[44] reftestWait: true
[task 2020-07-15T11:19:42.226Z] 11:19:42     INFO - PID 5356 | 1594811982223	Marionette	DEBUG	[44] reftestWait: true
[task 2020-07-15T11:19:42.226Z] 11:19:42     INFO - PID 5356 | 1594811982224	Marionette	DEBUG	[44] reftestWait: true
[task 2020-07-15T11:19:42.246Z] 11:19:42     INFO - PID 5356 | 1594811982239	Marionette	DEBUG	[44] reftestWait: false

So it can take multiple iterations before windowUtils.isMozAfterPaintPending is actually false. And in the case when we hit false immediately maybe the call to windowUtils.updateLayerTree() causes a delayed update of windowUtils.isMozAfterPaintPending? When I switch from requestAnimationFrame() to a simple setTimeout() everything passes:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1d3c78b44edf262c97bea54219a43295f10da310

I pushed one more try build that reverts the last change, and adds some more logging:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b642736031debcd74e7f097948f295b2922cde5f

Right, we obviously can't actually add a 100ms timeout here; that would badly regress the performance of these tests.

Per Emilio on matrix it sounds like there's possibly some additional background processing on the fonts that doesn't block the load event, and apparently doesn't cause us to believe there's a pending paint? I don't know if await document.fonts.ready would account for that? Otherwise I don't know enough about the layout engine to really know what to do in cases like this; it would be helpful to get input from :mattwoodrow in case there's something we ought to be waiting on but aren't, or some other error in the logic here.

Flags: needinfo?(matt.woodrow)

Looking into the font code, it looks like we don't call UnblockOnload until after we've finished sanitizing on the background thread.

I had a look through the reftest and and marionette code, and can't find any reason why this would happen.

One difference is that it looks like reftest-content.js waits for isMozAfterPending/MozAfterPaint first, and then runs updateLayerTree, whereas listener.js is doing that in the reverse order.

I can't see how that could cause this failure, but might be worth trying.

Flags: needinfo?(matt.woodrow)

This doesn't seem to work, or I still did something wrong with the order of calls:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4afeea53a9f425fa6f577101f6c5d70d8c87ddcf

  1. await win.document.fonts.ready
  2. Flush rendering
  3. Awaiting MozAfterPaint event
  4. Calling updateLayerTree() if remote
  5. Calling requestAnimationFrame() twice (delay by ~34ms)

Matt, I would appreciate if you could have another look.

Flags: needinfo?(matt.woodrow)

If there are some MOZ_LOG's available let me know. I'm happy to let it run for further diagnostics.

Henrik - if you do more try runs here, you might like to include this patch, which adds NS_WARNING messages for a few places in the font-loading code where we currently just silently fail. The point being that if these show up in the log, then we'd know that nothing you do to tweak paint timing etc is going to help, because the font is just failing to load at all.

I pushed a new (non-artifact) try build including Jonathan's patch:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=721bcd0f5b7ee13d95c663245190d791837db814

Matt doesn't have further ideas. As such we agreed to mark failing tests as intermittent. As such I will also revert the current reftest changes to what we had before, but leave the 1proc setting in-tact for wptrunner.

Here the try job that I will take as base for marking tests as intermittent fail:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d06562263fc821a28347861554c867b6d2e76daa

Flags: needinfo?(matt.woodrow)

The severity field is not set for this bug.
:jgraham, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(james)
Severity: -- → S3
Flags: needinfo?(james)

Here a hopefully final try push with all the affected tests as marked intermittent failing:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=64b881bc13af6067bff94dd6fa4f3c60885f29f6

James, if all looks fine please push while I'm absent. Thanks.

Flags: needinfo?(james)
Pushed by james@hoppipolla.co.uk:
https://hg.mozilla.org/integration/autoland/rev/a3be6037d7e2
[web-platform] Run tests for Windows7 debug with a single Firefox process. r=jgraham
https://hg.mozilla.org/integration/autoland/rev/b580324b0c9d
[web-platform-tests] Use "about:blank" instead of "about:newtab" for newly opened tabs. r=jgraham
https://hg.mozilla.org/integration/autoland/rev/16e6e653f651
[web-platform-tests] Mark description and extension metadatadisplay-schema tests as intermittent fail on windows7 debug. r=jfkthame
https://hg.mozilla.org/integration/autoland/rev/18049c38b415
[marionette] Improve debug logging for paintComplete in reftests. r=marionette-reviewers,jgraham,maja_zf
Flags: needinfo?(james)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: