Closed Bug 1678089 Opened 4 years ago Closed 3 years ago

Perma [tier2][fission] layout/base/tests/test_bug851485.html | Scrolling after load works. | Reloading keeps scroll position - got 4904, expected 500

Categories

(Core :: Layout, defect, P5)

defect

Tracking

()

RESOLVED FIXED
93 Branch
Fission Milestone M8
Tracking Status
firefox-esr78 --- disabled
firefox-esr91 --- disabled
firefox83 --- disabled
firefox84 --- disabled
firefox85 --- disabled
firefox86 --- disabled
firefox87 --- disabled
firefox88 --- disabled
firefox91 --- disabled
firefox92 --- disabled
firefox93 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: hiro)

References

(Regression)

Details

(Keywords: intermittent-failure, regression, Whiteboard: [stockwell disabled] fission-soft-blocker)

Attachments

(3 files, 2 obsolete files)

Filed by: apavel [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=322203872&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/EprbhutiS8S5JCFc4qJ-fg/runs/0/artifacts/public/logs/live_backing.log


[task 2020-11-18T18:09:37.967Z] 18:09:37     INFO - TEST-START | layout/base/tests/test_bug851485.html
[task 2020-11-18T18:09:38.682Z] 18:09:38     INFO - TEST-INFO | started process screentopng
[task 2020-11-18T18:09:39.041Z] 18:09:39     INFO - TEST-INFO | screentopng: exit 0
[task 2020-11-18T18:09:39.042Z] 18:09:39     INFO - Buffered messages logged at 18:09:38
[task 2020-11-18T18:09:39.042Z] 18:09:39     INFO - TEST-PASS | layout/base/tests/test_bug851485.html | Reloading keeps scroll position 
[task 2020-11-18T18:09:39.042Z] 18:09:39     INFO - Buffered messages finished
[task 2020-11-18T18:09:39.043Z] 18:09:39     INFO - TEST-UNEXPECTED-FAIL | layout/base/tests/test_bug851485.html | Scrolling after load works. 
[task 2020-11-18T18:09:39.043Z] 18:09:39     INFO -     SimpleTest.ok@SimpleTest/SimpleTest.js:417:16
[task 2020-11-18T18:09:39.043Z] 18:09:39     INFO -     verifyScroll@layout/base/tests/test_bug851485.html:19:5
[task 2020-11-18T18:09:39.043Z] 18:09:39     INFO -     verifyAfterLoad/observer<@layout/base/tests/test_bug851485.html:39:21
[task 2020-11-18T18:09:39.044Z] 18:09:39     INFO - GECKO(13352) | MEMORY STAT | vsize 2690MB | residentFast 152MB | heapAllocated 21MB
[task 2020-11-18T18:09:39.045Z] 18:09:39     INFO - TEST-OK | layout/base/tests/test_bug851485.html | took 747ms
[task 2020-11-18T18:09:39.173Z] 18:09:39     INFO - Not taking screenshot here: see the one that was previously logged
[task 2020-11-18T18:09:39.174Z] 18:09:39     INFO - TEST-UNEXPECTED-FAIL | layout/base/tests/test_bug851485.html | Reloading keeps scroll position - got 4904, expected 500
[task 2020-11-18T18:09:39.175Z] 18:09:39     INFO -     SimpleTest.is@SimpleTest/SimpleTest.js:500:14
[task 2020-11-18T18:09:39.176Z] 18:09:39     INFO -     verifyAfterLoad@layout/base/tests/test_bug851485.html:34:7
[task 2020-11-18T18:09:39.176Z] 18:09:39     INFO -     onload@layout/base/tests/test_bug851485.html:1:1
[task 2020-11-18T18:09:39.198Z] 18:09:39     INFO - TEST-PASS | layout/base/tests/test_bug851485.html | Scrolling after load works. 
[task 2020-11-18T18:09:39.199Z] 18:09:39    ERROR - TEST-UNEXPECTED-FAIL | SimpleTest | this test already called finish!
[task 2020-11-18T18:09:39.202Z] 18:09:39     INFO - TEST-UNEXPECTED-ERROR | layout/base/tests/test_bug851485.html | called finish() multiple times
[task 2020-11-18T18:09:39.202Z] 18:09:39     INFO - TEST-INFO ```
Has Regression Range: --- → yes

Set release status flags based on info from the regressing bug 1677689

Should bug 1677689 be backed out, if it didn't actually fix the test on fission? Or the test skipped again?

There are 31 total failures in the last 7 days on

  • linux1804-64-shippable opt
  • windows10-64-shippable opt
  • windows10-64-shippable-qr opt

Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=329926926&repo=mozilla-central&lineNumber=7169

[task 2021-02-14T12:02:53.464Z] 12:02:53 INFO - TEST-START | layout/base/tests/test_bug851485.html
[task 2021-02-14T12:02:54.017Z] 12:02:54 INFO - TEST-INFO | started process screenshot
[task 2021-02-14T12:02:54.082Z] 12:02:54 INFO - TEST-INFO | screenshot: exit 0
[task 2021-02-14T12:02:54.084Z] 12:02:54 INFO - Buffered messages logged at 12:02:53
[task 2021-02-14T12:02:54.084Z] 12:02:54 INFO - TEST-PASS | layout/base/tests/test_bug851485.html | We set the location's hash to 'anchor'
[task 2021-02-14T12:02:54.084Z] 12:02:54 INFO - TEST-PASS | layout/base/tests/test_bug851485.html | Reloading keeps scroll position
[task 2021-02-14T12:02:54.084Z] 12:02:54 INFO - Buffered messages finished
[task 2021-02-14T12:02:54.084Z] 12:02:54 INFO - TEST-UNEXPECTED-FAIL | layout/base/tests/test_bug851485.html | Scrolling after load works.
[task 2021-02-14T12:02:54.084Z] 12:02:54 INFO - SimpleTest.ok@SimpleTest/SimpleTest.js:417:16
[task 2021-02-14T12:02:54.084Z] 12:02:54 INFO - runTest@layout/base/tests/test_bug851485.html:58:5
[task 2021-02-14T12:02:54.084Z] 12:02:54 INFO - GECKO(4352) | MEMORY STAT | vsize 2104099MB | vsizeMaxContiguous 69085050MB | residentFast 99MB | heapAllocated 9MB
[task 2021-02-14T12:02:54.084Z] 12:02:54 INFO - TEST-OK | layout/base/tests/test_bug851485.html | took 585ms

Peter can you take a look?

Flags: needinfo?(peterv)
Flags: needinfo?(peterv)

There are 31 total failures in the last 7 days on:

  • windows10-64-shippable-qr
  • windows10-64-shippable
    -linux1804-64-shippable-qr
    -linux1804-64-shippable

Recent failure log

"INFO - TEST-START | layout/base/tests/test_bug851485.html
[task 2021-02-28T23:34:17.506Z] 23:34:17 INFO - TEST-INFO | started process screenshot
[task 2021-02-28T23:34:17.576Z] 23:34:17 INFO - TEST-INFO | screenshot: exit 0
[task 2021-02-28T23:34:17.576Z] 23:34:17 INFO - Buffered messages logged at 23:34:17
[task 2021-02-28T23:34:17.576Z] 23:34:17 INFO - TEST-PASS | layout/base/tests/test_bug851485.html | We set the location's hash to 'anchor'
[task 2021-02-28T23:34:17.576Z] 23:34:17 INFO - TEST-PASS | layout/base/tests/test_bug851485.html | Reloading keeps scroll position
[task 2021-02-28T23:34:17.576Z] 23:34:17 INFO - Buffered messages finished
[task 2021-02-28T23:34:17.576Z] 23:34:17 INFO - TEST-UNEXPECTED-FAIL | layout/base/tests/test_bug851485.html | Scrolling after load works.
[task 2021-02-28T23:34:17.576Z] 23:34:17 INFO - SimpleTest.ok@SimpleTest/SimpleTest.js:417:16
[task 2021-02-28T23:34:17.576Z] 23:34:17 INFO - runTest@layout/base/tests/test_bug851485.html:58:5
[task 2021-02-28T23:34:17.577Z] 23:34:17 INFO - GECKO(4336) | MEMORY STAT | vsize 2103921MB | vsizeMaxContiguous 65007037MB | residentFast 98MB | heapAllocated 10MB
[task 2021-02-28T23:34:17.577Z] 23:34:17 INFO - TEST-OK | layout/base/tests/test_bug851485.html | took 583ms"

Peter can you please take a look?

Whiteboard: [stockwell needswork:owner]
Assignee: nobody → abutkovits
Status: NEW → ASSIGNED

Atila, you don't need to land your patch to disable test_bug851485.html for Fission. ahal just disabled this same test for Fission in bug 1694833:

https://hg.mozilla.org/integration/autoland/rev/4b99cb914aad#l1.9

 [test_bug851485.html]
 skip-if =
    toolkit == 'android'  # Bug 1355821
    fission && (os == 'win' || os == 'mac')  # Bug 1694833 - initial triage

But let's keep this bug open for the Fission bug fix. I am tentatively reassigning this bug to Peter because this bug was filed as a regression from his test_bug851485.html changes in bug 1677689.

Assignee: abutkovits → peterv
Blocks: 1694833
Fission Milestone: --- → ?
Flags: needinfo?(abutkovits)
See Also: → 1698195

Chris, thank you for the update.

Flags: needinfo?(abutkovits)
Attachment #9207247 - Attachment is obsolete: true

This is probably a test bug. Tracking for Fission M8 Release.

Fission Milestone: ? → M8

Update:

There have been 34 failures within the last 7 days:

  • 19 failures on Linux 18.04 x64 WebRender Shippable opt
  • 15 failures on Linux 18.04 x64 WebRender opt

Recent failure: https://treeherder.mozilla.org/logviewer?job_id=335092468&repo=autoland&lineNumber=11216

[task 2021-03-31T22:44:27.468Z] 22:44:27 INFO - TEST-PASS | layout/base/tests/test_bug851485.html | Reloading keeps scroll position
[task 2021-03-31T22:44:27.468Z] 22:44:27 INFO - Buffered messages finished
[task 2021-03-31T22:44:27.469Z] 22:44:27 INFO - TEST-UNEXPECTED-FAIL | layout/base/tests/test_bug851485.html | Scrolling after load works.
[task 2021-03-31T22:44:27.470Z] 22:44:27 INFO - SimpleTest.ok@SimpleTest/SimpleTest.js:417:16
[task 2021-03-31T22:44:27.470Z] 22:44:27 INFO - runTest@layout/base/tests/test_bug851485.html:58:5
[task 2021-03-31T22:44:27.470Z] 22:44:27 INFO - GECKO(1972) | MEMORY STAT | vsize 2675MB | residentFast 156MB | heapAllocated 18MB
[task 2021-03-31T22:44:27.470Z] 22:44:27 INFO - TEST-OK | layout/base/tests/test_bug851485.html | took 678ms

Whiteboard: [stockwell unknown] → [stockwell needswork:owner]

Hi Peter, are there any updates here?
There are 33 total failures in the last 7 days on

  • linux1804-64-qr opt
  • linux1804-64-shippable opt
  • linux1804-64-shippable-qr opt

Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=338776520&repo=autoland&lineNumber=7682

[task 2021-05-05T04:30:39.210Z] 04:30:39 INFO - TEST-START | layout/base/tests/test_bug851485.html
[task 2021-05-05T04:30:39.820Z] 04:30:39 INFO - TEST-INFO | started process screentopng
[task 2021-05-05T04:30:40.142Z] 04:30:40 INFO - TEST-INFO | screentopng: exit 0
[task 2021-05-05T04:30:40.142Z] 04:30:40 INFO - Buffered messages logged at 04:30:39
[task 2021-05-05T04:30:40.142Z] 04:30:40 INFO - TEST-PASS | layout/base/tests/test_bug851485.html | We set the location's hash to 'anchor'
[task 2021-05-05T04:30:40.143Z] 04:30:40 INFO - TEST-PASS | layout/base/tests/test_bug851485.html | Reloading keeps scroll position
[task 2021-05-05T04:30:40.143Z] 04:30:40 INFO - Buffered messages finished
[task 2021-05-05T04:30:40.144Z] 04:30:40 INFO - TEST-UNEXPECTED-FAIL | layout/base/tests/test_bug851485.html | Scrolling after load works.
[task 2021-05-05T04:30:40.144Z] 04:30:40 INFO - SimpleTest.ok@SimpleTest/SimpleTest.js:417:16
[task 2021-05-05T04:30:40.145Z] 04:30:40 INFO - runTest@layout/base/tests/test_bug851485.html:58:5
[task 2021-05-05T04:30:40.145Z] 04:30:40 INFO - GECKO(12403) | MEMORY STAT | vsize 2691MB | residentFast 144MB | heapAllocated 18MB
[task 2021-05-05T04:30:40.146Z] 04:30:40 INFO - TEST-OK | layout/base/tests/test_bug851485.html | took 632ms

Flags: needinfo?(peterv)
Flags: needinfo?(peterv)
Pushed by abutkovits@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fd8143f262ec
disable test_bug851485.html on Linux_fission for frequent failures. r=intermittent-reviewers,ahal DONTBUILD
Keywords: leave-open
Whiteboard: [stockwell needswork:owner] → [stockwell disabled]

peterv, just want to be sure this is still on your radar; it seems like we think this is just a test bug, but having said that, it's a test that was intended/expected to work with fission, per your changes in bug 1677689.

We've got this flagged as an M8 Fission blocker (per comment 25) at the moment, which means this is coming up in Fission nag/burndown charts -- so I want to be sure it's doesn't slip through the cracks. (Having said that: worst-case, since we're pretty confident about it being a test bug, I could also imagine this being morphed into "real-soon" post-fission-release followup work, if needed.)

Notably, this test does use sendMouseEvent just before the test failure; and that API is known to be potentially-broken with fission, per bug 1528935.

If that API is failing here (i.e. if we failed to send the click to the link-that-is-expected-to-trigger-scrolling), that would easily explain this test failure.

(having said that: I think the sendMouseEvent call here is for a same-origin iframe. So it's presumably not crossing any process boundaries. I don't know enough about our event-handling code before/after fission to know whether that changes things or not; but it seems like it might mean that this usage should be supported and unrelated to bug 1528935...)

Pushed by dholbert@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3051f9d50710
diagnostic: Add some logging to mochitest test_bug851485.html, to provide more info when it fails. (no review)

On a whim, I wrote a diagnostic patch (just landed) to log some more info when the test fails, and I tried running the test locally with this patch applied and with fission enabled last week, on a computer that I wasn't actively using.

I encountered this bug's failure only once, and unfortunately rr had a hiccup during that recording and didn't successfully capture the issue. I ran the test for several compute-days after that (with rr) and didn't hit the failure again, so I wasn't able to diagnose further.

Hopefully we can reproduce it on infra with this diagnostic patch (maybe in a try run, since it's disabled on trunk right now), and gain a bit more info; or maybe someone can capture it in rr and gain a bit more information about what's going on.

FWIW, I tried and failed to reproduce the issue on linux debug and opt with --verify.

Still happens on Windows and Mac.

The try run from comment 47 didn't yet have my diagnostic patch from comment 43 (because it seems autoland hasn't merged to central in a little while).

I did a Try run based off of a recent autoland commit, to get some test failures with the additional logging. Here's the try run:
https://treeherder.mozilla.org/jobs?repo=try&revision=b09326a3937762ccf60970248db42c9a85add324
And here's a failure in this test on Mac (the only one I see so far):
https://treeherder.mozilla.org/logviewer?job_id=343525491&repo=try&lineNumber=6684

The logging says:

INFO - TEST-PASS | layout/base/tests/test_bug851485.html | We set the location's hash to 'anchor' 
INFO - TEST-PASS | layout/base/tests/test_bug851485.html | Reloading keeps scroll position 
INFO - Sending click
INFO - IntersectionObserver: entering callback
INFO - IntersectionObserver: some entry isIntersecting; disconnecting
INFO - Promise resolved (for IntersectionObserver). win.scrollY is 0
INFO - Buffered messages finished
INFO - TEST-UNEXPECTED-FAIL | layout/base/tests/test_bug851485.html | Scrolling after load works. 

So we have one more piece of information here, which is win.scrollY is 0 at the time of the test-failure. We're expecting to have scrolled (by at least 3000px) but we're not scrolling at all, it seems.

Also, one side note: given the use of IntersectionObserver in the test, it's conceivable that there's a connection between this bug and bug 1671539 (another apparently-fission-specific IntersectionObserver-related test failure).

See Also: → 1671539

This bug is a soft blocker for Fission M8. We'd like to fix it before our M8 Release experiment, but we won't delay the experiment waiting for it.

Whiteboard: [stockwell disabled] → [stockwell disabled] fission-soft-blocker

I managed to catch this in rr on Linux (by running the test continuously on 4 machines overnight). I'll see if I can get to the bottom of what's going on...

Pernosco trace: https://pernos.co/debug/-IllENi-iynvkylXutktsQ/index.html

I'm partway through stepping through it. My high-level takeaway right now is: the test seems to be expecting the document to stay the same (i.e. no navigation) from the point where the IntersectionObserver is constructed. But in fact the document does navigate (i.e. we get OnPageHide/OnPageShow, and a new Document object, with an initially-null root frame). So, we get a win.scrollY of 0 (per comment 49) because we haven't yet constructed frames in the new document at the point where it gets queried, which means we take this return CSSIntPoint(0, 0); statement:
https://searchfox.org/mozilla-central/rev/8f08c21f093be1c1c42438697f8bca67af94fc77/dom/base/nsGlobalWindowOuter.cpp#4108-4110

I haven't yet quite figured out what all this means, but at this point I'm suspecting/hoping it's a bug in the test's expectations, rather than an actual fission bug.

(In reply to Daniel Holbert [:dholbert] from comment #54)

Pernosco trace: https://pernos.co/debug/-IllENi-iynvkylXutktsQ/index.html

I'm partway through stepping through it. My high-level takeaway right now is: the test seems to be expecting the document to stay the same (i.e. no navigation) from the point where the IntersectionObserver is constructed. But in fact the document does navigate (i.e. we get OnPageHide/OnPageShow, and a new Document object, with an initially-null root frame).

This comment made me think that this load event is not the one for the reload. Triggering the runTest function after the load event for the iframe's window received fixed this failure.

Stealing.

Assignee: peterv → hikezoe.birchill
Flags: needinfo?(peterv)

Though the test funtion is called twice since the parent document reloads the
iframe document, but the test will not run for the second call since it's
prevented by testRunning flag in test_bug842853.html.

(In reply to Hiroyuki Ikezoe (:hiro) from comment #55)

(In reply to Daniel Holbert [:dholbert] from comment #54)

Pernosco trace: https://pernos.co/debug/-IllENi-iynvkylXutktsQ/index.html

I'm partway through stepping through it. My high-level takeaway right now is: the test seems to be expecting the document to stay the same (i.e. no navigation) from the point where the IntersectionObserver is constructed. But in fact the document does navigate (i.e. we get OnPageHide/OnPageShow, and a new Document object, with an initially-null root frame).

This comment made me think that this load event is not the one for the reload.

Yup, I had the same realization! I was going to post a patch with a slightly different approach (adjusting the test file to ensure we get a load event before we call reload on the frame and wait for a second one), but this makes sense too.

(I was imagining that maybe the test had a reason for firing runTest() before load -- the fact that it has a slowly-loading .sjs CSS file seems like maybe it is intentionally delaying load to happen later... But I can't find any mention of that or reason why the scrolling bug might involves fiddly load-timing on the original bug page that added this file_bug842853.html helper. So I think this is fine and probably the original test could have used the load event like you're changing it to do here.)

Thanks for the patch!

Pushed by hikezoe.birchill@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8397357bef1f
Trigger the test function after the child document received the load event. r=dholbert
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → 93 Branch
Attachment #9228297 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: