Closed Bug 1678565 Opened 4 years ago Closed 4 years ago

tp5n bild.de test page never finishes loading with Fission enabled

Categories

(Core :: General, defect)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1503499
Tracking Status
firefox-esr78 --- unaffected
firefox83 --- unaffected
firefox84 --- unaffected
firefox85 --- affected

People

(Reporter: jdescottes, Unassigned)

References

(Regression)

Details

(Keywords: regression)

It seems that the changes made in Bug 1634065 cause one of our test pages to never reload properly. We are waiting for a load event that is never fired and the favicon remains with the "loading" animation.

STRs:

  • run
    ./mach talos-test --activeTests damp --subtests complicated.netmonitor --cycles 1 --tppagecycles 1 --enable-fission

ER: Test should finish
AR: Test times out waiting for the page to reload

Investigation pointing to Bug 1634065

Some DevTools talos perf tests (aka DAMP) use a fake "bild.de" page provided by tp5n/tp5n-fis.

The DAMP suite used to permafail on Fission, but we fixed that on Bug 1677587, by skipping two tests that still need fixes for fission.

Here is a try push with --no-artifact, on top of the mc changeset for Bug 1677587: https://treeherder.mozilla.org/jobs?repo=try&revision=aba4358c9d7c3b8b399c0b73ce6d9b79c7281464
On this try push, DAMP is green both on Fission and non Fission platform.

Bug 1634065 landed right after Bug 1677587 (see pushlog). And here is a try push, right on top of the last changeset for Bug 1677587: https://treeherder.mozilla.org/jobs?repo=try&revision=51b75528c14644b7c63f5734c3bf5668f3acda5d
On this try push, DAMP times out on the Fission platform. The DAMP suite times out on the netmonitor.complicated test, when we try to reload the bild.de page at
https://searchfox.org/mozilla-central/rev/1d34bd022de0b55c81d9db6026f69bda1d4a86d2/testing/talos/talos/tests/devtools/addon/content/tests/netmonitor/complicated.js#39

To me this means that something from Bug 1634065 broke the test.

Investigation on DAMP reload logic

To summarize how we handle the reload in DAMP, we call gBrowser.selectedBrowser.reload() and then we wait for a "load" event on the page. (see on searchfox).

This is currently done using a framescript+message manager. So I initially thought, this could be fixed by migrating to JSWindowActors, which I did in Bug 1678379. But it doesn't help, we still never get a load or pageshow event for the top-level document of the page.

And if we look at the tab for bild.de during the test, it is stuck with the "loading" favicon animation. So I think something else got broken and is not really related to how DAMP monitors the load of the page.

To reproduce this you can run this test with the following command:

./mach talos-test --activeTests damp --subtests complicated.netmonitor --cycles 1 --tppagecycles 1 --enable-fission

Note that I am adding a workaround for this in Bug 1574417, after what we will no longer wait for the load event in this test. In case Bug 1574417 has already landed, make sure to use an earlier changeset to repro.

I am moving this to Core, since it was the product for Bug 1634065.

Product: DevTools → Core

Set release status flags based on info from the regressing bug 1634065

Neha, can you set a priority here (and a better component if/when we know it), please?

Flags: needinfo?(nkochar)

Doing a few more tests around this, this infinite loading only seems to happen when DevTools are open.

I started a few retriggers on my try pushes, I wonder if this is simply related to what :ochameau describes in https://bugzilla.mozilla.org/show_bug.cgi?id=1503499#c14 . Maybe I just got a green try on https://treeherder.mozilla.org/jobs?repo=try&revision=aba4358c9d7c3b8b399c0b73ce6d9b79c7281464 by chance. Will update when I get results here.

I can confirm this is related to DevTools attaching too many threads. I guess Bug 1634065 made the race more visible for some reason, but the root cause of the issue is still in devtools. Sorry about the noise.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → DUPLICATE
Flags: needinfo?(nkochar)
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.