[32-bit Linux, maybe focus related] Permaorange or intermittent on Linux32 debug unaccelerated layout/reftests/bugs/613433-3.html,613433-2.html,613433-1.html | load failed: timed out waiting for reftest-wait to be removed

RESOLVED FIXED in Firefox 54

Status

()

defect
P1
normal
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: intermittent-bug-filer, Assigned: bugs)

Tracking

({intermittent-failure, regression})

Trunk
mozilla54
Points:
---
Bug Flags:
in-testsuite -

Firefox Tracking Flags

(firefox52 wontfix, firefox53 wontfix, firefox54 fixed)

Details

Attachments

(1 attachment)

Blocks: 1012752
Keywords: regression
Summary: Intermittent layout/reftests/bugs/613433-3.html | load failed: timed out waiting for reftest-wait to be removed → Permaorange on Linux32 debug unaccelerated layout/reftests/bugs/613433-3.html | load failed: timed out waiting for reftest-wait to be removed
Some real goodness going on here: something, perhaps the landing of bug 1000957, moved the permaorange from -3 to -2.
Summary: Permaorange on Linux32 debug unaccelerated layout/reftests/bugs/613433-3.html | load failed: timed out waiting for reftest-wait to be removed → Permaorange on Linux32 debug unaccelerated layout/reftests/bugs/613433-3.html,613433-2.html | load failed: timed out waiting for reftest-wait to be removed
Maybe worth seeing if only the test (order) changes in https://hg.mozilla.org/integration/mozilla-inbound/rev/b1dbce81bf3b6124577fb46414f811ec5f45f4e0 were enough to trigger this?
Flags: needinfo?(mstange)
Sigh.

This bug is permaorange when either -2 or -3 is the first test to run in Ru3, not following -1.

When -1 runs in Ru3, bug 1289014 is intermittent.

The thing that started this was not Markus, or a layout-touching push in July, it was when we went from running Linux32 debug reftest-no-accel in 2 chunks to running it in 6 chunks, making it possible for the 613433 tests to run in a browser which hadn't previously run whatever previous test they depend on running after.
Summary: Permaorange on Linux32 debug unaccelerated layout/reftests/bugs/613433-3.html,613433-2.html | load failed: timed out waiting for reftest-wait to be removed → [32-bit Linux, maybe focus related] Permaorange on Linux32 debug unaccelerated layout/reftests/bugs/613433-3.html,613433-2.html | load failed: timed out waiting for reftest-wait to be removed
I turn on the focus logging on try to what happens on Ru3.
https://treeherder.mozilla.org/logviewer.html#?job_id=26688681&repo=try#L2016-L2064

Since "613433-2.html" is the first test being run on this chunk, somehow the test file is loaded *before* the window "chrome://reftest/content/reftest.xul" is loaded, which makes the focus changing to the contenteditable failed. On the subsequent successful tests, the xul should be loaded before the test files.

To make the focus switching happens after xul is loaded by setTimeout 1000ms, the focus could be switched successfully, but this might not be a robust fix though.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d7b0ed18b771
Perhaps the reftest harness should be waiting longer before it starts running tests, rather than initiating everything in OnRefTestLoad?
I agree with comment 18. We need to have a focus listener somewhere. We probably don't need the full-blown SimpleTest.waitForFocus solution, though.
Flags: needinfo?(mstange)
A two second delay before starting the first test seems to fix it:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4fe52514712b
Perhaps we should we take that ^ as a wallpaper for now?
Flags: needinfo?(dbaron)
No longer blocks: 1012752
Shouldn't it be simple enough to just poll for focus, and avoid having to add a timeout that might not be quite reliable?
Flags: needinfo?(dbaron)
Summary: [32-bit Linux, maybe focus related] Permaorange on Linux32 debug unaccelerated layout/reftests/bugs/613433-3.html,613433-2.html | load failed: timed out waiting for reftest-wait to be removed → [32-bit Linux, maybe focus related] Permaorange or intermittent on Linux32 debug unaccelerated layout/reftests/bugs/613433-3.html,613433-2.html,613433-1.html | load failed: timed out waiting for reftest-wait to be removed
Duplicate of this bug: 1289014
Our number one single-test failure, so I'll let you all decide what sort of hack or perfect fix you want to give it, with an accompanying patch to start running the tests on Linux32 again.
Keywords: leave-open
Whiteboard: [test disabled]
Pushed by philringnalda@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/785f1dbb4900
Disable 613433-1.html,613433-2.html,613433-3.html on Linux32 for needing focus which they don't get when they are the first test to run in a chunk
Keywords: leave-open
Seems that this bug was making progress and then stalled out after the tests got disabled? Is anybody owning the harness fixing and re-enabling of these tests?
Flags: needinfo?(bugs)
Version: unspecified → Trunk
(In reply to David Baron :dbaron: ⌚️UTC-8 from comment #29)
> Shouldn't it be simple enough to just poll for focus, and avoid having to
> add a timeout that might not be quite reliable?

I got a green Try run by adding a focus() handler:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5425671e37bd00e8dfa2053e717dac512a081815

Mats: can you help land this one, if it looks good to you? Thx!
Flags: needinfo?(bugs) → needinfo?(mats)
Comment on attachment 8829647 [details] [diff] [review]
jet's patch to wait for focus before starting tests

Looks good to me, fwiw.  One potential issue might be that 'gBrowser'
already has focus so our listener won't be called.  Probably worth
checking that by doing a Try run on all platforms.
Flags: needinfo?(mats)
Attachment #8829647 - Flags: review?(dbaron)
Attachment #8829647 - Flags: feedback+
(In reply to Mats Palmgren (:mats) from comment #38)
> Probably worth checking that by doing a Try run on all platforms.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=035862cbbe8c78f27f6390705b9a79906baee41c
Comment on attachment 8829647 [details] [diff] [review]
jet's patch to wait for focus before starting tests

OK, I'd suggest as a commit message:

Bug 1292460 - Focus the reftest browser before starting tests, except when filtering out needs-focus tests.
Attachment #8829647 - Flags: review?(dbaron) → review+
Mats will take this one over the finish line. Thanks, All!
Assignee: nobody → mats
Pushed by mpalmgren@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d0d4bfd4c073
Focus the reftest browser before starting tests, except when filtering out needs-focus tests.  r=dbaron
Assignee: mats → bugs
Flags: in-testsuite-
https://hg.mozilla.org/mozilla-central/rev/d0d4bfd4c073
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Whiteboard: [test disabled]
For reason, this appears to have stuck on trunk just fine, but both Aurora and Beta started getting frequent startup hangs (presumably unable to get focus in the right place) after I uplifted it there.
https://treeherder.mozilla.org/logviewer.html#?job_id=73014965&repo=mozilla-beta

Anyway, I've backed it out from Beta and will do so from Aurora as well.
https://hg.mozilla.org/releases/mozilla-beta/rev/cfe1b0427178
Curiously enough, on the trunk instead of lots of Linux reftest/crashtest startup hangs showing the "not the default browser" dialog, we're getting just a smattering of Win8 reftest startup hangs showing the Start screen.
Hmm, that seems odd given that the testing profile appears to disable that check:
https://dxr.mozilla.org/mozilla-central/rev/71224049c0b52ab190564d3ea0eab089a159a4cf/testing/profiles/prefs_general.js#24
Maybe there's an actual bug there - either that check isn't waiting for prefs to be read,
or the prefs are not read properly in some cases, or the pref was renamed or something.
Given how soon after the merge day this landed, I'm a bit worried that 54 will start hitting these failures too when it goes to Aurora.
You need to log in before you can comment on or make changes to this bug.