Intermittent reftest/tests/gfx/tests/crashtests/783041-2.html | application timed out after 330 seconds with no output

RESOLVED FIXED in Firefox 56

Status

()

Core
Graphics
P1
normal
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: aryx, Assigned: milan)

Tracking

({intermittent-failure})

unspecified
mozilla56
intermittent-failure
Points:
---

Firefox Tracking Flags

(firefox56 fixed)

Details

(Whiteboard: [gfx-noted][stockwell fixed:other])

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment)

https://treeherder.mozilla.org/logviewer.html#?job_id=115332190&repo=autoland

[task 2017-07-18T17:52:07.162181Z] 17:52:07    ERROR - REFTEST ERROR | file:///home/worker/workspace/build/tests/reftest/tests/gfx/tests/crashtests/783041-2.html | application timed out after 330 seconds with no output
Interesting:

[task 2017-07-18T17:46:37.167591Z] 17:46:37     INFO - --DOMWINDOW == 45 (0x7fd9d1ab7800) [pid = 1013] [serial = 2500] [outer = (nil)] [url = file:///home/worker/workspace/build/tests/reftest/tests/gfx/tests/crashtests/783041-2.html]
[task 2017-07-18T17:46:37.168232Z] 17:46:37     INFO - --DOMWINDOW == 44 (0x7fd9995e9000) [pid = 1013] [serial = 2501] [outer = (nil)] [url = file:///home/worker/workspace/build/tests/reftest/tests/gfx/tests/crashtests/783041-2.html?10]
[task 2017-07-18T17:46:37.168406Z] 17:46:37     INFO - --DOMWINDOW == 43 (0x7fd9ce460800) [pid = 1013] [serial = 2502] [outer = (nil)] [url = file:///home/worker/workspace/build/tests/reftest/tests/gfx/tests/crashtests/783041-2.html?9]
[task 2017-07-18T17:46:37.169482Z] 17:46:37     INFO - --DOMWINDOW == 42 (0x7fd999a5f000) [pid = 1013] [serial = 2503] [outer = (nil)] [url = file:///home/worker/workspace/build/tests/reftest/tests/gfx/tests/crashtests/783041-2.html?8]
[task 2017-07-18T17:46:37.170122Z] 17:46:37     INFO - --DOMWINDOW == 41 (0x7fd9ce361800) [pid = 1013] [serial = 2504] [outer = (nil)] [url = file:///home/worker/workspace/build/tests/reftest/tests/gfx/tests/crashtests/783041-2.html?7]
[task 2017-07-18T17:46:37.170770Z] 17:46:37     INFO - --DOMWINDOW == 40 (0x7fd9ce454000) [pid = 1013] [serial = 2505] [outer = (nil)] [url = file:///home/worker/workspace/build/tests/reftest/tests/gfx/tests/crashtests/783041-2.html?6]
[task 2017-07-18T17:46:37.170905Z] 17:46:37     INFO - --DOMWINDOW == 39 (0x7fd9d46ea800) [pid = 1013] [serial = 2506] [outer = (nil)] [url = file:///home/worker/workspace/build/tests/reftest/tests/gfx/tests/crashtests/783041-2.html?5]
[task 2017-07-18T17:52:07.162181Z] 17:52:07    ERROR - REFTEST ERROR | file:///home/worker/workspace/build/tests/reftest/tests/gfx/tests/crashtests/783041-2.html | application timed out after 330 seconds with no output

So, we reload (by setting a new search value) 6 times, each time taking less than a second (you'd hope so), and then we get stuck for five and a half minutes.  But, this is a different than the 20 minute timeout from before, though it would probably hit the same one - do we have other runs with failures?  Do they always show up after 6 reloads (e.g., the last valid url has ?5 at the end)?
Flags: needinfo?(aryx.bugmail)
Assignee: nobody → milan
Priority: -- → P1
Whiteboard: [gfx-noted]
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1381933 has 4 failures now (and one mistagged one). 3 of them show the issue after 6 reloads.
https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=115310624&lineNumber=20031 doesn't have "783041-2.html?"
Flags: needinfo?(aryx.bugmail)
Based on conversation in bug 1381283 comment 7, this whole thing started with bug 1362903.  Samael, can you take a look?  It stands to reason that resetting gCurrentURL would mess with a test that does a page reload :)
Flags: needinfo?(sawang)
Flags: needinfo?(jmaher)
I do not see how resetting that variable is causing issues- I recommend backing it out until we determine what is going on here.  Possibly this is an issue with crashtests only?
Flags: needinfo?(jmaher)
I'm looking into this (and bug 1381839).
Anyone know how I could enable MOZ_REFTEST_VERBOSE on try server to get debug log? --setenv MOZ_REFTEST_VERBOSE=1 doesn't work for me...
Comment hidden (mozreview-request)
I found that if the reload happened during RecvClear(), it's possible that there would be no load event comes from the blank URL but from previous test document. It was working in the past since in this case there would be another TEST-PASS for the previous test document, and that would cause another RecClear(). 

As a workaround we can retry load the blank page, if gClearingForAssertionCheck is true but the load event comes from another document. I'm making more try runs to verify it.

That, however, in conjunction with bug 1362903, imply these reload test cases have been implemented incorrectly. I think there should be a "reftest-wait" in these test cases to ensure they are not considered finished before all reloads are done. Reftest / crashtest has no way to know it should expect for multiple reloads for a given test document otherwise.
Flags: needinfo?(sawang)
Comment on attachment 8888234 [details]
Bug 1381933 - Retry loading blank page if gClearingForAssertionCheck but load event comes from another URL.

https://reviewboard.mozilla.org/r/159184/#review164606

this looks like a good approach.  Do you want to modify some tests to use reftest-wait?  Please test on try with crashtest/reftest and retrigger a few times on all platforms.
Attachment #8888234 - Flags: review?(jmaher) → review+
23 failures in 190 pushes (0.121 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* autoland: 13
* mozilla-inbound: 6
* try: 4

Platform breakdown:
* linux64-stylo-sequential: 7
* linux64-stylo: 7
* linux64-qr: 4
* linux64: 2
* linux32: 2
* windows7-32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1381933&startday=2017-07-20&endday=2017-07-20&tree=all
:freesamael, can you try pushing again- I would like to have this bug resolved on Monday, even if we backout and work on the proper fix offline.
Whiteboard: [gfx-noted] → [gfx-noted][stockwell needswork]
15 failures in 144 pushes (0.104 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* mozilla-inbound: 7
* autoland: 7
* try: 1

Platform breakdown:
* linux64-stylo-sequential: 4
* linux32: 4
* linux64-stylo: 2
* linux64-qr: 2
* windows8-64: 1
* windows7-32: 1
* osx-10-10: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1381933&startday=2017-07-21&endday=2017-07-21&tree=all
(In reply to Samael Wang [:freesamael] from comment #11)
> Has reftest been so bad lately?
On WIndows, especially in the VMs: Yes.
61 failures in 822 pushes (0.074 failures/push) were associated with this bug in the last 7 days. 

This is the #20 most frequent failure this week.  

** This failure happened more than 30 times this week! Resolving this bug is a high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 2 weeks, the affected test(s) may be disabled. ** 

Repository breakdown:
* autoland: 34
* mozilla-inbound: 21
* try: 5
* mozilla-central: 1

Platform breakdown:
* linux64-stylo: 20
* linux64-stylo-sequential: 17
* linux64-qr: 9
* linux32: 6
* osx-10-10: 3
* linux64: 3
* windows7-32: 2
* windows8-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1381933&startday=2017-07-17&endday=2017-07-23&tree=all
(In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on intermittent or backout) from comment #14)
> (In reply to Samael Wang [:freesamael] from comment #11)
> > Has reftest been so bad lately?
> On WIndows, especially in the VMs: Yes.

Made a try again, not seeing an obvious difference with / without the patch on Windows. I believe it's not introducing regression. Let's try to land this.
Keywords: checkin-needed

Comment 17

a year ago
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b756c4d0b7ff
Retry loading blank page if gClearingForAssertionCheck but load event comes from another URL. r=jmaher
Keywords: checkin-needed

Comment 18

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/b756c4d0b7ff
Status: NEW → RESOLVED
Last Resolved: a year ago
status-firefox56: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
22 failures in 152 pushes (0.145 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* mozilla-inbound: 13
* try: 4
* autoland: 4
* pine: 1

Platform breakdown:
* linux64-stylo: 9
* linux64-stylo-sequential: 5
* linux64-qr: 2
* linux64: 2
* linux32: 2
* windows8-64: 1
* windows7-32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1381933&startday=2017-07-24&endday=2017-07-24&tree=all
Duplicate of this bug: 1381839
Whiteboard: [gfx-noted][stockwell needswork] → [gfx-noted][stockwell fixed:other]
23 failures in 1008 pushes (0.023 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-inbound: 14
* try: 4
* autoland: 4
* pine: 1

Platform breakdown:
* linux64-stylo: 9
* linux64-stylo-sequential: 6
* windows7-32: 2
* linux64-qr: 2
* linux64: 2
* windows8-64: 1
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1381933&startday=2017-07-24&endday=2017-07-30&tree=all
You need to log in before you can comment on or make changes to this bug.