Closed Bug 1360446 Opened 7 years ago Closed 7 years ago

Intermittent test_click.py TestLegacyClick.test_click | TimeoutException: Timeout loading page after 300000ms

Categories

(Testing :: Marionette Client and Harness, defect, P1)

Version 3
defect

Tracking

(firefox55 disabled, firefox56 fixed)

RESOLVED FIXED
mozilla56
Tracking Status
firefox55 --- disabled
firefox56 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: whimboo)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell disabled])

Attachments

(3 files)

Filed by: hskupin [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=95059748&repo=autoland

https://archive.mozilla.org/pub/firefox/tinderbox-builds/autoland-win64-debug/1493349925/autoland_win8_64-debug_test-marionette-e10s-bm110-tests1-windows-build512.txt.gz

We timeout for a get request:

1493354027007	Marionette	TRACE	449 -> [0,2,"get",{"url":"data:text/html;charset=utf-8,%0A%20%20%20%20%20%20%20%20%20%20%20%20%3Cbutton%3Eclick%20me%3C/button%3E%0A%20%20%20%20%20%20%20%20%20%20%20%20%3Cscript%3E%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20window.clicks%20%3D%200%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20let%20button%20%3D%20document.querySelector%28%22button%22%29%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20button.addEventListener%28%22click%22%2C%20%28%29%20%3D%3E%20window.clicks%2B%2B%29%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%3C/script%3E%0A%20%20%20%20%20%20%20%20"}]
1493354027074	Marionette	DEBUG	Received DOM event "pagehide" for "about:robots"
1493354027075	Marionette	DEBUG	Received DOM event "unload" for "about:robots"
[1464] WARNING: '!compMgr', file c:/builds/moz2_slave/autoland-w64-d-000000000000000/build/src/xpcom/components/nsComponentManagerUtils.cpp, line 63
[Child 1464] WARNING: '!compMgr', file c:/builds/moz2_slave/autoland-w64-d-000000000000000/build/src/xpcom/components/nsComponentManagerUtils.cpp, line 63
1493354027135	Marionette	DEBUG	Received observer notification "outer-window-destroyed" for "30"
++DOCSHELL 000000D67621C800 == 1 [pid = 1464] [id = {401630ee-9cf0-44ec-922f-214e9f79f610}]
++DOMWINDOW == 1 (000000D67621D000) [pid = 1464] [serial = 1] [outer = 0000000000000000]
++DOMWINDOW == 2 (000000D67CA0A000) [pid = 1464] [serial = 2] [outer = 000000D67621D000]
1493354027595	Marionette	DEBUG	loaded listener.js

Given that we haven't seen this before it might be related to bug 1335778.
Please note the additional received observer notification, which was caused by an upcoming remoteness change. So it should clearly be related to bug 1335778 then because the observer code has been added there.
All the failures happened on April 28th only, the day of landing the click+page load patch. I will have another look at this bug if there are more occurrences in the future.
With bug 1223277 we will have some improvements regarding window/tab closing. Lets see if it makes some differences.
Depends on: 1223277
The timeout here seems to be in `get` and specifically when loading the data URL:

1497006383667	Marionette	TRACE	430 -> [0,2,"get",{"url":"data:text/html;charset=utf-8,%0A%20%20%20%20%20%20%20%20%20%20%20%20%3Cbutton%3Eclick%20me%3C/button%3E%0A%20%20%20%20%20%20%20%20%20%20%20%20%3Cscript%3E%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20window.clicks%20%3D%200%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20let%20button%20%3D%20document.querySelector%28%22button%22%29%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20button.addEventListener%28%22click%22%2C%20%28%29%20%3D%3E%20window.clicks%2B%2B%29%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%3C/script%3E%0A%20%20%20%20%20%20%20%20"}]
1497006383670	Marionette	DEBUG	Received DOM event "beforeunload" for "about:robots"
[Child 1968] WARNING: '!mMainThread', file /home/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 330
[Child 1968] WARNING: '!mMainThread', file /home/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 330
--DOCSHELL 0x11b584800 == 2 [pid = 1968] [id = {231305ae-316a-b14f-a386-09959cd627be}]
--DOCSHELL 0x11b58b800 == 1 [pid = 1968] [id = {fe6ea7e4-cec4-ff42-8396-77651b46ecb3}]
--DOCSHELL 0x1140f1000 == 0 [pid = 1968] [id = {3f96298a-3810-de40-8282-c035b564a282}]
1497006383777	Marionette	DEBUG	Received DOM event "pagehide" for "about:robots"
1497006383777	Marionette	DEBUG	Received DOM event "unload" for "about:robots"
++DOCSHELL 0x11bd66000 == 1 [pid = 1969] [id = {637d8ea9-07ed-ca4c-ac47-3c8f416bc7d7}]
++DOMWINDOW == 1 (0x11bd66800) [pid = 1969] [serial = 1] [outer = 0x0]
1497006383807	Marionette	DEBUG	Received observer notification "outer-window-destroyed" for "23"
++DOMWINDOW == 2 (0x12586c800) [pid = 1969] [serial = 2] [outer = 0x11bd66800]
1497006383997	Marionette	DEBUG	loaded listener.js
[..]
1497006684772	Marionette	TRACE	430 <- [1,2,{"error":"timeout","message":"Timeout loading page after 300000ms","stacktrace":"WebDriverError@chrome://marionette/content/error.js:222:5\nTimeoutError@chrome://marionette/content/error.js:472:5\nnotify@chrome://marionette/content/listener.js:296:17\n"},null]

It means right after triggering the page load the framescript gets reloaded. Then no single load event is taking place.

I assume this is related to bug 1368434.
Depends on: 1368434
this intermittent failure has 50 instances in the last week almost all on debug.

:whimboo, I see you have a possible root cause, but that bug has no progress in almost 3 weeks- can you take a look at this bug in the next week and see if we can get a fix or consider disabling the test?
Flags: needinfo?(hskupin)
Whiteboard: [stockwell needswork]
Given the failure rate and higher priority items to work on this week than fixing this test, lets disable for now:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7544bf58474a00e373647b04e96ffbce0aca9bc8
Attachment #8879943 - Flags: review?(gbrown)
Attachment #8879943 - Flags: review?(gbrown) → review+
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/fcc60b7fa3ef
Intermittent test_click.py TestLegacyClick.test_click. temporarily disable on debug+e10s. r=gbrown
Whiteboard: [stockwell needswork] → [stockwell disabled]
We need to run this test on e10s with Marionette.  whimboo, do you have
capacity to look into it?
Priority: -- → P1
This is not happening on OS X so I cannot reproduce it. But please see the dependency list. I think it is related to bug 1368434, and I can reproduce such an issue via the failure on bug 1353599. This means we need bug 1368434 fixed.
Flags: needinfo?(hskupin)
Joel, your skip patch was for debug and e10s only. But it looks like that opt builds are failing too very often now. Maybe we have to skip more broader.
Flags: needinfo?(gbrown)
Attached patch skip on all e10sSplinter Review
I totally agree. How about this?
Flags: needinfo?(gbrown)
Attachment #8885282 - Flags: review?(hskupin)
Comment on attachment 8885282 [details] [diff] [review]
skip on all e10s

Looks good. Thanks.
Attachment #8885282 - Flags: review?(hskupin) → review+
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c22ccebf80cd
Broaden skip of test_click.py TestLegacyClick.test_click for continued intermittent failures; r=whimboo
(In reply to Geoff Brown [:gbrown] from comment #25)
> Created attachment 8885282 [details] [diff] [review]
> skip on all e10s
> 
> I totally agree. How about this?

Geoff, I actually didn't really notice that we skip ALL click tests while this is a failure only for a single test inside this file. We should really revert that and only skip the failing test. Can you work on that? Sorry, that this slipped through but it is major now. Thanks.
Flags: needinfo?(gbrown)
Comment on attachment 8887230 [details] [diff] [review]
skip only TestLegacyClick.test_click, only on e10s

r=me with a green try push. Thanks a lot!
Attachment #8887230 - Flags: review?(hskupin) → review+
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e3dcb77ea22f
Enable marionette test_click.py, but disable the troublesome TestLegacyClick.test_click part, on e10s; r=whimboo
Fixed by my patch on bug 1368434.
Assignee: nobody → hskupin
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
Product: Testing → Remote Protocol
Moving bug to Testing::Marionette Client and Harness component per bug 1815831.
Component: Marionette → Marionette Client and Harness
Product: Remote Protocol → Testing
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: