Closed
Bug 1054793
Opened 10 years ago
Closed 10 years ago
Intermittent test_shared_all.py TestSharedUnits.test_units | NoSuchElementException: NoSuchElementException: Unable to locate element: complete
Categories
(Hello (Loop) :: Client, defect, P1)
Tracking
(firefox32 unaffected, firefox33 fixed, firefox34 fixed, firefox-esr24 unaffected, firefox-esr31 unaffected)
Tracking | Status | |
---|---|---|
firefox32 | --- | unaffected |
firefox33 | --- | fixed |
firefox34 | --- | fixed |
firefox-esr24 | --- | unaffected |
firefox-esr31 | --- | unaffected |
People
(Reporter: RyanVM, Assigned: jib)
References
Details
(Keywords: intermittent-failure)
Attachments
(4 files, 1 obsolete file)
497.56 KB,
image/png
|
Details | |
495.10 KB,
image/png
|
Details | |
1.33 KB,
patch
|
Details | Diff | Splinter Review | |
1.49 KB,
patch
|
dmosedale
:
review+
|
Details | Diff | Splinter Review |
https://tbpl.mozilla.org/php/getParsedLog.php?id=46132787&tree=Fx-Team Ubuntu VM 12.04 fx-team debug test marionette on 2014-08-17 13:46:54 PDT for push 37ac55a26014 slave: tst-linux32-spot-517 14:00:50 INFO - TEST-START | test_shared_all.py TestSharedUnits.test_units 14:01:04 ERROR - TEST-UNEXPECTED-ERROR | test_shared_all.py TestSharedUnits.test_units | NoSuchElementException: NoSuchElementException: Unable to locate element: complete 14:01:04 INFO - Traceback (most recent call last): 14:01:04 INFO - File "/builds/slave/test/build/tests/marionette/marionette/marionette_test.py", line 171, in run 14:01:04 INFO - testMethod() 14:01:04 INFO - File "/builds/slave/test/build/tests/marionette/tests/browser/components/loop/test/shared/test_shared_all.py", line 16, in test_units 14:01:04 INFO - self.check_page("index.html") 14:01:04 INFO - File "/builds/slave/test/build/tests/marionette/tests/browser/components/loop/test/shared/frontend_tester.py", line 89, in check_page 14:01:04 INFO - self.marionette.find_element("id", 'complete') 14:01:04 INFO - File "/builds/slave/test/build/tests/marionette/marionette/marionette.py", line 1274, in find_element 14:01:04 INFO - response = self._send_message('findElement', 'value', **kwargs) 14:01:04 INFO - File "/builds/slave/test/build/tests/marionette/marionette/decorators.py", line 35, in _ 14:01:04 INFO - return func(*args, **kwargs) 14:01:04 INFO - File "/builds/slave/test/build/tests/marionette/marionette/marionette.py", line 634, in _send_message 14:01:04 INFO - self._handle_error(response) 14:01:04 INFO - File "/builds/slave/test/build/tests/marionette/marionette/marionette.py", line 668, in _handle_error 14:01:04 INFO - raise errors.NoSuchElementException(message=message, status=status, stacktrace=stacktrace) 14:01:04 INFO - TEST-INFO expected PASS | took 13406ms
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 7•10 years ago
|
||
There was substantial discussion and debugging about this yesterday in #ateam, starting at http://logs.glob.uno/?c=mozilla%23ateam&s=18+Aug+2014&e=18+Aug+2014#c843455 And pretty much any pieces of the chat log that I'm involved up to and including 22:02. The try server runs that chmanster mentioned pushing did not yield any assertions. A reasonable next step might be to either push that to mozilla-central since it might only assert in the failing case. Even better would be to do something like I describe starting around 21:54.
Comment 8•10 years ago
|
||
Randell, Jan-Ivar -- I'd like to find a regression window for this bug and then see if we can eliminate this intermittent.
Target Milestone: --- → mozilla34
Updated•10 years ago
|
Priority: -- → P1
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 12•10 years ago
|
||
Jib -- Can you look at this? First we want to find regression window and then find the root cause. We suspect that Standard8's checkin of the initial web sockets work (bug 1022594) on Friday caused this intermittent, but we don't know. Standard8 is on PTO until Monday and dmose is unavailable for questions until tomorrow (Thursday). However, please feel free to ask for support in #media.
Assignee: nobody → jib
Assignee | ||
Comment 13•10 years ago
|
||
(In reply to Maire Reavy [:mreavy] (Plz needinfo me) from comment #12) Sorry for leaving this hanging. The IRC chat in comment 7 suggests fixing the logging harness to emit the AssertionError message that would tell us which assertion failed on future runs on tbpl (though I see no more TBPL Robot messages here since Tuesday) - I don't have access rights to push to inbound as suggested nor do I have a good grasp on the marionette harness, so I may not be the best person to pursue that avenue. However, it reproduces well locally for me (but again without assertion message): > ★ ~/moz/mozilla-central $ ./mach marionette-test browser/components/loop/manifest.ini > ... > 0:11.45 LOG: MainThread INFO starting httpd > 0:11.45 LOG: MainThread INFO running webserver on http://127.0.0.1:56581/ > 0:11.45 SUITE_START: MainThread 3 > 0:11.46 TEST_START: MainThread test_shared_all.py TestSharedUnits.test_units > 0:27.58 TEST_END: MainThread ERROR, expected PASS > Traceback (most recent call last): > File "/Users/Jan/moz/mozilla-central/testing/marionette/client/marionette/marionette_test.py", line 171, in run > testMethod() > File "/Users/Jan/moz/mozilla-central/browser/components/loop/test/shared/test_shared_all.py", line 16, in test_units > self.check_page("index.html") > File "/Users/Jan/moz/mozilla-central/browser/components/loop/test/shared/frontend_tester.py", line 89, in check_page > self.marionette.find_element("id", 'complete') > File "/Users/Jan/moz/mozilla-central/testing/marionette/client/marionette/marionette.py", line 1278, in find_element > response = self._send_message('findElement', 'value', **kwargs) > File "/Users/Jan/moz/mozilla-central/testing/marionette/client/marionette/decorators.py", line 35, in _ > return func(*args, **kwargs) > File "/Users/Jan/moz/mozilla-central/testing/marionette/client/marionette/marionette.py", line 638, in _send_message > self._handle_error(response) > File "/Users/Jan/moz/mozilla-central/testing/marionette/client/marionette/marionette.py", line 672, in _handle_error > raise errors.NoSuchElementException(message=message, status=status, stacktrace=stacktrace) > NoSuchElementException: NoSuchElementException: Unable to locate element: complete But I'm not sure, is it possible to run marionette tests on mozregression builds?
Flags: needinfo?(dmose)
Comment 14•10 years ago
|
||
As I mentioned in the linked discussion on Monday, I haven't reproduced a case where the test harness omits relevant logging. The log pasted above and the one associated with the failure in automation look to me like an assertion originating from python that logs a relevant call stack and message. My hypothesis wasn't proved by the push to try, and relies on the idea that somewhere in the test run the current working directory is intermittently changed to something unexpected, so this may be a long shot.
Comment 15•10 years ago
|
||
(In reply to Maire Reavy [:mreavy] (Plz needinfo me) from comment #12) > We suspect that Standard8's checkin of the > initial web sockets work (bug 1022594) on Friday caused this intermittent, > but we don't know. Ok, useful data point... starting to try and link some dots... (In reply to Jan-Ivar Bruaroey [:jib] from comment #13) > However, it reproduces well locally for me (but again without assertion > message): Ok, I think the assertion message is a separate bug/issue. > > NoSuchElementException: NoSuchElementException: Unable to locate element: complete This could be the key. With this message, there's two possible issues that I know of: 1) The page is taking too long to process, and hits the 10 second timeout. 2) There's been a fatal error on the page. Jib, as you can reproduce, can you try a couple of things: 1) Set the timeout in this line to something like 60 seconds (or even longer maybe): http://hg.mozilla.org/mozilla-central/annotate/daa84204a11a/browser/components/loop/test/shared/frontend_tester.py#l66 See if that removes the issue. 2) Grab a screenshot of the failing case. - In http://hg.mozilla.org/mozilla-central/annotate/daa84204a11a/browser/components/loop/test/shared/frontend_tester.py#l89 - Wrap the find_element('complete') in a try..except..raise with dumping a screenshot, similar to: try: self.marionette.find_element("id", 'complete') except: print "data:image/png;base64,%s" % self.marionette.screenshot() raise > But I'm not sure, is it possible to run marionette tests on mozregression > builds? Lets see what the above give us first. To explain the dots I'm starting to link: If "complete" is missing, the likelyhood is that something has gone wrong in the page or timed out. Bug 1022594 added some new code, that was working fine locally, but did introduce ~150ms delays into the test occasionally. test_shared_all.py is the first test to run in the sequence of tests for the Loop folder (though I don't know how Marionette runs wrt browser loads/restarts on tbpl). If the browser is slow starting up and was previously on the edge, then a 150ms delay or two could push us over the 10 seconds that we wait for "complete" to be displayed.
Comment 16•10 years ago
|
||
Meant to needinfo Jib on that last comment.
Flags: needinfo?(dmose) → needinfo?(jib)
Assignee | ||
Comment 17•10 years ago
|
||
(In reply to Mark Banner (:standard8) from comment #15) > 1) The page is taking too long to process, and hits the 10 second timeout. > 2) There's been a fatal error on the page. > > Jib, as you can reproduce, can you try a couple of things: > > 1) Set the timeout in this line to something like 60 seconds (or even longer > maybe): > http://hg.mozilla.org/mozilla-central/annotate/daa84204a11a/browser/ > components/loop/test/shared/frontend_tester.py#l66 > > See if that removes the issue. That was totally it! On my machine the first test takes around 17s, and this timed out at around 12s. I need a faster machine apparently. :-) Is 60 the number we want?
Flags: needinfo?(jib) → needinfo?(standard8)
Assignee | ||
Comment 18•10 years ago
|
||
Attachment #8478299 -
Flags: review?(standard8)
Comment 19•10 years ago
|
||
Comment on attachment 8478299 [details] [diff] [review] increase marionette find_element timeout for intermittent Review of attachment 8478299 [details] [diff] [review]: ----------------------------------------------------------------- Yeah, I think 60 is reasonable here as it is a timeout value.
Attachment #8478299 -
Flags: review?(standard8) → review+
Updated•10 years ago
|
Flags: needinfo?(standard8)
Assignee | ||
Comment 20•10 years ago
|
||
Assignee | ||
Comment 21•10 years ago
|
||
Assignee | ||
Comment 22•10 years ago
|
||
Updated commit msg.
Attachment #8478299 -
Attachment is obsolete: true
Assignee | ||
Updated•10 years ago
|
Keywords: checkin-needed
Reporter | ||
Comment 23•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/78a46df3e3bd
Keywords: checkin-needed
Reporter | ||
Comment 24•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/78a46df3e3bd
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment 25•10 years ago
|
||
Thanks for the fix, jib! One not is that the comment above the statement you changed has become wrong. A good fix for that might be to not put the constant in the comment at all. :-)
Assignee | ||
Comment 26•10 years ago
|
||
(In reply to Dan Mosedale (:dmose - away until 8/21) from comment #25) > Thanks for the fix, jib! One not is that the comment above the statement > you changed has become wrong. A good fix for that might be to not put the > constant in the comment at all. :-) Ugh, sorry missed that.
Attachment #8478695 -
Flags: review?(dmose)
Comment 27•10 years ago
|
||
Comment on attachment 8478695 [details] [diff] [review] comment fix Review of attachment 8478695 [details] [diff] [review]: ----------------------------------------------------------------- Thanks! r=dmose
Attachment #8478695 -
Flags: review?(dmose) → review+
Assignee | ||
Updated•10 years ago
|
Comment 28•10 years ago
|
||
https://hg.mozilla.org/integration/fx-team/rev/d3b46bc636ad
Keywords: checkin-needed
Reporter | ||
Comment 29•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/d3b46bc636ad
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 30•10 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/7d5c706a4f84
status-firefox32:
--- → unaffected
status-firefox33:
--- → fixed
status-firefox34:
--- → fixed
status-firefox-esr24:
--- → unaffected
Reporter | ||
Updated•10 years ago
|
status-firefox-esr31:
--- → unaffected
Comment 31•10 years ago
|
||
Untracking for QA. Please needinfo me to request specific testing.
Whiteboard: [qa-]
Comment hidden (Legacy TBPL/Treeherder Robot) |
Blocks: 1000240
I'm not entirely sure, but I think bug 1000240 has encountered this bug again, preventing it to land; see: https://bugzilla.mozilla.org/show_bug.cgi?id=1000240#c40 https://tbpl.mozilla.org/php/getParsedLog.php?id=49008339&tree=Fx-Team https://tbpl.mozilla.org/php/getParsedLog.php?id=49009152&tree=Fx-Team As this might be something different and me not being an expert with Marionette, NI :dmose and :Standard8 here.
Flags: needinfo?(standard8)
Flags: needinfo?(dmose)
Comment 34•10 years ago
|
||
(In reply to Nicolas Perriault (:NiKo`) — needinfo me if you need my attention from comment #33) > I'm not entirely sure, but I think bug 1000240 has encountered this bug > again, preventing it to land; see: > > https://bugzilla.mozilla.org/show_bug.cgi?id=1000240#c40 > https://tbpl.mozilla.org/php/getParsedLog.php?id=49008339&tree=Fx-Team > https://tbpl.mozilla.org/php/getParsedLog.php?id=49009152&tree=Fx-Team > > As this might be something different and me not being an expert with > Marionette, NI :dmose and :Standard8 here. Unlikely, for two reasons: 16:46:07 INFO - TEST-START ... 16:46:16 ERROR - TEST-UNEXPECTED-ERROR That's ~ 9 seconds, whereas we've now got a 60 second timeout. Also, there's: 16:46:16 ERROR - PROCESS-CRASH | test_standalone_all.py TestDesktopUnits.test_units | application crashed [None] Which is slightly more critical. I'll follow-up in bug 1000240.
Flags: needinfo?(standard8)
Flags: needinfo?(dmose)
Updated•10 years ago
|
Iteration: --- → 34.3
You need to log in
before you can comment on or make changes to this bug.
Description
•