Closed
Bug 1190791
Opened 10 years ago
Closed 10 years ago
Again failures in various tests in self.marionette.start_session() : IOError: Connection to Marionette server is lost
Categories
(Firefox OS Graveyard :: Gaia::UI Tests, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: martijn.martijn, Unassigned)
References
Details
(Keywords: regression)
Attachments
(1 file)
I think I see similar failures again like we saw in bug 1172343 :(
http://jenkins1.qa.scl3.mozilla.com/view/Bitbar/job/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.1.bitbar/207/HTML_Report/
Traceback (most recent call last):
File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.1.bitbar/workspace/.env/lib/python2.7/site-packages/marionette_client-0.16-py2.7.egg/marionette/marionette_test.py", line 277, in run
self.setUp()
File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.1.bitbar/workspace/tests/python/gaia-ui-tests/gaiatest/tests/functional/system/test_privileged_app_video_capture_prompt.py", line 13, in setUp
GaiaTestCase.setUp(self)
File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.1.bitbar/workspace/tests/python/gaia-ui-tests/gaiatest/gaia_test.py", line 862, in setUp
self.device.start_b2g()
File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.1.bitbar/workspace/tests/python/gaia-ui-tests/gaiatest/gaia_test.py", line 663, in start_b2g
self.marionette.start_session()
File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.1.bitbar/workspace/.env/lib/python2.7/site-packages/marionette_driver-0.9-py2.7.egg/marionette_driver/marionette.py", line 1015, in start_session
self.session = self._send_message('newSession', 'value', capabilities=desired_capabilities, sessionId=session_id)
File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.1.bitbar/workspace/.env/lib/python2.7/site-packages/marionette_driver-0.9-py2.7.egg/marionette_driver/decorators.py", line 36, in _
return func(*args, **kwargs)
File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.1.bitbar/workspace/.env/lib/python2.7/site-packages/marionette_driver-0.9-py2.7.egg/marionette_driver/marionette.py", line 691, in _send_message
response = self.client.send(message)
File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.1.bitbar/workspace/.env/lib/python2.7/site-packages/marionette_transport-0.5-py2.7.egg/marionette_transport/transport.py", line 101, in send
self.connect()
File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.1.bitbar/workspace/.env/lib/python2.7/site-packages/marionette_transport-0.5-py2.7.egg/marionette_transport/transport.py", line 89, in connect
hello = self.receive()
File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.1.bitbar/workspace/.env/lib/python2.7/site-packages/marionette_transport-0.5-py2.7.egg/marionette_transport/transport.py", line 73, in receive
raise IOError(self.connection_lost_msg)
IOError: Connection to Marionette server is lost. Check gecko.log (desktop firefox) or logcat (b2g) for errors.
I also saw it in smoke 2 of bitbar and I guess this also happens in other places.
Reporter | ||
Comment 1•10 years ago
|
||
Oliver, would you perhaps willing to find out when this regressed again (you can look at bitbar also when it regressed, I think)?
Flags: needinfo?(onelson)
Keywords: regressionwindow-wanted
Reporter | ||
Comment 2•10 years ago
|
||
Again, withe patch in bug 1172343, comment 28 and running that test, I get this failure in 2/3 repeats.
![]() |
||
Comment 3•10 years ago
|
||
On mozilla-central Jenkins jobs, I almost never see this occur on smoke runs. I'm curious if it's because the tests don't run enough for this to occur. I see this most commonly in the non-smoke runs, and from what I can discern it appears it started reproing again on August 1st:
* http://jenkins1.qa.scl3.mozilla.com/job/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.1/382/
* http://jenkins1.qa.scl3.mozilla.com/job/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.1.bitbar/204/
This error really hurts automation testing because the tests that fail to this take 15 minutes before they close out. It always appears to be the last reported in the HTML report, assuming that means they were the last run by the marionette test client. Is it possible the test is taking too long and the client is losing it's port from adb?
Could we modify the timeout on tests so they never take more than 5 minutes? It would at least reduce some of the time overhead created by this failure.
Flags: needinfo?(onelson) → needinfo?(martijn.martijn)
Reporter | ||
Comment 4•10 years ago
|
||
(In reply to Oliver Nelson [:oliverthor] from comment #3)
> Could we modify the timeout on tests so they never take more than 5 minutes?
> It would at least reduce some of the time overhead created by this failure.
I don't understand what you mean. We should never get these kinds of failures in the first place. They are very disruptive.
Flags: needinfo?(martijn.martijn)
![]() |
||
Comment 5•10 years ago
|
||
Reporter | ||
Comment 6•10 years ago
|
||
With the test in the pull request I can reproduce it after it starts with the 2nd run in there. The first test takes something like 272413ms.
When this issue occurs, I see only this message appearing, repeatedly:
V/WLAN_PSA( 215): NL MSG, len[048], NL type[0x11] WNI type[0x5050] len[028
Reporter | ||
Comment 7•10 years ago
|
||
It's passing on:
Build ID 20150731150205
Gaia Revision 2ca27bbdd84526c6a3b198d9cf10f2caff1dadde
Gaia Date 2015-07-31 08:23:31
Gecko Revision https://hg.mozilla.org/mozilla-central/rev/afa67b6957bb
Gecko Version 42.0a1
Device Name flame
Firmware(Release) 4.4.2
Firmware(Incremental) eng.cltbld.20150727.063909
Firmware Date Mon Jul 27 06:39:20 EDT 2015
Bootloader L1TC000118D0
It fails on:
Build ID 20150801030207
Gaia Revision 2ca27bbdd84526c6a3b198d9cf10f2caff1dadde
Gaia Date 2015-07-31 08:23:31
Gecko Revision https://hg.mozilla.org/mozilla-central/rev/aeb85029c3b3
Gecko Version 42.0a1
Device Name flame
Firmware(Release) 4.4.2
Firmware(Incremental) eng.cltbld.20150727.063909
Firmware Date Mon Jul 27 06:39:20 EDT 2015
Bootloader L1TC000118D0
Reporter | ||
Comment 8•10 years ago
|
||
Gecko changelog between those builds:
https://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2015-07-31+11%3A10%3A16&enddate=2015-08-01+05%3A10%3A00
Reporter | ||
Comment 9•10 years ago
|
||
There was no change in Gaia between those builds, so there is only a Gecko changelog to look for.
It looks like this was caused by bug 1180596.
We had similar issues before, for which we filed bug 1172343. That corresponded to the regression range and fix range for Presentation WebAPI and disabling it.
It also makes me wonder if bug 1171827 would be there again, potentially.
Gary, can you take a look at this?
Reporter | ||
Comment 10•10 years ago
|
||
Fabrice just disabled device discovery again in bug 1196884, so this should be fixed tomorrow.
Reporter | ||
Comment 11•10 years ago
|
||
Johan, do you think it would be useful to have a test that I've attached to this bug checked in as a regression test?
Flags: needinfo?(xeonchen) → needinfo?(jlorenzo)
Comment 12•10 years ago
|
||
Yes, this test is very valuable to us. I'd put it in the sanity suite. What do you guys think?
Flags: needinfo?(martijn.martijn)
Flags: needinfo?(jlorenzo)
Flags: needinfo?(jdorlus)
Reporter | ||
Comment 13•10 years ago
|
||
In the sanity suite probably makes the most sense. It doesn't really belong in the unit test suite, but I don't know in functional area it should belong.
Flags: needinfo?(martijn.martijn)
Reporter | ||
Comment 14•10 years ago
|
||
I just filed bug 1198264 to keep track of adding this to the test suite.
Comment 15•10 years ago
|
||
Yes, I agree that it should go in the sanity suite.
Flags: needinfo?(jdorlus)
Reporter | ||
Comment 16•10 years ago
|
||
I can still reproduce this in the latest Flame build, using the automated test from bug 1198264. Reopening.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 17•10 years ago
|
||
This is still very likely caused by bug 1180596.
Flags: needinfo?(xeonchen)
Reporter | ||
Comment 18•10 years ago
|
||
Hmm, perhaps this is a different issue, this is much easier to reproduce in all kinds of testing in the latest Flame build.
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Flags: needinfo?(xeonchen)
Resolution: --- → FIXED
Reporter | ||
Comment 19•10 years ago
|
||
(In reply to Martijn Wargers [:mwargers] (QA) from comment #18)
> Hmm, perhaps this is a different issue, this is much easier to reproduce in
> all kinds of testing in the latest Flame build.
I filed bug 1198950 for this.
Reporter | ||
Comment 20•10 years ago
|
||
However, while testing out the automated test in bug 1198264 is still causing this. But at this point, I'll wait on testing this until bug 1198950 is fixed.
Depends on: 1198950
You need to log in
before you can comment on or make changes to this bug.
Description
•