Closed
Bug 955626
Opened 11 years ago
Closed 10 years ago
Try to wait for b2g startup without execute_script
Categories
(Firefox OS Graveyard :: Gaia::UI Tests, defect, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: zcampbell, Assigned: zcampbell)
References
Details
Attachments
(1 file)
I'd like to revisit this as a task. My rationale and for P1 is three-fold: 1) The failures are logged/interpreted by those who don't work on the suite as framework intermittents instead of Gaia bugs and thus reflect badly on what is otherwise a reliable test suite. The failures are not related to the reliability of the test suite. 2) Intermittents on Travis and TBPL cost devs and sheriffs productivity and confidence as they have to re-run the suite. In some cases the exception will be an early warning sign for broken Gaia functionality but in that scenario our test case will pick up broken Gaia functionality during the test run instead without compromising stability. 3) The gaiatest package is well shared widely and externally and the impact of `start_b2g` failing on unrelated problems can block a lot of test coverage. Since the AppWindowManager was finalised we might be able to reliably wait for a DOM property (for example the 'active' class on the loaded iframe) instead of using execute_script. It may also be worth revisiting Bebe's try/except loop idea which was considered not ideal at the time but the importance of this issue has changed and it might be worth it now.
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → zcampbell
Assignee | ||
Comment 1•10 years ago
|
||
Attachment #8355218 -
Flags: review?(dave.hunt)
Attachment #8355218 -
Flags: review?(bob.silverberg)
Assignee | ||
Comment 2•10 years ago
|
||
Also device test run here: http://qa-selenium.mv.mozilla.com:8080/job/b2g.hamachi.mozilla-central.ui.adhoc/64/console
Assignee | ||
Comment 3•10 years ago
|
||
Surprised by the adhoc test results, hard to see the relation. They seem to be a keyboard problem. Unless it is starting up much faster and the keyboard is lazy loaded/not initialized. Retriggered, but I'll put this aside for a bit and work on some intermittents.
Comment 5•10 years ago
|
||
Comment on attachment 8355218 [details] [review] github pr I don't think it's worth reviewing at this stage due to the failures. Please re-request review once these are addressed.
Attachment #8355218 -
Flags: review?(dave.hunt)
Attachment #8355218 -
Flags: review?(bob.silverberg)
Assignee | ||
Comment 6•10 years ago
|
||
(In reply to Dave Hunt (:davehunt) from comment #5) > > I don't think it's worth reviewing at this stage due to the failures. Please > re-request review once these are addressed. Yeah sorry about that, was planning to debug it locally, but the future happened sooner than I anticipated..
Assignee | ||
Comment 7•10 years ago
|
||
Seems fine locally now; I'll rebase and rebuild the adhoc. Fear it might be our old friend the update toaster!
Comment 8•10 years ago
|
||
I took a look at this as well and I also saw a failure locally trying to connect to cell data on one of the tests I ran: ``` test_browser_cell_data (test_browser_cell_data.TestBrowserCellData) ... ERROR ====================================================================== ERROR: None ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/bsilverberg/.virtualenvs/gaia/lib/python2.7/site-packages/marionette_client-0.7.2-py2.7.egg/marionette/marionette_test.py", line 127, in run self.setUp() File "/Users/bsilverberg/gitRepos/gaia/tests/python/gaia-ui-tests/gaiatest/tests/functional/browser/test_browser_cell_data.py", line 17, in setUp self.data_layer.connect_to_cell_data() File "/Users/bsilverberg/gitRepos/gaia/tests/python/gaia-ui-tests/gaiatest/gaia_test.py", line 298, in connect_to_cell_data result = self.marionette.execute_async_script("return GaiaDataLayer.connectToCellData()", special_powers=True) File "/Users/bsilverberg/.virtualenvs/gaia/lib/python2.7/site-packages/marionette_client-0.7.2-py2.7.egg/marionette/marionette.py", line 1080, in execute_async_script filename=os.path.basename(frame[0])) File "/Users/bsilverberg/.virtualenvs/gaia/lib/python2.7/site-packages/marionette_client-0.7.2-py2.7.egg/marionette/marionette.py", line 584, in _send_message self._handle_error(response) File "/Users/bsilverberg/.virtualenvs/gaia/lib/python2.7/site-packages/marionette_client-0.7.2-py2.7.egg/marionette/marionette.py", line 633, in _handle_error raise ScriptTimeoutException(message=message, status=status, stacktrace=stacktrace) TEST-UNEXPECTED-FAIL | test_browser_cell_data.py test_browser_cell_data.TestBrowserCellData.test_browser_cell_data | ScriptTimeoutException: timed out ---------------------------------------------------------------------- Ran 1 test in 46.419s FAILED (errors=1) ``` I notice that the last adhoc run [1] also seemed to have a few errors connecting to cell data, e.g., test_cost_control_data_alert_mobile, test_sms_send, test_enable_cell_data_via_settings_app. I wonder if this is related to this patch? Maybe the condition we are waiting for isn't waiting long enough for the OS to be in a state where we can attempt to connect to cell data? [1] http://qa-selenium.mv.mozilla.com:8080/job/b2g.hamachi.mozilla-central.ui.adhoc/69/consoleFull
Assignee | ||
Comment 9•10 years ago
|
||
It looks suspiciously like a pattern doesn't it? Will debug locally a bit more.
Assignee | ||
Comment 10•10 years ago
|
||
Bob, I wasn't able to replicate locally. After talking to Hsin-yi (Ril owner) about enabling the ril.data, the only thing I could imagine going wrong is a race between us pushing the APN settings from json file and the RIL setting them up itself, or where sometimes a carrier use shared bandwidth on another carrier we were pushing completely wrong settings. I've removed the APN settings from CI and I'll let the RIL do the work itself. Failing that, in connect_to_cell_data method we can wait for mozMobileConnection.data.state=='registered' and that will tell us that it is ready to go. I'll kick off another adhoc of this soon.
Assignee | ||
Comment 11•10 years ago
|
||
Have been doing some research on this, including debugging as per the above comment and debugging the 'ondataerror' event which is supposed to trigger when the ril connection fails. For the former I found no problem - the apn settings are always set correctly before we start the connection. For the latter I found it doesn't trigger before the timeout which led me to believe that sometimes it just takes a long time to connect. I'm going to debug a bit more along that principle.
Assignee | ||
Comment 12•10 years ago
|
||
Trying another adhoc run with cell data timeout increased. http://qa-selenium.mv.mozilla.com:8080/job/b2g.hamachi.mozilla-central.ui.adhoc/80/
Assignee | ||
Comment 13•10 years ago
|
||
Still unsure why this is affecting the cell data connection but I will attempt to debug a bit more still.
Assignee | ||
Comment 14•10 years ago
|
||
https://github.com/mozilla-b2g/gaia/commit/5f4b2af6143dc64fd0d3b08893289351bb007f87
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•