Closed Bug 923821 Opened 8 years ago Closed 8 years ago

Repair and re-enable test_killall.py TestKillAll.test_kill_all

Categories

(Firefox OS Graveyard :: Gaia::UI Tests, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(firefox28 fixed, b2g-v1.2 fixed)

RESOLVED FIXED
Tracking Status
firefox28 --- fixed
b2g-v1.2 --- fixed

People

(Reporter: philor, Assigned: bitgeeky)

Details

(Keywords: intermittent-failure, Whiteboard: [gaia-ui-test][mentor=davehunt][lang=py])

Attachments

(1 file, 3 obsolete files)

https://tbpl.mozilla.org/php/getParsedLog.php?id=28762037&tree=Fx-Team
b2g_ubuntu64_vm fx-team opt test gaia-ui-test on 2013-10-04 23:51:44 PDT for push b5d24ef1eb37
slave: tst-linux64-ec2-416

23:57:24     INFO -  ======================================================================
23:57:24     INFO -  ERROR: test_kill_all (test_killall.TestKillAll)
23:57:24     INFO -  ----------------------------------------------------------------------
23:57:24     INFO -  Traceback (most recent call last):
23:57:24     INFO -    File "/builds/slave/test/build/venv/local/lib/python2.7/site-packages/marionette/marionette_test.py", line 132, in run
23:57:24     INFO -      testMethod()
23:57:24     INFO -    File "/builds/slave/test/gaia/tests/python/gaia-ui-tests/gaiatest/tests/unit/test_killall.py", line 12, in test_kill_all
23:57:24     INFO -      self.apps.launch(app)
23:57:24     INFO -    File "/builds/slave/test/build/venv/local/lib/python2.7/site-packages/gaiatest/gaia_test.py", line 72, in launch
23:57:24     INFO -      result = self.marionette.execute_async_script("GaiaApps.launchWithName('%s')" % name, script_timeout=launch_timeout)
23:57:24     INFO -    File "/builds/slave/test/build/venv/local/lib/python2.7/site-packages/marionette/marionette.py", line 1068, in execute_async_script
23:57:24     INFO -      filename=os.path.basename(frame[0]))
23:57:24     INFO -    File "/builds/slave/test/build/venv/local/lib/python2.7/site-packages/marionette/marionette.py", line 572, in _send_message
23:57:24     INFO -      self._handle_error(response)
23:57:24     INFO -    File "/builds/slave/test/build/venv/local/lib/python2.7/site-packages/marionette/marionette.py", line 621, in _handle_error
23:57:24    ERROR -      raise ScriptTimeoutException(message=message, status=status, stacktrace=stacktrace)
23:57:24    ERROR -  TEST-UNEXPECTED-FAIL | test_killall.py TestKillAll.test_kill_all | ScriptTimeoutException: timed out
23:57:24     INFO -  ----------------------------------------------------------------------
23:57:24     INFO -  Ran 3 tests in 93.229s
23:57:24  WARNING -  FAILED (errors=1)
This is failing regluarly on Travis.

Switching it to a P1 to either fix or just disable this test.





----------------------------------------------------------------------

Traceback (most recent call last):

File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.6.1-py2.7.egg/marionette/marionette_test.py", line 143, in run

testMethod()

File "/home/travis/build/mozilla-b2g/gaia/tests/python/gaia-ui-tests/gaiatest/tests/unit/test_killall.py", line 12, in test_kill_all

self.apps.launch(app)

File "/home/travis/build/mozilla-b2g/gaia/tests/python/gaia-ui-tests/gaiatest/gaia_test.py", line 72, in launch

result = self.marionette.execute_async_script("GaiaApps.launchWithName('%s')" % name, script_timeout=launch_timeout)

File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.6.1-py2.7.egg/marionette/marionette.py", line 1073, in execute_async_script

filename=os.path.basename(frame[0]))

File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.6.1-py2.7.egg/marionette/marionette.py", line 577, in _send_message

self._handle_error(response)

File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.6.1-py2.7.egg/marionette/marionette.py", line 626, in _handle_error

raise ScriptTimeoutException(message=message, status=status, stacktrace=stacktrace)

TEST-UNEXPECTED-FAIL | test_killall.py test_killall.TestKillAll.test_kill_all | ScriptTimeoutException: timed out

----------------------------------------------------------------------
Priority: -- → P1
Dave, can you replicate this locally?

The Gaia team have noticed the intermittency on Travis and would like to resolve it.

Maybe just needs a quick sleep in the launch/loop.
Flags: needinfo?(dave.hunt)
(In reply to Zac C (:zac) from comment #2)
> Dave, can you replicate this locally?

I can continue to try to reproduce this, but so far I've been unsuccessful. The test before asserts that no apps (other than the homescreen) are running, and the test times out waiting for the Calendar app to launch. The Calendar app is successfully launched in the previous test. The TBPL screenshot confirms that the Calender app is not running (showing the homescreen).

> The Gaia team have noticed the intermittency on Travis and would like to
> resolve it.

Do you have links to some failing Travis-CI builds that I can look at to see if it's the same traceback? It's possible that this is somehow Linux specific, although that would seem unlikely.

> Maybe just needs a quick sleep in the launch/loop.

It's possible that we're attempting to launch the app before the restart is completed and the homescreen is displayed. Bug 924912 was raised to address this, and a workaround was added that sleeps for 5 seconds, however this is not applied for desktop testing. We could try moving that sleep to see if it resolves this issue (at a cost of adding five seconds to every test), or we could add a wait for the homescreen to be displayed.
Flags: needinfo?(dave.hunt)
Often the runs are retriggered to get it green but if I see one I'll send it through as fast as I can!
Pull requests are often less re-triggered.
Zac found this one for me: https://travis-ci.org/mozilla-b2g/gaia/jobs/13201948

I'm running these tests 100 times locally to see if I hit this error. If I don't, I'll try running on Ubuntu, which is closer to what Travis is running.

If anyone sees this locally, please let me know.
After ~70 iterations this eventually crashed out with "OSError: [Errno 35] Resource temporarily unavailable". I didn't see the launch timeout in the tests that did run. The next thing to try is to attempt to replicate on Linux.
What is the duration of a single test run compared to the equivalent on Travis? 
Seeing as we think it is a latency/speed thing this might be a good clue to whether Travis is particularly powerful (and fast) compared to our local devices.
Running on my machine compares well to both Travis and TBPL in terms of test times. I'll try to replicate this on Ubuntu today.
I've been able to replicate this on Ubuntu. In 50 runs I hit this or bug 932494 four times. I'll see if I can work out what's going on and test a fix.
Okay, so it seems when this happens the app loads but we never get the apploadend event for it. This appears to be only when launching an app immediately after another, and if I just put a 0.1 second sleep after the initial launch it works 100% of the time for me.

I'd rather not add a sleep, but I'm not sure what else I would wait for. I suspect we don't see this on device due to the extra latency, and as it is we probably only see in in less that 10% of the desktop tests. I'll submit a pull request with a sleep and request feedback.
Speaking to lightsofapollo, we may want to rewrite the app launch code to be more robust. I found I still did get errors with a 0.1 second sleep, which just proves this is not the best approach.
Can we disable test_kill while you are working on this solution?
(In reply to Zac C (:zac) from comment #13)
> Can we disable test_kill while you are working on this solution?

It's not just a single test that's affected. As far as I can tell it's wherever we launch an app immediately after another. This would be test_kill_multiple in test_kill.py and test_kill_all and test_kill_all_twice in test_killall.py

Rather than disabling them, I'd be okay with a temporary sleep in these tests alone, to keep the coverage.
I've no objection to adding the sleeps so let's go with that until a more definitive fix can be done.
This is now a task to repair these tests using the fix Dave described in comment #11.

Also re-enable the tests.
Summary: Intermittent test_killall.py TestKillAll.test_kill_all | ScriptTimeoutException: timed out → Repair and re-enable test_killall.py TestKillAll.test_kill_all
Whiteboard: [gaia-ui-test][mentor=davehunt][lang=py]
Assignee: nobody → mozpankaj1994
Status: NEW → ASSIGNED
Attached patch 923821.patch (obsolete) — Splinter Review
Added time.sleep(1) after every launch , and in case of launch and kill ,added after launch and kill.
Attachment #826885 - Flags: feedback?(dave.hunt)
Comment on attachment 826885 [details] [diff] [review]
923821.patch

Review of attachment 826885 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks Pankaj. Just a couple of comments. Please fix these up and request review from :zac. Also, please reenable these tests in tests/python/gaia-ui-tests/gaiatest/tests/unit/manifest.ini (remove the "disabled" line) in your next patch.

::: tests/python/gaia-ui-tests/gaiatest/tests/unit/test_kill.py
@@ +2,4 @@
>  # License, v. 2.0. If a copy of the MPL was not distributed with this
>  # file, You can obtain one at http://mozilla.org/MPL/2.0/.
>  
> +import time

Please add a blank line between these two imports, in accordance with PEP8.

@@ +9,5 @@
>  
>      def test_kill(self):
>          app = self.apps.launch('Clock')
>          self.apps.kill(app)
> +	time.sleep(1)

You can remove this one, we only need the sleep when we launch multiple apps immediately after one another.

@@ +22,5 @@
>  
>          for app in running_apps:
>              self.apps.launch(app.name)
>              self.apps.kill(app)
> +	    time.sleep(1)

You can remove this sleep too.

::: tests/python/gaia-ui-tests/gaiatest/tests/unit/test_killall.py
@@ +2,4 @@
>  # License, v. 2.0. If a copy of the MPL was not distributed with this
>  # file, You can obtain one at http://mozilla.org/MPL/2.0/.
>  
> +import time

Please add a blank line between these two imports.
Attachment #826885 - Flags: feedback?(dave.hunt) → feedback+
Attached patch 923821v2.patch (obsolete) — Splinter Review
Attachment #826885 - Attachment is obsolete: true
Attachment #827342 - Flags: review?
Attachment #827342 - Flags: review? → review?(zcampbell)
Comment on attachment 827342 [details] [diff] [review]
923821v2.patch

This looks perfect!
I've r+ it but now we need to submit it as a Git pull request so that we can test it against our Travis CI instances.
Attachment #827342 - Flags: review?(zcampbell) → review+
Comment on attachment 827342 [details] [diff] [review]
923821v2.patch

Review of attachment 827342 [details] [diff] [review]:
-----------------------------------------------------------------

Zac points out there are issues with the indentation. Did you try running these tests locally? Python should complain about mixing tabs and spaces. Please just use spaces, and you may want to change your text editor to use spaces by default when working with Python files.

Also, as Zac says, we prefer github pull requests for Gaia patches, as they are automatically run on Travis-CI and give us a pass/fail which helps when we come to review and merge.
Attachment #827342 - Flags: review-
Comment on attachment 827951 [details]
Pointer to Github pull request: https://github.com/mozilla-b2g/gaia/pull/13390

Zac pointed me towards your already submitted pull request. We need to be sure to keep the bug in Bugzilla up to date too.
Attachment #827951 - Flags: review-
Attachment #827342 - Attachment is obsolete: true
Attached file Link to Pull Request
Indentation errors due to editor fixed , Please go through the pull request 

https://github.com/mozilla-b2g/gaia/pull/13390
Attachment #828045 - Flags: review?(dave.hunt)
Comment on attachment 828045 [details] [review]
Link to Pull Request

Missed to update the review flag.
Attachment #828045 - Flags: review?(dave.hunt) → review+
Attachment #827951 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.