Marionette can die during setUp() if previous test had failed during tearDown()

RESOLVED FIXED in Firefox 24, Firefox OS v1.1hd

Status

RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: jgriffin, Assigned: jgriffin)

Tracking

unspecified
mozilla24
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox22 wontfix, firefox23 wontfix, firefox24 fixed, b2g18 fixed, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 wontfix, b2g-v1.1hd fixed)

Details

Attachments

(2 attachments, 2 obsolete attachments)

(Assignee)

Description

5 years ago
The Gaia UI tests recently failed with this stack trace:

======================================================================
ERROR: test_call_log_all_calls (test_call_log_all_calls.TestCallLogAllCalls)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/jenkins/workspace/b2g.inari.gaia.v1.0.1.ui.ril/gaiatest/tests/dialer/test_call_log_all_calls.py", line 49, in tearDown
    self.data_layer.kill_active_call()
  File "/var/jenkins/workspace/b2g.inari.gaia.v1.0.1.ui.ril/gaiatest/gaia_test.py", line 293, in kill_active_call
    "if(telephony.active) telephony.active.hangUp();")
  File "/var/jenkins/workspace/b2g.inari.gaia.v1.0.1.ui.ril/.env/local/lib/python2.7/site-packages/marionette_client-0.5.27-py2.7.egg/marionette/marionette.py", line 605, in execute_script
    scriptTimeout=script_timeout)
  File "/var/jenkins/workspace/b2g.inari.gaia.v1.0.1.ui.ril/.env/local/lib/python2.7/site-packages/marionette_client-0.5.27-py2.7.egg/marionette/marionette.py", line 344, in _send_message
    self._handle_error(response)
  File "/var/jenkins/workspace/b2g.inari.gaia.v1.0.1.ui.ril/.env/local/lib/python2.7/site-packages/marionette_client-0.5.27-py2.7.egg/marionette/marionette.py", line 405, in _handle_error
    raise MarionetteException(message=message, status=status, stacktrace=stacktrace)
TEST-UNEXPECTED-FAIL | test_call_log_all_calls.py TestCallLogAllCalls.test_call_log_all_calls | MarionetteException: Error sending message to frame (NS_ERROR_NOT_INITIALIZED) [object Object]; frame has closed.

Every test after that then failed with this one:

Traceback (most recent call last):
  File "/var/jenkins/workspace/b2g.inari.gaia.v1.0.1.ui.ril/gaiatest/tests/ftu/test_ftu_skip_tour.py", line 80, in setUp
    GaiaTestCase.setUp(self)
  File "/var/jenkins/workspace/b2g.inari.gaia.v1.0.1.ui.ril/gaiatest/gaia_test.py", line 408, in setUp
    MarionetteTestCase.setUp(self)
  File "/var/jenkins/workspace/b2g.inari.gaia.v1.0.1.ui.ril/.env/local/lib/python2.7/site-packages/marionette_client-0.5.27-py2.7.egg/marionette/marionette_test.py", line 130, in setUp
    self.marionette.test_name = self.test_name
  File "/var/jenkins/workspace/b2g.inari.gaia.v1.0.1.ui.ril/.env/local/lib/python2.7/site-packages/marionette_client-0.5.27-py2.7.egg/marionette/marionette.py", line 453, in test_name
    if self._send_message('setTestName', 'ok', value=test_name):
  File "/var/jenkins/workspace/b2g.inari.gaia.v1.0.1.ui.ril/.env/local/lib/python2.7/site-packages/marionette_client-0.5.27-py2.7.egg/marionette/marionette.py", line 344, in _send_message
    self._handle_error(response)
  File "/var/jenkins/workspace/b2g.inari.gaia.v1.0.1.ui.ril/.env/local/lib/python2.7/site-packages/marionette_client-0.5.27-py2.7.egg/marionette/marionette.py", line 405, in _handle_error
    raise MarionetteException(message=message, status=status, stacktrace=stacktrace)
TEST-UNEXPECTED-FAIL | test_ftu_skip_tour.py TestFtu.test_ftu_skip_tour | MarionetteException: Error sending message to frame (NS_ERROR_NOT_INITIALIZED) [object Object]; frame has closed.

The nature of the error prevented the harness from restarting B2G between each test.

What happened was that the first failure caused the test to die during tearDown because the current frame disappeared, the fact that it died during tearDown prevented us from calling delete_session().

When the next test was run, we already had a session, so we didn't create a new one, and that session was attached to a non-existent frame, so setUp() of that test also failed, and so on for all the remaining tests.

We may need to implement a method that gets called after tearDown, whether that gets successfully completed or not, that will delete our current session.
In case it matters, the run was: http://qa-selenium.mv.mozilla.com:8080/view/B2G.Inari/job/b2g.inari.gaia.v1.0.1.ui.ril/133/console (unfortunately, its logcat appears truncated - been meaning to fix that in the jobs; will try now!)

Comment 2

5 years ago
I've started to work on it yesterday evening.

For my point of view, the change should be on "run" function where setUp and tearDown is called (from unittest). I've made a little test: 
i've created a function call "cleanTest" in CommonTestCase in marionette_test.py. This function should be call after tearDown for cleaning marionette session.
For the run part, i've overridden run function from unittest in runtests.py.

So, i'll give you a first submission asap, but i didn't reproduce the bug (i 've made a little mock for debugging and testing)

Comment 3

5 years ago
Created attachment 760155 [details] [diff] [review]
First submission : add cleanTest function and override run function

As I said in my last comment, i have added a function which delete marionette session if still exist.
I have override the unittest.TestCase.run function.

The last big problem for me here, is that i didn't reproduced the bug yet. I've tried on different b2g builds (my own build and nightly). I'll try to focus on a way to produce this bug, to write one (or several) unit tests to validate the change.

mbu.
(Assignee)

Comment 4

5 years ago
It's quite hard to reproduce this, since the original cause is very sporadic, but I'll take a look at the patch and see if it looks like it should fix this.
(Assignee)

Updated

5 years ago
Attachment #760155 - Flags: review?(jgriffin)
(Assignee)

Comment 5

5 years ago
Comment on attachment 760155 [details] [diff] [review]
First submission : add cleanTest function and override run function

Review of attachment 760155 [details] [diff] [review]:
-----------------------------------------------------------------

I think in general this is a good approach.  However, we need to be compatible with Python 2.6 as well as 2.7, which is painful, as unittest changed quite a bit from 2.6 to 2.7.  Have you tried to run this through 2.6?  If not, we can run it through tryserver, which will tell us if there are problems.  I'm pretty sure we'll have to make some 2.6 compatibility adjustments.

Also, since we're duplicating a chunk of code in cleanTest, I think we could make a private method that does the work, then call it from both tearDown and cleanTest.
Attachment #760155 - Flags: review?(jgriffin) → review-

Comment 6

5 years ago
Ok, i did not try on Python 2.6. I will try to find a server with python 2.6 and fix the compatibility.

For the second part, i'll made the change quickly :)

Comment 7

5 years ago
Ok sorry for the delay, i wasn't at home for long time.

I 've tested it on a debian squeeze (with python 2.6) and you'r right :) it doesn't work. I will made the change to fix the compatibility, but just for information, there is as package :
python-unittest2 - backport of the enhanced unittest testing framework in Python 2.7

For the second point, i have created a private function.. i'll give you the patch with the fix of compatibility python2.6/2.7 :)

Comment 8

5 years ago
Created attachment 764380 [details] [diff] [review]
Second submission - python2.6 compatible

Try to be python 2.6 compliant.

I add a comment on line 72. Instead of calling directly cleanTest function, we could handle the unittest function doCleanups().
Attachment #760155 - Attachment is obsolete: true
Attachment #764380 - Flags: review?(jgriffin)
(Assignee)

Comment 9

5 years ago
Comment on attachment 764380 [details] [diff] [review]
Second submission - python2.6 compatible

Review of attachment 764380 [details] [diff] [review]:
-----------------------------------------------------------------

Cool, looks good.  I'll run this on tryserver to make sure I didn't miss anything.

::: testing/marionette/client/marionette/marionette_test.py
@@ +148,5 @@
>      def tearDown(self):
>          self.duration = time.time() - self.start_time
> +        self._deleteSession()
> +
> +    def cleanTest(self):

Let's set self.duration in cleanTest if it wasn't set in tearDown (because it died before the base was called) and if self.start_time is defined.  This will prevent self.duration from being 0 in some cases.
Attachment #764380 - Flags: review?(jgriffin) → review-
(Assignee)

Updated

5 years ago
Assignee: nobody → mat.bultel

Comment 11

5 years ago
Created attachment 764635 [details] [diff] [review]
move duration into _deleteSession function
Attachment #764380 - Attachment is obsolete: true
Attachment #764635 - Flags: review?(jgriffin)

Comment 12

5 years ago
ok, i have made the change quickly this morning :)
(Assignee)

Comment 13

5 years ago
Comment on attachment 764635 [details] [diff] [review]
move duration into _deleteSession function

Review of attachment 764635 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, thanks!  The try run was also green.
Attachment #764635 - Flags: review?(jgriffin) → review+
(Assignee)

Updated

5 years ago
Keywords: checkin-needed
https://hg.mozilla.org/projects/birch/rev/f5a90edfa8f6
Flags: in-testsuite+
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/f5a90edfa8f6
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
Ryan, would you mind landing this on mozilla-b2g18_v1_0_1 and mozilla-b2g18, or will they get auto-uplifted to those anyway?  Thanks!
Flags: needinfo?(ryanvm)
(Assignee)

Comment 17

5 years ago
This doesn't specifically need to be uplifted.  I'll release a new version to pypi, which is what's used by Jenkins.  There isn't any code internal to gecko in here which would need to be uplifted.
Flags: needinfo?(ryanvm)
(Assignee)

Comment 18

5 years ago
This is causing some failures in Jenkins:

Traceback (most recent call last):
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/bin/gaiatest", line 9, in <module>
    load_entry_point('gaiatest==0.12', 'console_scripts', 'gaiatest')()
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/gaiatest/runtests.py", line 70, in main
    cli(runner_class=GaiaTestRunner, parser_class=GaiaTestOptions)
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.5.31-py2.7.egg/marionette/runtests.py", line 733, in cli
    runner = startTestRunner(runner_class, options, tests)
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.5.31-py2.7.egg/marionette/runtests.py", line 725, in startTestRunner
    runner.run_tests(tests)
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.5.31-py2.7.egg/marionette/runtests.py", line 386, in run_tests
    self.run_test(test)
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.5.31-py2.7.egg/marionette/runtests.py", line 468, in run_test
    self.run_test(i["path"])
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.5.31-py2.7.egg/marionette/runtests.py", line 489, in run_test
    results = runner.run(suite)
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.5.31-py2.7.egg/marionette/runtests.py", line 122, in run
    test(result)
  File "/usr/lib/python2.7/unittest/suite.py", line 70, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/suite.py", line 108, in run
    test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 396, in __call__
    return self.run(*args, **kwds)
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.5.31-py2.7.egg/marionette/marionette_test.py", line 73, in run
    self.cleanTest()
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.5.31-py2.7.egg/marionette/marionette_test.py", line 152, in cleanTest
    self._deleteSession()
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.5.31-py2.7.egg/marionette/marionette_test.py", line 158, in _deleteSession
    self.loglines = self.marionette.get_logs()
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.5.31-py2.7.egg/marionette/marionette.py", line 654, in get_logs
    return self._send_message('getLogs', 'value')
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.5.31-py2.7.egg/marionette/marionette.py", line 335, in _send_message
    response = self.client.send(message)
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.5.31-py2.7.egg/marionette/client.py", line 95, in send
    response = self.receive()
  File "/var/jenkins/workspace/b2g.unagi.comril.gaia.v1-train.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.5.31-py2.7.egg/marionette/client.py", line 58, in receive
    status=ErrorCodes.INVALID_RESPONSE)
marionette.errors.InvalidResponseException: Could not successfully complete transport of message to Gecko, socket closed?

I'm investigating.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 19

5 years ago
Created attachment 766036 [details] [diff] [review]
Only call _deleteSession during cleanTest,

I don't really understand this stack trace.  The fact that we're trying to get loglines in cleanTest implies that tearDown wasn't called, or failed.  The only case it wouldn't be called is if setUp() failed, but in either case we should see an error for that and we don't.

I wasn't able to reproduce this locally regardless of what I tried.  This patch may help, or at least may make the failure more clear.

I will run through try.
Attachment #766036 - Flags: review?(mdas)
(Assignee)

Updated

5 years ago
Assignee: mat.bultel → jgriffin
Comment on attachment 766036 [details] [diff] [review]
Only call _deleteSession during cleanTest,

Review of attachment 766036 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm, but I'm curious, what does self.duration do? I see it being assigned but not used in marionette_client or unittest
Attachment #766036 - Flags: review?(mdas) → review+
ah, mbu pointed it out in runtests.py.
(Assignee)

Comment 23

5 years ago
I included a marionette_client version bump in this too.

https://hg.mozilla.org/integration/mozilla-inbound/rev/fcb261b154c7
https://hg.mozilla.org/mozilla-central/rev/fcb261b154c7
Status: REOPENED → RESOLVED
Last Resolved: 5 years ago5 years ago
Resolution: --- → FIXED
(Assignee)

Comment 25

5 years ago
https://hg.mozilla.org/releases/mozilla-b2g18/rev/f0a140c8af19
status-b2g18: --- → fixed
status-b2g18-v1.0.0: --- → wontfix
status-b2g18-v1.0.1: --- → wontfix
status-firefox24: --- → fixed
status-firefox25: --- → fixed
https://hg.mozilla.org/releases/mozilla-b2g18_v1_1_0_hd/rev/f0a140c8af19
https://hg.mozilla.org/releases/mozilla-b2g18_v1_1_0_hd/rev/26288eda134e
status-b2g-v1.1hd: --- → fixed
status-firefox22: --- → wontfix
status-firefox23: --- → wontfix
status-firefox25: fixed → ---
You need to log in before you can comment on or make changes to this bug.