[Tracking bug] Marionette socket.timeout errors

RESOLVED FIXED

Status

Testing
Marionette
RESOLVED FIXED
5 years ago
2 years ago

People

(Reporter: jgriffin, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

5 years ago
This is a tracking bug that will attempt to make sense out of all the socket.timeout errors we're seeing.
(Reporter)

Comment 1

5 years ago
bug 855458:  this looks like a transport problem; socket.timeout occurred during finish() even though the actor had sent the correct data to the client.  A little later in the log we also see:

12:11:48  WARNING -  E/GeckoConsole(  237): [JavaScript Error: "this._actorPool is null" {file: "chrome://global/content/devtools/dbg-server.js" line: 582}]

https://tbpl.mozilla.org/php/getParsedLog.php?id=21169440&full=1&branch=mozilla-inbound#error1
Depends on: 855458
(Reporter)

Comment 2

5 years ago
bug 845291:  this is caused by a B2G crash; hopefully crash detection will help us with that soon.

https://tbpl.mozilla.org/php/getParsedLog.php?id=20894770&full=1&branch=mozilla-inbound
(Reporter)

Comment 3

5 years ago
bug 839842:  another case in which it looks like the actor is behaving correctly, but the client isn't receiving some data.

https://tbpl.mozilla.org/php/getParsedLog.php?id=20120216&full=1&branch=mozilla-inbound
(Reporter)

Comment 4

5 years ago
bug 830622:  a case on desktop Firefox in which we may not be waiting long enough after the browser starts before running the test.

https://tbpl.mozilla.org/php/getParsedLog.php?id=21411807&full=1&branch=mozilla-aurora
(Reporter)

Comment 5

5 years ago
bug 852709:  another case in which it looks like the actor is sending the correct data, but shortly after we see:

E/GeckoConsole(  247): [JavaScript Error: "this._actorPool is null" {file: "chrome://global/content/devtools/dbg-server.js" line: 589}]

https://tbpl.mozilla.org/php/getParsedLog.php?id=20834959&full=1&branch=mozilla-inbound
(Reporter)

Comment 6

5 years ago
bug 843262: another case in which it seems the actor is doing the right thing.  Here, though, we have an actorpool error showing up before the test failure:

14:17:39  WARNING -  E/GeckoConsole(  277): [JavaScript Error: "this._actorPool is null" {file: "chrome://global/content/devtools/dbg-server.js" line: 589}]

https://tbpl.mozilla.org/php/getParsedLog.php?id=20262625&full=1&branch=mozilla-central
Depends on: 845291
(Reporter)

Updated

5 years ago
Depends on: 839842, 830622, 852709, 843262
(Reporter)

Comment 7

5 years ago
bug 842543: another apparent transport problem near an actorpool error.

https://tbpl.mozilla.org/php/getParsedLog.php?id=19868582&full=1&branch=mozilla-central
Depends on: 842543
(Reporter)

Comment 9

5 years ago
(In reply to Jonathan Griffin (:jgriffin) from comment #8)
> bug 842294: same transport problem + actorPool error:
> 
> https://tbpl.mozilla.org/php/getParsedLog.
> php?id=19840908&full=1&branch=mozilla-central

That should have been bug 842297.
(Reporter)

Comment 10

5 years ago
bug 835540:  similar to 855458 in that it looks like a transport error during finish(), but in this case there is no actorPool error in the log.

https://tbpl.mozilla.org/php/getParsedLog.php?id=20184732&full=1&branch=mozilla-b2g18_v1_0_1
(Reporter)

Updated

5 years ago
Depends on: 835540
(Reporter)

Comment 13

5 years ago
bug 823076: transport problem + actorPool error:

09:55:02  WARNING -  E/GeckoConsole(  276): [JavaScript Error: "this._actorPool is null" {file: "chrome://global/content/devtools/dbg-server.js" line: 589}]

https://tbpl.mozilla.org/php/getParsedLog.php?id=20108931&full=1&branch=mozilla-inbound
Depends on: 823076
(Reporter)

Comment 14

5 years ago
bug 842274: this is unlike any of the others.  We get a socket.timeout during newSession(); no actorPool error in the logs.  Possibly a (now-fixed) sync issue?

https://tbpl.mozilla.org/php/getParsedLog.php?id=19837398&full=1&branch=mozilla-central
(Reporter)

Updated

5 years ago
Depends on: 842274
(Reporter)

Updated

5 years ago
Depends on: 844416
(Reporter)

Comment 21

5 years ago
Zac pointed out a socket.timeout error that occurred recently during a Gaia-UI-test on v1-train (with mozilla-b2g18 gecko):

http://qa-selenium.mv.mozilla.com:8080/job/b2g.unagi.gaia.v1-train.ui.xfail/238/testReport/junit/%28root%29/TestDialerAirplaneMode/test_dialer_airplane_mode/

Traceback (most recent call last):
  File "/var/jenkins/workspace/b2g.unagi.gaia.v1-train.ui.xfail/gaia-ui-tests/gaiatest/tests/dialer/test_dialer_airplane_mode.py", line 27, in test_dialer_airplane_mode
    phone.keypad.call_number(test_phone_number)
  File "/var/jenkins/workspace/b2g.unagi.gaia.v1-train.ui.xfail/gaia-ui-tests/gaiatest/apps/phone/regions/keypad.py", line 40, in call_number
    return self.tap_call_button()
  File "/var/jenkins/workspace/b2g.unagi.gaia.v1-train.ui.xfail/gaia-ui-tests/gaiatest/apps/phone/regions/keypad.py", line 45, in tap_call_button
    return CallScreen(self.marionette)
  File "/var/jenkins/workspace/b2g.unagi.gaia.v1-train.ui.xfail/gaia-ui-tests/gaiatest/apps/phone/regions/call_screen.py", line 25, in __init__
    self.marionette.switch_to_frame(call_screen)
  File "/var/jenkins/workspace/b2g.unagi.gaia.v1-train.ui.xfail/.env/local/lib/python2.7/site-packages/marionette_client-0.5.23-py2.7.egg/marionette/marionette.py", line 475, in switch_to_frame
    response = self._send_message('switchToFrame', 'ok', element=frame.id, focus=focus)
  File "/var/jenkins/workspace/b2g.unagi.gaia.v1-train.ui.xfail/.env/local/lib/python2.7/site-packages/marionette_client-0.5.23-py2.7.egg/marionette/marionette.py", line 300, in _send_message
    raise TimeoutException(message='socket.timeout', status=ErrorCodes.TIMEOUT, stacktrace=None)
TimeoutException: socket.timeout

This does not look similar to any of the cases I've already listed above.
I am seeing a similar issue with the gaia-ui add_contact endurance test (Inari with b2g18_v1_0_1):

Traceback (most recent call last):
  File "/home/rwood/gaia-ui-tests/gaiatest/tests/endurance/test_endurance_add_contact.py", line 22, in test_endurance_add_contact
    self.drive(test=self.add_contact, app='contacts')
  File "/home/rwood/gaia-ui-tests/gaiatest/gaia_test.py", line 701, in drive
    self.test_method()
  File "/home/rwood/gaia-ui-tests/gaiatest/tests/endurance/test_endurance_add_contact.py", line 29, in add_contact
    new_contact_form = self.contacts.tap_new_contact()
  File "/home/rwood/gaia-ui-tests/gaiatest/apps/contacts/app.py", line 37, in tap_new_contact
    self.marionette.tap(self.marionette.find_element(*self._new_contact_button_locator))
  File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.5.27-py2.7.egg/marionette/marionette_touch.py", line 31, in tap
    self.execute_script("%s.tap(arguments[0], null, null, null, null, arguments[1]);" % self.library_name, [element, send_all])
  File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.5.27-py2.7.egg/marionette/marionette.py", line 605, in execute_script
    scriptTimeout=script_timeout)
  File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.5.27-py2.7.egg/marionette/marionette.py", line 334, in _send_message
    raise TimeoutException(message='socket.timeout', status=ErrorCodes.TIMEOUT, stacktrace=None)
TEST-UNEXPECTED-FAIL | test_endurance_add_contact.py TestEnduranceAddContact.test_endurance_add_contact | TimeoutException: socket.timeout
Depends on: 830149
Depends on: 832662
Depends on: 837828
Depends on: 837912
Depends on: 838353
(Reporter)

Comment 23

5 years ago
See also bug 898074
Depends on: 967252

Updated

4 years ago
Depends on: 967021
Bug 871475 hits a perm-orange that the some test pages loading get stuck in calling syscall epoll_wait.  The main functionality in that pull request is actually not taking effect during the process.  Actually we can reproduce the same error without that pull request but by appending "-net nic -net nic -net user" as qemu tail args when running "./run-emulator.sh".  Adb shell is still working at the time and test process continues after timed out.  This is happening on all emulator versions -- ics, jb, and kk.  Have cherry-picked some smc91c11x fixes from upstream kernel but still in vain.  Don't know if that's related to this bug.
All dependent bugs are closed
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.