Intermittent Taskcluster mochitest/reftest "No tests run or test summary not found" after socket.error: [Errno 111] Connection refused

RESOLVED WORKSFORME

Status

Taskcluster
General
RESOLVED WORKSFORME
2 years ago
a year ago

People

(Reporter: RyanVM, Unassigned)

Tracking

({intermittent-failure})

Details

(Reporter)

Description

2 years ago
https://treeherder.mozilla.org/logviewer.html#?job_id=28201214&repo=mozilla-inbound#L1469

06:13:19     INFO -  REFTEST INFO | Checking for orphan ssltunnel processes...
06:13:19     INFO -  REFTEST INFO | Checking for orphan xpcshell processes...
06:13:23     INFO -  REFTEST INFO | Running with e10s: True
06:13:23     INFO -  ### XPCOM_MEM_BLOAT_LOG defined -- logging bloat/leaks to /tmp/tmpb7Ud1a.mozrunner/runreftest_leaks.log
06:13:32     INFO -  ++DOCSHELL 0x7f87d6d99000 == 1 [pid = 1150] [id = 1]
06:13:32     INFO -  ++DOMWINDOW == 1 (0x7f87d6d99800) [pid = 1150] [serial = 1] [outer = (nil)]
06:13:32     INFO -  ++DOMWINDOW == 2 (0x7f87d6d9a800) [pid = 1150] [serial = 2] [outer = 0x7f87d6d99800]
06:14:23     INFO -  Traceback (most recent call last):
06:14:23     INFO -    File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 725, in <module>
06:14:23     INFO -      sys.exit(run())
06:14:23     INFO -    File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 721, in run
06:14:23     INFO -      return reftest.runTests(options.tests, options)
06:14:23     INFO -    File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 420, in runTests
06:14:23     INFO -      return self.runSerialTests(manifests, options, cmdargs)
06:14:23     INFO -    File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 675, in runSerialTests
06:14:23     INFO -      debuggerInfo=debuggerInfo)
06:14:23     INFO -    File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 616, in runApp
06:14:23     INFO -      marionette.start_session(timeout=options.marionette_port_timeout)
06:14:23     INFO -    File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/marionette.py", line 1172, in start_session
06:14:23     INFO -      self.protocol, _ = self.client.connect()
06:14:23     INFO -    File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/transport.py", line 226, in connect
06:14:23     INFO -      self.sock.connect((self.addr, self.port))
06:14:23     INFO -    File "/usr/lib/python2.7/socket.py", line 224, in meth
06:14:23     INFO -      return getattr(self._sock,name)(*args)
06:14:23     INFO -  socket.error: [Errno 111] Connection refused
06:14:23    ERROR - Return code: 1
Joel - can you or someone on your team help triage this?
Flags: needinfo?(jmaher)
if this happened once, I wouldn't think it is reproducible or actionable.  Is there something else you needed from me on this?
Flags: needinfo?(jmaher)
(Reporter)

Comment 3

2 years ago
It's happened more than once. It also affects other suites.
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1274255
Thanks Ryan, I didn't realize that- without more information in the bug, I cannot make good decisions.

This is a marionette issue, I will pull in AutomatedTester as he is the main marionette guy.
Flags: needinfo?(dburns)
(Reporter)

Comment 5

2 years ago
Pretty sure there's been others starred as infra, but I'll let the paid sheriffs weigh on that when they have time to investigate.
There doesnt appear to be any output from the browser for Marionette that I can see. Marionette is being used to bootstrap reftests here, can anyone suggest where I can see the runner for this test suite?
Flags: needinfo?(dburns)
we are getting a socket error here:
marionette.start_session(timeout=options.marionette_port_timeout)

this is seen in reftest and mochitest, so it isn't one specific harness.  Here is where it is called for reftest:
https://dxr.mozilla.org/mozilla-central/source/layout/tools/reftest/runreftest.py#616

I assume this is a problem seen intermittently with marionette connecting to its own port
in 

> marionette.start_session(timeout=options.marionette_port_timeout)

We try connecting to the browser for 60 seconds (as a default, mozharness could be higher since it uses options.marionette_port_timeout) and if we can't then give up. Since I can't see any Marionette output, something is preventing Marionette from starting up, or the marionette initialisation logs are  being logged somewhere and not visible to treeherder.
(Reporter)

Updated

2 years ago
Summary: Intermittent Taskcluster reftest "No tests run or test summary not found" after socket.error: [Errno 111] Connection refused → Intermittent Taskcluster mochitest/reftest "No tests run or test summary not found" after socket.error: [Errno 111] Connection refused

Comment 9

a year ago
5 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 3
* mozilla-inbound: 1
* fx-team: 1

Platform breakdown:
* windows7-32: 2
* linux64: 2
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1274255&startday=2016-07-25&endday=2016-07-31&tree=all
8 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 5
* fx-team: 2
* autoland: 1

Platform breakdown:
* linux64: 5
* windows7-32: 2
* android-4-3-armv7-api15: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1274255&startday=2016-08-01&endday=2016-08-07&tree=all
I don't know where the marionette initialization logs are.  If it's not stdout then it should be pretty easy to capture them as an artifact.  Even though this isn't happening recently, it would be good to catch that logging in case it does happen again.

Can you point me in the right direction to get that logging?
Flags: needinfo?(dburns)
I keep thinking I have answered this and havent :/

The marionette harness outputs these to gecko.log which is just taking stdout/stderr and piping that to a file.
Flags: needinfo?(dburns)
OK, there is no upload of gecko.log, and in fact I don't see it mentioned in the logs.  So while we seem to have fixed this particular issue, I filed bug 1307114 to get the logging.
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.