https://treeherder.mozilla.org/logviewer.html#?job_id=28201214&repo=mozilla-inbound#L1469 06:13:19 INFO - REFTEST INFO | Checking for orphan ssltunnel processes... 06:13:19 INFO - REFTEST INFO | Checking for orphan xpcshell processes... 06:13:23 INFO - REFTEST INFO | Running with e10s: True 06:13:23 INFO - ### XPCOM_MEM_BLOAT_LOG defined -- logging bloat/leaks to /tmp/tmpb7Ud1a.mozrunner/runreftest_leaks.log 06:13:32 INFO - ++DOCSHELL 0x7f87d6d99000 == 1 [pid = 1150] [id = 1] 06:13:32 INFO - ++DOMWINDOW == 1 (0x7f87d6d99800) [pid = 1150] [serial = 1] [outer = (nil)] 06:13:32 INFO - ++DOMWINDOW == 2 (0x7f87d6d9a800) [pid = 1150] [serial = 2] [outer = 0x7f87d6d99800] 06:14:23 INFO - Traceback (most recent call last): 06:14:23 INFO - File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 725, in <module> 06:14:23 INFO - sys.exit(run()) 06:14:23 INFO - File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 721, in run 06:14:23 INFO - return reftest.runTests(options.tests, options) 06:14:23 INFO - File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 420, in runTests 06:14:23 INFO - return self.runSerialTests(manifests, options, cmdargs) 06:14:23 INFO - File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 675, in runSerialTests 06:14:23 INFO - debuggerInfo=debuggerInfo) 06:14:23 INFO - File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 616, in runApp 06:14:23 INFO - marionette.start_session(timeout=options.marionette_port_timeout) 06:14:23 INFO - File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/marionette.py", line 1172, in start_session 06:14:23 INFO - self.protocol, _ = self.client.connect() 06:14:23 INFO - File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/transport.py", line 226, in connect 06:14:23 INFO - self.sock.connect((self.addr, self.port)) 06:14:23 INFO - File "/usr/lib/python2.7/socket.py", line 224, in meth 06:14:23 INFO - return getattr(self._sock,name)(*args) 06:14:23 INFO - socket.error: [Errno 111] Connection refused 06:14:23 ERROR - Return code: 1
Joel - can you or someone on your team help triage this?
if this happened once, I wouldn't think it is reproducible or actionable. Is there something else you needed from me on this?
It's happened more than once. It also affects other suites. https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1274255
Thanks Ryan, I didn't realize that- without more information in the bug, I cannot make good decisions. This is a marionette issue, I will pull in AutomatedTester as he is the main marionette guy.
Pretty sure there's been others starred as infra, but I'll let the paid sheriffs weigh on that when they have time to investigate.
There doesnt appear to be any output from the browser for Marionette that I can see. Marionette is being used to bootstrap reftests here, can anyone suggest where I can see the runner for this test suite?
we are getting a socket error here: marionette.start_session(timeout=options.marionette_port_timeout) this is seen in reftest and mochitest, so it isn't one specific harness. Here is where it is called for reftest: https://dxr.mozilla.org/mozilla-central/source/layout/tools/reftest/runreftest.py#616 I assume this is a problem seen intermittently with marionette connecting to its own port
in > marionette.start_session(timeout=options.marionette_port_timeout) We try connecting to the browser for 60 seconds (as a default, mozharness could be higher since it uses options.marionette_port_timeout) and if we can't then give up. Since I can't see any Marionette output, something is preventing Marionette from starting up, or the marionette initialisation logs are being logged somewhere and not visible to treeherder.
5 automation job failures were associated with this bug in the last 7 days. Repository breakdown: * autoland: 3 * mozilla-inbound: 1 * fx-team: 1 Platform breakdown: * windows7-32: 2 * linux64: 2 * linux32: 1 For more details, see: https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1274255&startday=2016-07-25&endday=2016-07-31&tree=all
8 automation job failures were associated with this bug in the last 7 days. Repository breakdown: * mozilla-inbound: 5 * fx-team: 2 * autoland: 1 Platform breakdown: * linux64: 5 * windows7-32: 2 * android-4-3-armv7-api15: 1 For more details, see: https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1274255&startday=2016-08-01&endday=2016-08-07&tree=all
I don't know where the marionette initialization logs are. If it's not stdout then it should be pretty easy to capture them as an artifact. Even though this isn't happening recently, it would be good to catch that logging in case it does happen again. Can you point me in the right direction to get that logging?
I keep thinking I have answered this and havent :/ The marionette harness outputs these to gecko.log which is just taking stdout/stderr and piping that to a file.
OK, there is no upload of gecko.log, and in fact I don't see it mentioned in the logs. So while we seem to have fixed this particular issue, I filed bug 1307114 to get the logging.