Intermittent IOError: Process killed because the connection to Marionette server is lost. Check gecko.log for errors (Reason: Timed out waiting for connection on localhost:2828!) [Firefox UI Update tests]

RESOLVED FIXED in mozilla55

Status

Testing
Marionette
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: Treeherder Bug Filer, Assigned: whimboo)

Tracking

(Blocks: 1 bug, {intermittent-failure, regression})

Version 3
mozilla55
intermittent-failure, regression
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Assignee)

Comment 1

a year ago
So we start the Marionette server, but then fail to connect within the next 120s (startup timeout)

07:56:17     INFO -  1495205777486	Marionette	INFO	Listening on port 2828
07:56:45     INFO -  *** UTM:SVC TimerManager:notify - notified timerID: browser-cleanup-thumbnails
07:57:16     INFO -  *** UTM:SVC TimerManager:registerTimer - id: telemetry_modules_ping
07:58:20    ERROR - Failure during execution of the update test.
07:58:20    ERROR - Traceback (most recent call last):
[..]
07:58:20    ERROR - IOError: Process killed because the connection to Marionette server is lost. Check gecko.log for errors (Reason: Timed out waiting for connection on localhost:2828!)

So something blocks the connection or Firefox is too slow, which I cannot believe. This mainly happens for our Firefox ui update tests.
Component: Firefox UI Tests → Marionette
QA Contact: hskupin
(Assignee)

Comment 2

a year ago
So this started recently. The updates for the nightly build on May 17th were fine:

https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=6e3ca5b38f7173b214b10de49e58cb01890bf39d&filter-searchStr=fxup+update+windows&filter-tier=1&filter-tier=2&filter-tier=3

Which means that we have this regression range:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=6e3ca5b38f71&tochange=baf05f61bc14

I can see the patch from Geoff regarding the startup timeout landed here via bug 1364228. I wonder if this could have caused this.
Summary: Intermittent IOError: Process killed because the connection to Marionette server is lost. Check gecko.log for errors (Reason: Timed out waiting for connection on localhost:2828!) → Intermittent IOError: Process killed because the connection to Marionette server is lost. Check gecko.log for errors (Reason: Timed out waiting for connection on localhost:2828!) [Firefox UI Update tests]
I wouldn't expect my change to affect non-linux tests, nor non-mochitest...but it seems an odd coincidence.

Comment 4

a year ago
8 failures in 777 pushes (0.01 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-central: 8

Platform breakdown:
* windows7-64: 5
* windows8-64: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1366312&startday=2017-05-15&endday=2017-05-21&tree=all
(Assignee)

Comment 5

a year ago
This could have been caused due to the fallout we had with the patch on bug 1298803 landed. It left Firefox instances behind, which were blocking port 2828.

I will check again tomorrow, now that all machines have been cleaned-up.
Blocks: 1298803
Flags: needinfo?(hskupin)
Keywords: regression
(Assignee)

Comment 6

a year ago
And that was indeed the case. No more hangs visible.
Assignee: nobody → hskupin
Status: NEW → RESOLVED
Last Resolved: a year ago
Flags: needinfo?(hskupin)
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.