Socket timeout is lost in TcpTransport.connect()

RESOLVED FIXED in Firefox 58

Status

defect
P1
critical
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: whimboo, Assigned: whimboo)

Tracking

(Blocks 1 bug, {regression})

Version 3
mozilla59
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox57 wontfix, firefox58 fixed, firefox59 fixed)

Details

(Whiteboard: [17/11])

Attachments

(3 attachments)

The changes in bug 1413852 caused a major regression in Marionette clients socket handling code. Hereby the specified socket timeout gets lost completely, and always `None` is used. As result a long running command never gets aborted, and Marionette hangs forever.

Btw this bug is most likely the reason for all the 1000s timeout problems for test jobs with Marionette involved.
Priority: -- → P1
Whiteboard: [17/11]
Attachment #8929383 - Flags: review?(dburns)
Attachment #8929375 - Flags: review?(dburns)
Attachment #8929376 - Flags: review?(dburns)
Comment on attachment 8929375 [details]
Bug 1418227 - Indicate that sock instance is private.

https://reviewboard.mozilla.org/r/200700/#review205956
Attachment #8929375 - Flags: review?(dburns) → review+
Comment on attachment 8929383 [details]
Bug 1418227 - TcpTransport.sock should not be used by consumers.

https://reviewboard.mozilla.org/r/200710/#review205958
Attachment #8929383 - Flags: review?(dburns) → review+
Comment on attachment 8929376 [details]
Bug 1418227 - Socket timeout is lost in TcpTransport.connect().

https://reviewboard.mozilla.org/r/200702/#review205962
Attachment #8929376 - Flags: review?(dburns) → review+
Since you've put it as a dependency, in the case of bug 1412035 there was an I/O error happening in the test harness code that was silently ignored and would hang the harness so I'm not sure if it's also related to this. The root cause for that bug should have been fixed in bug 1416028.
This bug only happens if there is no response being sent from Firefox via the server socket to the Marionette client. It means that this can only happen in case of hangs in Firefox, or for a broken message manager in GeckoDriver and our framescript.

Maybe adding the other bug as dependency was wrong, and I will remove it. Thanks.
No longer blocks: 1412035
Pushed by hskupin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f75b0e5beca2
TcpTransport.sock should not be used by consumers. r=automatedtester
https://hg.mozilla.org/integration/autoland/rev/fad877b10735
Indicate that sock instance is private. r=automatedtester
https://hg.mozilla.org/integration/autoland/rev/3be73de9db10
Socket timeout is lost in TcpTransport.connect(). r=automatedtester
Please uplift those patches to mozilla-beta given that it fixes a major regression. Thanks.
Keywords: checkin-needed
Not sure why regression got removed from this bug, and why the uplift request doesn't work with checkin-needed (even Aryx told me lately to request it that way). I will from now on use the [checkin-needed-beta] whiteboard entry again.
Keywords: regression
Whiteboard: [17/11] → [17/11][checkin-needed-beta]
Duplicate of this bug: 1420644
You need to log in before you can comment on or make changes to this bug.