Closed Bug 1276559 Opened 8 years ago Closed 7 years ago

Changing wifi networks triggers offline mode which closes peer connections prematurely

Categories

(Core :: WebRTC, defect, P1)

49 Branch
defect

Tracking

()

RESOLVED DUPLICATE of bug 1318180

People

(Reporter: voltrevo, Unassigned)

References

Details

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.63 Safari/537.36

Steps to reproduce:

1. Go to this jsbin: http://output.jsbin.com/wasoni
2. In the dev console, type `start();` and press enter
3. After `Received data channel message: ping` is printed in the console, change wifi networks


Actual results:

Peer connections quickly emit signalingstatechange event indicating the signalingState is now closed. This is logged in the console as `peerConnections[0] signalingstatechange: closed`.


Expected results:

The peer connection should stay recoverable, i.e. not enter the `closed`state since `closed` is terminal. This should allow for a successful ice restart and ping messages should resume flowing on the data channel. This is what happens in Chrome stable (51) and canary (53).

---

I also noticed that this error is sometimes emitted:
`InvalidStateError: RTCPeerConnection is gone (did you enter Offline mode?)`
so I tried disabling offline mode by setting network.manage-offline-status to false in about:config. This prevents the immediate transition to closed when switching networks, but messages do not resume flowing over the data channel after an ice restart.
I've also confirmed that the ice restart is successful by creating a new data channel and having messages flow over that new channel: http://output.jsbin.com/kigoxa.

Strangely, this data channel can spend a long time (10+ seconds) in the `connecting` state, even though both peer connections are in the same page. I see this delay in Chrome as well. This is why this new jsbin times out after 15 seconds whereas the one in the ticket is set to 5 seconds.
(In the previous comment, I was using network.manage-offline-status=false)
Component: Untriaged → WebRTC
Product: Firefox → Core
I'm able to reproduce this in 46 and 49, albeit intermittently. That is, it often works, but other times I see it entering closed state as described. Seems timing dependent.

I've also confirmed that Chrome Canary works fine (whereas Chrome release gets into a renegotiation loop, not sure what that's about).

I think we should fix this, especially since the spec has recently clarified that connections never close themselves.
Status: UNCONFIRMED → NEW
Rank: 25
Ever confirmed: true
Priority: -- → P2
This is blocking a really important feature for us, we would love to get this fixed.
Flags: needinfo?(mreavy)
jib - might this be caused by the network status events in PeerConnection.js getting triggered?  They're supposed to be for "Work Offline", but maybe they're firing for this case
Rank: 25 → 15
Flags: needinfo?(mreavy) → needinfo?(jib)
Priority: P2 → P1
Yes, if you mean these http://mxr.mozilla.org/mozilla-central/source/dom/media/PeerConnection.js?rev=b9ac4527fbbb&mark=172-172#164

It seems very plausible that they're firing given the error seen here.

The spec doesn't speak to this, so we should perhaps reassess what value this code brings (it was added way back).

Alternatively, maybe we should at least add, say, a 10-30 second delay, and recheck network before we do anything?
Flags: needinfo?(jib)
Flags: needinfo?(rjesup)
Sure - the idea was to avoid overhead/power drain/etc when it was impossible to communicate anyways.  Since this is triggered by things other than user "Work Offline", I'm probably ok with either a delay (even a minute or more), or simply not doing it.  SCTP data connections are actually able to handle moderately long disconnects and re-establish.
Flags: needinfo?(rjesup)
I'm pretty sure that this problem got fixed through landing bug 1318180
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.