If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

PC sends ICE candidates after getting closed

RESOLVED FIXED in mozilla35

Status

()

Core
WebRTC
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: drno, Assigned: bwc)

Tracking

Trunk
mozilla35
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

3 years ago
It appears from this test run:
https://tbpl.mozilla.org/php/getParsedLog.php?id=48864601&full=1&branch=mozilla-central#error0

That even though a PeerConnection has been closed it still calls the onicecandidate callback if it finds candidates after/while the PC got/gets closed. We should probably check the state of the PC before delivering these events (in case the PC has not been GC yet).
(Reporter)

Updated

3 years ago
See Also: → bug 1072945
(In reply to Nils Ohlmeier [:drno] from comment #0)
> It appears from this test run:
> https://tbpl.mozilla.org/php/getParsedLog.
> php?id=48864601&full=1&branch=mozilla-central#error0

Since the pattern is pc1.onicecandidate = cand => pc2.addIceCandidate(cand), that try only proves that we're erroring when pc2 is closed. I'm not disputing that it's because pc1 is closed as well, just that it doesn't prove that since it's intermittent. If you've verified the cause then nevermind.

> That even though a PeerConnection has been closed it still calls the
> onicecandidate callback if it finds candidates after/while the PC got/gets
> closed. We should probably check the state of the PC before delivering these
> events (in case the PC has not been GC yet).

Byron, if you've verified that this is indeed a bug, this is a one-line fix in PeerConnection.js, or do you think this should be fixed in the c++ code?
(Assignee)

Comment 2

3 years ago
There is definitely a bug here; we do an extra dispatch in c++ land to make sure the candidate event doesn't outrun the OnSetLocalDescriptionSuccess callback, but that dispatch is based on a weak ptr to the observer and nothing else. If the PC closes in the meantime, the event still fires if the observer is still there. It is probably easiest to check in PeerConnection.js.
(Reporter)

Comment 3

3 years ago
(In reply to Jan-Ivar Bruaroey [:jib] from comment #1)
> (In reply to Nils Ohlmeier [:drno] from comment #0)
> > It appears from this test run:
> > https://tbpl.mozilla.org/php/getParsedLog.
> > php?id=48864601&full=1&branch=mozilla-central#error0
> 
> Since the pattern is pc1.onicecandidate = cand => pc2.addIceCandidate(cand),
> that try only proves that we're erroring when pc2 is closed. I'm not
> disputing that it's because pc1 is closed as well, just that it doesn't
> prove that since it's intermittent.

Sorry, but I disagree. Look at this these logging lines from the try run above:

06:40:26     INFO -  485 INFO Closing peer connections
06:40:26     INFO -  486 INFO Closing pcLocal
06:40:26     INFO -  487 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_setLocalOfferInHaveRemoteOffer.html | iceConnectionState should not be undefined
06:40:26     INFO -  488 INFO PeerConnectionWrapper (pcLocal): oniceconnectionstatechange fired, new state is: closed
06:40:26     INFO -  489 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_setLocalOfferInHaveRemoteOffer.html | PeerConnectionWrapper (pcLocal): legal ICE state transition from new to closed
06:40:26     INFO -  490 INFO PeerConnectionWrapper (pcLocal): Closed connection.
06:40:26     INFO -  491 INFO Closing pcRemote
06:40:26     INFO -  492 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_setLocalOfferInHaveRemoteOffer.html | iceConnectionState should not be undefined
06:40:26     INFO -  493 INFO PeerConnectionWrapper (pcRemote): oniceconnectionstatechange fired, new state is: closed
06:40:26     INFO -  494 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_setLocalOfferInHaveRemoteOffer.html | PeerConnectionWrapper (pcRemote): legal ICE state transition from new to closed
06:40:26     INFO -  495 INFO PeerConnectionWrapper (pcRemote): Closed connection.
06:40:26     INFO -  496 INFO pcLocal: received iceCandidateEvent
06:40:26     INFO -  497 INFO pcLocal: iceCandidate = {"candidate":"candidate:0 1 UDP 2122252543 10.134.57.29 44766 typ host","sdpMid":"","sdpMLineIndex":0}

The last two lines are from the onicecandidate() handler on pc1 side.
As the timing between closing pc1 and ICE gathering finishing is always going to be a race, there is no way to prove it with a bullet prove test, but will always only show up as an intermittent.
 
> > That even though a PeerConnection has been closed it still calls the
> > onicecandidate callback if it finds candidates after/while the PC got/gets
> > closed. We should probably check the state of the PC before delivering these
> > events (in case the PC has not been GC yet).
> 
> Byron, if you've verified that this is indeed a bug, this is a one-line fix
> in PeerConnection.js, or do you think this should be fixed in the c++ code?

FULL ACK, that fixing this in PeerConnection.js make the most sense.

It actually brings up the question if we should have a generic way of checking the state of the PC in PeerConnection.js before delivering any callbacks. My assumption here is that in general we should not deliver any callbacks any more after close() has been called, although I doubt that this is clearly defined in the spec.
(Assignee)

Comment 4

3 years ago
Created attachment 8496313 [details] [diff] [review]
Check whether PC is closed before delivering events

This should do the trick.
(Assignee)

Comment 5

3 years ago
https://tbpl.mozilla.org/?tree=Try&rev=e2ed63540dc8
(Reporter)

Comment 6

3 years ago
Comment on attachment 8496313 [details] [diff] [review]
Check whether PC is closed before delivering events

Review of attachment 8496313 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #8496313 - Flags: review+
Comment on attachment 8496313 [details] [diff] [review]
Check whether PC is closed before delivering events

Review of attachment 8496313 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/PeerConnection.js
@@ +502,5 @@
>    dispatchEvent: function(event) {
> +    // PC can close while events are firing if there is an async dispatch
> +    // in c++ land
> +    if (!this._closed) {
> +      this.__DOM_IMPL__.dispatchEvent(event);

This blocks all peerConnection events, which I think is fine from looking at the list of events we support. Note that PeerConnectionIdp.jsm also uses this function which also seems ok but may be less obvious.

I got nervous, but callbacks are not affected by this (they go through the callCB function) which is good as they should probably NOT be blocked by close. Instead I think our plan for callbacks is to flush the queue on close and call all outstanding error callbacks, which I believe also matches how promises work. And then there's Bug 1056433, so a closed pc isn't a dead pc, but it's close (no pun intended).
(Assignee)

Comment 8

3 years ago
I can ask mt about PeerConnectionIdp and see next week.
(Assignee)

Comment 9

3 years ago
^
Flags: needinfo?(martin.thomson)
The IdP stuff shouldn't fire outside of calls to setRemoteDescription and createOffer/createAnswer.  Blocking events on a closed PC works. 

Don't you have to ensure that the state transition to closed fires still?
Flags: needinfo?(martin.thomson)
(Assignee)

Comment 11

3 years ago
The signaling state changes seem to be routed through callCB, strangely enough. I guess that's ok?
http://dxr.mozilla.org/mozilla-central/source/dom/media/PeerConnection.js#1289

Looks wrong to me.  That's not how you invoke an event handler.  Maybe it works, but it's not quite right.  Do we really need to rely on the C++ code completing its work before we signal that we've closed?  After all, we'll be responding to requests for state with "closed" before that happens.
(Assignee)

Comment 13

3 years ago
I agree that the stack unwind that callCB does is going to lead to some strange corner cases here. I suppose we could switch the order up in c++ land so the callback happens before the state is set, but I could see that as violating the principle of least surprise. Alternately, we could just add a flag. Opinions?
(Assignee)

Comment 14

3 years ago
Ok, let's fix the event handler stuff in another ticket, since it is a bit more involved.
(Assignee)

Comment 15

3 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/42b100b009e4
https://hg.mozilla.org/mozilla-central/rev/42b100b009e4
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
You need to log in before you can comment on or make changes to this bug.