Don't buffer ICE candidates in mochitests

ASSIGNED
Assigned to

Status

()

defect
P5
normal
Rank:
41
ASSIGNED
4 years ago
2 years ago

People

(Reporter: ekr, Assigned: ekr)

Tracking

(Depends on 1 bug)

38 Branch
x86
macOS
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [tests])

Attachments

(2 attachments)

We used to emit ICE candidates out of order. We don't any more (hopefully) so the mochitests should stop buffering.
Assignee: nobody → ekr
Status: NEW → ASSIGNED
Per discussion in bug 1087551, this just removes the queuing in the
mochitests (which we all agree we should do).

Try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a59bf74aa945
Attachment #8594154 - Flags: review?(martin.thomson)
Comment on attachment 8594154 [details] [diff] [review]
Remove ICE queuing in mochitests

Review of attachment 8594154 [details] [diff] [review]:
-----------------------------------------------------------------

I see no reason to keep the hack in place.
Attachment #8594154 - Flags: review?(martin.thomson) → review+
I've just done the analysis here and I think we're OK.

We have code running on main that sets this all up, namely the call to setLocalDescription.  That's all synchronous, but when it completes, there is a dispatch back to main to unroll the stack and call the .then() functions registered on its promise.

If the STS thread is fast, it could attempt a dispatch to main before the main thread does that dispatch.  That would cause onicecandidate to fire before setLocalDescription.then().  That's not a correctness issue from our perspective (our state will have advanced to the right state properly), but it could be a problem for API users, who have to deal with out of order invocations of their code.

Here's the simplest possible screw-up case:

pc.onicecandidate = e => signal(e.candidate);
pc.createOffer().then(offer => {
  pc.setLocalDescription(offer).then(_ => signal(offer));
});

Here, a race condition could have the potential to screw this code up.  If the onicecandidate handler runs before the .then on setLocalDescription runs, then the candidate arrives before the offer.  Naively setting the signaling information as it arrives (which is supposed to be safe if your signaling guarantees order), fails.  The answerer will then reject any candidates that arrive early.

However, the dispatch from the STS thread only triggers a second dispatch to main (through dispatchEvent), that causes the two events to be reordered correctly.

We cannot afford to add any additional dispatches here, or if we do, we need to be very careful.  We should add a note to the end of the implementation of setLocalDescription() in PeerConnection.js that references this bug.
Also we need a note in SetLocalDescription in PeerConnectionImpl.cpp or JsepSessionImpl.cpp: that code has to start gathering and invoke the callback synchronously (it can start gathering after the callback is called safely without breaking this code).  If it permits asynchronous calls between the two, things break.
Keywords: checkin-needed
Martin, since you did the analysis, I suggest you write the patch with this comment.
Flags: needinfo?(martin.thomson)
Ack, let's use this bug number though.
Flags: needinfo?(martin.thomson)
Keywords: leave-open
Attachment #8594921 - Flags: review?(ekr) → review+
Keywords: leave-open
Every single M-e10s(3) run in that Try push has the same failure.
1893 INFO TEST-UNEXPECTED-FAIL | dom/media/tests/mochitest/test_peerConnection_basicH264Video.html | PeerConnectionWrapper (pcLocal): legal ICE state transition from new to failed - expected PASS
Keywords: checkin-needed
Thanks for catching that. Sorry I missed that.
Depends on: 1157042
As discussed in bug 1087551 I don't think that right now it is feasible to remove the queuing, as the SDP and the ICE candidates take different paths through the test framework (with different amount of dispatches).

If we implement a common signaling interface between the PeerConnection wrappers as suggested in bug 1020436, then we might be able to removing the queuing.
looks like these didn't land due to test failures.  what would you like to do here?
backlog: --- → webRTC+
Rank: 41
Flags: needinfo?(martin.thomson)
Priority: -- → P4
Whiteboard: [tests]
Nils know this better than I.  Does Bug 1087551 cover this well enough?
Flags: needinfo?(martin.thomson) → needinfo?(drno)
The problem remains that the SDP and the ICE candidates travel on different path (with different amount of dispatches in between) from one side to the other. I think we can only turn off the storing of ICE candidates if they both travel the same route.
I think we should open a bug for the same route and make this bug depending on it.
Flags: needinfo?(drno)
Mass change P4->P5 to align with new Mozilla triage process.
Priority: P4 → P5
You need to log in before you can comment on or make changes to this bug.