Closed Bug 1628139 Opened 1 year ago Closed 1 year ago

onnegotiationneeded still races with onmessage under stress

Categories

(Core :: WebRTC: Signaling, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla77
Tracking Status
firefox77 --- fixed

People

(Reporter: jib, Assigned: jib)

References

Details

Attachments

(2 files, 1 obsolete file)

Thanks to new perfect negotiation WPT tests in bug 1621399, and instrumentation of the operations chain, we're observing that ONN may still race with onmessage under stress. Comments inline.

Instrumentation legend:

++ push operation onto chain
-- fifo shift completed operation off chain
-> execute operation
[] chain contents

Here's a log of one side A (the impolite peer) of one of the tests (doesn't matter which one):

★ ~/moz/mozilla-central $ grep JIBA ~/m | sed 's/^ [0-9]:[0-9][0-9].[0-9][0-9] pid:[0-9][0-9][0-9][0-9][0-9] //'
JIBA impolite JS: addTransceiver #1 stable
JIBA impolite PC: ++ 1 [updateNegotiationNeeded] -> updateNegotiationNeeded
JIBA impolite JS: ONN CALLED stable pc.getTransceivers() is [null]
JIBA impolite PC: ++ 2 [updateNegotiationNeeded,setLocalDescription]
JIBA impolite PC: -- 1 [setLocalDescription] -> setLocalDescription
JIBA impolite PC: -- 0 []
JIBA impolite JS: ONN's SLD SUCCEEDED have-local-offer contains 1 mids: ["a=mid:0"], pc.getTransceivers() is ["0"]
JIBA impolite PC: ++ 1 [updateNegotiationNeeded] -> updateNegotiationNeeded
JIBA impolite PC: -- 0 []
JIBA impolite JS: ONMESSAGE CALLED BUT IGNORING offer have-local-offer pc.getTransceivers() is ["0"]
JIBA impolite PC: ++ 1 [addIceCandidate] -> addIceCandidate
JIBA impolite PC: -- 0 []
JIBA impolite PC: ++ 1 [addIceCandidate] -> addIceCandidate
JIBA impolite PC: -- 0 []
JIBA impolite PC: ++ 1 [addIceCandidate] -> addIceCandidate
JIBA impolite PC: -- 0 []
JIBA impolite PC: ++ 1 [addIceCandidate] -> addIceCandidate
JIBA impolite PC: -- 0 []
JIBA impolite PC: ++ 1 [addIceCandidate] -> addIceCandidate
JIBA impolite PC: -- 0 []
JIBA impolite JS: addTransceiver #2 have-local-offer  tc2.mid = null is present
JIBA impolite PC: ++ 1 [updateNegotiationNeeded] -> updateNegotiationNeeded
JIBA impolite PC: -- 0 []
JIBA impolite JS: ONMESSAGE CALLED AND PROCEEDING answer have-local-offer pc.getTransceivers() is ["0",null]
JIBA impolite PC: ++ 1 [setRemoteDescription] -> setRemoteDescription
JIBA impolite PC: ++ 2 [setRemoteDescription,addIceCandidate]
JIBA impolite PC: ++ 3 [setRemoteDescription,addIceCandidate,addIceCandidate]
JIBA impolite PC: ++ 4 [setRemoteDescription,addIceCandidate,addIceCandidate,addIceCandidate]
JIBA impolite PC: -- 3 [addIceCandidate,addIceCandidate,addIceCandidate] -> addIceCandidate
JIBA impolite JS: ONMESSAGE's SRD SUCCEEDED answer stable pc.getTransceivers() is ["0",null]
JIBA impolite JS: Done negotiating addTransceiver #1 stable mid 0
JIBA impolite PC: ++ 4 [addIceCandidate,addIceCandidate,addIceCandidate,updateNegotiationNeeded]
JIBA impolite PC: -- 3 [addIceCandidate,addIceCandidate,updateNegotiationNeeded] -> addIceCandidate
JIBA impolite JS: ONMESSAGE CALLED AND PROCEEDING offer stable pc.getTransceivers() is ["0",null]
JIBA impolite PC: ++ 4 [addIceCandidate,addIceCandidate,updateNegotiationNeeded,setRemoteDescription]

On the last two lines above, we see a normal signaling offer come in over the signaling channel (ONMESSAGE) in stable state. Our perfect negotiation code in the WPT checks ONN isn't in progress (makingOffer) and since it isn't, immediately calls setRemoteDescription like it's supposed to. Our premise in the spec was that these two things would shield us from racing with ONN, since any subsequent ONN would chain behind SRD which would take us out of stable, and not fire. Hence no race.

Unfortunately, in this case we can see that an updateNegotiationNeeded (what fires ONN) was already on the chain, behind a bunch of ice candidates, but ahead of the SRD we just pushed. This means ONN will happen before SRD, even though JS is now in the middle of onmessage, awaiting SRD. ONN and ONMESSAGE race below.

JIBA impolite PC: -- 3 [addIceCandidate,updateNegotiationNeeded,setRemoteDescription] -> addIceCandidate
JIBA impolite PC: -- 2 [updateNegotiationNeeded,setRemoteDescription] -> updateNegotiationNeeded
JIBA impolite JS: ONN CALLED stable pc.getTransceivers() is ["0",null]
JIBA impolite PC: ++ 3 [updateNegotiationNeeded,setRemoteDescription,setLocalDescription]

On the last line above, we see ONN's SLD(implicit offer) is added to the chain right after ONMESSAGE's SRD(offer). Spoiler: this SLD() without arguments will instead be interpreted as an SLD(answer) below.

JIBA impolite PC: -- 2 [setRemoteDescription,setLocalDescription] -> setRemoteDescription
JIBA impolite PC: -- 1 [setLocalDescription] -> setLocalDescription
JIBA impolite JS: ONMESSAGE's SRD SUCCEEDED offer have-remote-offer pc.getTransceivers() is ["0",null,"1","2"]
JIBA impolite PC: ++ 2 [setLocalDescription,setLocalDescription]

Above, ONMESSAGE's SRD(offer) finishes, so it chains SLD(implicit answer).

JIBA impolite JS: addTransceiver #3 have-remote-offer offering tc2.mid = null is present
JIBA impolite PC: ++ 3 [setLocalDescription,setLocalDescription,updateNegotiationNeeded]
JIBA impolite PC: ++ 4 [setLocalDescription,setLocalDescription,updateNegotiationNeeded,updateNegotiationNeeded]
JIBA impolite PC: -- 3 [setLocalDescription,updateNegotiationNeeded,updateNegotiationNeeded] -> setLocalDescription
JIBA impolite JS: ONN UNEXPECTED! expected "Have-local-offer", got stable

On the very last line, ONN's SLD(implicit offer) succeeded, and our WPT test now asserts that we have the local offer we just set, which we don't since our SLD() without arguments got inserted after ONMESSAGE's SRD(offer) and was interpreted as SLD(answer). We are now in stable, which surprises JS.

I think we need to provide better invariants here for JS code, otherwise this is not POLA.

I'll file an issue on the spec. I think the solution is to only fire ONN when the chain is empty.

Assignee: nobody → jib
Attachment #9139276 - Attachment is obsolete: true
Attachment #9139188 - Attachment description: Bug 1628139 - queue negotiationneeded chain is/becomes empty. → Bug 1628139 - Queue negotiationneeded when chain is/becomes empty.
Priority: -- → P2
Pushed by jbruaroey@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5718ea76403a
Tighten negotiationneeded test to disallow double-firing and racing w/onmessage. r=bwc
https://hg.mozilla.org/integration/autoland/rev/9bbd3fd1bba8
Queue negotiationneeded when chain is/becomes empty. r=bwc
Failed to create upstream wpt PR due to merge conflicts. This requires fixup from a wpt sync admin.
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla77
No longer blocks: 1621399
Depends on: 1621399
Flags: needinfo?(james)
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/23110 for changes under testing/web-platform/tests
Flags: needinfo?(james) → needinfo?(jib)
Upstream PR merged by jgraham

Looks like everything got merged. Thanks James!

Flags: needinfo?(jib)
You need to log in before you can comment on or make changes to this bug.