Closed Bug 1622384 Opened 4 years ago Closed 4 years ago

assert_equals: 42 transceivers on side B expected 42 but got 41

Categories

(Core :: WebRTC: Signaling, defect, P2)

68 Branch
Unspecified
All
defect

Tracking

()

VERIFIED FIXED
mozilla77
Tracking Status
firefox-esr68 --- wontfix
firefox75 --- wontfix
firefox76 --- wontfix
firefox77 --- verified

People

(Reporter: jib, Assigned: bwc)

References

(Regression)

Details

(Keywords: regression)

Attachments

(5 files, 1 obsolete file)

The above turned out to be a pilot error in the test (yay). The remaining intermittent is [Perfect negotiation stress glare with roles reversed]:

assert_equals: 42 transceivers on side B expected 42 but got 41

Hopefully this also turns out to be a test problem, but that remains to be seen.

Summary: InvalidAccessError: Offer and answer have different number of m-lines (2 vs 1) → assert_equals: 42 transceivers on side B expected 42 but got 41
Depends on: 1621399
Assignee: nobody → jib
Priority: -- → P2
Attached image PerfectWptErrors6.png
Attachment #9133276 - Attachment is obsolete: true

After some instrumentation, I was able to boil down STRs:

  1. Open https://jsfiddle.net/jib1/v68ak09f/

Expected results:

Settled pc1 stable ["1","0","3","4","5"] 5
Settled pc2 stable ["0","5","1","3","4"] 5

Actual results:

Settled pc1 stable ["1","0","3","4","5"] 5
Settled pc2 stable ["0","5","1","2","3","4"] 6

I've given both sides ample time to settle this, but they don't, which seems like a JSEP bug.

Flags: needinfo?(docfaraday)
Blocks: 1621399
No longer depends on: 1621399

pc1 and pc2 have not had the same number of negotiations in this test case. pc1 completed the offer/answer started in p3, but pc2 never did; that offer/answer exchange was abandoned on pc2 entirely when we reached the "await pc2.setRemoteDescription(pc1.localDescription);" in the loop below. I do not think JSEP can deal with this case; JSEP requires that when an offer/answer exchange completes on one end, it must also be completed on the other.

Why does history matter?

With SDP dump in https://jsfiddle.net/jib1/v68ak09f/150/ we see pc1 producing an answer containing a transceiver it doesn't have ("2"). Surely that's a bug?


Settled pc1 stable ["1","0","3","4","5"] 5
Settled pc2 stable ["0","5","1","2","3","4"] 6
pc2's offer = a=mid:0,a=mid:1,a=mid:2,a=mid:3,a=mid:4,a=mid:5
pc1's answer = a=mid:0,a=mid:1,a=mid:2,a=mid:3,a=mid:4,a=mid:5
Settled pc1 stable ["1","0","3","4","5"] 5
Settled pc2 stable ["0","5","1","2","3","4"] 6

Note I updated the fiddle to 150

(In reply to Jan-Ivar Bruaroey [:jib] (needinfo? me) from comment #5)

Why does history matter?

Lots of reasons. For example, the rules around m-section reuse require a negotiation to first disable the m-section, then a subsequent negotiation can reuse it. If the two sides aren't carrying out the same negotiations, then the whole thing breaks.

With SDP dump in https://jsfiddle.net/jib1/v68ak09f/150/ we see pc1 producing an answer containing a transceiver it doesn't have ("2"). Surely that's a bug?


Settled pc1 stable ["1","0","3","4","5"] 5
Settled pc2 stable ["0","5","1","2","3","4"] 6
pc2's offer = a=mid:0,a=mid:1,a=mid:2,a=mid:3,a=mid:4,a=mid:5
pc1's answer = a=mid:0,a=mid:1,a=mid:2,a=mid:3,a=mid:4,a=mid:5
Settled pc1 stable ["1","0","3","4","5"] 5
Settled pc2 stable ["0","5","1","2","3","4"] 6

I'm not sure what is happening there. Let me look into it. I suspect that we're letting a failure slide that we should not be.

Here's a minimal test case:

https://jsfiddle.net/L3a5wydz/

It looks like implicit rollback is destroying the wrong transceiver on the second sRD(offer)? Looking closer.

Ok, so doing the rollbacks explicitly in between the sRD calls seems to work properly: https://jsfiddle.net/L3a5wydz/1/

It seems that repeated sRD(offer) is not doing the implicit rollbacks correctly for some reason.

Actually, that's not true; just because you emit an offer with a mid in it, doesn't mean that mid is going to show up in a transceiver yet.

I see what's happening. This loop ends up not doing its job because we're doing a rollback and reapplication in one go, which leaves us with the same number of transceivers that we started with, but one of the pre-existing transceivers was marked for removal:

https://searchfox.org/mozilla-central/rev/a707541ff423ade0d81cef6488e6ecfa09273886/media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp#1417-1459

I think we can fix this by moving the implicit rollback for repeated sRD(offer) to PeerConnection.jsm, like we do for the implicit rollback for sRD(offer) when in have-local-offer.

Flags: needinfo?(docfaraday)

Great! Btw if you're wondering how this crazy series of methods happened, see bug 1628139. I still think we should fix this though.

JSEP transceivers were previously held in a vector, in the order that they were
created. However, these could be removed, which meant that the indexing was not
stable. Under most circumstances this did not matter, however there was a
wrinkle with implicit rollback in repeated sRD. Re-applying a remote offer that
had created a transceiver would destroy that transceiver, and create a new one
to replace it. However, JS was not informed, because to PeerConnectionImpl it
looked like nothing had changed. Now, transceivers are indexed in a stable way,
which allows this bug to be fixed, and makes things a little less confusing.

Depends on D70399

Depends on D70400

Keywords: regression
Regressed by: 1072388
Has Regression Range: --- → yes
Version: unspecified → 68 Branch

Lost sight of this because I never took the bug!

Assignee: jib → docfaraday

I'm going to do another try push for all our tests, since it has been a while, and related stuff has landed in the meantime.

Pushed by bcampen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/eba5d05e59b2
Test-case for bug. r=jib
https://hg.mozilla.org/integration/autoland/rev/aad4412b43e2
Implement stable indexing for JSEP transceivers. r=mjf
https://hg.mozilla.org/integration/autoland/rev/13101fc544bc
Some wpt cleanup while I'm in here. r=jib
https://hg.mozilla.org/integration/autoland/rev/d339ce87c2ef
Fix this test to not expect onn, and add a slightly different test that should expect onn at the very end. r=jib

I wonder why that did not fail on try...

Flags: needinfo?(docfaraday)
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/23352 for changes under testing/web-platform/tests

Yeah, try must be running different static analysis. Maybe some new static analysis has landed on autoland, but has not yet been merged to m-c?

Yep, try isn't running the base-toolchains stuff.

Upstream PR was closed without merging
Pushed by bcampen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/120851356e60
Test-case for bug. r=jib
https://hg.mozilla.org/integration/autoland/rev/cb3ed3e1a72c
Implement stable indexing for JSEP transceivers. r=mjf
https://hg.mozilla.org/integration/autoland/rev/6c019a630750
Some wpt cleanup while I'm in here. r=jib
https://hg.mozilla.org/integration/autoland/rev/b78e1a7c12d3
Fix this test to not expect onn, and add a slightly different test that should expect onn at the very end. r=jib
Upstream web-platform-tests status checks passed, PR will merge once commit reaches central.
Upstream PR merged by moz-wptsync-bot

This issue is Verified as Fixed in our latest Nightly build 78.0a1 (2020-05-05) but since it's so technical I'm reluctant to change the flags for it, Byron can you please take a look at this issue and confirm that these are the correct Expected Results for this issue :

Using this test case: https://jsfiddle.net/jib1/v68ak09f/

Settled pc1 stable ["1","0","3","2","4"] 5
Settled pc2 stable ["0",null,"1","2","3","4"] 6
Settled pc1 stable ["1","0","3","2","4","5"] 6
Settled pc2 stable ["0","5","1","2","3","4"] 6

Settled pc1 stable ["1","0","3","2","4","5"] 6
Settled pc2 stable ["0","5","1","2","3","4"] 6
Settled pc1 stable ["1","0","3","2","4","5"] 6
Settled pc2 stable ["0","5","1","2","3","4"] 6

Flags: needinfo?(docfaraday)

That looks fine to me. jib?

Flags: needinfo?(docfaraday) → needinfo?(jib)

LGTM and matches Chrome now. 👍

Flags: needinfo?(jib)

Thank you, I will update the flags for this issue.

Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: