assert_equals: 42 transceivers on side B expected 42 but got 41
Categories
(Core :: WebRTC: Signaling, defect, P2)
Tracking
()
People
(Reporter: jib, Assigned: bwc)
References
(Regression)
Details
(Keywords: regression)
Attachments
(5 files, 1 obsolete file)
Comment hidden (obsolete) |
Reporter | ||
Comment 1•4 years ago
|
||
The above turned out to be a pilot error in the test (yay). The remaining intermittent is [Perfect negotiation stress glare with roles reversed]:
assert_equals: 42 transceivers on side B expected 42 but got 41
Hopefully this also turns out to be a test problem, but that remains to be seen.
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Comment 2•4 years ago
|
||
Reporter | ||
Comment 3•4 years ago
•
|
||
After some instrumentation, I was able to boil down STRs:
Expected results:
Settled pc1 stable ["1","0","3","4","5"] 5
Settled pc2 stable ["0","5","1","3","4"] 5
Actual results:
Settled pc1 stable ["1","0","3","4","5"] 5
Settled pc2 stable ["0","5","1","2","3","4"] 6
I've given both sides ample time to settle this, but they don't, which seems like a JSEP bug.
Reporter | ||
Updated•4 years ago
|
Assignee | ||
Comment 4•4 years ago
|
||
pc1 and pc2 have not had the same number of negotiations in this test case. pc1 completed the offer/answer started in p3, but pc2 never did; that offer/answer exchange was abandoned on pc2 entirely when we reached the "await pc2.setRemoteDescription(pc1.localDescription);" in the loop below. I do not think JSEP can deal with this case; JSEP requires that when an offer/answer exchange completes on one end, it must also be completed on the other.
Reporter | ||
Comment 5•4 years ago
•
|
||
Why does history matter?
With SDP dump in https://jsfiddle.net/jib1/v68ak09f/150/ we see pc1
producing an answer containing a transceiver it doesn't have ("2"
). Surely that's a bug?
Settled pc1 stable ["1","0","3","4","5"] 5
Settled pc2 stable ["0","5","1","2","3","4"] 6
pc2's offer = a=mid:0,a=mid:1,a=mid:2,a=mid:3,a=mid:4,a=mid:5
pc1's answer = a=mid:0,a=mid:1,a=mid:2,a=mid:3,a=mid:4,a=mid:5
Settled pc1 stable ["1","0","3","4","5"] 5
Settled pc2 stable ["0","5","1","2","3","4"] 6
Reporter | ||
Comment 6•4 years ago
|
||
Note I updated the fiddle to 150
Assignee | ||
Comment 7•4 years ago
|
||
(In reply to Jan-Ivar Bruaroey [:jib] (needinfo? me) from comment #5)
Why does history matter?
Lots of reasons. For example, the rules around m-section reuse require a negotiation to first disable the m-section, then a subsequent negotiation can reuse it. If the two sides aren't carrying out the same negotiations, then the whole thing breaks.
With SDP dump in https://jsfiddle.net/jib1/v68ak09f/150/ we see
pc1
producing an answer containing a transceiver it doesn't have ("2"
). Surely that's a bug?Settled pc1 stable ["1","0","3","4","5"] 5 Settled pc2 stable ["0","5","1","2","3","4"] 6 pc2's offer = a=mid:0,a=mid:1,a=mid:2,a=mid:3,a=mid:4,a=mid:5 pc1's answer = a=mid:0,a=mid:1,a=mid:2,a=mid:3,a=mid:4,a=mid:5 Settled pc1 stable ["1","0","3","4","5"] 5 Settled pc2 stable ["0","5","1","2","3","4"] 6
I'm not sure what is happening there. Let me look into it. I suspect that we're letting a failure slide that we should not be.
Assignee | ||
Comment 8•4 years ago
|
||
Here's a minimal test case:
https://jsfiddle.net/L3a5wydz/
It looks like implicit rollback is destroying the wrong transceiver on the second sRD(offer)? Looking closer.
Assignee | ||
Comment 9•4 years ago
|
||
Ok, so doing the rollbacks explicitly in between the sRD calls seems to work properly: https://jsfiddle.net/L3a5wydz/1/
It seems that repeated sRD(offer) is not doing the implicit rollbacks correctly for some reason.
Assignee | ||
Comment 10•4 years ago
|
||
Actually, that's not true; just because you emit an offer with a mid in it, doesn't mean that mid is going to show up in a transceiver yet.
Assignee | ||
Comment 11•4 years ago
|
||
I see what's happening. This loop ends up not doing its job because we're doing a rollback and reapplication in one go, which leaves us with the same number of transceivers that we started with, but one of the pre-existing transceivers was marked for removal:
I think we can fix this by moving the implicit rollback for repeated sRD(offer) to PeerConnection.jsm, like we do for the implicit rollback for sRD(offer) when in have-local-offer.
Reporter | ||
Comment 12•4 years ago
|
||
Great! Btw if you're wondering how this crazy series of methods happened, see bug 1628139. I still think we should fix this though.
Assignee | ||
Comment 13•4 years ago
|
||
Assignee | ||
Comment 14•4 years ago
|
||
JSEP transceivers were previously held in a vector, in the order that they were
created. However, these could be removed, which meant that the indexing was not
stable. Under most circumstances this did not matter, however there was a
wrinkle with implicit rollback in repeated sRD. Re-applying a remote offer that
had created a transceiver would destroy that transceiver, and create a new one
to replace it. However, JS was not informed, because to PeerConnectionImpl it
looked like nothing had changed. Now, transceivers are indexed in a stable way,
which allows this bug to be fixed, and makes things a little less confusing.
Depends on D70399
Assignee | ||
Comment 15•4 years ago
|
||
Depends on D70400
Assignee | ||
Comment 16•4 years ago
|
||
Depends on D70449
Assignee | ||
Comment 17•4 years ago
•
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3e75b20766f8018d10861639e7860568adc37077 Gtest here: https://hg.mozilla.org/try/rev/9126a582591adcc66561733b474aedd33205fcc5
Assignee | ||
Comment 18•4 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=513f05876adab34cc370d163a9c8c6ac295d3638
Reporter | ||
Comment 20•4 years ago
|
||
See bug 1630029 for regression range.
Reporter | ||
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Comment 21•4 years ago
|
||
Lost sight of this because I never took the bug!
Assignee | ||
Comment 22•4 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ce6abb6383d2baa746d0d50ae886a24fda19116c
Assignee | ||
Comment 23•4 years ago
•
|
||
I'm going to do another try push for all our tests, since it has been a while, and related stuff has landed in the meantime.
Assignee | ||
Comment 24•4 years ago
|
||
Comment 25•4 years ago
|
||
Pushed by bcampen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/eba5d05e59b2 Test-case for bug. r=jib https://hg.mozilla.org/integration/autoland/rev/aad4412b43e2 Implement stable indexing for JSEP transceivers. r=mjf https://hg.mozilla.org/integration/autoland/rev/13101fc544bc Some wpt cleanup while I'm in here. r=jib https://hg.mozilla.org/integration/autoland/rev/d339ce87c2ef Fix this test to not expect onn, and add a slightly different test that should expect onn at the very end. r=jib
Comment 26•4 years ago
|
||
Backed out 4 changesets (bug 1622384) for causing build bustages on JsepSession.h
Backout revision https://hg.mozilla.org/integration/autoland/rev/255fae34d0ccfa57b5bf2c01c4b1e94e177f9959
Failure log https://treeherder.mozilla.org/logviewer.html#?job_id=300287496&repo=autoland
Byron can you please take a look?
Assignee | ||
Comment 27•4 years ago
|
||
I wonder why that did not fail on try...
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/23352 for changes under testing/web-platform/tests
Assignee | ||
Comment 29•4 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=696f71cb3f053aa30adf1283ea351452b135d85b
Assignee | ||
Comment 30•4 years ago
|
||
Yeah, try must be running different static analysis. Maybe some new static analysis has landed on autoland, but has not yet been merged to m-c?
Assignee | ||
Comment 31•4 years ago
|
||
Yep, try isn't running the base-toolchains stuff.
Upstream PR was closed without merging
Assignee | ||
Comment 33•4 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=cb9750ccaebc9123a827313a80983c5a92a31320
Comment 34•4 years ago
|
||
Pushed by bcampen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/120851356e60 Test-case for bug. r=jib https://hg.mozilla.org/integration/autoland/rev/cb3ed3e1a72c Implement stable indexing for JSEP transceivers. r=mjf https://hg.mozilla.org/integration/autoland/rev/6c019a630750 Some wpt cleanup while I'm in here. r=jib https://hg.mozilla.org/integration/autoland/rev/b78e1a7c12d3 Fix this test to not expect onn, and add a slightly different test that should expect onn at the very end. r=jib
Upstream web-platform-tests status checks passed, PR will merge once commit reaches central.
Comment 36•4 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/120851356e60
https://hg.mozilla.org/mozilla-central/rev/cb3ed3e1a72c
https://hg.mozilla.org/mozilla-central/rev/6c019a630750
https://hg.mozilla.org/mozilla-central/rev/b78e1a7c12d3
Upstream PR merged by moz-wptsync-bot
Comment 38•4 years ago
|
||
This issue is Verified as Fixed in our latest Nightly build 78.0a1 (2020-05-05) but since it's so technical I'm reluctant to change the flags for it, Byron can you please take a look at this issue and confirm that these are the correct Expected Results for this issue :
Using this test case: https://jsfiddle.net/jib1/v68ak09f/
Settled pc1 stable ["1","0","3","2","4"] 5
Settled pc2 stable ["0",null,"1","2","3","4"] 6
Settled pc1 stable ["1","0","3","2","4","5"] 6
Settled pc2 stable ["0","5","1","2","3","4"] 6
Settled pc1 stable ["1","0","3","2","4","5"] 6
Settled pc2 stable ["0","5","1","2","3","4"] 6
Settled pc1 stable ["1","0","3","2","4","5"] 6
Settled pc2 stable ["0","5","1","2","3","4"] 6
Assignee | ||
Comment 39•4 years ago
|
||
That looks fine to me. jib?
Comment 41•4 years ago
|
||
Thank you, I will update the flags for this issue.