Closed
Bug 1041832
Opened 10 years ago
Closed 10 years ago
Webrtc mochitests should support trickle candidates
Categories
(Core :: WebRTC: Networking, defect)
Core
WebRTC: Networking
Tracking
()
RESOLVED
FIXED
mozilla34
People
(Reporter: bwc, Assigned: drno)
References
Details
Attachments
(1 file, 14 obsolete files)
43.47 KB,
patch
|
drno
:
review+
|
Details | Diff | Splinter Review |
Since bug 991037 entails a switch to full trickle logic, we'll need to teach the mochitests to pass along trickle ICE candidates, since they will no longer be in the SDP. This may also be useful in testing stuff other than host candidates.
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → drno
Assignee | ||
Comment 1•10 years ago
|
||
First version which works for local mochitest (note: WIP).
Assignee | ||
Comment 2•10 years ago
|
||
This properly handles now end of trickle ICE. Still WIP
Attachment #8465768 -
Attachment is obsolete: true
Assignee | ||
Comment 3•10 years ago
|
||
WIP. Still missing receiving candidates in steeplechase setup.
Attachment #8465817 -
Attachment is obsolete: true
Assignee | ||
Comment 4•10 years ago
|
||
WIP: only flush candidates on setRemoteDesc. Turn on data channel tests.
Attachment #8465861 -
Attachment is obsolete: true
Assignee | ||
Comment 5•10 years ago
|
||
Byron: With your patch from bug 991037 I have seen: - data channel test just work - end of trickle ICE coming in before all other candidates - data channel test hang Is it expected to not see any candidates from the answerer side?
Flags: needinfo?(docfaraday)
Reporter | ||
Comment 6•10 years ago
|
||
Not really. I'll probably be looking into 991037 today, so hopefully I'll get to the bottom of it.
Flags: needinfo?(docfaraday)
Reporter | ||
Comment 7•10 years ago
|
||
(In reply to Nils Ohlmeier [:drno] from comment #5) > Byron: With your patch from bug 991037 I have seen: > - data channel test just work > - end of trickle ICE coming in before all other candidates > - data channel test hang > > Is it expected to not see any candidates from the answerer side? So, are you seeing this happen with ICE succeeding? Because if ICE is sufficiently fast, it can beat trickle. Is this roughly what you're seeing? 0. Answerer has already gathered candidates. 1. Offerer starts gathering candidates. 2. Offerer creates the offer and calls setLocal, then sends the offer along. 3. Offerer starts trickling ICE candidates. 4. Answerer gets the offer, calls setLocal, and creates an answer. 5. Answerer starts sending ICE checks. 6. Answerer sends answer. 7. Answerer starts sending trickle candidates. 8. ICE checks beat the answerer's trickle candidates across, offerer creates peer-reflexive candidates, and ICE concludes.
Flags: needinfo?(drno)
Assignee | ||
Comment 8•10 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #7) > So, are you seeing this happen with ICE succeeding? Because if ICE is > sufficiently fast, it can beat trickle. Is this roughly what you're seeing? > > 0. Answerer has already gathered candidates. > 1. Offerer starts gathering candidates. > 2. Offerer creates the offer and calls setLocal, then sends the offer along. > 3. Offerer starts trickling ICE candidates. > 4. Answerer gets the offer, calls setLocal, and creates an answer. > 5. Answerer starts sending ICE checks. > 6. Answerer sends answer. > 7. Answerer starts sending trickle candidates. > 8. ICE checks beat the answerer's trickle candidates across, offerer creates > peer-reflexive candidates, and ICE concludes. I did not see 7. happening at all. The answer in 6. is empty. And ICE succeeds on both ends without any candidate from the answerer side ever being offered. I assume ICE connectivity beat the candidate gathering.
Flags: needinfo?(drno)
Reporter | ||
Comment 9•10 years ago
|
||
Strange. At any rate, I know what is causing the premature end-of-candidates signal, and I think I know how to fix it by deleting code. More tomorrow, hopefully, and maybe that will shed some light on this.
Reporter | ||
Comment 10•10 years ago
|
||
Ok, figured out the DataChannel and premature end-of-candidates problem. Bug 1024028 is biting us now.
Depends on: 1024028
Assignee | ||
Comment 11•10 years ago
|
||
Updated patch which watches for trickle end, verifies if candidates have been received, and uses a copy of the initial offer and answer.
Attachment #8466563 -
Attachment is obsolete: true
Assignee | ||
Comment 12•10 years ago
|
||
This actually works now with data channel test as well.
Attachment #8469604 -
Attachment is obsolete: true
Assignee | ||
Comment 13•10 years ago
|
||
Works with non-trickle ICE now.
Attachment #8470115 -
Attachment is obsolete: true
Reporter | ||
Comment 14•10 years ago
|
||
Be aware; this test case now detects bug 991037 (early end-of-candidates signal), so it will fail intermittently.
Assignee | ||
Comment 15•10 years ago
|
||
Added timeouts for ICE gathering.
Attachment #8470245 -
Attachment is obsolete: true
Reporter | ||
Comment 16•10 years ago
|
||
Heads-up; there's a couple of tests in media/tests/identity that still use variables that this patch renames.
Flags: needinfo?(drno)
Comment 17•10 years ago
|
||
Comment on attachment 8474853 [details] [diff] [review] bug_1041832_add_trickle_ice_support.patch Review of attachment 8474853 [details] [diff] [review]: ----------------------------------------------------------------- Missed a couple: 28640 INFO TEST-UNEXPECTED-FAIL | /tests/dom/media/tests/identity/test_setIdentityProvider.html | uncaught exception - TypeError: test.pcLocal._last_offer is undefined at http://mochi.test:8888/tests/dom/media/tests/identity/test_setIdentityProvider.html:86 TEST-INFO | expected PASS 28641 INFO TEST-UNEXPECTED-FAIL | /tests/dom/media/tests/identity/test_setIdentityProviderWithErrors.html | uncaught exception - TypeError: test.pcLocal._last_offer is undefined at http://mochi.test:8888/tests/dom/media/tests/identity/test_setIdentityProviderWithErrors.html:73 TEST-INFO | expected PASS ::: dom/media/tests/mochitest/pc.js @@ +1975,5 @@ > + * A PeerConnectionTest object to which the ice candidates gets > + * forwarded. > + */ > + setupIceCandidateHandler : function > + PCW_setupIceCandidateHandler(test) { Line wrap not needed, especially if you don't follow house style and you remove the second function name (which isn't needed). @@ +1986,5 @@ > + info(self.label + ": got iceCandidateEvent: " + anEvent); > + if (!anEvent.candidate) { > + info(self.label + ": received end of trickle ICE event"); > + self.endOfTrickleIce = true; > + if (typeof self.endOfTrickleIceCallback == 'function') { === ; check quote style
Comment 18•10 years ago
|
||
Should fix the only real problem related to this that I saw running bug 991037
Assignee | ||
Comment 19•10 years ago
|
||
Try run: https://tbpl.mozilla.org/?tree=Try&rev=2c6e82ec8aa6
Attachment #8474853 -
Attachment is obsolete: true
Attachment #8477627 -
Attachment is obsolete: true
Assignee | ||
Comment 20•10 years ago
|
||
Removed checks for gathering state.
Attachment #8477973 -
Attachment is obsolete: true
Assignee | ||
Comment 21•10 years ago
|
||
Try run for the latest patch with trickle ICE: https://tbpl.mozilla.org/?tree=Try&rev=ed4e54261d6f Try run against current m-c (w/o trickle ICE): https://tbpl.mozilla.org/?tree=Try&rev=cc547383078d
Reporter | ||
Comment 22•10 years ago
|
||
Ok, the main problem here is that there is a totally unreasonable delay between createAnswer/setLocal on the answerer side, while we send the answer to the offerer immediately. The offerer receives and sets the answer seconds before the answerer sets its own description. Pre-full-trickle, the offerer got the candidates in the answer, and was able to start ICE regardless of whether the answerer had set its own SDP yet, but in the full trickle case the answerer can't send any candidates until its own SDP is set, so the start of ICE is significantly out of sync.
Reporter | ||
Comment 23•10 years ago
|
||
Wait, scratch that, the reason for the delay between createAnswer/setLocal is because this is a datachannel test, and we're waiting for the datachannels in between these two operations. We probably shouldn't be doing that.
Reporter | ||
Comment 24•10 years ago
|
||
I'm kicking off a try run with full logging for b2g emulator M7. That should shed some more light on where these delays are coming from.
Assignee | ||
Comment 25•10 years ago
|
||
Keep in mind that test_dataChannel_bug1013809.html just tests the old test behavior, where we set the answer on the offerer side first, before setting on the answerer side. I think that is a pretty unrealistic scenario which we will hardly see in the field. If this is the only remaining problem I suggest we disabled that test, as it is know to cause strange timing effect.
Reporter | ||
Comment 26•10 years ago
|
||
Kill it with fire.
Reporter | ||
Comment 27•10 years ago
|
||
(Ok, maybe that wasn't totally explicit. We should remove the test entirely.)
Reporter | ||
Comment 28•10 years ago
|
||
So, I see that practically all of these tests are disabled on b2g debug, due to horrible lag. Maybe the fact that we're doing a little more work handling trickle candidates has pushed opt over the edge too?
Reporter | ||
Comment 29•10 years ago
|
||
Comment on attachment 8478684 [details] [diff] [review] bug_1041832_add_trickle_ice_support.patch Review of attachment 8478684 [details] [diff] [review]: ----------------------------------------------------------------- Drive-by review comments. Not giving the thumbs up/down yet, just trying to parallelize. ::: dom/media/tests/identity/test_setIdentityProvider.html @@ +83,5 @@ > [ > "OFFERS_AND_ANSWERS_INCLUDE_IDENTITY", > function(test) { > + ok(test.originalOffer.sdp.contains("a=identity"), "a=identity is in the offer SDP"); > + ok(test.originalAnswer.sdp.contains("a=identity"), "a=identity is in the answer SDP"); The other tests seem to replace with _latest_offer/_latest_answer here. ::: dom/media/tests/identity/test_setIdentityProviderWithErrors.html @@ +70,5 @@ > [ > 'ONLY_REMOTE_SDP_INCLUDES_IDENTITY_ASSERTION', > function(test) { > + ok(!test.originalOffer.sdp.contains('a=identity'), 'a=identity not contained in the offer SDP'); > + ok(test.originalAnswer.sdp.contains('a=identity'), 'a=identity is contained in the answer SDP'); Same issue here. ::: dom/media/tests/mochitest/mochitest.ini @@ +67,5 @@ > # no screenshare on b2g/android > # frequent timeouts/crashes on e10s (bug 1048455) > skip-if = buildapp == 'b2g' || toolkit == 'android' || e10s > +#[test_peerConnection_basicH264Video.html] > +#skip-if = buildapp == 'b2g' || os == 'android' # bug 1043403 Is this related? ::: dom/media/tests/mochitest/pc.js @@ +677,3 @@ > peer.createAnswer(function (answer) { > + // make a copy so this does not get updated with ICE candidates > + self.originalAnswer = new mozRTCSessionDescription(JSON.parse(JSON.stringify(answer))); I love javascript... @@ +1830,5 @@ > + var self = this; > + > + self._remote_ice_candidates.push(candidate); > + if (self.remoteDescriptionSet) { > + //info("adding candidate " + JSON.stringify(candidate)); Either remove or uncomment. @@ +2002,5 @@ > + self._local_ice_candidates.push(anEvent.candidate); > + test.iceCandidateHandler(self.label, anEvent.candidate); > + if (typeof self.onIceCandidateCallback === 'function') { > + self.onIceCandidateCallback(anEvent.candidate); > + } Don't see onIceCandidateCallback being set anywhere. ::: dom/media/tests/mochitest/templates.js @@ +693,5 @@ > ], > [ > 'PC_REMOTE_SANE_LOCAL_SDP', > function (test) { > + test.pcRemote.verifySdp(test.pcRemote.localDescription, "answer", test.pcRemote.constraints, test.pcRemote.offerOptions, function (trickle) { Line's getting really long here. @@ +702,5 @@ > ], > [ > 'PC_LOCAL_SANE_REMOTE_SDP', > function (test) { > + test.pcLocal.verifySdp(test.pcLocal.remoteDescription, "answer", test.pcLocal.constraints, test.pcLocal.offerOptions, function (trickle) Line's getting really long here. @@ +722,5 @@ > myTest.next(); > }; > function onIceConnectedFailed () { > dumpSdp(myTest); > + is(test.pcLocal._pc.iceGatheringState, GATH_COMPLETE, "ICE gathering state has reached 'complete'"); Hmm. I'm not totally sure this something we'll do right now, and not sure what putting it right above ok(false, accomplishes. @@ +765,5 @@ > myTest.next(); > }; > function onIceConnectedFailed () { > dumpSdp(myTest); > + is(test.pcRemote._pc.iceGatheringState, GATH_COMPLETE, "ICE gathering state has reached 'complete'"); Similar thing here.
Assignee | ||
Comment 30•10 years ago
|
||
Updated according to Byron's feedback. Disabled test_dataChannel_bug1013809.html on B2G. Try run with trickle ICE patches: https://tbpl.mozilla.org/?tree=Try&rev=401be974550f
Attachment #8478684 -
Attachment is obsolete: true
Assignee | ||
Comment 31•10 years ago
|
||
As the previous try run was showing lots of oranges from the IPC fall out, here is another run with IPC fixes included: https://tbpl.mozilla.org/?tree=Try&rev=0316bdc7627f
Assignee | ||
Comment 32•10 years ago
|
||
The previous patch had still issues with the SDP in the data channel tests. Try run: https://tbpl.mozilla.org/?tree=Try&rev=c486298eb413
Attachment #8479450 -
Attachment is obsolete: true
Reporter | ||
Comment 33•10 years ago
|
||
So I ran a bunch of retriggers of b2g M7, and it looks like our orange rate is in the 20-30% range. https://tbpl.mozilla.org/?tree=Try&rev=41d9bca2547b
Reporter | ||
Comment 34•10 years ago
|
||
I am trying to pin down the source of these horrible delays, and the TimerThread is definitely firing timers very late (I added some checking and a MOZ_CRASH to TimerThread::Run in a try push here: https://tbpl.mozilla.org/?tree=Try&rev=8e989f171fc5). I tried checking for clock drift, but found no evidence of it. I now have a try run that should log when the monitor wait in TimerThread takes more than a second too long, we'll see what happens.
Reporter | ||
Comment 35•10 years ago
|
||
So something interesting here. dom/tests/mochitest/bugs/test_bug369306.html seems to be perma-orange with these patches, but it is a completely unrelated test...
Reporter | ||
Comment 36•10 years ago
|
||
That test is failing because the test window has chrome privs for some reason, which allows it to bypass the dom.disable_window_flip pref. Interestingly enough, the test doesn't always fail, probably because the "focus" and "blur" callbacks aren't arriving in time.
Reporter | ||
Comment 37•10 years ago
|
||
I am going to debug this test without the patch set from bug 991037. More in a few hours.
Reporter | ||
Comment 38•10 years ago
|
||
Ok, that test is completely broken, see the comments on bug 963103. I think we should disable it.
Assignee | ||
Comment 39•10 years ago
|
||
Moved B2G test disabling out of the patch set.
Attachment #8479628 -
Attachment is obsolete: true
Attachment #8480812 -
Flags: review?(docfaraday)
Reporter | ||
Comment 40•10 years ago
|
||
Comment on attachment 8480812 [details] [diff] [review] bug_1041832_add_trickle_ice_support.patch Review of attachment 8480812 [details] [diff] [review]: ----------------------------------------------------------------- r+ if you just make sure that the PC_REMOTE_CHECK_MEDIA_TRACKS and PC_LOCAL_CHECK_MEDIA_TRACKS steps are consistent between the non-datachannel and datachannel cases, since that will make it easier to start cutting down on the duplicate code later.
Attachment #8480812 -
Flags: review?(docfaraday) → review+
Assignee | ||
Comment 41•10 years ago
|
||
Addressed Byron's concern regarding the TRACKS verification. Carrying forward r+=bwc
Attachment #8480812 -
Attachment is obsolete: true
Attachment #8480831 -
Flags: review+
Assignee | ||
Comment 42•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/029654998ef3
Comment 43•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/029654998ef3
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Updated•10 years ago
|
Flags: qe-verify-
You need to log in
before you can comment on or make changes to this bug.
Description
•