Closed Bug 1013809 Opened 6 years ago Closed 5 years ago

Mochitests call setLocalDesc and setRemoteDesc in an odd order

Categories

(Core :: WebRTC, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla32

People

(Reporter: drno, Assigned: drno)

References

Details

Attachments

(4 files, 6 obsolete files)

The sequence of function in all of our current mochitests looks like this:
1. pc1.createOffer()
2. pc1.setLocalDescription()
3. pc2.setRemoteDescription()
4. pc2.createAnswer()
5. pc1.setRemoteDescription()
6. pc2.setLocalDescription()

To me the order of steps 5 and 6 appears odd, and most likely not how browsers in a real world scenario would call these functions.
We should switch the order of steps 5. and 6. to make our test cases behave more like regular usage scenarios.
Assignee: nobody → drno
Call setLocalDescription with the SDP answer before sending it to the far end and call setRemoteDescription there.
Attachment 8426081 [details] [diff] only fixes the test_peerConnection_* tests, but not the dataChannel tests. Fixing them requires a little bit more effort.
Yes, please.  Though it points out that there are a number of possible paths to test, so maybe not fixing one of the tests would be good to verify the sequence works
(In reply to Randell Jesup [:jesup] from comment #3)
> Yes, please.  Though it points out that there are a number of possible paths
> to test, so maybe not fixing one of the tests would be good to verify the
> sequence works

Absolutely. We should keep or create a new test which keeps testing the old behavior. It is a valid, but unusual sequence.
Lets try to land this is in multiple chunks. This fixes the order for the peerConnection tests and creates one new test which keeps testing the old order.

Try run: https://tbpl.mozilla.org/?tree=Try&rev=5c361c5c806f
Attachment #8426081 - Attachment is obsolete: true
Attachment #8427389 - Flags: review?(rjesup)
Attachment #8427389 - Flags: review?(jib)
Attachment #8427389 - Flags: review?(rjesup) → review+
Comment on attachment 8427389 [details] [diff] [review]
swap_setLocalDesc_setRemoteDesc_steps.patch

Review of attachment 8427389 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm.

::: dom/media/tests/mochitest/test_peerConnection_bug1013809.html
@@ +3,5 @@
> +<head>
> +  <link rel="stylesheet" type="text/css" href="/tests/SimpleTest/test.css"/>
> +  <script type="application/javascript" src="/tests/SimpleTest/SimpleTest.js"></script>
> +  <script type="application/javascript" src="head.js"></script>
> +  <script type="application/javascript" src="mediaStreamPlayback.js"></script>

Are all of these needed in every test file? I see no uses of LocalMediaStreamPlayback here.
Attachment #8427389 - Flags: review?(jib) → review+
(In reply to Jan-Ivar Bruaroey [:jib] from comment #6)
> ::: dom/media/tests/mochitest/test_peerConnection_bug1013809.html
> @@ +3,5 @@
> > +<head>
> > +  <link rel="stylesheet" type="text/css" href="/tests/SimpleTest/test.css"/>
> > +  <script type="application/javascript" src="/tests/SimpleTest/SimpleTest.js"></script>
> > +  <script type="application/javascript" src="head.js"></script>
> > +  <script type="application/javascript" src="mediaStreamPlayback.js"></script>
> 
> Are all of these needed in every test file? I see no uses of
> LocalMediaStreamPlayback here.

Good point. I'm going to upload a new patch withouth the mediaStreamPlayback.js.
But in general we should probably review if we can find a nicer chain of inclusions, to reduce the amount of inclusions needed in each test.
Carrying forward r+=:jesup,:jib.
Attachment #8427389 - Attachment is obsolete: true
Depends on: 1016498
Attached patch split_up_data_channels.patch (obsolete) — Splinter Review
This patch splits the old setLocalDescription behavior for test_dataChannel_* into multiple pieces and swaps the order of setRemoteDescription and setLocalDescription.
It also adds another test case which maintains the old behavior for data channels tests.

Try run: https://tbpl.mozilla.org/?tree=Try&rev=4a1983276712
Attachment #8430424 - Flags: review?(rjesup)
Attachment #8430424 - Flags: review?(jib)
Blocks: 1017321
Comment on attachment 8430424 [details] [diff] [review]
split_up_data_channels.patch

Review of attachment 8430424 [details] [diff] [review]:
-----------------------------------------------------------------

r- because the PC_LOCAL_ADD_DATA_CHANNEL_CALLBACK and PC_REMOTE_ADD_DATA_CHANNEL_CALLBACK tests seem insufficient to test that onopen notifications work. Not a regression, but seems we should fix them or remove them.

::: dom/media/tests/mochitest/pc.js
@@ -937,5 @@
>       */
>      value : function DCT_setLocalDescription(peer, desc, state, onSuccess) {
> -      // If the peer has a remote offer we are in the final call, and have
> -      // to wait for the datachannel connection to be open. It will also set
> -      // the local description internally.

Please remove mention of datachannel from the comment on DCT_setLocalDescription itself as well.

@@ +957,2 @@
>  
> +      function dataChannelConnected() {

We get a channel passed in here. Should we validate it?

@@ +962,5 @@
>        }
>  
> +      dcConnectionTimeout = setTimeout(function () {
> +        ok(false, peer + " timed out while waiting for dataChannel[0] to connect");
> +        onSuccess();

Wont calling onSuccess here trip up subsequent tests? Seems like waitForInitialDataChannel should take an onFailure argument to use here, then the caller can decide whether to proceed or not.

Also, why not move ok(true|false) calls out of this function and let the caller do those, i.e. making this a utility function rather than a test?

@@ +968,2 @@
>  
> +      // TODO: we should have generic code handling these two cases the same way

I don't see how we could do that, given that the second case requires an extra indirection (I'm peeking at registerDataChannelOpenEvents). It seems inherently asymmetrical.

@@ +971,5 @@
> +        peer.dataChannels[0].onopen = dataChannelConnected;
> +      }
> +      else {
> +        peer.registerDataChannelOpenEvents(function () {
> +          dataChannelConnected();

Why not peer.registerDataChannelOpenEvents(dataChannelConnected); ?

(I spot a similar optimization possibility in registerDataChannelOpenEvents fwiw)

@@ -993,5 @@
> -
> -      PeerConnectionTest.prototype.setLocalDescription.call(this, targetPeer, desc,
> -        state,
> -        function () {
> -          self.connected = true;

There's one more reference to self.connected in this file that should be removed.

::: dom/media/tests/mochitest/templates.js
@@ +386,5 @@
> +  [
> +    'PC_REMOTE_ADD_DATA_CHANNEL_CALLBACK',
> +    function (test) {
> +      test.pcRemote.registerDataChannelOpenEvents(function () {
> +        ok(true, test.pcRemote + " dataChannel[0] connected")

PC_LOCAL_ADD_DATA_CHANNEL_CALLBACK and PC_REMOTE_ADD_DATA_CHANNEL_CALLBACK seem insufficient to test that onopen notifications work.

They register onopen handlers and proceed, but there's no timeout, so if a callback fails to fire we get one or two fewer SUCCEEDED than otherwise, which isn't detected. (I know you have timeouts in subsequent tests, but if things succeed quickly then notifications aren't verified by them).

Why not use waitForInitialDataChannel here? You've just added a timeout to it, so with an additional onFailure argument (as I suggest elsewhere), it seems a fit: You wouldn't actually have to wait, instead, add two callbacks (one with ok(true), the other with ok(false), and immediately proceed to test.next().

@@ +420,4 @@
>      function (test) {
> +      if (test.pcLocal.dataChannels[0].readyState === "open") {
> +        ok(true, test.pcLocal + "dataChannel[0] is in 'open' state already");
> +        test.next();

why not fold this test into waitForInitialDataChannel, basically have it succeed synchronously if readyState is already open? That should simplify this codeflow and make the waitForInitialDataChannel more robust.
Attachment #8430424 - Flags: review?(jib) → review-
Duplicate of this bug: 1016498
(In reply to Carsten Book [:Tomcat] from comment #10)
> https://hg.mozilla.org/mozilla-central/rev/6c53f6dbb7d0

Backed out the first landing in this bug for causing bug 1016498.

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/8b6bb2a4b8bb
(In reply to Ed Morley [:edmorley UTC+0] from comment #14)
> (In reply to Carsten Book [:Tomcat] from comment #10)
> > https://hg.mozilla.org/mozilla-central/rev/6c53f6dbb7d0
> 
> Backed out the first landing in this bug for causing bug 1016498.
> 
> remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/8b6bb2a4b8bb

Sorry for not replying earlier on bug 1016498, but I think you killed the wrong beast here. The patch to template.js brings things more in the order how we would expect it out in the wild.
And the new test case test_peerConnection_bug1013809.html preserves the old ("broken"/unusual) behavior. From looking at bug 1016498 it looks like the new test case is causing lots of problems. But I think that should not prevent the patch to template.js to stick.

Would it help if I provide a patch to template.js here and move the new test out into a separate Bugzilla ticket, which we can then independently track?
Commenting inline on things I have trouble with...

(In reply to Jan-Ivar Bruaroey [:jib] from comment #12)
> @@ +962,5 @@
> >        }
> >  
> > +      dcConnectionTimeout = setTimeout(function () {
> > +        ok(false, peer + " timed out while waiting for dataChannel[0] to connect");
> > +        onSuccess();
> 
> Wont calling onSuccess here trip up subsequent tests? Seems like
> waitForInitialDataChannel should take an onFailure argument to use here,
> then the caller can decide whether to proceed or not.

We have tons of places were this a problem. Maybe it is time to introduce a global
killall function which tears down everything in case an error was encountered
(note: SimpleTest.finish() is not sufficient for that). But we should probably
do that in a seperate ticket.

> @@ +968,2 @@
> >  
> > +      // TODO: we should have generic code handling these two cases the same way
> 
> I don't see how we could do that, given that the second case requires an
> extra indirection (I'm peeking at registerDataChannelOpenEvents). It seems
> inherently asymmetrical.

I added some new code which should take of this now.

> @@ +971,5 @@
> > +        peer.dataChannels[0].onopen = dataChannelConnected;
> > +      }
> > +      else {
> > +        peer.registerDataChannelOpenEvents(function () {
> > +          dataChannelConnected();
> 
> Why not peer.registerDataChannelOpenEvents(dataChannelConnected); ?
>
> (I spot a similar optimization possibility in registerDataChannelOpenEvents
> fwiw)

I don't see that. in registerDataChannelOpenEvents it actually passes arguments.
How do you pass the arguments without calling the function if you don't wrap it
into a function?

> ::: dom/media/tests/mochitest/templates.js
> @@ +386,5 @@
> > +  [
> > +    'PC_REMOTE_ADD_DATA_CHANNEL_CALLBACK',
> > +    function (test) {
> > +      test.pcRemote.registerDataChannelOpenEvents(function () {
> > +        ok(true, test.pcRemote + " dataChannel[0] connected")
> 
> PC_LOCAL_ADD_DATA_CHANNEL_CALLBACK and PC_REMOTE_ADD_DATA_CHANNEL_CALLBACK
> seem insufficient to test that onopen notifications work.
> 
> They register onopen handlers and proceed, but there's no timeout, so if a
> callback fails to fire we get one or two fewer SUCCEEDED than otherwise,
> which isn't detected. (I know you have timeouts in subsequent tests, but if
> things succeed quickly then notifications aren't verified by them).
> 
> Why not use waitForInitialDataChannel here? You've just added a timeout to
> it, so with an additional onFailure argument (as I suggest elsewhere), it
> seems a fit: You wouldn't actually have to wait, instead, add two callbacks
> (one with ok(true), the other with ok(false), and immediately proceed to
> test.next().
> 

First of all the question is if I want to test the callbacks here at all. The
old code did not test them, but merly depend on them to fire. Which IMHO is
really bad, because if they don't fire the test framework just times out with no
explanation at all.
Secondly your suggestion would only work if subsequent steps in the chain would
not depend on the success of this. Further down the tests actually try to pass
data through the channel. If this point is reached without the channel being
open, but timeout has not fired yet, how should the test behave?
And what happens if the test finishes before the timeout kicked in?

On the other hand my test runs have shown (not surprisingly) that I need to
setup the callbacks before calling set*Descriptions, because otherwise the
default onopen handler marks it as a test failure (which is perfectly fine).

So I don't really see a way how to prevent that I need two steps here one for
setting up the callbacks, and the second one which waits for success. My current
implementation was just a lazy way of avoiding to carry state information, for
the price of not veryfying that the callbacks actually ever happened.

I see two alternatives forward:
1) In our general data channel tests we don't care if callbacks are firing, but
instead we write a separate test which only verifies that.
2) We verify proper callbacks in all data channel tests, but then need to add
some state information to our tests.
Attachment #8430424 - Flags: review?(rjesup)
(In reply to Nils Ohlmeier [:drno] from comment #15)
> Would it help if I provide a patch to template.js here and move the new test
> out into a separate Bugzilla ticket, which we can then independently track?

Happy to defer to you on this; looks like you're making good progress with this bug either way - so hopefully we should see fewer of these failures regardless :-)
Attachment #8430424 - Attachment is obsolete: true
Attachment #8427874 - Attachment is obsolete: true
Try run for the latest and greatest: https://tbpl.mozilla.org/?tree=Try&rev=96a298863ae9
Attachment #8432332 - Flags: review?(jib)
Comment on attachment 8432336 [details] [diff] [review]
bug_1013809_part_2_test_for_old_logic.patch

As suggested by Ed in another ticket I disabled the new test testing the old behavior on B2G for now as it seems to cause lots of problems.
Attachment #8432336 - Flags: review?(jib)
Attachment #8432338 - Flags: review?(jib)
Attachment #8432339 - Flags: review?(jib)
Attachment #8432332 - Flags: review?(jib) → review+
Attachment #8432336 - Flags: review?(jib) → review+
Comment on attachment 8432338 [details] [diff] [review]
bug_1013809_part_3_swap_setDesc_for_dataChannels.patch

Review of attachment 8432338 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with nits and onFailure callbacks in PC_XXX_SETUP_DATA_CHANNEL_CALLBACK

::: dom/media/tests/mochitest/pc.js
@@ +952,4 @@
>  
> +      function dataChannelConnected(channel) {
> +        clearTimeout(dcConnectionTimeout);
> +        is(channel.readyState, "open", peer + " dataChannel[0] is in state: 'open'");

dataChannels[0] plural

@@ +965,3 @@
>  
> +      //peer.registerDataChannelOpenEvents(dataChannelConnected);
> +      // TODO: we should have generic code handling these two cases the same way

Short of a new function that buries the if{}else{} I'm not sure what further optimization can be had here, looks pretty descent already if you ask me, but if you have something in mind that's great.

@@ +967,5 @@
> +      // TODO: we should have generic code handling these two cases the same way
> +      if (peer == this.pcLocal) {
> +        peer.dataChannels[0].onopen = dataChannelConnected;
> +      }
> +      else {

} else {

::: dom/media/tests/mochitest/templates.js
@@ +378,5 @@
> +    'PC_LOCAL_SETUP_DATA_CHANNEL_CALLBACK',
> +    function (test) {
> +      test.waitForInitialDataChannel(test.pcLocal, function () {
> +        ok(true, test.pcLocal + " dataChannel[0] switched to 'open'");
> +      }, function () { });

If datachannels fail early we want to catch that here just as much as later. I suggest reusing the same onFailure callback here as is used further down.

@@ +387,5 @@
> +    'PC_REMOTE_SETUP_DATA_CHANNEL_CALLBACK',
> +    function (test) {
> +      test.waitForInitialDataChannel(test.pcRemote, function () {
> +        ok(true, test.pcRemote + " dataChannel[0] switched to 'open'");
> +      }, function () { });

Same for remote
Attachment #8432338 - Flags: review?(jib) → review+
Comment on attachment 8432339 [details] [diff] [review]
bug_1013809_part_4_test_for_old_logic_data_channels.patch

Review of attachment 8432339 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/tests/mochitest/mochitest.ini
@@ +19,5 @@
>  skip-if = os == 'mac' || toolkit=='gonk' # b2g(Bug 960442, video support for WebRTC is disabled on b2g) b2g-debug(Bug 960442, video support for WebRTC is disabled on b2g)
>  [test_dataChannel_basicDataOnly.html]
>  [test_dataChannel_basicVideo.html]
> +skip-if = toolkit=='gonk'
> +[test_dataChannel_bug1013809.html]

Maybe repeat # b2g emulator seems to bee too slow (Bug 1016498 and 1008080)
Attachment #8432339 - Flags: review?(jib) → review+
Addressed NITs from jib.

Carrying forward r+=jib
Attachment #8432338 - Attachment is obsolete: true
Addressed NIT's from jib.

Carrying forward r+=jib
Attachment #8432339 - Attachment is obsolete: true
The summary for one of the patches that landed cited the wrong bug number (1013890 instead of 1013809).

https://hg.mozilla.org/mozilla-central/rev/98b14328c39a
(In reply to Jared Wein [:jaws] (please needinfo? me) from comment #30)
> The summary for one of the patches that landed cited the wrong bug number
> (1013890 instead of 1013809).
> 
> https://hg.mozilla.org/mozilla-central/rev/98b14328c39a

Sorry about that. Are there any tools for checking for such typo's before uploading/committing?
Not that I know of, although you can use the bzexport tool to create the blank patches for you and work from there. See https://blog.mozilla.org/sfink/2012/04/13/bzexport-changes-released/ for more details.
Blocks: 1016498
No longer depends on: 1016498
Duplicate of this bug: 1008080
You need to log in before you can comment on or make changes to this bug.