Closed Bug 840344 Opened 11 years ago Closed 11 years ago

Assertion failure: !other->mOtherDirection with multiple createAnswer() calls

Categories

(Core :: WebRTC: Signaling, defect, P2)

x86
macOS
defect

Tracking

()

VERIFIED FIXED
mozilla21
Tracking Status
firefox18 --- disabled
firefox19 --- disabled
firefox20 - disabled
firefox21 + fixed
firefox-esr17 --- disabled

People

(Reporter: ekr, Assigned: abr)

References

Details

(Keywords: crash, sec-high, testcase, Whiteboard: [WebRTC][blocking-webrtc+][adv-main21-])

Attachments

(3 files, 2 obsolete files)

Attached file Testcase
The attached test case will cause the assert above to fire.

The problem appears to be that we create the media objects twice, once for each createAnswer. That's bad and eventually causes this crash:


bt
#0  mozilla::WebrtcAudioConduit::Init (this=0x1230cc480, other=0x1212dbf70) at AudioConduit.cpp:129
#1  0x00000001046172f0 in mozilla::AudioSessionConduit::Create (aOther=0x1212dbf70) at AudioConduit.cpp:36
#2  0x0000000104630bae in vcmRxStartICE_m (mcap_id=0, group_id=0, stream_id=13, level=1, pc_stream_id=0, pc_track_id=1, call_handle=65538, peerconnection=0x11b819984 "fafaa84ea1902493", num_payloads=3, payloads=0x11cdc61a0, fingerprint_alg=0x1067d3750 "sha-256", fingerprint=0x1067d375a "DF:69:78:20:8D:2E:08:CE:49:82:A3:11:79:1D:BF:B5:49:49:2D:32:82:2F:0D:88:84:A7:C6:63:23:63:A9:0F", attrs=0x1148ed868) at VcmSIPCCBinding.cpp:1317
#3  0x0000000104635be5 in mozilla::runnable_args_nm_13_ret<int (*)(unsigned short, unsigned short, unsigned short, int, int, int, unsigned int, char const*, int, vcm_payload_info_t const*, char const*, char const*, vcm_attrs_t_*), unsigned short, unsigned short, unsigned short, int, int, int, unsigned int, char const*, int, vcm_payload_info_t const*, char const*, char const*, vcm_attrs_t_*, int>::Run (this=0x134f42ee0) at runnable_utils_generated.h:1291
#4  0x00000001039620d2 in nsThreadSyncDispatch::Run (this=0x134f8d5a0) at /Users/ekr/dev/mozilla-inbound/xpcom/threads/nsThread.cpp:774
#5  0x00000001039617c4 in nsThread::ProcessNextEvent (this=0x10bd035c0, mayWait=false, result=0x7fff5fbfcfd3) at /Users/ekr/dev/mozilla-inbound/xpcom/threads/nsThread.cpp:627
#6  0x00000001038c9ad2 in NS_ProcessPendingEvents_P (thread=0x10bd035c0, timeout=20) at nsThreadUtils.cpp:188
#7  0x000000010312579f in nsBaseAppShell::NativeEventCallback (this=0x10bd2f460) at /Users/ekr/dev/mozilla-inbound/widget/xpwidgets/nsBaseAppShell.cpp:97
#8  0x00000001030b645c in nsAppShell::ProcessGeckoEvents (aInfo=0x10bd2f460) at /Users/ekr/dev/mozilla-inbound/widget/cocoa/nsAppShell.mm:387
#9  0x00007fff90c664f1 in __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ ()
#10 0x00007fff90c65d5d in __CFRunLoopDoSources0 ()
#11 0x00007fff90c8cb49 in __CFRunLoopRun ()
#12 0x00007fff90c8c486 in CFRunLoopRunSpecific ()
#13 0x00007fff955202bf in RunCurrentEventLoopInMode ()
#14 0x00007fff955274bf in ReceiveNextEventCommon ()
#15 0x00007fff955273fa in BlockUntilNextEventMatchingListInMode ()
#16 0x00007fff8acf0779 in _DPSNextEvent ()
#17 0x00007fff8acf007d in -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] ()
#18 0x00000001030b4cc7 in -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] (self=0x10bd32860, _cmd=0x7fff8b5c8458, mask=18446744073709551615, expiration=0x422d63c37f00000d, mode=0x7fff7ae3aae0, flag=1 '\001') at /Users/ekr/dev/mozilla-inbound/widget/cocoa/nsAppShell.mm:164
#19 0x00007fff8acec9b9 in -[NSApplication run] ()
#20 0x00000001030b6edc in nsAppShell::Run (this=0x10bd2f460) at /Users/ekr/dev/mozilla-inbound/widget/cocoa/nsAppShell.mm:741
#21 0x0000000102d1c027 in nsAppStartup::Run (this=0x10bd2f3d0) at /Users/ekr/dev/mozilla-inbound/toolkit/components/startup/nsAppStartup.cpp:288
#22 0x000000010118fa08 in XREMain::XRE_mainRun (this=0x7fff5fbfece8) at /Users/ekr/dev/mozilla-inbound/toolkit/xre/nsAppRunner.cpp:3832
#23 0x000000010119018b in XREMain::XRE_main (this=0x7fff5fbfece8, argc=4, argv=0x7fff5fbff588, aAppData=0x7fff5fbfef18) at /Users/ekr/dev/mozilla-inbound/toolkit/xre/nsAppRunner.cpp:3899
#24 0x00000001011905cf in XRE_main (argc=4, argv=0x7fff5fbff588, aAppData=0x7fff5fbfef18, aFlags=0) at /Users/ekr/dev/mozilla-inbound/toolkit/xre/nsAppRunner.cpp:4102
#25 0x0000000100002219 in do_main (argc=4, argv=0x7fff5fbff588, xreDirectory=0x10b9005c0) at /Users/ekr/dev/mozilla-inbound/browser/app/nsBrowserApp.cpp:185
#26 0x000000010000184c in main (argc=4, argv=0x7fff5fbff588) at /Users/ekr/dev/mozilla-inbound/browser/app/nsBrowserApp.cpp:377
(gdb) 

Marking this bug security sensitive since it seems to be associated with a crash in the SRTP code in non-debug mode.
Flags: in-testsuite?
Keywords: crash, testcase
Severity: normal → critical
Whiteboard: [WebRTC][blocking-webrtc+]
Assignee: nobody → ekr
Priority: -- → P2
Adam,

This looks to be a signaling issue. Can you take a look.
Assignee: ekr → adam
Blocks: 836379
Blocks: 796463
There is (sort of) code to handle this kind of thing, but it's not plumbed all the way through -- at least, not correctly. Figuring out what needs to change (and how it needs to change) looked like it would quickly develop into a research project. I've opened Bug 840728 to deal with the longer-term issue of changing the local description for an existing PC once it's already been created.

I'm going to fix this bug (and Bug 834038 by happenstance) by simply checking inside fsmdef_ev_createanswer and fsmdef_ev_createoffer to see whether the local SDP is already defined; and, if so, return that SDP without taking any further actions.

This change will prevent crashing and other downright untoward behavior, but still leaves us without the ability to renegotiate sessions or do things like createAnswer/adjust constraints/createAnswer. Luckily, the use cases for these kinds of things are still comparatively exotic.
This isn't ready to go, as I still need to add a mochi test, but it should allow progress to be made on Bug 836379.
Keywords: sec-high
Attachment #713267 - Attachment is obsolete: true
Comment on attachment 713500 [details] [diff] [review]
Prevent multiple creations of local SDP

Now with 100% more test.
Attachment #713500 - Flags: review?(ekr)
Comment on attachment 713500 [details] [diff] [review]
Prevent multiple creations of local SDP

Adding myself as a reviewer for the mochitest. I've got feedback coming on this.
Attachment #713500 - Flags: review?(jsmith)
Comment on attachment 713500 [details] [diff] [review]
Prevent multiple creations of local SDP

Review of attachment 713500 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm

::: media/webrtc/signaling/src/sipcc/core/gsm/fsmdef.c
@@ +2865,5 @@
>      char                *ufrag = NULL;
>      char                *ice_pwd = NULL;
>      short               vcm_res;
>      session_data_t      *sess_data_p = NULL;
> +    char                *local_sdp = 0;

= NULL here and below.

@@ +2883,5 @@
>      }
> +
> +    // For now, if the local SDP has been set, we don't allow it to be set
> +    // again. This will change when we allow renegotiation of ongoing
> +    // sessions. See bug 840728.

Suggest C-style comments.
Attachment #713500 - Flags: review?(ekr) → review+
Comment on attachment 713500 [details] [diff] [review]
Prevent multiple creations of local SDP

Review of attachment 713500 [details] [diff] [review]:
-----------------------------------------------------------------

Mainly giving an review- for the watchdog timer. Mochitest supports a global timeout already, so no need to include that. Other items are improvement areas for consideration - not necessarily blocking landing.

::: dom/media/tests/mochitest/test_peerConnection_bug840344.html
@@ +14,5 @@
> +  </meta>
> +</head>
> +<body>
> +<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=840344">
> +Bug 840344</a>

Maybe keep this on the same line as above?

@@ +18,5 @@
> +Bug 840344</a>
> +<p id="display"></p>
> +<pre id="test">
> +<script class="testbody" type="application/javascript">
> +runTest(function () {

Nit - the style for these tests has had "runTest" be indented once and have a blank line before and after the script tags.

@@ +22,5 @@
> +runTest(function () {
> +  var pc = new mozRTCPeerConnection();
> +  var answerCount = 0;
> +  var setLocalCount = 0;
> +  var offer = { sdp: "v=0\r\n"+

Lots of complexity here. Maybe add a comment explaining what's going on here?

@@ +58,5 @@
> +
> +  // If we've hosed a callback, we don't want to just hang.
> +  // Ten seconds should be more than enough for this case to run.
> +  info("Setting watchdog timer");
> +  var watchdog = setTimeout(function() {

Mochitest technically has a global timeout that will fail eventually. So a general timeout mechanism isn't totally necessary.

Although you could do what I did with my timeouts with gum tests to do a "wait for condition" mechanism for an event. But there's debate on that one as there's controversy on ever deciding to use setTimeout in a mochitest, I'll leave that to you to decide which mechanism to go with.

@@ +65,5 @@
> +  },10000);
> +
> +
> +  // Fifth: Once we have two successful rounds through here, declare success
> +  var setLocalSuccess = function() {

Hmm...I think we need a deeper check here if it's possible. For example, what happens if setLocalCount ends up somehow becoming greater than 2? We should fail in this case.

@@ +68,5 @@
> +  // Fifth: Once we have two successful rounds through here, declare success
> +  var setLocalSuccess = function() {
> +    setLocalCount++;
> +    info("Set local description #" + answerCount);
> +    if (setLocalCount == 2) {

Nit - use triple equals

@@ +78,5 @@
> +  }
> +
> +  // Fourth: Count the answers and push them into the local description
> +  var answerSuccess = function(answer) {
> +    answerCount++;

Why not do any checking here of this answerCount value for validity?

@@ +84,5 @@
> +    is(answer.type,'answer',"Answer is of type 'answer'");
> +    ok(answer.sdp.length > 10, "Answer has length " + answer.sdp.length);
> +    info("Step 4: Set local description");
> +    pc.setLocalDescription(answer,
> +                           function(x) { setLocalSuccess() },

Why not call setLocalSuccess and have setLocalSuccess take the parameter?

@@ +90,5 @@
> +  };
> +
> +  // Third: Attempt to create an answer. Twice.
> +  var setRemoteSuccess = function() {
> +    info("Step 3: Create answer (twice)");

Nit - add a second info call to be explicit which createAnswer call is being made

@@ +97,5 @@
> +  }
> +
> +  // Second: set the remote description
> +  var gumSuccess = function(x) {
> +    info("Step 2: Set Remote Description");

Nit - I'd split this into three info calls - one mentioning when the stream is being added, another for creating the session description, and last for making the callback.

This sounds like logging a lot, but it will help with debugging if this test ever fails.

@@ +103,5 @@
> +    var osd = new mozRTCSessionDescription(offer);
> +    pc.setRemoteDescription(osd, setRemoteSuccess, unexpectedCallbackAndFinish);
> +  };
> +
> +  info("Step 1: Get User Media");

Nit - I'd mention you are doing an audio/video call here.

@@ +105,5 @@
> +  };
> +
> +  info("Step 1: Get User Media");
> +  // First: Kick off the chain of events by asking for a mic and camera
> +  getUserMedia({audio:true, video:true},

Nit - reading this file requires a lot of jumping up and down to understand the path. Can you order the function calls in such a way that it's:

function firstcallback() {
   secondcallback();
}

function secondcallback() {
  thirdcallback();
}

function thirdcallback() {

}

firstcallback()
Attachment #713500 - Flags: review?(ekr)
Attachment #713500 - Flags: review-
Attachment #713500 - Flags: review+
Attachment #713500 - Flags: review?(jsmith)
I have no idea what bugzilla just did. Sorry for the conflict, ekr.
Comment on attachment 713500 [details] [diff] [review]
Prevent multiple creations of local SDP

Review of attachment 713500 [details] [diff] [review]:
-----------------------------------------------------------------

I'm modifying the patch as suggested, with one exception (noted below)

::: dom/media/tests/mochitest/test_peerConnection_bug840344.html
@@ +14,5 @@
> +  </meta>
> +</head>
> +<body>
> +<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=840344">
> +Bug 840344</a>

I'm trying to keep line lengths less than 80, for the reasons listed here :https://developer.mozilla.org/en-US/docs/Developer_Guide/Coding_Style#Line_Length

I recognize that mochi tests aren't subject to C++ style guidelines, but find 80-character lines easier to work with.

@@ +68,5 @@
> +  // Fifth: Once we have two successful rounds through here, declare success
> +  var setLocalSuccess = function() {
> +    setLocalCount++;
> +    info("Set local description #" + answerCount);
> +    if (setLocalCount == 2) {

I'm confused on this point. While Crockford advises to "always" use === in preference to == ("Javascript: The Good Parts", First Edition, p.109), the Mozilla guidelines (https://developer.mozilla.org/en-US/docs/Developer_Guide/Coding_Style#Operators) pretty clearly specify that == is preferred over ===. Perhaps a discussion in a wider forum is in order.

In any case, I'm inclined to side with you and Crockford over the code guidelines as they are currently written, so I'll update the patch accordingly.
Attachment #713500 - Attachment is obsolete: true
Attachment #713500 - Flags: review?(ekr)
Comment on attachment 713562 [details] [diff] [review]
Prevent multiple creations of local SDP

r=ekr; Asking for re-review from jsmith for mochi test changes.


>[Security approval request comment]
>How easily could an exploit be constructed based on the patch?

The mochi test, if run on an unpatched browser, will trigger a browser crash. It is not clear whether an exploit is possible based on this crash, as the exact cause of the crash in non-debug builds is still unknown (see Bug 836379).

> Do comments in the patch, the check-in comment, or tests included in the 
> patch paint a bulls-eye on the security problem?

The mochitest does not highlight the nature of the problem being addressed, although a causal read of the code it exercises could lead one to believe that "something goes wrong" if createAnswer is called twice.

> Which older supported branches are affected by this flaw?
18 and 19, but it's behind a pref that is off by default.

> If not all supported branches, which bug introduced the flaw?

This was introduced with the initial creation of this code as part of the WebRTC feature implementation.

> Do you have backports for the affected branches? If not, how different,
> hard to create, and risky will they be?

The patch should be easy to apply to previous branches.

> How likely is this patch to cause regressions; how much testing does it need?

The new code shouldn't be triggered except in exceptional behavior that would have previously caused browser crashes. Regressions are unlikely.
Attachment #713562 - Flags: sec-approval?
Attachment #713562 - Flags: review?(jsmith)
Comment on attachment 713562 [details] [diff] [review]
Prevent multiple creations of local SDP

Review of attachment 713562 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #713562 - Flags: review?(jsmith) → review+
Attachment #713562 - Flags: sec-approval? → sec-approval+
Comment on attachment 713628 [details] [diff] [review]
Disable renegotiation signaling_unittests

These three unit tests were using the (limited) renegotiation support, and fail with the other patch applied.
Attachment #713628 - Flags: review?(ekr)
And there's no way I'm landing my first patch on m-i without a try push: https://tbpl.mozilla.org/?tree=Try&rev=99ccd1059a52
Comment on attachment 713628 [details] [diff] [review]
Disable renegotiation signaling_unittests

Review of attachment 713628 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #713628 - Flags: review?(ekr) → review+
Okay, that try push seems to have burned the browserchrome tests on all the OS X and Linux builds. That seems like more than a coincidence. I'm holding off landing this until I understand what happened there.
This is likely someone else's bug - what rev are you based on?  

See something landed yesterday and then reverted in:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a20345bfe817
which did exactly what you seem to be seeing.

I don't think those oranges have anything to do with your push.
(In reply to Randell Jesup [:jesup] from comment #20)
> This is likely someone else's bug - what rev are you based on?  
> 
> See something landed yesterday and then reverted in:
> https://hg.mozilla.org/integration/mozilla-inbound/rev/a20345bfe817
> which did exactly what you seem to be seeing.
> 
> I don't think those oranges have anything to do with your push.

Good call.

Looking at the logs, it looks like that reversion landed after the base I used for my try (957ede94acc3). I'm re-trying the bc tests on Linux and OS X:

https://tbpl.mozilla.org/?tree=Try&rev=d7a102222c94
Attachment #713562 - Flags: checkin+
Attachment #713628 - Flags: checkin+
Well, this just blew up on inbound. See followup bug 841496.
Depends on: 841496
https://hg.mozilla.org/mozilla-central/rev/04a987a5327c
https://hg.mozilla.org/mozilla-central/rev/ce75b2d18c4c
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: in-testsuite? → in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Verified automatically by the mochitest on check in.
Status: RESOLVED → VERIFIED
Whiteboard: [WebRTC][blocking-webrtc+] → [WebRTC][blocking-webrtc+][adv-main21-]
Group: core-security
You need to log in before you can comment on or make changes to this bug.