mozRTCPeerConnection crash with GC/CC

RESOLVED FIXED in Firefox 27

Status

()

Core
WebRTC
--
critical
RESOLVED FIXED
5 years ago
3 years ago

People

(Reporter: Jesse Ruderman, Assigned: jib)

Tracking

(7 keywords)

Trunk
mozilla27
assertion, crash, csectype-uaf, regression, sec-critical, testcase, verifyme
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox25 unaffected, firefox26 unaffected, firefox27+ fixed, firefox-esr17 unaffected, firefox-esr24 unaffected, b2g18 unaffected, b2g-v1.2 unaffected)

Details

(Whiteboard: [webrtc][fix for testcase 2 only], crash signature)

Attachments

(6 attachments, 4 obsolete attachments)

(Reporter)

Description

5 years ago
1. Create a new profile with:
     user_pref("media.peerconnection.enabled", true);
2. Install https://www.squarefree.com/extensions/domFuzzLite3.xpi
3. Load the testcase
(Reporter)

Comment 1

5 years ago
Created attachment 818821 [details]
testcase 1

This testcase causes a null deref.
(Reporter)

Comment 2

5 years ago
Created attachment 818823 [details]
stack for testcase 1 (gdb)
(Reporter)

Comment 3

5 years ago
Created attachment 818824 [details]
testcase 2

This testcase causes a use-after-free.
(Reporter)

Comment 4

5 years ago
Created attachment 818825 [details]
stacks for testcase 2 (ASan)
(Reporter)

Comment 5

5 years ago
Created attachment 818826 [details]
stack for testcase 2 (mdsw)

Hit MOZ_CRASH(PeerConnectionObserver not thread-safe) at ./PeerConnectionObserverBinding.cpp:2031
Keywords: csec-uaf, sec-critical
(In reply to Jesse Ruderman from comment #1)
> Created attachment 818821 [details]
> testcase 1
> 
> This testcase causes a null deref.

I've tried this repeatedly on an inbound debug build on Linux with no problems.  Will try on mac
(In reply to Jesse Ruderman from comment #2)
> Created attachment 818823 [details]
> stack for testcase 1 (gdb)

Note this null deref is while handling IceStateChange_m()
(In reply to Jesse Ruderman from comment #4)
> Created attachment 818825 [details]
> stacks for testcase 2 (ASan)

This is crashing in mozilla::runnable_args_m_1<nsRefPtr<sipcc::PeerConnectionImpl>, tag_nsresult (sipcc::PeerConnectionImpl::*)(mozilla::dom::PCImplIceState), mozilla::dom::PCImplIceState>::Run()

The object it's trying to reference was allocated in PCObserverBinding

Appears related to ICE again.  Both of these may be timing-related.
(In reply to Randell Jesup [:jesup] from comment #8)
> (In reply to Jesse Ruderman from comment #4)
> > Created attachment 818825 [details]
> > stacks for testcase 2 (ASan)
> 
> This is crashing in
> mozilla::runnable_args_m_1<nsRefPtr<sipcc::PeerConnectionImpl>, tag_nsresult
> (sipcc::PeerConnectionImpl::*)(mozilla::dom::PCImplIceState),
> mozilla::dom::PCImplIceState>::Run()

And this is also calling PeerConnectionImpl::IceStateChange_m()  (per  "stack for testcase 2 (mdsw)"), so these seem to be the same failure.

Updated

5 years ago
Whiteboard: [webrtc]
I have now been able to repro this on an inbound debug build (fresh pull) on Linux, with the just-landed bug 902003 patch (and it crashed dereferencing using mPCObserver.MayGet() (PeerConnectionImpl.cpp:1634), which was added as part of the webidl conversion, so I believe the invariant given for using a raw pointer here in PeerConnectionImpl.h:505 is wrong or incomplete).

Jesse: what was the pull you were using?  I suspect it didn't include bug 902003 (landed on inbound at 5:13pm PDT), but bug 902003 may make this problem easier to find - or I just got lucky this time.

Before the WebIDL conversion monday, mPCObserver was an nsWeakPtr.  It's now a WeakReminder wrapping a raw pointer

If this is indeed due to the WebIDL conversion that landed Monday, then this is regression in 27.  :-)

Jib, please take point on investigating this.
Assignee: nobody → jib
Depends on: 902003
Flags: needinfo?(jruderman)
I believe I know what the problem is and I'm working on a fix which I hope to have ready later today. I hope we can avoid backing stuff out as the changes are significant, other patches have landed that rely on it, and I know of several patch-queues that would be in rebase limbo.

Relying on PeerconnectionImpl.close() to clear the raw pointer to the PCObserver was my mistake, as it gets called explicitly only when the page is torn down, and implicitly by ~PeerConnectionImpl.

When content JS drops all references to the mozPeerConnection, nothing holds the PCObserver alive, so if you cycle-collect then the observer goes away before PeerConnectionImpl, because our runnables probably keep the latter alive longer. These runnables then hit the observer before ~PeerConnectionImpl has been called, and we  get null-refs in the PeerConnectionObserverBindings.

I'm working on a fix that will restore behavior similar to the weak-pointers we used to have here. I think I figured out a way to use them with concrete objects. That should bring us back to safety.
status-b2g18: --- → unaffected
status-firefox25: --- → unaffected
status-firefox26: --- → unaffected
status-firefox27: --- → affected
status-firefox-esr17: --- → unaffected
status-firefox-esr24: --- → unaffected
tracking-firefox27: --- → ?
Keywords: regression
Created attachment 819070 [details] [diff] [review]
nsWeakPtr on guard object detects PeerConnectionObserver going away

Added a new WeakConcretePtr class which uses an nsWeakPtr to guard a raw pointer, and uses that to detect observer being gone.

This weak-ref part now behaves much more like things were before bug 917328, which seems safer - i.e. Compare to https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=917328&attachment=816677

I was able to reproduce, and this seems to fix it, but please re-test! Thanks.
Attachment #819070 - Flags: review?(rjesup)
Hmm, except now signaling_unittest crashes :-( Probably because I comment out the meat of WeakConcretePtr for the unittests, making it a regular pointer. Will have to find a way to include it.

Comment 14

5 years ago
Comment on attachment 819070 [details] [diff] [review]
nsWeakPtr on guard object detects PeerConnectionObserver going away

Review of attachment 819070 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.h
@@ +488,1 @@
>    // We hold a raw pointer to PeerConnectionObserver (no WeakRefs to concretes!)

This comment block needs to be updated.
Created attachment 819169 [details] [diff] [review]
nsWeakPtr on guard object detects PeerConnectionObserver going away (2)

The signaling_unittest crashes appeared to be unrelated, as they happen on my system even before this patch. - Regardless, I had already gone ahead and updated the patch to use WeakPtrs in unittest too, for less difference.

Also fixed some nits and renamed the class from abr's suggestion.

Try from first version looks green - https://tbpl.mozilla.org/?tree=Try&rev=3c54ba413bda
Attachment #819070 - Attachment is obsolete: true
Attachment #819070 - Flags: review?(rjesup)
Attachment #819169 - Flags: review?(rjesup)
Attachment #819169 - Flags: review?(adam)
Created attachment 819206 [details] [diff] [review]
nsWeakPtr on guard object detects PeerConnectionObserver going away (3)

- Added a backpointer in the pc._guard object to prevent pc from going away.
- Updated comment.
Attachment #819169 - Attachment is obsolete: true
Attachment #819169 - Flags: review?(rjesup)
Attachment #819169 - Flags: review?(adam)
Attachment #819206 - Flags: review?(rjesup)
Attachment #819206 - Flags: review?(adam)
I meant pco._guard and pco.

Updated

5 years ago
Attachment #819206 - Flags: review?(rjesup) → review+

Comment 18

5 years ago
Comment on attachment 819206 [details] [diff] [review]
nsWeakPtr on guard object detects PeerConnectionObserver going away (3)

Review of attachment 819206 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me.
Attachment #819206 - Flags: review?(adam) → review+
Try looks good - https://tbpl.mozilla.org/?tree=Try&rev=e1ffa8dc134b
Keywords: checkin-needed

Comment 21

5 years ago
I backed this out because it calls a GetWeakReferent function which doesn't exist anywhere:

https://tbpl.mozilla.org/php/getParsedLog.php?id=29366148&tree=Mozilla-Inbound
https://hg.mozilla.org/integration/mozilla-inbound/rev/77f70432fdcd
Created attachment 819390 [details] [diff] [review]
nsWeakPtr on guard object detects PeerConnectionObserver going away (4)

(In reply to :Ehsan Akhgari (needinfo? me!) from comment #21)
> I backed this out because it calls a GetWeakReferent function which doesn't
> exist anywhere:

Thanks, GetWeakReferent is a new webidl attribute in the patch itself and it compiles locally and worked on try in Comment 19, but I didn't alter the contract uuid of the PeerConnectionObserver interface. Maybe that's the problem? I'm trying that here.

If this doesn't work, it might need a clobber due to bug 928195 again.
Attachment #819206 - Attachment is obsolete: true
Attachment #819390 - Flags: review+
Created attachment 819391 [details] [diff] [review]
nsWeakPtr on guard object detects PeerConnectionObserver going away (4). r=abr, jesup

Same patch, just making clear in description that it has been reviewed.
Carrying forward r+ from abr and jesup.
Attachment #819390 - Attachment is obsolete: true
Attachment #819391 - Flags: review+
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #21)
> I backed this out because it calls a GetWeakReferent function which doesn't
> exist anywhere:
> 
> https://tbpl.mozilla.org/php/getParsedLog.php?id=29366148&tree=Mozilla-
> Inbound
> https://hg.mozilla.org/integration/mozilla-inbound/rev/77f70432fdcd
(Sorry, I had leftover crud in the comment field)
Relanded with CLOBBER in https://hg.mozilla.org/integration/mozilla-inbound/rev/cc9ce7c8ab99

build system messed up webidl dependencies again (previous push and two following were green, then one went red and ehsan backed it out).
If the CID needs to up updated when the webidl changes, then we should revise and reland. 

(I think I'd flagged on a review - perhaps bug 902003 - that you should check if CIDs need to be revised if the webidl changes - I don't think so, but perhaps that's wrong.)
https://hg.mozilla.org/mozilla-central/rev/dc2b71e57211
Status: NEW → RESOLVED
Last Resolved: 5 years ago
status-firefox27: affected → fixed
Keywords: checkin-needed
Resolution: --- → FIXED

Updated

5 years ago
tracking-firefox27: ? → +

Updated

5 years ago
Keywords: verifyme
jesse,

I tried to run testcase 2 under ASan and I get:

JavaScript error: https://bug928221.bugzilla.mozilla.org/attachment.cgi?id=818821&t=xAWnbBNhMv, line 20: fuzzPriv is not defined

Can you advise how to run?
You have to install the domFuzzLite addon.  See step 2 of comment 0.

Comment 32

5 years ago
Just so we're clear here, this bug report appears to deal with two different bugs that can be evoked using somewhat similar tests. But the stacks are very different and don't seem to bear any relation to each other.

The patch that we landed seems to be working for "testcase 2." But there's a different flaw here, and it's evoked by "testcase 1."

Given the rather substantial differences in the stacks, I'm pretty sure these should have been filed as two different bug numbers.
I filed bug 928221 for testcase 1.
Status: REOPENED → RESOLVED
Last Resolved: 5 years ago5 years ago
Resolution: --- → FIXED
Whiteboard: [webrtc] → [webrtc][fix for testcase 2 only]
You mean Bug 933447
(Reporter)

Updated

5 years ago
Flags: needinfo?(jruderman)
:jib says this is a regression from bug 882145 as is bug 933447, which means Firefox 26 should be affected as well. The symptoms in this bug is more troubling than those Jesse saw in testcase1, and the patch is more revealing. We probably want to get this in on Beta as well. And if we take this one maybe we should take bug 933447 as well.
Blocks: 882145
status-firefox26: unaffected → affected
tracking-firefox26: --- → ?
Even though this affects Firefox 26 on Desktop/Android, PeerConnection is disabled on b2g1.2 so we don't have to worry about that long-lived branch.
status-b2g-v1.2: --- → disabled
(In reply to Daniel Veditz [:dveditz] from comment #35)
> :jib says this is a regression from bug 882145 as is bug 933447, which means
> Firefox 26 should be affected as well.

My mistake. Regressions are from Bug 917328 which landed 10-14. I confused the bugs.

Everything is fine here. Sorry about the false alarm.
Blocks: 917328
No longer blocks: 882145
status-b2g-v1.2: disabled → unaffected
status-firefox26: affected → unaffected
tracking-firefox26: ? → ---
No longer depends on: 902003
I can't reproduce any crash with either test case with builds of FF27 from 2013-10-27 and before.

Jesse, if you can verify that your tests no longer crash, I'd appreciate it. Thanks.
(Reporter)

Comment 39

5 years ago
I noticed an odd assertion on mozilla-central and filed bug 944854.
Group: core-security
You need to log in before you can comment on or make changes to this bug.