Closed
Bug 928221
Opened 11 years ago
Closed 11 years ago
mozRTCPeerConnection crash with GC/CC
Categories
(Core :: WebRTC, defect)
Core
WebRTC
Tracking
()
RESOLVED
FIXED
mozilla27
Tracking | Status | |
---|---|---|
firefox25 | --- | unaffected |
firefox26 | --- | unaffected |
firefox27 | + | fixed |
firefox-esr17 | --- | unaffected |
firefox-esr24 | --- | unaffected |
b2g18 | --- | unaffected |
b2g-v1.2 | --- | unaffected |
People
(Reporter: jruderman, Assigned: jib)
References
Details
(7 keywords, Whiteboard: [webrtc][fix for testcase 2 only])
Crash Data
Attachments
(6 files, 4 obsolete files)
1. Create a new profile with: user_pref("media.peerconnection.enabled", true); 2. Install https://www.squarefree.com/extensions/domFuzzLite3.xpi 3. Load the testcase
Reporter | ||
Comment 1•11 years ago
|
||
This testcase causes a null deref.
Reporter | ||
Comment 2•11 years ago
|
||
Reporter | ||
Comment 3•11 years ago
|
||
This testcase causes a use-after-free.
Reporter | ||
Comment 4•11 years ago
|
||
Reporter | ||
Comment 5•11 years ago
|
||
Hit MOZ_CRASH(PeerConnectionObserver not thread-safe) at ./PeerConnectionObserverBinding.cpp:2031
Updated•11 years ago
|
Keywords: csec-uaf,
sec-critical
Comment 6•11 years ago
|
||
(In reply to Jesse Ruderman from comment #1) > Created attachment 818821 [details] > testcase 1 > > This testcase causes a null deref. I've tried this repeatedly on an inbound debug build on Linux with no problems. Will try on mac
Comment 7•11 years ago
|
||
(In reply to Jesse Ruderman from comment #2) > Created attachment 818823 [details] > stack for testcase 1 (gdb) Note this null deref is while handling IceStateChange_m()
Comment 8•11 years ago
|
||
(In reply to Jesse Ruderman from comment #4) > Created attachment 818825 [details] > stacks for testcase 2 (ASan) This is crashing in mozilla::runnable_args_m_1<nsRefPtr<sipcc::PeerConnectionImpl>, tag_nsresult (sipcc::PeerConnectionImpl::*)(mozilla::dom::PCImplIceState), mozilla::dom::PCImplIceState>::Run() The object it's trying to reference was allocated in PCObserverBinding Appears related to ICE again. Both of these may be timing-related.
Comment 9•11 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #8) > (In reply to Jesse Ruderman from comment #4) > > Created attachment 818825 [details] > > stacks for testcase 2 (ASan) > > This is crashing in > mozilla::runnable_args_m_1<nsRefPtr<sipcc::PeerConnectionImpl>, tag_nsresult > (sipcc::PeerConnectionImpl::*)(mozilla::dom::PCImplIceState), > mozilla::dom::PCImplIceState>::Run() And this is also calling PeerConnectionImpl::IceStateChange_m() (per "stack for testcase 2 (mdsw)"), so these seem to be the same failure.
Updated•11 years ago
|
Whiteboard: [webrtc]
Comment 10•11 years ago
|
||
I have now been able to repro this on an inbound debug build (fresh pull) on Linux, with the just-landed bug 902003 patch (and it crashed dereferencing using mPCObserver.MayGet() (PeerConnectionImpl.cpp:1634), which was added as part of the webidl conversion, so I believe the invariant given for using a raw pointer here in PeerConnectionImpl.h:505 is wrong or incomplete). Jesse: what was the pull you were using? I suspect it didn't include bug 902003 (landed on inbound at 5:13pm PDT), but bug 902003 may make this problem easier to find - or I just got lucky this time. Before the WebIDL conversion monday, mPCObserver was an nsWeakPtr. It's now a WeakReminder wrapping a raw pointer If this is indeed due to the WebIDL conversion that landed Monday, then this is regression in 27. :-) Jib, please take point on investigating this.
Assignee | ||
Comment 11•11 years ago
|
||
I believe I know what the problem is and I'm working on a fix which I hope to have ready later today. I hope we can avoid backing stuff out as the changes are significant, other patches have landed that rely on it, and I know of several patch-queues that would be in rebase limbo. Relying on PeerconnectionImpl.close() to clear the raw pointer to the PCObserver was my mistake, as it gets called explicitly only when the page is torn down, and implicitly by ~PeerConnectionImpl. When content JS drops all references to the mozPeerConnection, nothing holds the PCObserver alive, so if you cycle-collect then the observer goes away before PeerConnectionImpl, because our runnables probably keep the latter alive longer. These runnables then hit the observer before ~PeerConnectionImpl has been called, and we get null-refs in the PeerConnectionObserverBindings. I'm working on a fix that will restore behavior similar to the weak-pointers we used to have here. I think I figured out a way to use them with concrete objects. That should bring us back to safety.
Updated•11 years ago
|
status-b2g18:
--- → unaffected
status-firefox25:
--- → unaffected
status-firefox26:
--- → unaffected
status-firefox27:
--- → affected
status-firefox-esr17:
--- → unaffected
status-firefox-esr24:
--- → unaffected
tracking-firefox27:
--- → ?
Keywords: regression
Assignee | ||
Comment 12•11 years ago
|
||
Added a new WeakConcretePtr class which uses an nsWeakPtr to guard a raw pointer, and uses that to detect observer being gone. This weak-ref part now behaves much more like things were before bug 917328, which seems safer - i.e. Compare to https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=917328&attachment=816677 I was able to reproduce, and this seems to fix it, but please re-test! Thanks.
Attachment #819070 -
Flags: review?(rjesup)
Assignee | ||
Comment 13•11 years ago
|
||
Hmm, except now signaling_unittest crashes :-( Probably because I comment out the meat of WeakConcretePtr for the unittests, making it a regular pointer. Will have to find a way to include it.
Comment 14•11 years ago
|
||
Comment on attachment 819070 [details] [diff] [review] nsWeakPtr on guard object detects PeerConnectionObserver going away Review of attachment 819070 [details] [diff] [review]: ----------------------------------------------------------------- ::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.h @@ +488,1 @@ > // We hold a raw pointer to PeerConnectionObserver (no WeakRefs to concretes!) This comment block needs to be updated.
Assignee | ||
Comment 15•11 years ago
|
||
The signaling_unittest crashes appeared to be unrelated, as they happen on my system even before this patch. - Regardless, I had already gone ahead and updated the patch to use WeakPtrs in unittest too, for less difference. Also fixed some nits and renamed the class from abr's suggestion. Try from first version looks green - https://tbpl.mozilla.org/?tree=Try&rev=3c54ba413bda
Attachment #819070 -
Attachment is obsolete: true
Attachment #819070 -
Flags: review?(rjesup)
Attachment #819169 -
Flags: review?(rjesup)
Attachment #819169 -
Flags: review?(adam)
Assignee | ||
Comment 16•11 years ago
|
||
- Added a backpointer in the pc._guard object to prevent pc from going away. - Updated comment.
Attachment #819169 -
Attachment is obsolete: true
Attachment #819169 -
Flags: review?(rjesup)
Attachment #819169 -
Flags: review?(adam)
Attachment #819206 -
Flags: review?(rjesup)
Attachment #819206 -
Flags: review?(adam)
Assignee | ||
Comment 17•11 years ago
|
||
I meant pco._guard and pco.
Updated•11 years ago
|
Attachment #819206 -
Flags: review?(rjesup) → review+
Comment 18•11 years ago
|
||
Comment on attachment 819206 [details] [diff] [review] nsWeakPtr on guard object detects PeerConnectionObserver going away (3) Review of attachment 819206 [details] [diff] [review]: ----------------------------------------------------------------- Looks good to me.
Attachment #819206 -
Flags: review?(adam) → review+
Assignee | ||
Comment 19•11 years ago
|
||
Try looks good - https://tbpl.mozilla.org/?tree=Try&rev=e1ffa8dc134b
Keywords: checkin-needed
Comment 20•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/dc2b71e57211
Target Milestone: --- → mozilla27
Comment 21•11 years ago
|
||
I backed this out because it calls a GetWeakReferent function which doesn't exist anywhere: https://tbpl.mozilla.org/php/getParsedLog.php?id=29366148&tree=Mozilla-Inbound https://hg.mozilla.org/integration/mozilla-inbound/rev/77f70432fdcd
Assignee | ||
Comment 22•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #21) > I backed this out because it calls a GetWeakReferent function which doesn't > exist anywhere: Thanks, GetWeakReferent is a new webidl attribute in the patch itself and it compiles locally and worked on try in Comment 19, but I didn't alter the contract uuid of the PeerConnectionObserver interface. Maybe that's the problem? I'm trying that here. If this doesn't work, it might need a clobber due to bug 928195 again.
Attachment #819206 -
Attachment is obsolete: true
Attachment #819390 -
Flags: review+
Assignee | ||
Comment 23•11 years ago
|
||
Same patch, just making clear in description that it has been reviewed. Carrying forward r+ from abr and jesup.
Attachment #819390 -
Attachment is obsolete: true
Attachment #819391 -
Flags: review+
Assignee | ||
Comment 24•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #21) > I backed this out because it calls a GetWeakReferent function which doesn't > exist anywhere: > > https://tbpl.mozilla.org/php/getParsedLog.php?id=29366148&tree=Mozilla- > Inbound > https://hg.mozilla.org/integration/mozilla-inbound/rev/77f70432fdcd
Assignee | ||
Comment 25•11 years ago
|
||
(Sorry, I had leftover crud in the comment field)
Comment 26•11 years ago
|
||
Relanded with CLOBBER in https://hg.mozilla.org/integration/mozilla-inbound/rev/cc9ce7c8ab99 build system messed up webidl dependencies again (previous push and two following were green, then one went red and ehsan backed it out).
Comment 27•11 years ago
|
||
If the CID needs to up updated when the webidl changes, then we should revise and reland. (I think I'd flagged on a review - perhaps bug 902003 - that you should check if CIDs need to be revised if the webidl changes - I don't think so, but perhaps that's wrong.)
Comment 28•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/dc2b71e57211
Updated•11 years ago
|
Comment 29•11 years ago
|
||
Nightly build 20131024030204 crashes in testcase 1 see more crash reports here: https://crash-stats.mozilla.com/report/list?product=Firefox&signature=mozilla%3A%3Adom%3A%3ACallbackObject%3A%3ACallSetup%3A%3ACallSetup%28JS%3A%3AHandle%3CJSObject*%3E%2C+mozilla%3A%3AErrorResult%26%2C+mozilla%3A%3Adom%3A%3ACallbackObject%3A%3AExceptionHandling%2C+JSCompartment*%29
Status: RESOLVED → REOPENED
OS: Mac OS X → All
Hardware: x86_64 → All
Resolution: FIXED → ---
Comment 30•11 years ago
|
||
jesse, I tried to run testcase 2 under ASan and I get: JavaScript error: https://bug928221.bugzilla.mozilla.org/attachment.cgi?id=818821&t=xAWnbBNhMv, line 20: fuzzPriv is not defined Can you advise how to run?
Comment 31•11 years ago
|
||
You have to install the domFuzzLite addon. See step 2 of comment 0.
Comment 32•11 years ago
|
||
Just so we're clear here, this bug report appears to deal with two different bugs that can be evoked using somewhat similar tests. But the stacks are very different and don't seem to bear any relation to each other. The patch that we landed seems to be working for "testcase 2." But there's a different flaw here, and it's evoked by "testcase 1." Given the rather substantial differences in the stacks, I'm pretty sure these should have been filed as two different bug numbers.
Comment 33•11 years ago
|
||
I filed bug 928221 for testcase 1.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Whiteboard: [webrtc] → [webrtc][fix for testcase 2 only]
Assignee | ||
Comment 34•11 years ago
|
||
You mean Bug 933447
Reporter | ||
Updated•11 years ago
|
Flags: needinfo?(jruderman)
Comment 35•11 years ago
|
||
:jib says this is a regression from bug 882145 as is bug 933447, which means Firefox 26 should be affected as well. The symptoms in this bug is more troubling than those Jesse saw in testcase1, and the patch is more revealing. We probably want to get this in on Beta as well. And if we take this one maybe we should take bug 933447 as well.
Comment 36•11 years ago
|
||
Even though this affects Firefox 26 on Desktop/Android, PeerConnection is disabled on b2g1.2 so we don't have to worry about that long-lived branch.
status-b2g-v1.2:
--- → disabled
Assignee | ||
Comment 37•11 years ago
|
||
(In reply to Daniel Veditz [:dveditz] from comment #35) > :jib says this is a regression from bug 882145 as is bug 933447, which means > Firefox 26 should be affected as well. My mistake. Regressions are from Bug 917328 which landed 10-14. I confused the bugs. Everything is fine here. Sorry about the false alarm.
Updated•11 years ago
|
tracking-firefox26:
? → ---
No longer depends on: 902003
Comment 38•11 years ago
|
||
I can't reproduce any crash with either test case with builds of FF27 from 2013-10-27 and before. Jesse, if you can verify that your tests no longer crash, I'd appreciate it. Thanks.
Reporter | ||
Comment 39•11 years ago
|
||
I noticed an odd assertion on mozilla-central and filed bug 944854.
Updated•9 years ago
|
Group: core-security
You need to log in
before you can comment on or make changes to this bug.
Description
•