Closed Bug 801227 Opened 12 years ago Closed 12 years ago

WebRTC crash [@mozilla::MediaManager::GetUserMedia]

Categories

(Core :: WebRTC, defect, P2)

x86_64
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla19
Tracking Status
firefox17 --- unaffected
firefox18 --- disabled
firefox19 + fixed
firefox-esr10 --- unaffected
firefox-esr17 --- unaffected

People

(Reporter: posidron, Assigned: jesup)

References

Details

(4 keywords, Whiteboard: [asan][WebRTC][blocking-webrtc+][qa-][adv-main19-])

Crash Data

Attachments

(4 files, 4 obsolete files)

Attached file testcase (obsolete) —
This testcase simulates an abrupt abort during the setup of a valid WebRTC session. The result is a use after free.

alloc
mozilla::MediaManager::GetUserMedia mozalloc.h:200

free
mozilla::MediaManager::OnNavigation MediaManager.cpp:827

reuse
mozilla::GetUserMediaStreamRunnable::Run MediaManager.cpp:257
Attached file callstack
Adding anant, author of this code
Whiteboard: [asan] → [asan][fuzzblocker]
Try run is ongoing at https://tbpl.mozilla.org/?tree=Try&rev=caf3a0111ea0

Once we have an ASAN build from that run, we can verify that it fixes this issue.
Assignee: nobody → anant
This is btw a regression. My debug build from Oct 12th didn't crash while a more recent one does.
Status: NEW → ASSIGNED
Regression window wanted isn't warranted here - we've already got a test case that reproduces the issue consistently and a fix has already put out in that try build to see if it fixes the issue.
Attached patch crashtest v1 (obsolete) — Splinter Review
This is the reduced crashtest for the attached testcase. Please do not check-in before the actual code lands. It's also based on my patch on bug 791278 to not cause broken hunks.
Attachment #672321 - Flags: review?(rjesup)
Attached patch crashtest v1 (obsolete) — Splinter Review
Attachment #672321 - Attachment is obsolete: true
Attachment #672321 - Flags: review?(rjesup)
Attachment #672323 - Flags: review?(rjesup)
Attachment #672323 - Flags: review?(rjesup) → review+
(In reply to Jason Smith [:jsmith] from comment #5)
> Regression window wanted isn't warranted here - we've already got a test
> case that reproduces the issue consistently and a fix has already put out in
> that try build to see if it fixes the issue.

This is not a justification that we are not interested in the regressing bug. It's always helpful to know which changeset has caused the regression. Only that way we can explicitly test the code path.
Priority: -- → P2
Whiteboard: [asan][fuzzblocker] → [asan][fuzzblocker][WebRTC][blocking-webrtc+]
Attached patch Patch v1 (obsolete) — Splinter Review
Is it possible to fetch a ASAN build from that try run to see if it fixes the issue? I'm uploading the patch in any case, so you can also run a local build.
Attachment #672556 - Flags: review?(rjesup)
(In reply to Henrik Skupin (:whimboo) from comment #8)
> (In reply to Jason Smith [:jsmith] from comment #5)
> > Regression window wanted isn't warranted here - we've already got a test
> > case that reproduces the issue consistently and a fix has already put out in
> > that try build to see if it fixes the issue.
> 
> This is not a justification that we are not interested in the regressing
> bug. It's always helpful to know which changeset has caused the regression.
> Only that way we can explicitly test the code path.

No. Usually we use these keywords as an informational element to get enough information to solve the problem. We can't look at every single bug (as there's too many against our existing resources), so we can't be perfect with having every little detail, but if there's enough information to solve the problem, then that's acceptable. Abusing the flag though is going to do nothing but cause confusion between where the regression window is actually warranted (e.g. cases where root cause is not clear and foggy) vs. others where there's enough clarity.
Does this affect FF 18?
Yes, it does even with a bit different stack: 

bp-75c780f4-4225-4964-afb7-2fcd22121019

0 	XUL 	mozilla::MediaPipelineReceiveAudio::Init 	nsDOMMediaStream.h:37
1 	XUL 	mozilla::MediaPipelineReceiveAudio::MediaPipelineReceiveAudio 	MediaPipeline.h:265
2 	XUL 	vcmRxStartICE 	MediaPipeline.h:266
3 	XUL 	lsm_rx_start 	lsm.c:1042
4 	XUL 	lsm_update_media 	lsm.c:3905
5 	XUL 	cc_call_state 	lsm.c:3951
6 	XUL 	fsmdef_ev_setlocaldesc 	fsmdef.c:3160
7 	XUL 	sm_process_event 	sm.c:83
8 	XUL 	fim_process_event 	fim.c:671
9 	XUL 	gsm_process_msg 	gsm.c:167
10 	XUL 	GSMTask 	gsm.c:359
11 	libsystem_c.dylib 	libsystem_c.dylib@0x4e8be 	
12 	libsystem_c.dylib 	libsystem_c.dylib@0x51b74 	
13 	XUL 	XUL@0x14c9aff
Crash Signature: [@ mozilla::MediaManager::GetUserMedia] [@ mozilla::MediaPipelineReceiveAudio::Init]
Attachment #672556 - Flags: review?(rjesup) → review+
David, PeerConnection functionality is available in FF 18 but is behind a pref, so it will affect only a small set of developers who have the pref turned on. We would still like to uplift this to Aurora though.
Is the patch ready to land? If yes, we should also include the crashtest.
Let's refresh this patch (other changes have landed), reverify the problem exists, and ask for sec-approval (since apparently the bug existed, even if in a slightly different form, in 18)
Anant are you able to refresh per comment 16?
Flags: needinfo?(anant)
Attachment #672556 - Attachment is obsolete: true
Attachment #671022 - Attachment is obsolete: true
Comment on attachment 680452 [details] [diff] [review]
Patch v2 (unbitrotted)

Since I had to significantly un-bitrot it, r? to anant
Attachment #680452 - Flags: review?(anant)
We need someone to check this with an ASAN build, and also verify that the warnings we get mean it's a valid test or not: 
************************************************************
* Call to xpconnect wrapped JSObject produced this error:  *
[Exception... "'[JavaScript Error: "this._dompc._pc is null" {file: "file:///home/jesup/src/mozilla/inbound/obj-x86_64-unknown-linux-gnu-debug/dist/bin/components/PeerConnection.js" line: 612}]' when calling method: [IPeerConnectionObserver::onStateChange]"  nsresult: "0x80570021 (NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS)"  location: "native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0"  data: yes]
************************************************************
Flags: needinfo?(anant)
Attachment #680452 - Flags: review?(anant) → review+
Henrik, can you verify if this covers your issues?  I'll push a try build later
Henrik, can you verify if this covers your issues?  I'll push a try build later
The warnings are probably expected, as once the inner window is destroyed pc_ is set to null.
Looks like the try build fixes the problem. I can't get my crashtest to crash Firefox anymore.
henrik: just to verify, did you use the updated testcase?  There was an API change to createAnswer() that caused the previous test to no longer do what it had been doing.

In any case, I think it's good to commit this given your results
No, I haven't used this updated testcase. Could you transform it into a crashtest? It would have been more obvious for me if you had replaced my created crashtest.
The crashtest is not affected by the API change; the testcase was.   So your test was valid.
The updated testcase still crashes Minefield after a couple of seconds:

Report: bp-77daa719-5d51-48af-84e8-e40402121112

Could be something different.
Comment on attachment 680452 [details] [diff] [review]
Patch v2 (unbitrotted)

[Security approval request comment]
How easily can the security issue be deduced from the patch?
  Not especially easily.  That there's exposure of the listeners list is visible, however, and they may deduce that's what to look at.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
  Yes, as it clearly shows the use-after-free

Which older supported branches are affected by this flaw?
  I believe 18, 17, 16, esr17, though the code has been changed a moderate amount surrounding this.  Note again that GetUserMedia is preffed off in all these releases.  I suggest it be turned off at config level for esr17 as a precaution and since it doesn't add anything to an ESR release (if it has not been turned off already).

If not all supported branches, which bug introduced the flaw?
   Bug 752353 (initial DOM checkin, not built be default) and bug 691234 (enable gUM DOM on Desktop).

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
   No, we don't have backports, but they should be fairly easy to produce with some changes to the GetActiveWindows portion of the patch (either use more of the pre-bitrot version, or port more of the updated code to encapsulate the ActiveWindows code behind methods).

How likely is this patch to cause regressions; how much testing does it need?
   Unlikely to cause regressions, as it simply moves checks of window liveness to when it's used instead of when the runnable is created.  Testing would be basic functionality.
Attachment #680452 - Flags: sec-approval?
Comment on attachment 680452 [details] [diff] [review]
Patch v2 (unbitrotted)

sec-approval+
Attachment #680452 - Flags: sec-approval? → sec-approval+
(In reply to Henrik Skupin (:whimboo) from comment #31)
> The updated testcase still crashes Minefield after a couple of seconds:
> 
> Report: bp-77daa719-5d51-48af-84e8-e40402121112
> 
> Could be something different.

The stack appears to be useless, however, the crash is likely a security-safe null-ptr deref (0x000000d), so it's not *likely* to be the same bug.  Any chance of reproducing with real symbols?  (gdb?)  Given the crash appearing to be a null-ptr deref, I think we're still safe to land on m-c.

I cannot get the updated testcase to crash for me on Linux (debug), loaded from a local file.  What build did you test from the try run, exactly?
https://hg.mozilla.org/integration/mozilla-inbound/rev/5d28022bc22a
Assignee: anant → rjesup
Target Milestone: --- → mozilla19
https://hg.mozilla.org/mozilla-central/rev/5d28022bc22a
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Whiteboard: [asan][fuzzblocker][WebRTC][blocking-webrtc+] → [asan][fuzzblocker][WebRTC][blocking-webrtc+][qa-]
Crash test is here, but an hg link showing it checked in. Do we need to check it in or is there no link here?
Flags: in-testsuite?
I will do that once I have verified that this patch fixed the problem and the crashtest is valid.
Comment on attachment 672323 [details] [diff] [review]
crashtest v1

This crashtest will not work because it's spinning an infinite loop. I most likely have to implement it inside an iframe so we only reload it once. I will have an updated version tomorrow.
Attachment #672323 - Attachment is obsolete: true
So the iframe method doesn't work because we end-up in another crash. See bug 812122. I will try to use localStorage for it, so we reload the page itself and not a sub page which seems to be the issue of the new bug.
Attached patch crashtest v2Splinter Review
Updated crashtest which makes use of the localStorage. I have pushed it to try:
https://tbpl.mozilla.org/?tree=Try&rev=7234c950ce7d
Attachment #681920 - Flags: review?(rjesup)
Comment on attachment 681920 [details] [diff] [review]
crashtest v2

Also verified it crashes without the fix, and doesn't with the fix (linux64 debug)
Attachment #681920 - Flags: review?(rjesup) → review+
Whiteboard: [asan][fuzzblocker][WebRTC][blocking-webrtc+][qa-] → [asan][WebRTC][blocking-webrtc+][qa-]
Whiteboard: [asan][WebRTC][blocking-webrtc+][qa-] → [asan][WebRTC][blocking-webrtc+][qa-][adv-main19-]
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: