Closed Bug 1022945 Opened 5 years ago Closed 5 years ago

Shutdown crash with many mozGetUserMedia calls

Categories

(Core :: Audio/Video, defect, critical)

x86_64
Linux
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla34
Tracking Status
firefox31 --- wontfix
firefox32 --- verified
firefox33 + verified
firefox34 + verified
firefox-esr24 --- unaffected
firefox-esr31 32+ verified
b2g-v1.3 --- unaffected
b2g-v1.3T --- unaffected
b2g-v1.4 --- unaffected
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed

People

(Reporter: jesup, Assigned: karlt)

References

(Blocks 1 open bug)

Details

(4 keywords, Whiteboard: [adv-main32+][adv-esr31.1+])

Attachments

(10 files)

+++ This bug was initially created as a clone of Bug #1002340 +++

1. Load the testcase in an ASan build of Firefox
2. Quit Firefox

On Nightly we (once again) have crashes in the MediaStreamGraph Resampler code.  I can repro on opt builds only; a noopt build never hits it.

I'll attach a stack.

Strongly suspect that the Graph is going away while we're still pushing audio into it.
Attached file asan
Attached file gdb2.out
More data from a catch; I'll try to symbolize the logs.  It's interesting that the mGraph seems reasonable (at Fire(), where I had access to the primary structures, I dumped: 
*this
*mTimerCallbackWhileFiring.mRawPtr
*mTimerCallbackWhileFiring.mRawPtr->mSource
*mTimerCallbackWhileFiring.mRawPtr->mSource->mGraph
which you'll see in this bug.

The MSG appears to be in 
  mLifecycleState = mozilla::MediaStreamGraphImpl::LIFECYCLE_WAITING_FOR_THREAD_SHUTDOWN, 
  mWaitState = mozilla::MediaStreamGraphImpl::WAITSTATE_RUNNING, 

Interestingly, the SourceMediaStream shows mDestroyed=true, and mFinished=false, and mMainThreadDestroyed=true

Roc, any thoughts on how this can happen?  The MediaEngineDefault calling AppendToTrack() is run off a Timer, and in the Stop() function Cancel()'s the timer (which per the dumps appears to not have happened yet), calls EndTrack() and Finish().  It's pretty clear Stop() hasn't run yet, though there may be an event queued to the MediaManager thread that will do so.
Flags: needinfo?(roc)
Looks like the main thread shut down the graph OK but the MediaEngineDefault is still running and doing stuff to the destroyed stream? Seems to me SourceMediaStream::AppendToTrack should have more mDestroyed checks. Though it's not clear to me why ASAN would catch an error inside GraphImpl(); the SourceMediaStream itself is still a valid object, is it not?

In any case, I think we shouldn't call GraphImpl() if mDestroyed is true. So we probably should hoist the mDestroyed check up from below.

BTW rr might be helpful here.
Flags: needinfo?(roc)
Assigning to Randell during Critsmash.

(When you clone bugs, please remove my advisory whiteboard tags...)
Assignee: nobody → rjesup
Whiteboard: [adv-main30+]
Any update on this, Randell?
Flags: needinfo?(rjesup)
Nothing so far.  Way busy on other items I'm afraid
Flags: needinfo?(rjesup)
Group: media-core-security
Maybe roc knows someone who has cycles to try comment 3?
Flags: needinfo?(roc)
Maybe Karl. At least trying the band-aid fix should be easy.
Flags: needinfo?(roc) → needinfo?(karlt)
Oh, sorry.  Somehow missed this was a security bug, having assumed needing to quit at the right time would mean it is not sec-critical.
> #2  0x00007fffe9e4b55c in GraphImpl (this=<optimized out>)
>    at /home/jesup/src/mozilla/asan-inbound/content/media/MediaStreamGraphImpl.h:395

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #3)
> Though it's not clear to me why ASAN would catch an error inside GraphImpl()

Yes, I don't know what this GraphImpl method in the global namespace is, but MediaStreamGraphImpl.h sounds more like MediaStreamGraphImpl::AudioSampleRate() as in attachment 8413570 [details].
(Line 395 corresponded with AudioSampleRate() in some revisions, but I don't know what revision generated the attached asan output.)
including ResampleAudioToGraphSampleRate() which assumes a graph.
Attachment #8462315 - Flags: review?(roc)
Assignee: rjesup → karlt
Status: NEW → ASSIGNED
The stream order is only important if there are connections, in which case
MediaInputPort::Disconnect() calls SetStreamOrderDirty().

MediaStreamGraphImpl::RemoveStream() also calls SetStreamOrderDirty(), which
ensures that mFirstCycleBreaker is updated when necessary.
Attachment #8462318 - Flags: review?(roc)
This shouldn't be necessary, if clients do the right things, but provides some
safety, in case they don't.
Attachment #8462320 - Flags: review?(roc)
and use mGraph to test whether SourceMediaStreams are destroyed.

This also makes the testcase fail (in the same way) after several seconds,
even without the quit, so it may be possible to checkin a worthwhile automated
test.
Attachment #8462324 - Flags: review?(roc)
Comment on attachment 8462318 [details] [diff] [review]
remove SetStreamOrderDirty call from ProcessedMediaStream::DestroyImpl

Review of attachment 8462318 [details] [diff] [review]:
-----------------------------------------------------------------

Add a comment here explaining why we don't need to call SetStreamOrderDirty.
Attachment #8462318 - Flags: review?(roc) → review+
Comment on attachment 8462321 [details] [diff] [review]
only AppendToTrack before SourceMediaStreams are destroyed

[Security approval request comment]
Please consider this a sec approval request for all code patches for this bug.

How easily could an exploit be constructed based on the patch?

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

It's clear from the patch what the problem is, but it is not clear how to exploit it.

The bug has been reported as an issue at shutdown, but AFAIU it could happen after garbage collection at any time if races run the right way.

Which older supported branches are affected by this flaw?
31 and subsequent.

If not all supported branches, which bug introduced the flaw?
bug 818822

Do you have backports for the affected branches? Yes.

How likely is this patch to cause regressions; how much testing does it need?

Possible regressions might be a null pointer deref from the nulled pointer in part 3, or unexpected source stream position accounting during shutdown due to the change in return value from AppendToTrack().  Multiple automated tests trigger this code, so the former is reasonably well tested, but the shutdown path is not.
Attachment #8462321 - Flags: sec-approval?
Attached patch mochitestSplinter Review
Based on Jesse's testcase.
For checkin after fixed on all branches.
Flags: in-testsuite?
sec-approval+ for patch. We won't want the test checked in until after we ship the fix. We should get Aurora and Beta nominations too once it is green on trunk.
Attachment #8462321 - Flags: sec-approval? → sec-approval+
Comment on attachment 8462321 [details] [diff] [review]
only AppendToTrack before SourceMediaStreams are destroyed

Approval Request Comment
Please consider this an approval request for all code patches for this bug.
[Feature/regressing bug #]: bug 818822
[User impact if declined]: sec-critical
[Describe test coverage new/current, TBPL]:
test not landed for security reasons.
[Risks and why]:
Possible regressions might be a null pointer deref from the nulled pointer in part 3, or unexpected source stream position accounting during shutdown due to the change in return value from AppendToTrack().  Multiple automated tests trigger this code, so the former is reasonably well tested, but the shutdown path is not.
[String/UUID change made/needed]: none.
Attachment #8462321 - Flags: approval-mozilla-beta?
Attachment #8462321 - Flags: approval-mozilla-aurora?
Attachment #8462321 - Flags: approval-mozilla-beta?
Attachment #8462321 - Flags: approval-mozilla-beta+
Attachment #8462321 - Flags: approval-mozilla-aurora?
Attachment #8462321 - Flags: approval-mozilla-aurora+
Please request esr31 approval on the 31 rollup patch.
Comment on attachment 8463226 [details] [diff] [review]
rollup for 31 branch

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: crit
User impact if declined: sec-crit
Fix Landed on Version: 32
Risk to taking this patch (and alternatives if risky): 
Possible regressions might be a null pointer deref from the nulled pointer in part 3, or unexpected source stream position accounting during shutdown due to the change in return value from AppendToTrack().  Multiple automated tests trigger this code, so the former is reasonably well tested, but the shutdown path is not.
String or UUID changes made by this patch: none.
Attachment #8463226 - Flags: approval-mozilla-esr31?
Flags: needinfo?(karlt)
Attachment #8463226 - Flags: approval-mozilla-esr31? → approval-mozilla-esr31+
Reproduced the original issue several times using the poc from bug #1002340 comment #0 using the following build:
- http://inbound-archive.pub.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux64-asan/1402064377/

fx 34.0a1: [PASSED]
- http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux64-asan/1409075491/

fx 33.0a2: [PASSED]
- http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-aurora-linux64-asan/1409066484/

fx 32.0: [PASSED]
- http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-beta-linux64-asan/1409002734/

fx esr31: [PASSED]
- Built from: http://hg.mozilla.org/releases/mozilla-esr31/rev/d736074cabbf

Configure arguments:

--enable-address-sanitizer --disable-jemalloc --disable-crashreporter --disable-elf-hack --enable-debug-symbols --disable-install-strip --enable-optimize --enable-debug

I went through all of the above test cases using the builds listed above:

- opened the poc from bug #1002340 comment #0 in 10 tabs/windows and waited several minutes
- opened the poc from bug #1002340 comment #0 in 10 private browser tabs/windows and waited several minutes
- opened the poc from bug #1002340 comment #0 in 10 e10s tabs/windows and waited several minutes (m-c only)
- opened the poc from bug #1002340 comment #0 in 2-3 tabs and quickly closed the browser
Whiteboard: [adv-main32+][adv-esr31.1+]
Group: media-core-security
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.