TrackBuffer::ResetDecode() reads mDecoders array on decode task queue without holding mutex

RESOLVED FIXED in Firefox 37

Status

()

defect
P1
normal
RESOLVED FIXED
5 years ago
3 years ago

People

(Reporter: karlt, Assigned: karlt)

Tracking

(Blocks 1 bug, {csectype-uaf, sec-high})

36 Branch
mozilla39
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite -

Firefox Tracking Flags

(firefox36 disabled, firefox37+ fixed, firefox38+ fixed, firefox39+ fixed, firefox-esr31 unaffected, b2g-v1.4 unaffected, b2g-v2.0 unaffected, b2g-v2.0M unaffected, b2g-v2.1 unaffected, b2g-v2.1S unaffected, b2g-v2.2 fixed, b2g-master fixed)

Details

Attachments

(4 attachments, 1 obsolete attachment)

TrackBuffer::ResetDecode() reads mDecoders on decode task queue without holding mutex.

The length of mDecoders is increased on the main thread
(in TrackBuffer::NewDecoder()) so reading on another thread may lead to use after
free.
I think ResetDecode() could use mInitializedDecoders instead of mDecoders, but perhaps that still needs to be accessed under the mutex.
mInitializedDecoders indeed looks to be accessed from multiple threads, so we have to hold the monitor either way.
Assignee: nobody → matt.woodrow
Attachment #8552137 - Flags: review?(karlt)
Does "We must not hold the state machine monitor while we call into the reader" apply only to the Seek() call or also ResetDecode()?"

https://hg.mozilla.org/mozilla-central/rev/5519fd827576#l7.1312
Flags: needinfo?(cpearce)
I think it's a bad idea to hold the monitor while we call into the reader, as the reader could, or could call into system APIs which could, block on some other thread which requires that monitor.

I think we still do call into the reader with the monitor held in a few cases, ideally we should eliminate them and only call into the reader on the decode task queue in order to guarantee its thread safety.
Flags: needinfo?(cpearce)
Some notes on mInitializedDecoders:

Its length is increased with decoder monitor held in
TrackBuffer::RegisterDecoder() on initialization task queue.

Its length is decreased with decoder monitor held in RemoveDecoder() from main
thread and in TrackBuffer shutdown.

I expect it would be possible to only modify mInitializedDecoders on the
decode task queue.  That's not necessarily the best way to solve this bug though.
A similar bug exists in the caller of TrackBuffer::ResetDecode():
MediaSourceReader::AttemptSeek() iterates over mTrackBuffers without holding
the decoder mutex.

mTrackBuffers length is increased with decoder mutex held in AddTrackBuffer(),
called from TrackBuffer constructor, from SourceBuffer constructor, which is
main thread.

MediaSourceDecoder::RemoveTrackBuffer() is not referenced, so mTrackBuffers
length is not reduced until MediaSourceReader shutdown. 

It would be nice if the length of MediaSourceReader::mTrackBuffers would only
be changed on the decode thread, to which a new TrackBuffer reference would be
sent.  However, MediaSourceReader::PrepareInitialization() at least would need
similar message passing; SetCDMProxy(), too I assume.
I think bug 1097252 comes into play with GetBuffered().
Comment on attachment 8552137 [details] [diff] [review]
lock-reset-decode

Sounds like we should first convince cpearce that this is the right thing if we want to take this approach.

However, I suspect it is not actually necessary to ResetDecode() all readers here.  I expect calling ResetDecode on the readers on which Seek() is called would be sufficient.  Does that seem practical?

I wonder why RequestVideoData() and OnVideoNotDecoded() don't call
ResetDecode() before Seek()?
Attachment #8552137 - Flags: review?(karlt)
(In reply to Karl Tomlinson (:karlt) from comment #7)
> I wonder why RequestVideoData() and OnVideoNotDecoded() don't call
> ResetDecode() before Seek()?


For documentation purposes; the MediaDecoderStateMachine didn't always call ResetDecode() before Seek(). All Readers's used to be required to begin their Seek() implementation with a ResetDecode() call. This was duplicated code, duplicated in every reader. But some people (including myself IIRC) implementing new Readers' Seek() methods forgot to call ResetDecode(), so we pushed the ResetDecode() up into the MDSM, so that it was not required for Reader::Seek() implementations to remember to call ResetDecode(), and so that we didn't have duplicated calls to ResetDecode() everywhere.
I'll look at changing the ResetDecode() calls in line with comments 7 and 8.
Assignee: matt.woodrow → karlt
Tracking all MSE P1 bugs for Firefox 37.
The approach in comment 10 was to ResetDecode() on a subreader when it transitioned from current to not-current.  However, it seems that when ResetDecode() is not called on a subreader before its first seek, we end up with it in a blocking read.
I don't know why this is, but I'll upload a patch for the more conventional approach of calling ResetDecode() before Seek(), which includes when the subreader transistions from not-current to current.

#0  pthread_cond_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f14a9472a58 in PR_Wait (mon=0x7f14621a4450, timeout=4294967295)
    at /mnt/ssd1/karl/moz/dev/nsprpub/pr/src/pthreads/ptsynch.c:691
#2  0x00007f14a0807f1a in mozilla::ReentrantMonitor::Wait (
    this=0x7f14622dc148, aInterval=4294967295)
    at /mnt/ssd1/karl/moz/dev/xpcom/glue/BlockingResourceBase.cpp:470
#3  0x00007f14a2b0f5fb in mozilla::SourceBufferResource::ReadInternal (this=
    0x7f14622dc120, 
    aBuffer=0x7f14754bdc70 "\245\245\245\245\245\245\245\245Ph\003u\024\177", 
    aCount=8, aBytes=0x7f14386fa8f0, aMayBlock=true)
    at /mnt/ssd1/karl/moz/dev/dom/media/mediasource/SourceBufferResource.cpp:71
#4  0x00007f14a2b0f8fa in mozilla::SourceBufferResource::ReadAtInternal (
    this=0x7f14622dc120, aOffset=5240, 
    aBuffer=0x7f14754bdc70 "\245\245\245\245\245\245\245\245Ph\003u\024\177", 
    aCount=8, aBytes=0x7f14386fa8f0, aMayBlock=true)
    at /mnt/ssd1/karl/moz/dev/dom/media/mediasource/SourceBufferResource.cpp:114
#5  0x00007f14a2b0f85c in mozilla::SourceBufferResource::ReadAt (
    this=0x7f14622dc120, aOffset=5240, 
    aBuffer=0x7f14754bdc70 "\245\245\245\245\245\245\245\245Ph\003u\024\177", 
    aCount=8, aBytes=0x7f14386fa8f0)
    at /mnt/ssd1/karl/moz/dev/dom/media/mediasource/SourceBufferResource.cpp:101
#6  0x00007f14a2be24d0 in mozilla::MP4Stream::BlockingReadIntoCache (
    this=0x7f145a62c040, aOffset=5240, aCount=8, aToUnlock=0x7f1460daa3e8)
    at /mnt/ssd1/karl/moz/dev/dom/media/fmp4/MP4Stream.cpp:42
#7  0x00007f14a2bdf9a9 in mozilla::InvokeAndRetry<mp4_demuxer::MP4Demuxer, mp4_demuxer::MP4Sample*> (aThisVal=0x7f1463103f30, aMethod=
    (mp4_demuxer::MP4Sample *(mp4_demuxer::MP4Demuxer::*)(mp4_demuxer::MP4Demuxer * const)) 0x7f14a06c2130 <mp4_demuxer::MP4Demuxer::DemuxAudioSample()>, 
    aStream=0x7f145a62c040, aMonitor=0x7f1460daa3e8)
    at /mnt/ssd1/karl/moz/dev/dom/media/fmp4/MP4Reader.cpp:129
#8  0x00007f14a2bdcf1b in mozilla::MP4Reader::PopSampleLocked (
    this=0x7f1460daa000, aTrack=mp4_demuxer::kAudio)
    at /mnt/ssd1/karl/moz/dev/dom/media/fmp4/MP4Reader.cpp:779
#9  0x00007f14a2bdce65 in mozilla::MP4Reader::PopSample (this=0x7f1460daa000, 
    aTrack=mp4_demuxer::kAudio)
    at /mnt/ssd1/karl/moz/dev/dom/media/fmp4/MP4Reader.cpp:770
#10 0x00007f14a2bdcadc in mozilla::MP4Reader::Update (this=0x7f1460daa000, 
    aTrack=mp4_demuxer::kAudio)
    at /mnt/ssd1/karl/moz/dev/dom/media/fmp4/MP4Reader.cpp:713
#11 0x00007f14a2be0b33 in nsRunnableMethodImpl<void (mozilla::MP4Reader::*)(mp4_demuxer::TrackType), mp4_demuxer::TrackType, true>::Run (this=0x7f145a62db00)
    at ../../../dist/include/nsThreadUtils.h:361
#12 0x00007f14a2a6f072 in mozilla::MediaTaskQueue::Runner::Run (
    this=0x7f145a952d60)
    at /mnt/ssd1/karl/moz/dev/dom/media/MediaTaskQueue.cpp:230
#13 0x00007f14a07d61e1 in nsThreadPool::Run (this=0x7f1461df88a0)
    at /mnt/ssd1/karl/moz/dev/xpcom/threads/nsThreadPool.cpp:225
#14 0x00007f14a07d3737 in nsThread::ProcessNextEvent (this=0x7f1463104980, 
    aMayWait=false, aResult=0x7f14386fad4f)
    at /mnt/ssd1/karl/moz/dev/xpcom/threads/nsThread.cpp:855
#15 0x00007f14a081794c in NS_ProcessNextEvent (aThread=0x7f1463104980, 
    aMayWait=false) at /mnt/ssd1/karl/moz/dev/xpcom/glue/nsThreadUtils.cpp:265
#16 0x00007f14a0cdbcd2 in mozilla::ipc::MessagePumpForNonMainThreads::Run (
    this=0x7f145a7508c0, aDelegate=0x7f145a70e1c0)
    at /mnt/ssd1/karl/moz/dev/ipc/glue/MessagePump.cpp:339
#17 0x00007f14a0c861e1 in MessageLoop::RunInternal (this=0x7f145a70e1c0)
    at /mnt/ssd1/karl/moz/dev/ipc/chromium/src/base/message_loop.cc:233
#18 0x00007f14a0c86176 in MessageLoop::RunHandler (this=0x7f145a70e1c0)
    at /mnt/ssd1/karl/moz/dev/ipc/chromium/src/base/message_loop.cc:226
#19 0x00007f14a0c86107 in MessageLoop::Run (this=0x7f145a70e1c0)
    at /mnt/ssd1/karl/moz/dev/ipc/chromium/src/base/message_loop.cc:200
#20 0x00007f14a07d2467 in nsThread::ThreadFunc (aArg=0x7f1463104980)
    at /mnt/ssd1/karl/moz/dev/xpcom/threads/nsThread.cpp:356
and this was already called before Seek().
Attachment #8566238 - Flags: review?(matt.woodrow)
Attachment #8566236 - Flags: review?(matt.woodrow) → review+
Attachment #8566237 - Flags: review?(matt.woodrow) → review+
Attachment #8566238 - Flags: review?(matt.woodrow) → review+
Comment on attachment 8566237 [details] [diff] [review]
ResetDecode() on subreaders when switching to current or seeking

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
Constructing an exploit depends on timing of two different threads.
Generation of the necessary events could be controlled by content, but timing is difficult to control.  Perhaps sufficient events could be generated that sometimes the timing could produce an exploit.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
No.  Significant understanding of the code would be required to identify the problem and the mechanisms available to generate an exploit.

Which older supported branches are affected by this flaw?
The feature is disabled in 36 and previous.  The intention is to enable from 37, and the feature is enabled on Nightly and Aurora.

If not all supported branches, which bug introduced the flaw?
Don't know, but I don't think it was recent.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
The intention is to backport all changes to this code to Beta 37, and so code will be the same on all affected branches.

How likely is this patch to cause regressions; how much testing does it need?
There is some risk.  The related code is being tested thoroughly as a new feature.
Attachment #8566237 - Flags: sec-approval?
Comment on attachment 8566237 [details] [diff] [review]
ResetDecode() on subreaders when switching to current or seeking

sec-approval+ for trunk. 

I suppose we should take this on Aurora and Beta as well?
Attachment #8566237 - Flags: sec-approval? → sec-approval+
(In reply to Al Billings [:abillings] from comment #18)
> I suppose we should take this on Aurora and Beta as well?

Yes, but I have some merge conflicts to resolve.

Landed on m-i but backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/75d7c2a46610
Flags: in-testsuite-
Something made it's way to m-c:
https://hg.mozilla.org/mozilla-central/rev/0063a9d0d70b

Unsure where that leaves this bug.
Flags: needinfo?(karlt)
0063a9d0d70b was one of the changesets backed out.
Perhaps it would be helpful if next time I explicitly list all the changesets backed out.

Things have changed a bit, so I expect I'll make a somewhat different patch tomorrow.
Flags: needinfo?(karlt)
Attachment #8566237 - Attachment is obsolete: true
Attachment #8570880 - Flags: review?(matt.woodrow) → review+
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Can we get Aurora and Beta patches nominated for this?
Flags: needinfo?(karlt)
Comment on attachment 8566238 [details] [diff] [review]
remove ResetDecode() call from MediaSourceReader::AttemptSeek()

Please consider this a request for all patches here.
The security bug is fixed by this patch, but uplifting the other patches keeps things consistent for any future changes to this code.

Approval Request Comment
[Feature/regressing bug #]: MSE (old bug I think)
[User impact if declined]: sec-high
[Describe test coverage new/current, TreeHerder]:
There are some mochitests and web-platform tests.
The related code is being tested thoroughly as a new feature.
Testing this particular bug is hard because it depends on timing.  We don't have a proof of concept for a crashtest.
[Risks and why]: there is risk due to the multi-threaded model with mixed sync and async communication being very difficult to analyse thoroughly.
[String/UUID change made/needed]: none.
Flags: needinfo?(karlt)
Attachment #8566238 - Flags: approval-mozilla-beta?
Attachment #8566238 - Flags: approval-mozilla-aurora?
Attachment #8566238 - Flags: approval-mozilla-beta?
Attachment #8566238 - Flags: approval-mozilla-beta+
Attachment #8566238 - Flags: approval-mozilla-aurora?
Attachment #8566238 - Flags: approval-mozilla-aurora+
Group: media-core-security
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.