Closed Bug 1453127 (CVE-2018-5156) Opened 3 years ago Closed 3 years ago

SEGV in mozilla::MediaEncoder::AudioTrackListener::NotifyRealtimeTrackData

Categories

(Core :: WebRTC: Audio/Video, defect)

50 Branch
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla62
Tracking Status
firefox-esr52 61+ verified
firefox-esr60 61+ verified
firefox59 --- wontfix
firefox60 + wontfix
firefox61 + verified
firefox62 + verified

People

(Reporter: nils, Assigned: pehrsons)

References

Details

(Keywords: csectype-race, sec-high, Whiteboard: [adv-main61+][adv-esr52.9+][adv-esr60.1+][post-critsmash-triage])

Attachments

(15 files, 4 obsolete files)

28.92 KB, audio/ogg
Details
693 bytes, video/webm
Details
693 bytes, video/webm
Details
11.20 KB, text/plain
Details
619 bytes, text/html
Details
1.53 KB, patch
pehrsons
: review+
abillings
: sec-approval+
Details | Diff | Splinter Review
1.05 KB, patch
Details | Diff | Splinter Review
8.38 KB, text/plain
Details
4.43 KB, patch
pehrsons
: review+
abillings
: sec-approval+
Details | Diff | Splinter Review
21.34 KB, patch
pehrsons
: review+
Details | Diff | Splinter Review
2.62 KB, patch
jya
: review+
Details | Diff | Splinter Review
25.12 KB, patch
jya
: review+
Details | Diff | Splinter Review
4.36 KB, patch
pehrsons
: review+
Details | Diff | Splinter Review
21.04 KB, patch
pehrsons
: review+
Details | Diff | Splinter Review
2.62 KB, patch
pehrsons
: review+
Details | Diff | Splinter Review
The following testcase crashes the latest ASAN build of Firefox 61.0a1 (SourceStamp=30d72755b1749953d438199456f1524ce84ab5e5). It requires the three attached media files in the same directory and might require a few attempts to trigger. It works best for me when loaded from a file:// location, however I also managed to reproduce from a HTTP server.

crash.html
<script>
function spin () {
    var x=new XMLHttpRequest();
    x.open("POST","https://mozilla.org",false);
    try{x.send("X");}catch(e){}
}

function start() {	
	o208=document.createElementNS('http://www.w3.org/1999/xhtml','video');
	o208.autoplay=true;
	o208.setAttribute('src','1.ogg');
	o208.onplaying=fun0;
	spin();
	o379=o208.mozCaptureStreamUntilEnded();
	o208.setAttribute('src','2.webm');
	window.setTimeout("location.reload()",400);
}
function fun0() {
	try{o573=new MediaRecorder(o379);}catch(e){}
	o208.setAttribute('src','3.webm');
	try{o573.start();}catch(e){}
}
</script>
<body onload="start()"></body>

ASAN output:
AddressSanitizer:DEADLYSIGNAL
=================================================================
==13484==ERROR: AddressSanitizer: SEGV on unknown address 0x00009fff8000 (pc 0x7fd41288058a bp 0x7fd3f3fd8ff0 sp 0x7fd3f3fd8fa0 T304)
==13484==The signal is caused by a READ memory access.
    #0 0x7fd412880589 in Length /builds/worker/workspace/build/src/obj-firefox/dist/include/nsTArray.h:369:43
    #1 0x7fd412880589 in AppendElements<const void *, nsTArrayInfallibleAllocator, nsTArrayInfallibleAllocator> /builds/worker/workspace/build/src/obj-firefox/dist/include/nsTArray.h:1637
    #2 0x7fd412880589 in AutoTArray /builds/worker/workspace/build/src/obj-firefox/dist/include/nsTArray.h:2457
    #3 0x7fd412880589 in AudioChunk /builds/worker/workspace/build/src/dom/media/AudioSegment.h:161
    #4 0x7fd412880589 in Construct<const mozilla::AudioChunk &> /builds/worker/workspace/build/src/obj-firefox/dist/include/nsTArray.h:539
    #5 0x7fd412880589 in mozilla::AudioChunk* nsTArray_Impl<mozilla::AudioChunk, nsTArrayInfallibleAllocator>::AppendElement<mozilla::AudioChunk const&, nsTArrayInfallibleAllocator>(mozilla::AudioChunk const&) /builds/worker/workspace/build/src/obj-firefox/dist/include/nsTArray.h:2291
    #6 0x7fd41287febb in mozilla::MediaSegmentBase<mozilla::AudioSegment, mozilla::AudioChunk>::AppendSliceInternal(mozilla::MediaSegmentBase<mozilla::AudioSegment, mozilla::AudioChunk> const&, long, long) /builds/worker/workspace/build/src/obj-firefox/dist/include/MediaSegment.h:512:19
    #7 0x7fd41746dd94 in AppendSlice /builds/worker/workspace/build/src/obj-firefox/dist/include/MediaSegment.h:277:5
    #8 0x7fd41746dd94 in mozilla::MediaEncoder::AudioTrackListener::NotifyRealtimeTrackData(mozilla::MediaStreamGraph*, long, mozilla::MediaSegment const&) /builds/worker/workspace/build/src/dom/media/encoder/MediaEncoder.cpp:150
    #9 0x7fd41746cc86 in mozilla::MediaEncoder::AudioTrackListener::NotifyQueuedChanges(mozilla::MediaStreamGraph*, long, mozilla::MediaSegment const&) /builds/worker/workspace/build/src/dom/media/encoder/MediaEncoder.cpp:123:7
    #10 0x7fd417074766 in mozilla::TrackUnionStream::CopyTrackData(mozilla::StreamTracks::Track*, unsigned int, long, long, bool*) /builds/worker/workspace/build/src/dom/media/TrackUnionStream.cpp:355:22
    #11 0x7fd417072425 in mozilla::TrackUnionStream::ProcessInput(long, long, unsigned int) /builds/worker/workspace/build/src/dom/media/TrackUnionStream.cpp:122:11
    #12 0x7fd417367045 in mozilla::MediaStreamGraphImpl::Process() /builds/worker/workspace/build/src/dom/media/MediaStreamGraph.cpp:1271:15
    #13 0x7fd417368014 in mozilla::MediaStreamGraphImpl::OneIteration(long) /builds/worker/workspace/build/src/dom/media/MediaStreamGraph.cpp:1357:3
    #14 0x7fd4170fa844 in mozilla::ThreadedDriver::RunThread() /builds/worker/workspace/build/src/dom/media/GraphDriver.cpp:319:40
    #15 0x7fd417124552 in mozilla::MediaStreamGraphInitThreadRunnable::Run() /builds/worker/workspace/build/src/dom/media/GraphDriver.cpp:194:14
    #16 0x7fd410cb3f18 in nsThread::ProcessNextEvent(bool, bool*) /builds/worker/workspace/build/src/xpcom/threads/nsThread.cpp:1096:14
    #17 0x7fd410cd0350 in NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/workspace/build/src/xpcom/threads/nsThreadUtils.cpp:519:10
    #18 0x7fd411ba544b in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /builds/worker/workspace/build/src/ipc/glue/MessagePump.cpp:334:20
    #19 0x7fd411af4229 in RunInternal /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:326:10
    #20 0x7fd411af4229 in RunHandler /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:319
    #21 0x7fd411af4229 in MessageLoop::Run() /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:299
    #22 0x7fd410cae368 in nsThread::ThreadFunc(void*) /builds/worker/workspace/build/src/xpcom/threads/nsThread.cpp:425:11
    #23 0x7fd42e1d647e in _pt_root /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:201:5
    #24 0x7fd431b636b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
    #25 0x7fd430bec3dc in clone (/lib/x86_64-linux-gnu/libc.so.6+0x1073dc)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /builds/worker/workspace/build/src/obj-firefox/dist/include/nsTArray.h:369:43 in Length
Thread T304 (MediaStreamGrph) created by T303 (CubebOp~tion #1) here:
    #0 0x4ae80d in __interceptor_pthread_create /builds/worker/workspace/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/asan_interceptors.cc:204:3
    #1 0x7fd42e1d31cf in _PR_CreateThread /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:433:14
    #2 0x7fd42e1d2dbe in PR_CreateThread /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:518:12
    #3 0x7fd410cb02e3 in nsThread::Init(nsTSubstring<char> const&) /builds/worker/workspace/build/src/xpcom/threads/nsThread.cpp:612:8
    #4 0x7fd410cb9eca in nsThreadManager::NewNamedThread(nsTSubstring<char> const&, unsigned int, nsIThread**) /builds/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp:471:22
    #5 0x7fd410cca454 in NS_NewNamedThread(nsTSubstring<char> const&, nsIThread**, nsIRunnable*, unsigned int) /builds/worker/workspace/build/src/xpcom/threads/nsThreadUtils.cpp:143:45
    #6 0x7fd4170f9a25 in NS_NewNamedThread<16> /builds/worker/workspace/build/src/obj-firefox/dist/include/nsThreadUtils.h:72:10
    #7 0x7fd4170f9a25 in mozilla::ThreadedDriver::Start() /builds/worker/workspace/build/src/dom/media/GraphDriver.cpp:210
    #8 0x7fd4170fc950 in mozilla::AudioCallbackDriver::Init() /builds/worker/workspace/build/src/dom/media/GraphDriver.cpp:625:5
    #9 0x7fd4170fbd34 in mozilla::AsyncCubebTask::Run() /builds/worker/workspace/build/src/dom/media/GraphDriver.cpp:495:21
    #10 0x7fd410ccb05c in nsThreadPool::Run() /builds/worker/workspace/build/src/xpcom/threads/nsThreadPool.cpp:228:14
    #11 0x7fd410ccb75c in non-virtual thunk to nsThreadPool::Run() /builds/worker/workspace/build/src/xpcom/threads/nsThreadPool.cpp
    #12 0x7fd410cb3f18 in nsThread::ProcessNextEvent(bool, bool*) /builds/worker/workspace/build/src/xpcom/threads/nsThread.cpp:1096:14
    #13 0x7fd410cd0350 in NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/workspace/build/src/xpcom/threads/nsThreadUtils.cpp:519:10
    #14 0x7fd411ba55cc in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /builds/worker/workspace/build/src/ipc/glue/MessagePump.cpp:364:5
    #15 0x7fd411af4229 in RunInternal /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:326:10
    #16 0x7fd411af4229 in RunHandler /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:319
    #17 0x7fd411af4229 in MessageLoop::Run() /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:299
    #18 0x7fd410cae368 in nsThread::ThreadFunc(void*) /builds/worker/workspace/build/src/xpcom/threads/nsThread.cpp:425:11
    #19 0x7fd42e1d647e in _pt_root /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:201:5
    #20 0x7fd431b636b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)

Thread T303 (CubebOp~tion #1) created by T0 (file:// Content) here:
    #0 0x4ae80d in __interceptor_pthread_create /builds/worker/workspace/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/asan_interceptors.cc:204:3
    #1 0x7fd42e1d31cf in _PR_CreateThread /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:433:14
    #2 0x7fd42e1d2dbe in PR_CreateThread /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:518:12
    #3 0x7fd410cb02e3 in nsThread::Init(nsTSubstring<char> const&) /builds/worker/workspace/build/src/xpcom/threads/nsThread.cpp:612:8
    #4 0x7fd410cb9eca in nsThreadManager::NewNamedThread(nsTSubstring<char> const&, unsigned int, nsIThread**) /builds/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp:471:22
    #5 0x7fd410cc9dbf in NS_NewNamedThread /builds/worker/workspace/build/src/xpcom/threads/nsThreadUtils.cpp:143:45
    #6 0x7fd410cc9dbf in nsThreadPool::PutEvent(already_AddRefed<nsIRunnable>, unsigned int) /builds/worker/workspace/build/src/xpcom/threads/nsThreadPool.cpp:109
    #7 0x7fd410ccb916 in nsThreadPool::Dispatch(already_AddRefed<nsIRunnable>, unsigned int) /builds/worker/workspace/build/src/xpcom/threads/nsThreadPool.cpp:277:5
    #8 0x7fd4170ff953 in Dispatch /builds/worker/workspace/build/src/obj-firefox/dist/include/nsIEventTarget.h:37:14
    #9 0x7fd4170ff953 in Dispatch /builds/worker/workspace/build/src/dom/media/GraphDriver.h:583
    #10 0x7fd4170ff953 in mozilla::AudioCallbackDriver::Start() /builds/worker/workspace/build/src/dom/media/GraphDriver.cpp:748
    #11 0x7fd417369bba in mozilla::MediaStreamGraphImpl::RunInStableState(bool) /builds/worker/workspace/build/src/dom/media/MediaStreamGraph.cpp:1693:17
    #12 0x7fd41738e48f in mozilla::(anonymous namespace)::MediaStreamGraphStableStateRunnable::Run() /builds/worker/workspace/build/src/dom/media/MediaStreamGraph.cpp:1548:15
    #13 0x7fd410b0a6c0 in mozilla::CycleCollectedJSContext::ProcessStableStateQueue() /builds/worker/workspace/build/src/xpcom/base/CycleCollectedJSContext.cpp:312:12
    #14 0x7fd410b0c855 in mozilla::CycleCollectedJSContext::AfterProcessTask(unsigned int) /builds/worker/workspace/build/src/xpcom/base/CycleCollectedJSContext.cpp:377:3
    #15 0x7fd412569d5d in XPCJSContext::AfterProcessTask(unsigned int) /builds/worker/workspace/build/src/js/xpconnect/src/XPCJSContext.cpp:1252:30
    #16 0x7fd410cb49e3 in nsThread::ProcessNextEvent(bool, bool*) /builds/worker/workspace/build/src/xpcom/threads/nsThread.cpp:1133:24
    #17 0x7fd410cd0350 in NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/workspace/build/src/xpcom/threads/nsThreadUtils.cpp:519:10
    #18 0x7fd411ba445a in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) /builds/worker/workspace/build/src/ipc/glue/MessagePump.cpp:97:21
    #19 0x7fd411af4229 in RunInternal /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:326:10
    #20 0x7fd411af4229 in RunHandler /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:319
    #21 0x7fd411af4229 in MessageLoop::Run() /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:299
    #22 0x7fd418815eba in nsBaseAppShell::Run() /builds/worker/workspace/build/src/widget/nsBaseAppShell.cpp:157:27
    #23 0x7fd41cad7edb in XRE_RunAppShell() /builds/worker/workspace/build/src/toolkit/xre/nsEmbedFunctions.cpp:893:22
    #24 0x7fd411af4229 in RunInternal /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:326:10
    #25 0x7fd411af4229 in RunHandler /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:319
    #26 0x7fd411af4229 in MessageLoop::Run() /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:299
    #27 0x7fd41cad78a2 in XRE_InitChildProcess(int, char**, XREChildData const*) /builds/worker/workspace/build/src/toolkit/xre/nsEmbedFunctions.cpp:719:34
    #28 0x4f50dc in content_process_main /builds/worker/workspace/build/src/browser/app/../../ipc/contentproc/plugin-container.cpp:50:30
    #29 0x4f50dc in main /builds/worker/workspace/build/src/browser/app/nsBrowserApp.cpp:280
    #30 0x7fd430b0582f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

==13484==ABORTING
Attached audio 1.ogg
Attached video 2.webm
Attached video 3.webm
Attached file ASAN output
Attached file crash.html
Drno: who gets this one?

Going to guess sec-high to start because our nsTArrays usually contain things with pointers that can be abused.
Flags: needinfo?(drno)
Andreas looks like you touched that code involved most recently. Can you have a look?
Or dispatch to Bryce or someone else if you don't have time.
Flags: needinfo?(apehrson)
Bryce, do you have time to look at this? I'm about to be off for a week and would rather have this looked at sooner.
Flags: needinfo?(apehrson) → needinfo?(bvandyk)
Will do.
Assignee: nobody → bvandyk
Flags: needinfo?(bvandyk)
Notes thus far:

Seems difficult to repro on Windows. I got it to trigger once after 10+ minutes, and since our logging is pretty poor on Windows it wasn't a particularly useful exercise.

Initially I could only get the following trace on Linux (20+ crashes and only this), which also takes place in non-asan builds since it's an assert:

==12409==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x00000056d5d8 bp 0x7fff7714cab0 sp 0x7fff7714c940 T0)
==12409==The signal is caused by a WRITE memory access.
==12409==Hint: address points to the zero page.
    #0 0x56d5d7 in MOZ_CrashPrintf /builds/worker/workspace/build/src/mfbt/Assertions.cpp:63:3
    #1 0x7f6bb545319b in InvalidArrayIndex_CRASH(unsigned long, unsigned long) /builds/worker/workspace/build/src/xpcom/ds/nsTArray.cpp:26:3
    #2 0x7f6bbb9c972c in ElementAt /builds/worker/workspace/build/src/obj-firefox/dist/include/nsTArray.h:1041:7
    #3 0x7f6bbb9c972c in operator* /builds/worker/workspace/build/src/obj-firefox/dist/include/mozilla/ArrayIterator.h:86
    #4 0x7f6bbb9c972c in mozilla::dom::MediaStreamTrack::Destroy() /builds/worker/workspace/build/src/dom/media/MediaStreamTrack.cpp:169
    #5 0x7f6bbb9ca405 in mozilla::dom::MediaStreamTrack::cycleCollection::Unlink(void*) /builds/worker/workspace/build/src/dom/media/MediaStreamTrack.cpp:181:8
    #6 0x7f6bb53cf9c4 in nsCycleCollector::CollectWhite() /builds/worker/workspace/build/src/xpcom/base/nsCycleCollector.cpp:3401:26
    #7 0x7f6bb53d2661 in nsCycleCollector::Collect(ccType, js::SliceBudget&, nsICycleCollectorListener*, bool) /builds/worker/workspace/build/src/xpcom/base/nsCycleCollector.cpp:3769:24
    #8 0x7f6bb53d6885 in nsCycleCollector_collectSlice(js::SliceBudget&, bool) /builds/worker/workspace/build/src/xpcom/base/nsCycleCollector.cpp:4330:21
    #9 0x7f6bb87563b2 in nsJSContext::RunCycleCollectorSlice(mozilla::TimeStamp) /builds/worker/workspace/build/src/dom/base/nsJSEnvironment.cpp:1544:3
    #10 0x7f6bb8756e73 in ICCRunnerFired(mozilla::TimeStamp) /builds/worker/workspace/build/src/dom/base/nsJSEnvironment.cpp:1603:3
    #11 0x7f6bb54f83ed in operator() /builds/worker/workspace/build/src/clang/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.4/../../../../include/c++/4.9.4/functional:2440:14
    #12 0x7f6bb54f83ed in mozilla::IdleTaskRunner::Run() /builds/worker/workspace/build/src/xpcom/threads/IdleTaskRunner.cpp:62
    #13 0x7f6bb553d3f9 in nsThread::ProcessNextEvent(bool, bool*) /builds/worker/workspace/build/src/xpcom/threads/nsThread.cpp:1096:14
    #14 0x7f6bb5558e30 in NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/workspace/build/src/xpcom/threads/nsThreadUtils.cpp:519:10
    #15 0x7f6bb64224aa in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) /builds/worker/workspace/build/src/ipc/glue/MessagePump.cpp:97:21
    #16 0x7f6bb6376fe9 in RunInternal /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:326:10
    #17 0x7f6bb6376fe9 in RunHandler /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:319
    #18 0x7f6bb6376fe9 in MessageLoop::Run() /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:299
    #19 0x7f6bbce3f80a in nsBaseAppShell::Run() /builds/worker/workspace/build/src/widget/nsBaseAppShell.cpp:157:27
    #20 0x7f6bc10e62bb in XRE_RunAppShell() /builds/worker/workspace/build/src/toolkit/xre/nsEmbedFunctions.cpp:893:22
    #21 0x7f6bb6376fe9 in RunInternal /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:326:10
    #22 0x7f6bb6376fe9 in RunHandler /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:319
    #23 0x7f6bb6376fe9 in MessageLoop::Run() /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:299
    #24 0x7f6bc10e5c80 in XRE_InitChildProcess(int, char**, XREChildData const*) /builds/worker/workspace/build/src/toolkit/xre/nsEmbedFunctions.cpp:719:34
    #25 0x4f1875 in content_process_main /builds/worker/workspace/build/src/browser/app/../../ipc/contentproc/plugin-container.cpp:50:30
    #26 0x4f1875 in main /builds/worker/workspace/build/src/browser/app/nsBrowserApp.cpp:280
    #27 0x7f6bd502082f in __libc_start_main /build/glibc-Cl5G7W/glibc-2.23/csu/../csu/libc-start.c:291
    #28 0x420f48 in _start (/home/b/firefoxAsanOpt/firefox+0x420f48)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /builds/worker/workspace/build/src/mfbt/Assertions.cpp:63:3 in MOZ_CrashPrintf
==12409==ABORTING

I am able to repro the above with a stripped down version of the original script:

<script>
function start() {	
	o208=document.createElement('video');
	o208.autoplay=true;
	o208.setAttribute('src','1.ogg');
	o208.onplaying=fun0;
	o379=o208.mozCaptureStreamUntilEnded();
	window.setTimeout("location.reload()",200);
}
function fun0() {
	try{o573=new MediaRecorder(o379);}catch(e){}
	o208.setAttribute('src','3.webm');
	try{o573.start();}catch(e){}
}
</script>
<body onload="start()"></body>

I've seen the original stack trace once while messing with the original script and adjusting the reload() up to a couple of seconds.

I suspect they're both related in terms of us racing on interactions with the Streams/Tracks involved.
Group: core-security → media-core-security
Attached patch Bug1453127-1.patch (obsolete) — Splinter Review
There's (at least) an issue with our code to clear our listeners for MediaStreamTracks. We currently use range based for loops -- we have some extra safety from our custom iterators, but I believe it's still a no no to do removal on an array with its own iterator.

So for the issue I was commonly seeing, I believe that's a product of our iterators performing bad access, but hitting code to make sure such misuse is 'safe'.

Patch attached to address this issue.

My suspicion for the stack in the original failure is that if our clean up goes bad then we have bad state down the road. It looks like we have a bad chunk being appended, which I imagine could happen if we have mangled state. I've run for 10ish hours against an ASAN build with the patch applied and have not hit the issue. I would appreciate if anyone else could verify the same.

I've had no luck repro issues with RR before or after the patch, even with several hours of runs.

After this patch I've seen other issues with IPC sometimes getting unhappy after several hour soaks, but I suspect this issue already existed and has not been introduced by any of my changes.

Going to stare at this a little more, but appreciate feedback on patch and the above.
Attachment #8969387 - Flags: review?(apehrson)
Comment on attachment 8969387 [details] [diff] [review]
Bug1453127-1.patch

Review of attachment 8969387 [details] [diff] [review]:
-----------------------------------------------------------------

Good find! This is a fix of something nasty indeed, just needs some simplification for r+.

It's not clear to me whether this is linked to the reporter's issue though. Were you able to reproduce his stack before the patch?

::: dom/media/MediaStreamTrack.cpp
@@ +176,5 @@
> +      // If we couldn't remove the listener we need to increase our index so
> +      // we don't keep trying to remove the same one. If we successfully
> +      // removed then the element at i should be a different listener.
> +      i++;
> +    }

You're right that if this code is triggered it would cause badness. Did you confirm this code was triggered when we crashed in a stack like the reporter's?

I think a cleaner way to do this is to make a copy of the array and iterate over that, like at https://searchfox.org/mozilla-central/rev/59a9a86553e9bfd9277202748ff791fd9bc0713b/dom/media/MediaStreamGraph.cpp#2046-2049

@@ +564,5 @@
>    GetOwnedStream()->AddTrackListener(aListener, mTrackID);
>    mTrackListeners.AppendElement(aListener);
>  }
>  
> +bool

Let's not do this only to satisfy the looping on Destroy().

If removing was fallible (i.e., we try to remove but it failed and the listener remains) then sure, returning the status would be useful.

But here it's infallible and idempotent so there's no need really. And we cannot put a requirement on it that all listeners must be destroyed by the time of Destroy() because cycle collection doesn't guarantee the order in which it unlinks objects (inherently because there's a cycle).
Attachment #8969387 - Flags: review?(apehrson) → review-
The code you're fixing is stretching back to 50, [1], so marking it as such.


[1] https://hg.mozilla.org/mozilla-central/rev/49295086e189276fdab180f5e907538d929c6f7f#l5.30
Version: 61 Branch → 50 Branch
[Tracking Requested - why for this release]: UAF sec-high

I agree with Andreas that this type of "loop with optional deletion" is tricky to reason about and view. Since we're in Destroy() anyways, we can use the copy/clear trick (though it's higher overhead - addrefs and and allocation), or you can simply iterate the list backwards instead of forwards; then if Remove*Listener() fails, it has no effect on the rest of the iterations.

And if you're reverse-iterating, you can still use range-based loops: (looking at ReverseIterator.h and nsTextframe.cpp:7303)
   for (blah& entry : Reversed(array)) {
nsDocument.cpp uses it a lot too.
Flags: needinfo?(bvandyk)
Rework of patch.

The original stack is very rare in my testing. After 100+ of the stacks mentioned in comment 11 I've seen the original once. All such runs were against central ASAN builds.

I've tried running against custom builds that do not remove the listeners at all, to see if listeners living too long is somehow getting us, and have not seen the issue.

I've run some long lived soaks after the patch: several hours each, and 15+ hour total. I haven't seen the issue during these runs.

However, it's also not clear to me what exactly is causing the original failure. As noted above I suspect it's a threading issue, and we somehow get a bad chunk. :pehrsons, :jesup, do you have any ideas what would cause the original trace?

Nils, did you see this other signature during your testing?
Attachment #8969387 - Attachment is obsolete: true
Flags: needinfo?(bvandyk) → needinfo?(nils)
Attachment #8970133 - Flags: review?(apehrson)
Attachment #8970133 - Flags: review?(apehrson) → review+
Comment on attachment 8970133 [details] [diff] [review]
Bug1453127-1.patch

Requesting sec approval. There may be more to be done here, but let's get this part sorted.

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
- Not easily. Before the fix, using the test case form this bug (which is not representative of typical usage of HTMLMediaElement/MediaRecorder), we mostly appear to be hitting asserts intended to make mis-use of this type safe in our arrays. If this fix is what led to the original stack then it's not an easy connection to make.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
- No. The commit message alludes to an issue of iterator invalidation, we could make it more opaque if that is a concern.

Which older supported branches are affected by this flaw?
- The flaw appears to have existed since 50. So all current branches are affected.

If not all supported branches, which bug introduced the flaw?
- All branches are affected. For the sake of completeness, it appears Bug 1266646 introduced the being addressed by the patch changes.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
- I haven't tested backporting this patch, but given its size I think it should be straight forward to apply to other branches.

How likely is this patch to cause regressions; how much testing does it need?
- Given the small size of this patch I do not think it likely to cause regressions or require further testing.
Attachment #8970133 - Flags: sec-approval?
Bryce, I did see the crash mentioned in comment 11. However the crash from the original report happened far more often and pretty reliably for me.
Flags: needinfo?(nils)
Comment on attachment 8970133 [details] [diff] [review]
Bug1453127-1.patch

sec-approval+ for trunk. I've spoken to Release Management and we want a Beta and ESR52 patch nominated as well.
Attachment #8970133 - Flags: sec-approval? → sec-approval+
https://hg.mozilla.org/integration/mozilla-inbound/rev/6b3aaee40f7507e240da08d6e073cff3c53971f4
Bug 1453127 - Do not use iterators in MediaStreamTrack when removing listeners. r=pehrsons
Comment on attachment 8970133 [details] [diff] [review]
Bug1453127-1.patch

Requesting uplift approval now. However, we should verify this fixes the issue in nightly as it's not clear if this part of the code the root cause of the bad read or if more needs to be done. I don't know how that impacts uplift timing, but wanted to ensure clarity.

Beta:
Approval Request Comment
[Feature/Bug causing the regression]:
- Bug 1266646
[User impact if declined]:
- Possible exposure of user memory.
[Is this code covered by automated tests?]:
- Not for this specific case.
[Has the fix been verified in Nightly?]:
- I am not able to repro the issue locally with the patch, verification of changes in nightly are pending.
[Needs manual test from QE? If yes, steps to reproduce]: 
- Yes if possible. If QE could verify the issue occurring in ASAN builds before patch and not after, that would be ideal, as the issue is extremely rare in my testing. STR can be found in comment 0.
[List of other uplifts needed for the feature/fix]:
- Only this patch is needed.
[Is the change risky?]:
- No.
[Why is the change risky/not risky?]:
- The change is small and is not complex.
[String changes made/needed]:
- None.

ESR:
[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
- Bug is sec high.
User impact if declined:
- Possible exposure of user memory.
Fix Landed on Version:
- 61, requesting uplift to 60.
Risk to taking this patch (and alternatives if risky): 
- Low, patch touches a small area of code and is not complex.
String or UUID changes made by this patch:
- None

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Attachment #8970133 - Flags: approval-mozilla-esr52?
Attachment #8970133 - Flags: approval-mozilla-beta?
I put in a PI request (https://mana.mozilla.org/wiki/display/PI/PI+Request) to QE asking for someone to verify this in nightly tonight/tomorrow (once it lands in m-c). 

The goal is to uplift this to beta/esr52 before the build/ merge on Thursday morning.
https://hg.mozilla.org/mozilla-central/rev/6b3aaee40f75
Group: media-core-security → core-security-release
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
I reproduced the crash with the mentioned error on Linux 16.04 x64 with an ASAN build from 2018/04/23.

Tried the same steps, as in comment 0, on Linux 16.04 x64 with ASAN build from today, 2018/04/25: gecko.v2.mozilla-central.pushdate.2018.04.25.20180425114412.firefox and did not get any crash.
I could not reproduce the error with a normal Nightly build 61.0a1, without the fix(Build ID 	20180423100754) or with the fix (Build ID 20180425100122). Tested on Linux 16.04 x64
Comment on attachment 8970133 [details] [diff] [review]
Bug1453127-1.patch

Crash fix, sec-high, verified in nightly. Let's uplift for beta 16 + esr52.
Attachment #8970133 - Flags: approval-mozilla-esr52?
Attachment #8970133 - Flags: approval-mozilla-esr52+
Attachment #8970133 - Flags: approval-mozilla-beta?
Attachment #8970133 - Flags: approval-mozilla-beta+
Verified the bug using the same scenario from comment 0 on Ubuntu 16.04 x64, Window 10 x64, Mac 10.12 with Beta 60.0b16, esr52 and esr60() builds.

Please note that for esr60 I used the build from taskcluster: https://tools.taskcluster.net/index/gecko.v2.mozilla-esr60.latest.firefox

On Ubuntu 16.04, the bug was verified also on ASAN builds for Beta 60.0b16, esr52 and esr60.
Whiteboard: [adv-main60+][adv-esr52.8+]
Verified this on the latest esr52 build: 52.7.4, Build ID 20180427222832. We will verify on official esr60 when a build is available.
Alias: CVE-2018-5156
(In reply to Nils from comment #18)
> Bryce, I did see the crash mentioned in comment 11. However the crash from
> the original report happened far more often and pretty reliably for me.

Probably should have asked much earlier, but have you seen the problem since we've landed this patch?
Flags: needinfo?(nils)
Dan, the last time the fuzzer hit this crash was 05/01/2018 @ 9:46am (UTC).

And the original testcase still reproduces for me on the latest ASAN nightly build (SourceStamp=b1628ac71fcc15797baec6083650bfcde650f190) with the same stack trace as the original testcase.
Flags: needinfo?(nils)
I can also reproduce this on m-c:
BuildID=20180507140941
SourceStamp=93443d36d4bd53dba004f7b73430879f96daa681

It does take some time there seems to be a race. It took about 5 minutes with 3 instances of the test open. Setting the timeout in the test to 800 seemed(?) to help me.
Status: VERIFIED → REOPENED
Flags: needinfo?(bvandyk)
Resolution: FIXED → ---
Continuing to look at this.

:pehrsons, if you have some time to take a look I'd appreciate your thoughts as to what you think may be going on here.
Flags: needinfo?(bvandyk) → needinfo?(apehrson)
Sounds like we're targeting 61 and associated ESR releases at this point.
Managed to finally catch this in a debugger by running 4 concurrent firefoxes. Thanks for the tip re running several at once :tsmith.

That bad access on 0x00009fff8000 (which seems a consistent address in these failures) is coming from the code generated by asan to verify the value of mHdr for our nsTArray. This is from the mapping from the normal address into asan's shadow addressing. This address is calculated from what appears to be the orignal mHdr via:
`mov    %rdi,%rax` (where rdi now contains mHdr's ptr)
`shr    $0x3,%rax` (shadow memory mapping step)
`mov    0x7fff8000(%rax),%al` (should map from shadow memory, but address is bad)

So our mHdr is bad here, with a value of 0x100000001. I believe mHdr should always be sane, either pointing to the empty header or some other header, so something is going wrong here.

In mozilla::MediaSegmentBase<mozilla::AudioSegment, mozilla::AudioChunk>::AppendSliceInternal[0] it looks like our chunk is garbage (full of insane values, has some poisoned values). aSource looks a bit weird.

Up to the call site of the append in MediaEncoder[1] may be our issue. Logging aMedia shows:

>(const mozilla::MediaSegment &) @0x618000077080: {_vptr$MediaSegment = 0x7f432681a550 <vtable for mozilla::VideoSegment+16>, Duration = 512, mType = mozilla::MediaSegment::VIDEO, mLastPrincipalHandle = {mPtr = { mRawPtr = 0x0}}}

So we've got a video segment, but it's being cast to an audio, which I imagine could explain the consistently bogus value for our mHdr.

Based on that we're in an AudioListener something else is going wrong with registration of listeners that we're receiving video. However, let's make this listener safe to address the issue at hand.

[0]: https://searchfox.org/mozilla-central/rev/5ff2d7683078c96e4b11b8a13674daded935aa44/dom/media/MediaSegment.h#512
[1]: https://searchfox.org/mozilla-central/rev/5ff2d7683078c96e4b11b8a13674daded935aa44/dom/media/encoder/MediaEncoder.cpp#150
Attached patch Bug1453127-2.patch (obsolete) — Splinter Review
Patch to avoid casting video data to audio in MediaEncoder's direct listener which should (hopefully) prevent us getting bogus pointers further down the stack.

It appears we're arriving at the direct callback via the non direct calling the direct path, which makes sense for video data, and is useful to know for investigating at a higher level why this is happening. Since all data to be processed is fed through the updated, direct function, this should avoid both direct and non-direct listeners that should somehow become incorrectly connected.
Attachment #8974468 - Flags: review?(apehrson)
Flags: sec-bounty?
Comment on attachment 8974468 [details] [diff] [review]
Bug1453127-2.patch

Review of attachment 8974468 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/encoder/MediaEncoder.cpp
@@ +144,5 @@
>        return;
>      }
>  
> +    if (aMedia.GetType() != MediaSegment::AUDIO) {
> +      MOZ_ASSERT_UNREACHABLE("AudioTrackListener should not be called with non-audio data!");

I'd rather see us fix the root problem than this. Something must be going seriously wrong somewhere if we're getting video in the audio listener.

Either the root cause is in MediaRecorder code, then this is OK; or it's in central code, then things could go wrong in any of the sinks. We need to figure out which.

If we must land something, make it a release assert so we at least get some diagnostics on this in the wild while we're figuring it out?
Attachment #8974468 - Flags: review?(apehrson) → review-
I won't be able to look at this for at least another week. Did anyone have any luck with rr so far?
I still think we should land the above. I think having a guard on the cast is a good idea in general, even if we identify the root cause and this gets us a straight forward fix now.

What kind of diagnostics would we hope to gather with a different assert?
(In reply to Bryce Van Dyk (:bryce) from comment #40)
> I still think we should land the above. I think having a guard on the cast
> is a good idea in general, even if we identify the root cause and this gets
> us a straight forward fix now.

It's a coding error to get the wrong media type there, and we handle coding errors with asserts and not runtime checks. (If your state is unexpectedly bad you'll get into much larger problems trying to handle all such cases).

A regular MOZ_ASSERT is certainly justified, and I can stretch to a release one to understand better how often this is happening in the wild since we're obviously hitting this path in some cases.


> What kind of diagnostics would we hope to gather with a different assert?

It lets us know how commonly we hit this in the wild.
Use a release assert instead.

I have some reservations about using a release assert, as we know there's an issue here that can crash release builds. But I suspect this will be rare.
Attachment #8974468 - Attachment is obsolete: true
Attachment #8975005 - Flags: review?(apehrson)
Looking at the trigger code we're using HTMLMediaElement.mozCaptureStream() with a media element that is changing sources. Timing between starting capture and changing source seems like a key element here.

Media element capture is the only MediaStream source that doesn't explicitly create its MediaStreamTracks on main thread before creating them, instead they get created after notification from the MediaStreamGraph, [1]. Note that this has one exception, if capture is started after tracks are known. Then we use the TrackIDs from metadata, [2].

In the trigger code we start with an ogg file, and after a little while (sync!) we start capturing and switch the source to a webm file, which I assume contains a video track.

I'm going to guess that the TrackID of the only audio track in the ogg file comes through as `1`.

With the graph running off main thread, and messages only dispatched at main thread stable state, we won't have dispatched the messages to connect the MediaDecoder's source stream to the output stream at the point of the source switch.

I deem it likely that the webm comes through with the video track's TrackID being `1` and the audio track's TrackID being `2`. If something in the chain of graph-side MediaStreams doesn't pick upp the audio track with id `1` from the first ogg file, the video track from the webm will get TrackID `1` instead.

If the message to disconnect the track from the ogg file gets dispatched and picked up by the same MSG iteration as the one that picks up the initial connect message, I'm not sure a node later in the graph chain for this track will pick it up, leaving `1` available for the video track that gets connected shortly after.

It seems unlikely because metadata can impossibly be known when mozCaptureStreamUntilEnded() happens, because it's set on main thread [3] and we didn't context switch.

It doesn't seem far-fetched that something close to this happens though. Note that media element capture (the prefixed methods mozCaptureStream() and mozCaptureStreamUntilEnded()) is the only source that could be vulnerable though. Is this reproducible with MediaStream:4,MediaStreamTrack:4 logs?



[1] https://searchfox.org/mozilla-central/rev/3f17a234769d25fca5144ebb8abc8e1cb3c56c16/dom/media/DOMMediaStream.cpp#153-198
[2] https://searchfox.org/mozilla-central/rev/3f17a234769d25fca5144ebb8abc8e1cb3c56c16/dom/html/HTMLMediaElement.cpp#3533,3544
[3] https://searchfox.org/mozilla-central/rev/3f17a234769d25fca5144ebb8abc8e1cb3c56c16/dom/html/HTMLMediaElement.cpp#5342
Comment on attachment 8975005 [details] [diff] [review]
Bug1453127-2.patch

Review of attachment 8975005 [details] [diff] [review]:
-----------------------------------------------------------------

I also have reservations, and wouldn't land this unless we can really not figure out the root of this.

If my thinking in comment 43 is heading in the right direction, every MediaStreamTrack sink would be affected. This includes at least HTMLMediaElement, MediaRecorder (this), WebAudio and RTCPeerConnection. I would rather see that explored further before giving something like this away publicly. With that in mind I'm clearing the r? for now. I can reconsider if we cannot get further, but then we should add asserts across all sinks, including DOMMediaStream, MediaStreamTrack and MediaStream instances where applicable.
Attachment #8975005 - Flags: review?(apehrson)
Duplicate of this bug: 1460867
Thanks to Bryce's relentless reproing with logs we have confirmation on what is causing the crash. I have minimized said log to exclude irrelevant entries thus making it (a lot) shorter and easier to read. It's attached.

---

An overview of the problem:
- HTMLMediaElement::MozCaptureStream() pre-creates an audio track (ID 1) on main thread because it's playing the ogg file and has metadata for it.

- A MediaRecorder is started for a MediaStream containing the track above with ID 1. It attaches an audio listener to the track with ID 1.

- Content for this audio track never reaches the graph (it must have ended before, or there could be bugs in MediaDecoder/DecodedStream), and so the graph cannot mark ID 1 as taken or signal that the track with ID 1 has ended.

- The webm loads and video gets forwarded to the graph as a track with ID 2.

- Track with ID 2 is ended by the decoder and this is propagated through the graph and signaled to main thread.

- The ogg loads again, but its audio track somehow does not make it to the graph. This must be a separate bug, but it's not critical and part of a prefixed feature. We'll fix it proper when we unprefix this feature instead.

- The webm loads again and video gets forwarded to the graph as a track with ID 2.

- When ID 2 reaches the MediaStream's input stream there's a TrackID collision, because ID 2 was already taken by the same video track in an earlier iteration of the trigger script.

- The source track with ID 2 gets reassigned to ID 1 in the MediaStream's input stream. This is a supported feature. In the log this looks like:
> 51 2018-05-17 16:15:34.908744 UTC - [74744:MediaStreamGrph]: D/TrackUnionStream TrackUnionStream 0x7f92c078e940 added track 1 for input stream 0x7f92c6370310 track 2, start ticks 31872

- The MediaRecorder is still listening to an audio track with ID 1 and is now getting video data from the new track. Tries to cast its MediaSegment to an AudioSegment. Crash.

---

At a higher level this happens because the way HTMLMediaElement::MozCaptureStream() allocates tracks is a mix of doing it on main thread before they exist on the graph thread (the new way), and doing it after-the-fact when graph thread signals that a track was created (the old way).

The new way is nice because it allows tracks to exist as soon as MozCaptureStream() returns, if metadata is known.

Clearly the mix of new and old here gets problematic if a pre-created track never reaches the graph so that main-thread state and graph-thread state are out of sync.


We could fix this by either not pre-creating tracks anymore, or by fixing MediaDecoder/DecodedStream such that all tracks are guaranteed to reach the graph.

The former is simple, but an observable behavior change. The latter could be tricky as it's inherently prone to races.
Flags: needinfo?(apehrson)
Flags: sec-bounty? → sec-bounty+
Whiteboard: [adv-main60+][adv-esr52.8+]
I'm taking this.
Assignee: bvandyk → apehrson
This takes care of the case where the track doesn't get fed to the MediaStreamGraph at all, and thus it ending is not signaled to the corresponding MediaStreamTrack.

This is not required for the sec fix, but it's good form to not leave live tracks around forever. Note that in a case where this is needed, the track would still be live across all seeks until the src changes.

We could do this on Seek() too, but the logic to end tracks on seek lives in MediaDecoder and below, so that would be a larger logic disconnect than this, and is also not to spec.
Attachment #8981361 - Flags: review?(jyavenard)
This ensures TrackID uniqueness in all layers of the DOMMediaStream exposed through mozCaptureStream(), also across multiple MediaDecoders for the same HTMLMediaElement and the same captured DOMMediaStream.

This should completely avoid the sec bug where there's confusion over a certain track's (id'd by TrackID) track type.

With this patch, TrackIDs are allocated as soon as the captured DOMMediaStream's underlying input stream is connected on main thread to the SourceMediaStream where DecodedStream appends data, instead of as earlier, at MediaSink/DecodedStream start time.

This allows DecodedStream/OutputStreamManager's OutputStreamData state to stay in sync with OutputMediaStream state in HTMLMediaElement.
Attachment #8981363 - Flags: review?(jyavenard)
Bryce, could you try to repro the sec issue with these two patches? Thanks!
Flags: needinfo?(bvandyk)
Comment on attachment 8981361 [details] [diff] [review]
Make sure decoder-captured tracks end when changing src

Review of attachment 8981361 [details] [diff] [review]:
-----------------------------------------------------------------

Could this change the timeline order into which tracks are freed ? say if RemoveMediaTracks() is called after?

what are the consequences of that?

::: dom/html/HTMLMediaElement.cpp
@@ +7725,5 @@
> +  for (OutputMediaStream& ms : mOutputStreams) {
> +    if (!ms.mCapturingDecoder) {
> +      continue;
> +    }
> +    for (RefPtr<MediaStreamTrack>& t: ms.mPreCreatedTracks) {

t : ms.mPre
Attachment #8981361 - Flags: review?(jyavenard) → review+
(In reply to Jean-Yves Avenard [:jya] from comment #51)
> Comment on attachment 8981361 [details] [diff] [review]
> Make sure decoder-captured tracks end when changing src
> 
> Review of attachment 8981361 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Could this change the timeline order into which tracks are freed ? say if
> RemoveMediaTracks() is called after?

After what?

This shouldn't change the order as the tracks in `ms.mPreCreatedTracks` are still kept alive by `ms.mStream` which owns them. 
`ms.mStream`, and thus the tracks, is released when `ms` is removed in PlaybackEnded().
RemoveMediaTracks() happens before PlaybackEnded() per [1] (changing state to ended triggers RemoveMediaTracks()).

[1] https://searchfox.org/mozilla-central/rev/5a744713370ec47969595e369fd5125f123e6d24/dom/media/MediaDecoder.cpp#765,767


> what are the consequences of that?
> 
> ::: dom/html/HTMLMediaElement.cpp
> @@ +7725,5 @@
> > +  for (OutputMediaStream& ms : mOutputStreams) {
> > +    if (!ms.mCapturingDecoder) {
> > +      continue;
> > +    }
> > +    for (RefPtr<MediaStreamTrack>& t: ms.mPreCreatedTracks) {
> 
> t : ms.mPre
Fixing the nit. Carries forward r=jya.
Attachment #8981381 - Flags: review+
Attachment #8981361 - Attachment is obsolete: true
Attachment #8981363 - Attachment description: Ensure TrackID uniqueness for captured MediaDecoder → Part 2: Ensure TrackID uniqueness for captured MediaDecoder
Comment on attachment 8981363 [details] [diff] [review]
Ensure TrackID uniqueness for captured MediaDecoder

Review of attachment 8981363 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/mediasink/DecodedStream.cpp
@@ +202,1 @@
>      mStream->AddAudioTrack(aInit.mInfo.mAudio.mTrackId,

use audioTrack

@@ +208,1 @@
>      mStream->AddTrack(aInit.mInfo.mVideo.mTrackId, 0, new VideoSegment());

use videoTrack
Attachment #8981363 - Attachment description: Part 2: Ensure TrackID uniqueness for captured MediaDecoder → Ensure TrackID uniqueness for captured MediaDecoder
Attachment #8981363 - Flags: review?(jyavenard) → review+
In my testing I am immediately hitting one of the new asserts:

> MOZ_ASSERT(mDecoder->NextAvailableTrackIDFor(capturedInputStream) >= out.mNextAvailableTrackID);

With some extra logging I'm seeing NextAvailableTrackId=1, out.mNextAvailableTrackID=2.
Flags: needinfo?(bvandyk)
Thanks Bryce.

(In reply to Bryce Van Dyk (:bryce) from comment #55)
> In my testing I am immediately hitting one of the new asserts:
> 
> > MOZ_ASSERT(mDecoder->NextAvailableTrackIDFor(capturedInputStream) >= out.mNextAvailableTrackID);
> 
> With some extra logging I'm seeing NextAvailableTrackId=1,
> out.mNextAvailableTrackID=2.

Ah yes, the TrackID allocations in DecodedStream/OutputStreamData happen on MediaSink::Start too. We'll just take the max of the DecodedStream TrackID and the HTMLMediaElement TrackID and we should be good.

I also ran across an additional issue where MediaDecoder::Shutdown() (main thread) loses a race against DecodedStream::Start() (MDST thread) where the latter adds tracks to an output stream after HTMLMediaElement has read the supposedly final NextAvailableTrackID value.
I am fixing this by clearing all output streams in OutputStreamManager in MediaDecoderStateMachine::BeginShutdown() (main thread), which is sync to when, and happens after, HTMLMediaElement reads the final NextAvailableTrackID value for each output stream.
Updated per comments. Carries forward r=jya.
Attachment #8981363 - Attachment is obsolete: true
Attachment #8981833 - Flags: review+
This fixes the case where DecodedStream would start and add tracks to the graph after shutdown had been initiated on main thread.

With these 3 patches, everything is smooth for me. Though I haven't been able to repro the original case on a local ASAN/opt build despite various efforts with `stress` and rr chaos mode.

Bryce, could make another attempt to repro with the updates?
Flags: needinfo?(bvandyk)
Attachment #8981835 - Flags: review?(jyavenard)
Attachment #8981835 - Flags: review?(jyavenard) → review+
Current patches look good in my testing. Issue from comment 55 is resolved and after 4+ hours of 4 Firefox instances looping the test case I have not seen the bug reproduce (as tested via assert on the datatype in the listener before the cast).

I'll keep the tests running for a few more hours before I need to reclaim the VM they're running in. Will report back if anything changes, but looks good on my end.
Flags: needinfo?(bvandyk)
Component: Audio/Video → WebRTC: Audio/Video
Comment on attachment 8981381 [details] [diff] [review]
[beta61, esr60] Part 1: Make sure decoder-captured tracks end when changing src

This sec-approval request applies to all three of my patches.

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
One can easily deduce that there's an issue with TrackID collisions. Going from there to figuring out how to trigger that is trickier but possible. Going from there to causing a UAF is even trickier and something the patches do not reveal. Then putting something deliberate in the memory accessed through the UAF (which is a bad cast) would be very hard.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
They mention TrackIDs and uniqueness but as mentioned above there's a fair amount of indirection from there to an exploit.

Which older supported branches are affected by this flaw?
All the way back to esr52. I think I traced it back to 36 (bug 1097224) but I'm not sure how TrackID collisions were handled then.

If not all supported branches, which bug introduced the flaw?
n/a

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
Everything but esr52 grafted clean. Esr52 needed some (not much) massaging. There are no fundamental differences between m-c and esr52 in this code so low risk.

How likely is this patch to cause regressions; how much testing does it need?
Not very likely. It's fairly large but it's mostly a plumbing exercise. The biggest behavioral change is that some tracks can appear live longer than expected (yet never see any actual data). This is the tradeoff we get instead of the sec issue.
Attachment #8981381 - Flags: sec-approval?
Comment on attachment 8981381 [details] [diff] [review]
[beta61, esr60] Part 1: Make sure decoder-captured tracks end when changing src

Sec-approval+ for trunk.

Assuming we need patches for Beta, ESR60, and ESR52 (which I assume), please nominate such as well.
Attachment #8981381 - Flags: sec-approval? → sec-approval+
Andreas, is this ready to land?
Flags: needinfo?(apehrson)
It is!
Flags: needinfo?(apehrson)
Keywords: checkin-needed
Note that the checkin-needed is for my patches marked "Part 1", "Part 2" and "Part 3".
Comment on attachment 8981381 [details] [diff] [review]
[beta61, esr60] Part 1: Make sure decoder-captured tracks end when changing src

This approval request applies to my patches marked "Part 1", "Part 2" and "Part 3".

Approval Request Comment
[Feature/Bug causing the regression]: Unclear/latent
[User impact if declined]: Possible UAF
[Is this code covered by automated tests?]: No
[Has the fix been verified in Nightly?]: No
[Needs manual test from QE? If yes, steps to reproduce]: If they're able to reproduce the original issue, yes. See comment 0 for STR. Bryce and the reporter had some success and might be able to help.
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: Low risk
[Why is the change risky/not risky?]: See details in comment 60
[String changes made/needed]: None
Attachment #8981381 - Flags: approval-mozilla-esr60?
Attachment #8981381 - Flags: approval-mozilla-beta?
Fixes squashed and patched to work on esr 52.

Re-requesting review because I had to create a method on HTMLMediaElement to override marking tracks ended, and protect against null OutputMediaStream::mStream members (from CC unlinking). These extra fixes are unique to 52 and are not needed on 60 or higher.
Attachment #8983375 - Flags: review?(jyavenard)
Attachment #8983375 - Flags: review?(jyavenard) → review+
Comment on attachment 8983375 [details] [diff] [review]
[esr52, squashed] Harden TrackID handling

See comment 65 for the approval request details. There is somewhat larger risk as we're patching older code.

To exemplify this I did find and fix a few small cases when testing locally per comment 66.
Attachment #8983375 - Flags: approval-mozilla-esr52?
Needs rebasing.
Flags: needinfo?(apehrson)
Keywords: checkin-needed
Seems like someone ran clang-format on some of these files in the meantime. I'll put up new patches for m-c and the existing ones can be used for beta and esr60.
Flags: needinfo?(apehrson)
Attachment #8981381 - Attachment description: Part 1: Make sure decoder-captured tracks end when changing src → [beta61, esr60] Part 1: Make sure decoder-captured tracks end when changing src
Attachment #8981833 - Attachment description: Part 2: Ensure TrackID uniqueness for captured MediaDecoder → [beta61, esr60] Part 2: Ensure TrackID uniqueness for captured MediaDecoder
Attachment #8981835 - Attachment description: Part 3: Clear output streams on shutdown → [beta61, esr60] Part 3: Clear output streams on shutdown
Rebased on latest m-c. Carries forward r=jya.
Attachment #8983440 - Flags: review+
Rebased on latest m-c. Carries forward r=jya.
Attachment #8983441 - Flags: review+
Rebased on latest m-c. Carries forward r=jya.
Attachment #8983442 - Flags: review+
Comment on attachment 8981381 [details] [diff] [review]
[beta61, esr60] Part 1: Make sure decoder-captured tracks end when changing src

Fixes a sec-high, approved for 61.0b12, ESR 60.1 and ESR 52.9.
Attachment #8981381 - Flags: approval-mozilla-esr60?
Attachment #8981381 - Flags: approval-mozilla-esr60+
Attachment #8981381 - Flags: approval-mozilla-beta?
Attachment #8981381 - Flags: approval-mozilla-beta+
Attachment #8983375 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
Verified as fixed on latest asan build, downloaded from taskcluster https://tools.taskcluster.net/index/gecko.v2.mozilla-central.latest.firefox/linux64-asan-opt on Linux 18.04.

Also verified with the Nightly build 62.0a1, build id 20180608100105 on Linux 18.06, Mac 10.13 and Windows 10 x64.

Please note that I manage to reproduce the issue on an older Nightly build from 05.07.2018 using 5 concurrent instances of the test open.
Whiteboard: [adv-main61+][adv-esr52.9+][adv-esr60.1+]
Whiteboard: [adv-main61+][adv-esr52.9+][adv-esr60.1+] → [adv-main61+][adv-esr52.9+][adv-esr60.1+][post-critsmash-triage]
Verified on Beta 61.0b14 on Ubuntu 18.06, Mac 10.13 and Windows 10. On Ubuntu we verified the fix also on asan build, downloaded from taskcluster https://tools.taskcluster.net/index/gecko.v2.mozilla-beta.latest.firefox/linux64-asan-opt
Verified on ESR 52.9.0 build ID 20180621064021 and ESR 60.1.0 build ID 20180621121604 on Ubuntu 16.04, Mac 10.13 and Windows 10.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.