Closed Bug 1108455 (CVE-2014-8641) Opened 9 years ago Closed 9 years ago

Execution of arbitrary addresses in relation to WebRTC MediaStreamgraph.

Categories

(Core :: WebRTC, defect)

34 Branch
x86_64
All
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla37
Tracking Status
firefox34 --- wontfix
firefox35 + fixed
firefox36 + fixed
firefox37 + fixed
firefox-esr31 35+ fixed
b2g-v1.4 --- wontfix
b2g-v2.0 --- wontfix
b2g-v2.0M --- wontfix
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed

People

(Reporter: mitchwharper, Assigned: padenot)

Details

(Keywords: sec-critical, valgrind, Whiteboard: [adv-main35+][adv-esr31.4+][b2g-adv-main2.2+])

Attachments

(3 files, 4 obsolete files)

User Agent: Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36

Steps to reproduce:

I've been examining the crash reports for WebAudio and WebRTC, and recently came across this class of crashes: https://crash-stats.mozilla.com/signature/?signature=mozilla%3A%3AMediaStreamGraphImpl%3A%3AOneIteration(__int64%2C+__int64%2C+__int64%2C+__int64)&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&page=1 

All occurred after the release of FF34. I have been attempting to reproduce the crash and have strong reason to believe the execution is occurring within the block at http://hg.mozilla.org/releases/mozilla-release/annotate/8274648ad79f/content/media/MediaStreamGraph.cpp#l1372 after examining the state of registers in that block in comparison to the crash reports.


Actual results:

While I have been working on this, the crash at https://crash-stats.mozilla.com/report/index/4e244741-db33-4400-bc4a-bbe442141208 occurred, and examining that it seems as someone is attempting to execute arbitrary code, and has succeeded in doing so. As they submitted a crash report, I would imagine they will be reporting this as well, but just in case I thought it would be prudent to notify you.
Examining the bug at https://crash-stats.mozilla.com/report/index/2ebf1312-cc30-4522-a6c0-c5bbf2141208, there seems to be a different user with a different locale attempting to work on this as well.
Attached file Valgrind output (obsolete) —
As this is most likely a use after free, I switched over to Linux and ran a built-for-valgrind build of Firefox 35 retrieved from the main FTP server. After visiting https://apprtc.appspot.com/ then https://www.webrtc-experiment.com/RTCMultiConnection/MultiRTC/ then https://www.webrtc-experiment.com/RTCMultiConnection/AppRTC-Look.html, while running Valgrind I found the attached errors. The run was a preliminary 10-minute test run, I didn't keep track of which sites cause which errors.

Although this was run on Ubuntu 14.04, it seems there's a read after free in relation to the way tracks are handled (one in the context of the OneIteration method) and could possibly be leading to the two classes of bugs. I'm attempting to create a valgrind build in Windows to use with Wine for a more careful analysis.
Component: Untriaged → WebRTC
Flags: needinfo?(rjesup)
Keywords: sec-critical
Product: Firefox → Core
Severity: normal → critical
Flags: needinfo?(rjesup)
Paul -- Can you look at this ASAP and let us know what you find?
Assignee: nobody → padenot
Flags: needinfo?(padenot)
The crash reports linked from comment 0 and comment 1 are both 
  https://www.facebook.com/videocall/incall/?<accountstuff>

Doesn't look like someone trying to develop an exploit. But as a popular app causing a dangerous-looking crash it might give someone a leg up on where to look.
Looks like webaudio or webrtc can trigger it (i.e. MSG)
Another thing worth mentioning that I sent in the email report--the set of crashes at https://crash-stats.mozilla.com/signature/?signature=mozilla%3A%3AAudioMixer%3A%3AFinishMixing()&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&page=1#reports is possibly related. The crashes occur in the same context, and they mostly satisfy that register ecx ends in 0x058, edx ends in 0x008, and efl is set to 0x00010202.
The valgrind trace sheds light on a possible cause:
- On the main thread, some message is enqueued in the MSG message queue that tells to disable a track.
- On the MSG thread, the message is enqueued, and ran: the track is disabled. This mutates an array. No locks are taken.
- At the same time, on the IncomingVideoStreamThread, a video frame is pushed to the SourceMediaStream. To do so, it needs to check whether the track is disabled, and reads bytes that have been freed.

The fix is straightforward, we should lock whenever mutating the track disabling array.

I'm still looking at the other issues.
Attached patch r= — — Splinter Review
The code in fact tried to do the right thing, but a missing `virtual` made it
call the wrong method. Also added another virtual wrapper to assert locking
instead of just having a comment.
Attachment #8533643 - Flags: review?(rjesup)
Attachment #8533643 - Flags: review?(rjesup) → review+
Comment on attachment 8533643 [details] [diff] [review]
r=

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
This is read-after-free. This can cause a crash, or a track to be disabled where it should not have been.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
It's rather straightforward to understand the class of issue this fixes (concurrent free/read without mutex).

Which older supported branches are affected by this flaw?
34, 35, 36, 37 (all of them)

If not all supported branches, which bug introduced the flaw?
Not clear, probably since we landed the track disabling

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
Yes (not risky)

How likely is this patch to cause regressions; how much testing does it need?
This is an easy fix, adding locking around a data structure access. A try run should be enough.
Flags: needinfo?(padenot)
Attachment #8533643 - Flags: sec-approval?
Fyi, Bug 1109041 looks to be this bug, in case we want a public bug to land on inconspicuously.
Attached patch cubeb-shutdown-fix — — Splinter Review
I kind of want to try this to check if it fixes globally some issues we had on WASAPI shutdown.
Attachment #8533726 - Flags: review?(kinetik)
Comment on attachment 8533726 [details] [diff] [review]
cubeb-shutdown-fix

Review of attachment 8533726 [details] [diff] [review]:
-----------------------------------------------------------------

I don't think deleting libnestegg will help... ;-)

r+ for the libcubeb changes.
Attachment #8533726 - Flags: review?(kinetik) → review+
Attached file Targeted valgrind output (obsolete) —
I've had some trouble building with valgrind enabled on Windows, so I instead downloaded Firefox 34.0.5 source from the main FTP server and built with the suggestions at https://developer.mozilla.org/en-US/docs/Mozilla/Testing/Valgrind.

This output contains some traces with errors that are closer to the failing method.

Valgrind command: `G_SLICE=always-malloc valgrind --tool=memcheck --vex-iropt-register-updates=allregs-at-mem-access --smc-check=all-non-file ./firefox`

Steps taken:
1. Start the browser
2. Open a new tab (note the use of unitialized memory, this may be a separate bug, and I couldn't find a similar bug in Bugzilla but my Bugzilla-fu isn't great)
3. Visit https://www.webrtc-experiment.com/RTCMultiConnection/MultiRTC/ in two separate tabs
4. Input the same room ID for both instances
5. Enable video and audio on the second tab, and allow access
6. Share my microphone and camera
7. Switch to other tab
8. Enable video and audio on first tab
9. Share camera and microphone
10. Preview camera from second user (this is where the first jump on uninitialized memory occured)
11. Preview microphone from second user
12. Switch tabs
13. Preview camera and mic from first user
14. Exit browser
Whiteboard: [checkin on 12/16]
Comment on attachment 8533643 [details] [diff] [review]
r=

This has sec-approval+ to land on trunk on Dec 16 or later. Please create and nominate Aurora and Beta patches as well (and ESR31 if it is affected).
Attachment #8533643 - Flags: sec-approval? → sec-approval+
Attached file Other Valgrind Output (obsolete) —
Here are two other traces taken from logs generated during random browsing that aren't represented in the previous two.
==3123== Conditional jump or move depends on uninitialised value(s)
==3123==    at 0x91BF9BD: mozilla::AudioCallbackDriver::DataCallback(float*, long) (GraphDriver.cpp:865)

This is fixed on beta/35 and later (Bug 1064966).  It's not a security issue; it just affects the initial duration and how long it takes for that initial value to get filtered down.  If very large, that could cause functionality problems.  It should have been nominated for uplift to 34, but wasn't.
Flags: sec-bounty?
Attached file Valgrind with Track Origins (obsolete) —
Took the same steps as the previous targeted run, and turned on track-origins.

It seems like there might be multiple memory safety issues in play, should I open up separate bugs?
It seems as though the original issue I reported is on track to being solved, and the other hits in valgrind aren't necessarily related to this specific bug.

I've opened bug 1109540, bug 1109543, bug 1109544, bug 1109545, bug 1109546, bug 1109547, bug 1109549, bug 1109550, bug 1109551, bug 1109552, bug 1109554, and bug 1109556, and I believe these include all the unreported hits.
[Tracking Requested - why for this release]:
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Keywords: valgrind
Most of the mismatched malloc/free stuff is bogus, due I believe
to differential inlining of new-calls-malloc vs delete-calls-free.
I routinely run with --show-mismatched-frees=no so as to avoid
the noise.

Some but not all of the other reports have been fixed on the trunk.
I routinely run {mochi,crash}-tests on Valgrind and pick up at
least some of this stuff.  I will try to figure out which are dups.
(In reply to mitchwharper from comment #13)
> Steps taken: [...]

I was unable to reproduce any badness using the trunk, except for
the already-known bug 1078275.

This is with

  G_SLICE=always-malloc valgrind --smc-check=all-non-file 
  --vex-iropt-register-updates=allregs-at-mem-access --read-inline-info=yes
  --fair-sched=yes --trace-children=yes --fair-sched=yes --partial-loads-ok=yes

building with gcc-4.7.2 on Fedora 17 (64-bit)

and mozconfig as follows

. $topsrcdir/browser/config/mozconfig
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/ff-O2-linux64
ac_add_options --enable-tests

ac_add_options --enable-optimize="-g -O2"
ac_add_options --enable-debug-symbols

ac_add_options --enable-valgrind

ac_add_options --enable-profiling
ac_add_options --enable-elf-hack

ac_add_options --disable-crashreporter

ac_add_options --disable-jemalloc

mk_add_options MOZ_MAKE_FLAGS="-j4"
For reference, my mozconfig on the release build was:

mk_add_options MOZ_MAKE_FLAGS="-j4"
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/ff-opt
ac_add_options --disable-tests
ac_add_options --enable-optimize="-g -O -freorder-blocks"
ac_add_options --disable-jemalloc
ac_add_options --enable-valgrind
ac_add_options --enable-debug-symbols

I just finished a build from the trunk about 20 minutes ago, and I haven't seen any of the same issues so far.
I just compiled my 34.0.5 source with the new -Og optimization flag (which is somewhere between -O0 and -O1) and actually found more complaints. 

For reference, I'm on a Toshiba P755-5265 (http://www.toshiba.com/us/computers/laptops/satellite/P750/P755-S5265) with 6GB ram and an i5 processor, and the run didn't feel any slower than -O1. I also built checked out and built valgrind from trunk.

This one in particular caught my eye, if the lower functions are being inlined this may be the cause of this current bug:

==16375== Conditional jump or move depends on uninitialised value(s)
==16375==    at 0x9204604: mozilla::MediaStreamGraphImpl::GraphTimeToStreamTime(mozilla::MediaStream*, long) (MediaStreamGraph.cpp:281)
==16375==    by 0x9205ECA: mozilla::MediaStreamGraphImpl::PlayVideo(mozilla::MediaStream*) (MediaStreamGraph.cpp:1093)
==16375==    by 0x920AF45: mozilla::MediaStreamGraphImpl::Process(long, long) (MediaStreamGraph.cpp:1354)
==16375==    by 0x920B811: mozilla::MediaStreamGraphImpl::OneIteration(long, long, long, long) (MediaStreamGraph.cpp:1415)
==16375==    by 0x91E7F08: mozilla::AudioCallbackDriver::DataCallback(float*, long) (GraphDriver.cpp:917)
==16375==    by 0x91E8041: mozilla::AudioCallbackDriver::DataCallback_s(cubeb_stream*, void*, void*, long) (GraphDriver.cpp:758)
==16375==    by 0x9C1E365: stream_request_callback (cubeb_pulse.c:195)
==16375==    by 0x4092AA76: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D875FC: pa_pdispatch_run (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x4090F011: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D8B44B: ??? (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x40922AEB: pa_mainloop_dispatch (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
^ Built with GCC 4.8.2 on Ubuntu 14.10
After looking at the above complaint, I tried to reproduce this in VS2013, but I realized the line at http://mxr.mozilla.org/mozilla-release/source/content/media/MediaStreamGraph.cpp#281 had the potential to propogate. I never checked it after running, but here's the full series of events:

==16375== Conditional jump or move depends on uninitialised value(s)
==16375==    at 0x9204604: mozilla::MediaStreamGraphImpl::GraphTimeToStreamTime(mozilla::MediaStream*, long) (MediaStreamGraph.cpp:281)
==16375==    by 0x9205ECA: mozilla::MediaStreamGraphImpl::PlayVideo(mozilla::MediaStream*) (MediaStreamGraph.cpp:1093)
==16375==    by 0x920AF45: mozilla::MediaStreamGraphImpl::Process(long, long) (MediaStreamGraph.cpp:1354)
==16375==    by 0x920B811: mozilla::MediaStreamGraphImpl::OneIteration(long, long, long, long) (MediaStreamGraph.cpp:1415)
==16375==    by 0x91E7F08: mozilla::AudioCallbackDriver::DataCallback(float*, long) (GraphDriver.cpp:917)
==16375==    by 0x91E8041: mozilla::AudioCallbackDriver::DataCallback_s(cubeb_stream*, void*, void*, long) (GraphDriver.cpp:758)
==16375==    by 0x9C1E365: stream_request_callback (cubeb_pulse.c:195)
==16375==    by 0x4092AA76: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D875FC: pa_pdispatch_run (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x4090F011: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D8B44B: ??? (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x40922AEB: pa_mainloop_dispatch (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375== 
==16375== Conditional jump or move depends on uninitialised value(s)
==16375==    at 0x9204704: mozilla::MediaStreamGraphImpl::GraphTimeToStreamTime(mozilla::MediaStream*, long) (MediaStreamGraph.cpp:286)
==16375==    by 0x9205ECA: mozilla::MediaStreamGraphImpl::PlayVideo(mozilla::MediaStream*) (MediaStreamGraph.cpp:1093)
==16375==    by 0x920AF45: mozilla::MediaStreamGraphImpl::Process(long, long) (MediaStreamGraph.cpp:1354)
==16375==    by 0x920B811: mozilla::MediaStreamGraphImpl::OneIteration(long, long, long, long) (MediaStreamGraph.cpp:1415)
==16375==    by 0x91E7F08: mozilla::AudioCallbackDriver::DataCallback(float*, long) (GraphDriver.cpp:917)
==16375==    by 0x91E8041: mozilla::AudioCallbackDriver::DataCallback_s(cubeb_stream*, void*, void*, long) (GraphDriver.cpp:758)
==16375==    by 0x9C1E365: stream_request_callback (cubeb_pulse.c:195)
==16375==    by 0x4092AA76: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D875FC: pa_pdispatch_run (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x4090F011: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D8B44B: ??? (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x40922AEB: pa_mainloop_dispatch (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375== 
==16375== Use of uninitialised value of size 8
==16375==    at 0x92046F1: mozilla::MediaStreamGraphImpl::GraphTimeToStreamTime(mozilla::MediaStream*, long) (MediaStreamGraph.cpp:289)
==16375==    by 0x9205ECA: mozilla::MediaStreamGraphImpl::PlayVideo(mozilla::MediaStream*) (MediaStreamGraph.cpp:1093)
==16375==    by 0x920AF45: mozilla::MediaStreamGraphImpl::Process(long, long) (MediaStreamGraph.cpp:1354)
==16375==    by 0x920B811: mozilla::MediaStreamGraphImpl::OneIteration(long, long, long, long) (MediaStreamGraph.cpp:1415)
==16375==    by 0x91E7F08: mozilla::AudioCallbackDriver::DataCallback(float*, long) (GraphDriver.cpp:917)
==16375==    by 0x91E8041: mozilla::AudioCallbackDriver::DataCallback_s(cubeb_stream*, void*, void*, long) (GraphDriver.cpp:758)
==16375==    by 0x9C1E365: stream_request_callback (cubeb_pulse.c:195)
==16375==    by 0x4092AA76: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D875FC: pa_pdispatch_run (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x4090F011: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D8B44B: ??? (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x40922AEB: pa_mainloop_dispatch (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375== 
==16375== Use of uninitialised value of size 8
==16375==    at 0x9204727: mozilla::MediaStreamGraphImpl::GraphTimeToStreamTime(mozilla::MediaStream*, long) (MediaStreamGraph.cpp:293)
==16375==    by 0x9205ECA: mozilla::MediaStreamGraphImpl::PlayVideo(mozilla::MediaStream*) (MediaStreamGraph.cpp:1093)
==16375==    by 0x920AF45: mozilla::MediaStreamGraphImpl::Process(long, long) (MediaStreamGraph.cpp:1354)
==16375==    by 0x920B811: mozilla::MediaStreamGraphImpl::OneIteration(long, long, long, long) (MediaStreamGraph.cpp:1415)
==16375==    by 0x91E7F08: mozilla::AudioCallbackDriver::DataCallback(float*, long) (GraphDriver.cpp:917)
==16375==    by 0x91E8041: mozilla::AudioCallbackDriver::DataCallback_s(cubeb_stream*, void*, void*, long) (GraphDriver.cpp:758)
==16375==    by 0x9C1E365: stream_request_callback (cubeb_pulse.c:195)
==16375==    by 0x4092AA76: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D875FC: pa_pdispatch_run (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x4090F011: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D8B44B: ??? (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x40922AEB: pa_mainloop_dispatch (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375== 
==16375== Conditional jump or move depends on uninitialised value(s)
==16375==    at 0x9205F33: mozilla::MediaStreamGraphImpl::PlayVideo(mozilla::MediaStream*) (MediaSegment.h:335)
==16375==    by 0x920AF45: mozilla::MediaStreamGraphImpl::Process(long, long) (MediaStreamGraph.cpp:1354)
==16375==    by 0x920B811: mozilla::MediaStreamGraphImpl::OneIteration(long, long, long, long) (MediaStreamGraph.cpp:1415)
==16375==    by 0x91E7F08: mozilla::AudioCallbackDriver::DataCallback(float*, long) (GraphDriver.cpp:917)
==16375==    by 0x91E8041: mozilla::AudioCallbackDriver::DataCallback_s(cubeb_stream*, void*, void*, long) (GraphDriver.cpp:758)
==16375==    by 0x9C1E365: stream_request_callback (cubeb_pulse.c:195)
==16375==    by 0x4092AA76: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D875FC: pa_pdispatch_run (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x4090F011: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D8B44B: ??? (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x40922AEB: pa_mainloop_dispatch (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40922ED9: pa_mainloop_iterate (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375== 
==16375== Conditional jump or move depends on uninitialised value(s)
==16375==    at 0x9205F51: mozilla::MediaStreamGraphImpl::PlayVideo(mozilla::MediaStream*) (MediaSegment.h:342)
==16375==    by 0x920AF45: mozilla::MediaStreamGraphImpl::Process(long, long) (MediaStreamGraph.cpp:1354)
==16375==    by 0x920B811: mozilla::MediaStreamGraphImpl::OneIteration(long, long, long, long) (MediaStreamGraph.cpp:1415)
==16375==    by 0x91E7F08: mozilla::AudioCallbackDriver::DataCallback(float*, long) (GraphDriver.cpp:917)
==16375==    by 0x91E8041: mozilla::AudioCallbackDriver::DataCallback_s(cubeb_stream*, void*, void*, long) (GraphDriver.cpp:758)
==16375==    by 0x9C1E365: stream_request_callback (cubeb_pulse.c:195)
==16375==    by 0x4092AA76: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D875FC: pa_pdispatch_run (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x4090F011: ??? (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40D8B44B: ??? (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so)
==16375==    by 0x40922AEB: pa_mainloop_dispatch (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
==16375==    by 0x40922ED9: pa_mainloop_iterate (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2)
I've tried re-running this with --track-origins=yes set, but it's just prohibitively slow.

The line where the first uninitialized dependency was found was

81   if (aTime <= IterationEnd()) {

I don't think it can be aTime, it's used quite a bit in GraphTimeToStreamTime's caller before being passed in (and maybe it would have complained about its use in the call anyway?)

IterationEnd is defined on line 358. Then it's either in the call to IterationEnd or CurrentDriver.

358 GraphTime
359 MediaStreamGraphImpl::IterationEnd()
360 {
361   return CurrentDriver()->IterationEnd();
362 }

For the GraphDriver, IterationEnd is defined at http://mxr.mozilla.org/mozilla-release/source/content/media/GraphDriver.h#137

137   GraphTime IterationEnd() {
138     return mIterationEnd;
139   }

Which is always initialized at http://mxr.mozilla.org/mozilla-release/source/content/media/GraphDriver.cpp#51

49 GraphDriver::GraphDriver(MediaStreamGraphImpl* aGraphImpl)
50   : mIterationStart(0),
51     mIterationEnd(0),
52     mStateComputedTime(0),
53     mNextStateComputedTime(0),
54     mGraphImpl(aGraphImpl),
55     mWaitState(WAITSTATE_RUNNING),
56     mCurrentTimeStamp(TimeStamp::Now()),
57     mPreviousDriver(nullptr),
58     mNextDriver(nullptr)
59 { }

That would imply the call to CurrentDriver is the culprit.

So before I ever submitted this or ran Valgrind, it seemed to me the only possible culprit, given the context I thought this was occurring, in was the CurrentDriver call (I can't remember the assertions I could make, but I was ready to submit this bug claiming that as the culprit). Especially as the comment (http://mxr.mozilla.org/mozilla-release/source/content/media/MediaStreamGraphImpl.h#432) above it claims:

431   /**
432    * Not safe to call off the MediaStreamGraph thread unless monitor is held!
433    */
434   GraphDriver* CurrentDriver() {
435     AssertOnGraphThreadOrNotRunning();
436     return mDriver;
437   }


I ended up writing this off though, as I didn't understand the distinction between the MediaSteamGraph and the MediaStreamGraphImpl. When looking at the call stack for the first complaint, it seems to be within the context of a callback, and the thread this all occurred in was reported by Valgrind as

==16375== 
==16375== Thread 51 threaded-ml:

It's also worth noting DEBUG wasn't set in my build, so the assertion may not have been tripped.

What's the definition of the MediaStreamGraph thread?
Flags: needinfo?(padenot)
It's also probably worth noting the class of crashes with reason EXCEPTION_ACCESS_VIOLATION_EXEC is now a top crasher for Firefox 34.
OS: Windows 8 → All
After thinking about this some more, the CurrentDriver I would assume is always initialized, if this was some kind of thread safety issue with the driver the error would be "Invalid Use" instead of "depends on unitialized." I gave it one more try with track-origins set, and was able to reproduce the trace. I was too hasty in dismissing aTime.

The first related complaint is actually higher up the call stack, I just missed it last time because a few other complaints split it off.

The first hit is

==3550== Conditional jump or move depends on uninitialised value(s)
==3550==    at 0x91BF9BD: mozilla::AudioCallbackDriver::DataCallback(float*, long) (GraphDriver.cpp:865)
...
==3550==  Uninitialised value was created by a heap allocation
==3550==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3550==    by 0x40334E6: moz_xmalloc (mozalloc.cpp:52)
==3550==    by 0x91DD80E: mozilla::MediaStreamGraphImpl::MediaStreamGraphImpl(bool, int, unsigned char, mozilla::dom::AudioChannel) (mozalloc.h:201)

The line where the conditional jump occurs (http://mxr.mozilla.org/mozilla-release/source/content/media/GraphDriver.cpp#865)

863   // For now, simply average the duration with the previous
864   // duration so there is some damping against sudden changes.
865   if (!mIterationDurationMS) {
866     mIterationDurationMS = durationMS;
867   } else {
868     mIterationDurationMS = (mIterationDurationMS*3) + durationMS;
869     mIterationDurationMS /= 4;
870   } 

And mIterationDurationMS must be used later on to calculate aTime.
Attachment #8533202 - Attachment is obsolete: true
Attachment #8533915 - Attachment is obsolete: true
Attachment #8534006 - Attachment is obsolete: true
Attachment #8534153 - Attachment is obsolete: true
Flags: needinfo?(padenot)
Looks like the above is just a continuation of bug 1064966, and wouldn't be the cause of this bug as it was fixed in 35, but crashes occur in 35.
Looking through the crash reports some more, this one appears to be the smoking gun:

https://crash-stats.mozilla.com/report/index/69abfb83-1c3f-496d-927c-14d612141212

Version: 37.0a1
Crash Location: http://hg.mozilla.org/mozilla-central/annotate/035a951fc24a/dom/media/MediaStreamGraph.cpp#l1360
Crash Reason: EXCEPTION_ACCESS_VIOLATION_EXEC


1361 if (CurrentDriver()->AsAudioCallbackDriver() && ticksPlayed) {
1360   mMixer.FinishMixing();
1361 }
A different crash inside the Process method with the program counter at 0x0 (on Android):

https://crash-stats.mozilla.com/report/index/804aa652-ff7f-4e38-8ba9-e4a632141213

Crash Location (http://hg.mozilla.org/releases/mozilla-release/annotate/d453f75a37fb/content/media/MediaStreamGraph.cpp#l1331):

1329 // Since an AudioNodeStream is present, go ahead and
1330 // produce audio block by block for all the rest of the streams.
1331 ProduceDataForStreamsBlockByBlock(i, n->SampleRate(), aFrom, aTo);

Actually, it seems like there are multiple places throughout the OneIteration method where this is occurring:

https://crash-stats.mozilla.com/search/?signature=~MediaStreamGraphImpl&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature
Comment on attachment 8533726 [details] [diff] [review]
cubeb-shutdown-fix

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
It's a race with a thread not controlled by the application, exploiting this would be rather hard.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
One would have to have a clear understanding of how an audio stack works

Which older supported branches are affected by this flaw?
release to nightly
If not all supported branches, which bug introduced the flaw?
--

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
yes, this code has not moved a lot

How likely is this patch to cause regressions; how much testing does it need?
This is rather straightforward. Worst case it cuts the audio by a tiny buffer at the end of a stream.
Attachment #8533726 - Flags: sec-approval?
Attachment #8533726 - Flags: sec-approval? → sec-approval+
Whiteboard: [checkin on 12/16]
https://hg.mozilla.org/mozilla-central/rev/d77fe45d2ab5
https://hg.mozilla.org/mozilla-central/rev/c233ca053f7c
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
Paul, could you fill the uplift request for aurora & beta? Merci!
Flags: needinfo?(padenot)
Comment on attachment 8533726 [details] [diff] [review]
cubeb-shutdown-fix

Approval Request Comment
[Feature/regressing bug #]: initial wasapi backed landing + MSG refactoring
[User impact if declined]: crash
[Describe test coverage new/current, TBPL]: landed for some time
[Risks and why]: low risk, the fix is simple
[String/UUID change made/needed]: none
Flags: needinfo?(padenot)
Attachment #8533726 - Flags: approval-mozilla-beta?
Attachment #8533726 - Flags: approval-mozilla-aurora?
Comment on attachment 8533643 [details] [diff] [review]
r=

Approval Request Comment
[Feature/regressing bug #]: track disabling landing
[User impact if declined]: memory access cross thread without locking, crash or silence instead of sound/video
[Describe test coverage new/current, TBPL]: tests exist, assertion added
[Risks and why]: low risk, this is correcting an obvious issue in a straightforward way. Also this adds an assertion to make sure it's doing the right thing
[String/UUID change made/needed]: none
Attachment #8533643 - Flags: approval-mozilla-beta?
Attachment #8533643 - Flags: approval-mozilla-aurora?
Attachment #8533726 - Flags: approval-mozilla-beta?
Attachment #8533726 - Flags: approval-mozilla-beta+
Attachment #8533726 - Flags: approval-mozilla-aurora?
Attachment #8533726 - Flags: approval-mozilla-aurora+
Attachment #8533643 - Flags: approval-mozilla-beta?
Attachment #8533643 - Flags: approval-mozilla-beta+
Attachment #8533643 - Flags: approval-mozilla-aurora?
Attachment #8533643 - Flags: approval-mozilla-aurora+
By the way, is ESR31 affected?
(In reply to Sylvestre Ledru [:sylvestre] from comment #42)
> By the way, is ESR31 affected?

The first patch (read-after-free) affects all branches and ESR31.  It has low risk other than possible crash (very unlikely).  I think the second may only affect 34 on, since I think it's part of the callback-driven mode that landed in bug 848954.
Flags: needinfo?(padenot)
Yes, that's right.
Flags: needinfo?(padenot)
Attachment #8533643 - Flags: approval-mozilla-esr31+
FYI, I marked the bug as esr31 not affected because the scary bug here is 34+.
https://hg.mozilla.org/releases/mozilla-aurora/rev/29265b95af97
https://hg.mozilla.org/releases/mozilla-aurora/rev/542c432f2c3b

https://hg.mozilla.org/releases/mozilla-beta/rev/bbd372f6967b
https://hg.mozilla.org/releases/mozilla-beta/rev/a0a3a83d6fe1

https://hg.mozilla.org/releases/mozilla-b2g34_v2_1/rev/7b3033487fb0
https://hg.mozilla.org/releases/mozilla-b2g34_v2_1/rev/46dd9727efbe

https://hg.mozilla.org/releases/mozilla-esr31/rev/d458adbbdb70

Calling this wontfix for b2g30/b2g32 based on the comments that only 34+ are affected by the main sec bug here. And marking esr31 fixed for tracking purposes (recognizing that this bug has reduced severity there).

BTW, could we have at least put a Part 1 and Part 2 in the commit messages so we wouldn't land two different patches from the same bug with identical commit messages?
For some reason, Win7/Win8 opt builds are hitting failures in test_playback_rate.html on b2g34. Debug builds and all other platforms are fine. Aurora and Beta are both fine across the board. I've talked to Randell and Paul about it and we all agree that it's not worth a backout on a B2G release branch for a Windows-only failure, but we're also not exactly sure the best path forward is yet either (disabling the test vs. trying to figure out what fixed it on 35+ for backporting). So we're going to let it ride for now until that's decided.
NEEDINFO-ing myself so I investigate a bit tomorrow to try to find the best way forward here.
Flags: needinfo?(padenot)
Flags: sec-bounty? → sec-bounty+
(In reply to Paul Adenot (:padenot) from comment #48)
> NEEDINFO-ing myself so I investigate a bit tomorrow to try to find the best
> way forward here.

Any news here? :)
Whiteboard: [adv-main35+][adv-esr31.4+]
Alias: CVE-2014-8641
This is probably caused by a patch that is on trunk an not on b2g34, or vice-versa. I'm going to try to find it tomorrow.
I think we miss 230920:67b4334af9b1 on b2g34. Ryan, do you think pushing to try with this patch would work? I can't remember if pushing old trees to try is supposed to work on not.
Flags: needinfo?(padenot) → needinfo?(ryanvm)
Yeah, it should work fine as long as it's a platform/test that normally runs there.
Flags: needinfo?(ryanvm)
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #53)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=e2bd180e8112

Still failing it appears.
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #47)

Given the complete lack of urgency demonstrated here for the last month, I'm going to go ahead and disable the test on Windows tomorrow. If and when someone ever wants to investigate, we can always re-enable then.
Whiteboard: [adv-main35+][adv-esr31.4+] → [adv-main35+][adv-esr31.4+][b2g-adv-main2.2+]
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.