Closed
Bug 1408276
Opened 7 years ago
Closed 7 years ago
races with LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP and NotifyOutputData()
Categories
(Core :: Audio/Video: MediaStreamGraph, defect, P1)
Core
Audio/Video: MediaStreamGraph
Tracking
()
RESOLVED
FIXED
mozilla59
People
(Reporter: karlt, Assigned: karlt)
References
Details
(Keywords: crash, csectype-wildptr, sec-high, Whiteboard: [adv-main58+][adv-esr52.6+][post-critsmash-triage])
Attachments
(3 files, 4 obsolete files)
2.52 KB,
patch
|
padenot
:
review+
|
Details | Diff | Splinter Review |
8.60 KB,
patch
|
padenot
:
review+
pehrsons
:
feedback+
gchang
:
approval-mozilla-beta+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
8.78 KB,
patch
|
gchang
:
approval-mozilla-esr52+
|
Details | Diff | Splinter Review |
1) When MediaStreamGraphImpl::UpdateMainThreadState() determines that the graph is empty, it sets mLifecycleState = LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP and then releases the graph monitor. Consider that GraphDriver is an AudioCallbackDriver. Consider that a new message is subsequently received: A2) If MediaStreamGraphImpl::RunInStableState() calls Revive() on the GraphDriver. Typically there is a NextDriver() because there are no audio outputs when the graph is empty. In this case: Aa3) Revive() calls ThreadedDriver::Start(), which triggers MediaStreamGraphInitThreadRunnable(), which schedules an async Stop() of the AudioCallbackDriver and calls RunThread() Aa4) AudioCallbackDriver::DataCallback() has not yet completed and calls NotifyOutputData(). This accesses mAudioInputs which can race if new mAudioInputs are added to the graph from RunThread(). Aa5) AudioCallbackDriver::Stop() calls stop on the cubeb stream, and DataCallback() returns to indicate that the stream drain. What if there is not a NextDriver()? The only case I can currently see where UpdateStreamOrder() would not SetNextDriver() on an empty graph would be if !AudioCallbackDriver::IsStarted(). https://treeherder.mozilla.org/#/jobs?repo=try&revision=d77478b6f51a810f54535929b5ad5699578bd3d9 adds assertions that this case is not happening, and the tests do not detect. In this case: Ab3) Revive() triggers AudioCallbackDriver::Start() which calls destroy() on the old stream and starts a new stream. stop() has not been called on the old stream. Ab4) AudioCallbackDriver::mBuffer is modified during DataCallback() for the old stream. Presumably this races with mBuffer access from DataCallback() for the new stream. NotifyOutputData() has a similar race. Ab5) DataCallback() returns to indicate that the old cubeb stream drain. Consider that no new message is received. C2) MediaStreamGraphImpl::RunInStableState() dispatches MediaStreamGraphShutDownRunnable to main thread. C3) MediaStreamGraphShutDownRunnable::Run() calls Destroy() on the graph which releases its self reference. C4) AudioCallbackDriver::DataCallback() calls NotifyOutputData() on the graph which no longer exists.
Assignee | ||
Comment 1•7 years ago
|
||
This race would be hard to engineer.
Updated•7 years ago
|
Group: core-security → media-core-security
Comment 2•7 years ago
|
||
Amazing analysis, thanks, I wonder how you found this. Also, I'm currently rewriting a bunch of things around mAudioInputs, currently thinking about the optimal lifetime stories for all this. We'll see how it goes, but thanks for the detailed scenario, I'll run it against my new designs.
Assignee | ||
Comment 3•7 years ago
|
||
Thanks, Paul. Sorry, this path should be fine due to the NS_DISPATCH_SYNC in AudioCallbackDriver::Shutdown() before C3. (In reply to Karl Tomlinson (:karlt) from comment #0) > Consider that no new message is received. > > C2) MediaStreamGraphImpl::RunInStableState() dispatches > MediaStreamGraphShutDownRunnable to main thread. > > C3) MediaStreamGraphShutDownRunnable::Run() calls Destroy() on the graph > which > releases its self reference. > > C4) AudioCallbackDriver::DataCallback() calls NotifyOutputData() on the graph > which no longer exists. I'm planning to add AsyncCubebOperation::SHUTDOWN for Ab3, which should also resolve Ab4. https://treeherder.mozilla.org/#/jobs?repo=try&revision=6d715927e365367cf0dbb26350ed4088f27aab2f The Revive() path is an edge case that is hard to test. For Aa4, I'm planning to move the transition to LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP to the end of DataCallback(), and something similar for ThreadedDriver(). Perhaps that's unnecessary if moving NotifyOutputData() is part of your plans. Even if it's not, then moving LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP may still help with maintaining the concept of a single graph thread at a time.
Assignee | ||
Comment 4•7 years ago
|
||
The races with pointers in mAudioInputs are still technically sec-critical and mitigating circumstances reduce this to only sec-high, even if very difficult to reproduce in this case, and so sec-approval will be required. The patches are tested on try (without referencing this bug): https://treeherder.mozilla.org/#/jobs?repo=try&revision=7e62b7431c4b51ab96491180b861c813906a8867 I don't know the consequences of destroying a cubeb stream before stop. This try run, with no stops, didn't find any obvious problems, assuming "Assertion failure: [GFX1]: BlitImage(SurfaceTextureImage)" is not related. https://treeherder.mozilla.org/#/jobs?repo=try&revision=8dba813a6f6c44442968e66c25bace2551b65dca Should we backport the missing stop?
Assignee | ||
Comment 5•7 years ago
|
||
Attachment #8922609 -
Flags: review?(padenot)
Assignee | ||
Comment 6•7 years ago
|
||
Attachment #8922610 -
Flags: review?(padenot)
Assignee | ||
Updated•7 years ago
|
Attachment #8922609 -
Attachment description: remove unused GraphDriver::Destroy() → part 2: stop cubeb stream before it gets destroyed when starting new stream on Revive()
Assignee | ||
Comment 7•7 years ago
|
||
Comment on attachment 8922610 [details] [diff] [review] part 3: assert that an old stream is stopped before a new stream is created Moving the implicit cubeb_stream_destroy() from Init() to Stop() makes the assertion possible.
Attachment #8922610 -
Attachment description: stop cubeb stream before it gets destroyed when starting new stream on Revive() → part 3: assert that an old stream is stopped before a new stream is created
Assignee | ||
Comment 8•7 years ago
|
||
Attachment #8922613 -
Flags: review?(padenot)
Assignee | ||
Comment 9•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Attachment #8922614 -
Flags: review?(padenot)
Assignee | ||
Comment 10•7 years ago
|
||
The NotifyOutput race was introduced for 46 in https://hg.mozilla.org/mozilla-central/rev/2d18cbea1800c23f06d78fc7c9c64503d2ab1ce3 The stop was also missing at that point.
status-firefox56:
--- → affected
status-firefox57:
--- → affected
status-firefox-esr52:
--- → affected
Comment 11•7 years ago
|
||
Comment on attachment 8922613 [details] [diff] [review] part 4: move to LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP at end of iteration Review of attachment 8922613 [details] [diff] [review]: ----------------------------------------------------------------- A little comment. r+ however. ::: dom/media/GraphDriver.cpp @@ +332,5 @@ > + LOG(LogLevel::Debug, > + ("MediaStreamGraph %p waiting for main thread cleanup", this)); > + mGraphImpl->mLifecycleState = > + MediaStreamGraphImpl::LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP; > + mGraphImpl->EnsureStableStateEventPosted(); It feels weird to touch mLifecycleState and EnsureStableStateEventPosted from a driver. I'd push this into the MediaStreamGraph, at the end of OneIteration.
Attachment #8922613 -
Flags: review?(padenot) → review+
Updated•7 years ago
|
Attachment #8922614 -
Flags: review?(padenot) → review+
Comment 12•7 years ago
|
||
Comment on attachment 8922609 [details] [diff] [review] part 2: stop cubeb stream before it gets destroyed when starting new stream on Revive() Review of attachment 8922609 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/GraphDriver.cpp @@ +797,5 @@ > ("Starting audio threads for MediaStreamGraph %p from a new thread.", > mGraphImpl)); > + // "`cubeb_stream_stop` MUST be called before destroying a stream." > + // The AsyncCubebTasks are performed sequentially and so shutdown > + // completes before init. This is perfectly true. However, in the backends, code, we see: - We join the thread on windows: https://github.com/kinetiknz/cubeb/blob/master/src/cubeb_wasapi.cpp#L1904 (including a little dance in a weird case) - On OSX: https://github.com/kinetiknz/cubeb/blob/master/src/cubeb_audiounit.cpp#L2617. More over, we have the guarantee that AudioOutputUnitStop is synchronous. If called outside the callback, the callback will not be called after it has returned. - On Linux/Pulse: https://github.com/kinetiknz/cubeb/blob/master/src/cubeb_pulse.c#L957 the callback is actually removed Have you observed this being not true? Maybe I'm missing something.
Attachment #8922609 -
Flags: review?(padenot) → review+
Updated•7 years ago
|
Attachment #8922610 -
Flags: review?(padenot) → review+
Assignee | ||
Comment 13•7 years ago
|
||
(In reply to Paul Adenot (:padenot) from comment #12) > This is perfectly true. However, in the backends, code, we see: > - We join the thread on windows: > https://github.com/kinetiknz/cubeb/blob/master/src/cubeb_wasapi.cpp#L1904 > (including a little dance in a weird case) > - On OSX: > https://github.com/kinetiknz/cubeb/blob/master/src/cubeb_audiounit.cpp#L2617. > More over, we have the guarantee that AudioOutputUnitStop is synchronous. If > called outside the callback, the callback will not be called after it has > returned. > - On Linux/Pulse: > https://github.com/kinetiknz/cubeb/blob/master/src/cubeb_pulse.c#L957 the > callback is actually removed > > Have you observed this being not true? Maybe I'm missing something. The issues reported here are just from analyzing the code because I wanted to understand it so that I could move some things around for bug 1228226. The requirement was added to the documentation in https://github.com/kinetiknz/cubeb/commit/dcc90f40c89cd304a85ed687b59b77d0542dbd2c#diff-a2acb4d7d9909e5853f52f32b203283dR430 How do I find out what comments that is addressing? Based on the requirement for stream_stop in cubeb documentation, I wondered whether the stop call may be required to ensure that the callback has returned before stream_destroy started deleting structures, or a new stream is started. If stop is not required, then we don't need to backport this patch. For the pulse backend, completion of the callback seems to depend on pulse holding the mainloop lock during the entire duration of the write callback, and on cubeb_stream_destroy() taking the lock before it does anything. I can verify the latter, and I assume the former is also true, as "Events that are dispatched from the event loop thread are executed with this lock held." https://freedesktop.org/software/pulseaudio/doxygen/thread-mainloop_8h.html#a2fd002f510ed24cabacb63a26bf4c977 Your reasoning for OSX and WASAPI is also convincing. I haven't found appropriate documentation for OpenSL AUDIOPLAYER, but if Destroy() is synchronous, then that should be good too. That would mean that the only purpose of this patch is to comply with the cubeb documentation.
Assignee | ||
Comment 14•7 years ago
|
||
(In reply to Paul Adenot (:padenot) from comment #11) > ::: dom/media/GraphDriver.cpp > @@ +332,5 @@ > > + LOG(LogLevel::Debug, > > + ("MediaStreamGraph %p waiting for main thread cleanup", this)); > > + mGraphImpl->mLifecycleState = > > + MediaStreamGraphImpl::LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP; > > + mGraphImpl->EnsureStableStateEventPosted(); > > It feels weird to touch mLifecycleState and EnsureStableStateEventPosted > from a driver. I'd push this into the MediaStreamGraph, at the end of > OneIteration. Prior to this patch, because UpdateMainThreadState() returns immediately after this, it is already at the end of OneIteration(). The problem is that doing this is handing over the graph to another thread, but DataCallback() uses objects assuming that no other thread is running the graph. Perhaps NotifyOutputData() could be called from OneIteration(), but moving mBuffer.BufferFilled() is more awkward. http://searchfox.org/mozilla-central/rev/1ebd2eff44617df3b82eea7d2f3ca1b60cc591a0/dom/media/GraphDriver.cpp#1002 I was considering moving UpdateMainThreadState() out of OneIteration() and calling UpdateMainThreadState() from the drivers. That would keep all main thread messages in one place. What do you think?
Attachment #8923285 -
Flags: review?(padenot)
Updated•7 years ago
|
Rank: 7
Priority: -- → P1
Assignee | ||
Comment 15•7 years ago
|
||
(In reply to Karl Tomlinson (:karlt) from comment #13) > That would mean that the only purpose of this patch is to comply with the > cubeb documentation. Except that clearing mAudioStream after stop also makes the test added at https://hg.mozilla.org/integration/mozilla-inbound/rev/1e4bb3a203a46f97ba0b5bb681bb1fb15e4474e4#l1.16 behave correctly in the case of errors during start from Revive(), and so there is value in uplifting the patches related to stop. Due to that need for null mAudioStream during cubeb_stream_init(), it would be better to move the assert in Init() to before the cubeb_stream_init() call. The assert could even be at the beginning of the method.
Assignee | ||
Comment 16•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=192cd8cfccaccc25c930ac70b76bb2e217ad04ff
Comment 17•7 years ago
|
||
(In reply to Karl Tomlinson (:karlt) from comment #14) > Created attachment 8923285 [details] [diff] [review] > part 4 v2: move to LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP at end of > iteration > > (In reply to Paul Adenot (:padenot) from comment #11) > > ::: dom/media/GraphDriver.cpp > > @@ +332,5 @@ > > > + LOG(LogLevel::Debug, > > > + ("MediaStreamGraph %p waiting for main thread cleanup", this)); > > > + mGraphImpl->mLifecycleState = > > > + MediaStreamGraphImpl::LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP; > > > + mGraphImpl->EnsureStableStateEventPosted(); > > > > It feels weird to touch mLifecycleState and EnsureStableStateEventPosted > > from a driver. I'd push this into the MediaStreamGraph, at the end of > > OneIteration. > > Prior to this patch, because UpdateMainThreadState() returns immediately > after > this, it is already at the end of OneIteration(). > > The problem is that doing this is handing over the graph to another thread, > but DataCallback() uses objects assuming that no other thread is running the > graph. Perhaps NotifyOutputData() could be called from OneIteration(), but > moving mBuffer.BufferFilled() is more awkward. > > http://searchfox.org/mozilla-central/rev/ > 1ebd2eff44617df3b82eea7d2f3ca1b60cc591a0/dom/media/GraphDriver.cpp#1002 > > I was considering moving UpdateMainThreadState() out of OneIteration() and > calling UpdateMainThreadState() from the drivers. That would keep all main > thread messages in one place. > > What do you think? I like this one better, yes. One of the job of a GraphDriver is to "drive" the graph (hence the name). `UpdateMainThreadState` decides whether or not to continue rendering, so it's conceptually valid. Touching mLifecycleState and EnsureStateStateEventPosted from the driver felt like an encapsulation breakage.
Updated•7 years ago
|
Attachment #8923285 -
Flags: review?(padenot) → review+
Assignee | ||
Comment 18•7 years ago
|
||
Comment on attachment 8922609 [details] [diff] [review] part 2: stop cubeb stream before it gets destroyed when starting new stream on Revive() Having noticed https://bugzilla.mozilla.org/show_bug.cgi?id=1382366#c19 I realize that stopping the stream while mAudioStream is set and there is no scheduled next driver would add another path for similar issues, and so I'll drop this patch from this bug. Thanks for pointing out that a missing stop is not a security issue. I'll post the patch elsewhere when this bug and bug 1382366 are sorted out.
Attachment #8922609 -
Attachment is obsolete: true
Assignee | ||
Comment 19•7 years ago
|
||
Comment on attachment 8922610 [details] [diff] [review] part 3: assert that an old stream is stopped before a new stream is created This patch depended on attachment 8922609 [details] [diff] [review] and so I'll also drop this patch. As failure to stop is not currently a security issue, the asserts may be pointless. (In reply to Karl Tomlinson (:karlt) from comment #15) > Except that clearing mAudioStream after stop also makes the test added at > https://hg.mozilla.org/integration/mozilla-inbound/rev/ > 1e4bb3a203a46f97ba0b5bb681bb1fb15e4474e4#l1.16 > behave correctly in the case of errors during start from Revive(), > and so there is value in uplifting the patches related to stop. I'll fix this differently together with fixing bug 1382366.
Attachment #8922610 -
Attachment is obsolete: true
Assignee | ||
Comment 20•7 years ago
|
||
For bug 1382366, I want to set state on a graph driver member after finding out that the graph is finishing but before handing over to the main thread for clean up. The approach in v2 precludes that. I'm sorry about the churn here. This v3 is conceptually similar to v1 but adds a graph method specifically for this purpose, and so there are separate functions for UpdateMainThreadState() and SignalMainThreadCleanup(). Like v2 however, the existing locking is left unchanged. There is also a documentation fix for mLifecycleState to indicate the real threads involved. https://treeherder.mozilla.org/#/jobs?repo=try&revision=b3d9e9ec6b167b079cb48e7f0bf705e90eea688f&selectedJob=142117514 The same c1 leaks occurred on the base m-c revision: https://treeherder.mozilla.org/logviewer.html#?job_id=142033133&repo=mozilla-central
Attachment #8922613 -
Attachment is obsolete: true
Attachment #8923285 -
Attachment is obsolete: true
Attachment #8925420 -
Flags: review?(padenot)
Comment 21•7 years ago
|
||
Comment on attachment 8925420 [details] [diff] [review] part 4 v3: move to LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP at end of iteration Review of attachment 8925420 [details] [diff] [review]: ----------------------------------------------------------------- lgtm, though I don't really have much opinion on where responsibilities should lie
Attachment #8925420 -
Flags: feedback+
Updated•7 years ago
|
Attachment #8925420 -
Flags: review?(padenot) → review+
Assignee | ||
Comment 22•7 years ago
|
||
Assignee | ||
Comment 23•7 years ago
|
||
Comment on attachment 8925420 [details] [diff] [review] part 4 v3: move to LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP at end of iteration [Security approval request comment] > How easily could an exploit be constructed based on the patch? A exploit would require the race-winning threads to run much faster than another thread. At this point, the thread that would need to be slow should not be influenced by content (because this situation occurs when the graph is empty), and so it is very difficult to rig the race. An attacker could trigger many of these races, and trigger many threads so as to increase the chance that some are starved, and so I guess have a chance of triggering this situation. > Do comments in the patch, the check-in comment, or tests included in the > patch paint a bulls-eye on the security problem? Comments in the patch describe how the code works and what is changed. They describe a change in order of execution, from which it is not too difficult to deduce a UaF. > Which older supported branches are affected by this flaw? Versions since 46. > If not all supported branches, which bug introduced the flaw? All supported. > Do you have backports for the affected branches? > If not, how different, hard to create, and risky will they be? Backport for 52 is ready. Others would not be difficult. Code has not changed much. > How likely is this patch to cause regressions; how much testing does it need? There is a moderate risk here, limited to Web Audio and WebRTC. There are many tests, but the interaction of threads means that it is not possible to explicitly test all possible paths. Some paths occur sometimes, according to timing, and so it would be reassuring to have these tests running on m-c for a few days before uplift.
Attachment #8925420 -
Flags: sec-approval?
Comment 24•7 years ago
|
||
sec-approval+ for checkin on 11/28 or immediately after. Do not check this in before then as that is two weeks into the 58 cycle. This won't make the 57 release as it is too late for that.
Updated•7 years ago
|
Attachment #8925420 -
Flags: sec-approval? → sec-approval+
Comment 25•7 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/37ce6e1e2115 https://hg.mozilla.org/mozilla-central/rev/1b27196751a8
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox59:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Comment 26•7 years ago
|
||
Please request Beta/ESR52 approval on the patches where applicable.
Flags: needinfo?(karlt)
Whiteboard: [checkin on 11/28]
Updated•7 years ago
|
Group: media-core-security → core-security-release
Updated•7 years ago
|
Assignee | ||
Comment 27•7 years ago
|
||
Comment on attachment 8925420 [details] [diff] [review] part 4 v3: move to LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP at end of iteration Approval Request Comment Only this patch need land on beta 58. The patch attached to the bug should be used. The changeset on m-c required conflict resolution with 59 changes. [Feature/Bug causing the regression]: bug 1221587 [User impact if declined]: sec-high [Is this code covered by automated tests?]: There are many tests, but the interaction of threads means that it is not possible to explicitly test all possible paths. Some paths occur sometimes, according to timing. [Has the fix been verified in Nightly?]: No. We don't have a reproducing testcase. [Needs manual test from QE? If yes, steps to reproduce]: No. [List of other uplifts needed for the feature/fix]: None. [Is the change risky?]: There is a moderate risk here, limited to Web Audio and WebRTC. [Why is the change risky/not risky?]: Some paths occur sometimes, according to timing. [String changes made/needed]: None.
Flags: needinfo?(karlt)
Assignee | ||
Updated•7 years ago
|
Attachment #8925420 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 28•7 years ago
|
||
Comment on attachment 8926259 [details] [diff] [review] esr52 patch (part 4 only) [Approval Request Comment] Only this patch need land on ESR52. If this is not a sec:{high,crit} bug, please state case for ESR consideration: User impact if declined: sec-high Fix Landed on Version: 59 Risk to taking this patch (and alternatives if risky): Moderate String or UUID changes made by this patch: None.
Attachment #8926259 -
Flags: approval-mozilla-esr52?
Comment 29•7 years ago
|
||
Comment on attachment 8925420 [details] [diff] [review] part 4 v3: move to LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP at end of iteration Fix a sec-high. Beta58+ & ESR52+.
Attachment #8925420 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Updated•7 years ago
|
Attachment #8926259 -
Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
Comment 30•7 years ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/cda8fcf76af1
Comment 31•6 years ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-esr52/rev/084c427ccf99
Updated•6 years ago
|
Whiteboard: [adv-main58+][adv-esr52.6+]
Updated•6 years ago
|
Flags: qe-verify-
Whiteboard: [adv-main58+][adv-esr52.6+] → [adv-main58+][adv-esr52.6+][post-critsmash-triage]
Updated•6 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•