Closed
Bug 1382366
Opened 8 years ago
Closed 8 years ago
Crash in mozilla::SystemClockDriver::WaitForNextIteration | mozilla::MediaStreamGraphImpl::UpdateMainThreadState
Categories
(Core :: Audio/Video: MediaStreamGraph, defect, P2)
Tracking
()
People
(Reporter: jesup, Assigned: karlt)
References
Details
(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [adv-main58+][adv-esr52.6+][post-critsmash-triage])
Crash Data
Attachments
(3 files)
4.45 KB,
patch
|
padenot
:
review+
pehrsons
:
feedback+
|
Details | Diff | Splinter Review |
8.99 KB,
patch
|
padenot
:
review+
pehrsons
:
feedback+
gchang
:
approval-mozilla-beta+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
8.65 KB,
patch
|
gchang
:
approval-mozilla-esr52+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-90598c3d-fa6e-4d42-8e1f-bbf6f0170719.
=============================================================
Goes back to at least 54. Clear UAF
Reporter | ||
Comment 1•8 years ago
|
||
Ugh, apparently setting "Security-Sensitive" and "Media security bug" means you have to be in both to see it (or be on the CC)
Comment 2•8 years ago
|
||
Appears to be go even back to 45 ESR https://crash-stats.mozilla.com/report/index/e85b96c6-721e-4fbd-9002-c20302170214
Updated•8 years ago
|
status-firefox54:
--- → wontfix
status-firefox55:
--- → affected
status-firefox56:
--- → affected
status-firefox-esr52:
--- → affected
Reporter | ||
Updated•8 years ago
|
Flags: needinfo?(padenot)
Updated•8 years ago
|
Rank: 15
Priority: -- → P1
Updated•8 years ago
|
Group: core-security
Comment 3•8 years ago
|
||
I think some of these crashes are a variation on 1384805, but not all of them.
Flags: needinfo?(padenot)
Reporter | ||
Comment 4•8 years ago
|
||
So in WaitForNextIteration, we assert the mGraphImpl owns the monitor, so mGraphImpl can't be null, and it seems unlikely it would be freed while the monitor is still held (perhaps we can add traps for that case, if it isn't already indirectly verified in the process of freeing a graph - and likely it is).
That would mean that the only possible source of a UAF here is that the SystemClockDriver itself was freed. (i.e. this->mGraphImpl == ...e5e5). This could be verified if in 56b3 mNeedAnotherIteration is at offset 0x68d - 0x5e5 (see https://crash-stats.mozilla.com/report/index/b6e17226-4148-4cfb-94fb-459260170818).
And... wow. There are 17(!) MediaStreamGrph threads in that report! And a lot of them are in WaitForNextIteration()
Thoughts Paul?
Flags: needinfo?(padenot)
Comment 5•8 years ago
|
||
Yes, see my analysis in bug 1384805. There are multiple different crashes here, but what you describe is fixed by my patch.
Flags: needinfo?(padenot)
Comment 6•8 years ago
|
||
Hrm I got the bug number wrong _twice_. I meant bug 1388243.
Reporter | ||
Comment 7•8 years ago
|
||
Paul - any idea why we'd have 17 MSG threads?? 17 active docs with webaudio??
Perhaps this is was caused by the apparent issue I saw with an MSG not shutting down?
Reporter | ||
Updated•8 years ago
|
Flags: needinfo?(padenot)
Reporter | ||
Comment 8•8 years ago
|
||
And a thread named as an MSG thread has to be a ClockDriver thread, not a callback
Comment 9•8 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #7)
> Paul - any idea why we'd have 17 MSG threads?? 17 active docs with
> webaudio??
>
> Perhaps this is was caused by the apparent issue I saw with an MSG not
> shutting down?
See comment 6, this explains why there is a lot of threads, and why they are leaking.
Flags: needinfo?(padenot)
Reporter | ||
Comment 10•8 years ago
|
||
Paul - I still see crashes in builds done after your patch for bug 1388243 landed in beta:
https://crash-stats.mozilla.com/signature/?build_id=%3E%3D20170825000000&signature=mozilla%3A%3ASystemClockDriver%3A%3AWaitForNextIteration%20%7C%20mozilla%3A%3AMediaStreamGraphImpl%3A%3AUpdateMainThreadState&date=%3E%3D2017-08-31T20%3A54%3A00.000Z&date=%3C2017-09-07T20%3A54%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_sort=-date&page=1#reports
Flags: needinfo?(padenot)
Comment 11•8 years ago
|
||
Yes, this signature is a catch all really, it's where most of the time is spent when running a SystemClockDriver. There are multiple issues there, my patch fixed only one of them, hence the "some" in comment 3.
Flags: needinfo?(padenot)
Comment 12•8 years ago
|
||
One of those is definitely bug 1360334. See [1], thread 59.
[1] https://crash-stats.mozilla.com/report/index/6005b02d-5708-4481-b501-753610170906
Reporter | ||
Comment 13•8 years ago
|
||
Mass change P1->P2 to align with new Mozilla triage process.
Priority: P1 → P2
Reporter | ||
Comment 15•8 years ago
|
||
Still causing UAF crashes in 57b8: https://crash-stats.mozilla.com/report/index/a73e1b70-6fc8-4936-88aa-97c800171017
We need someone to own investigating this bug
Comment 16•8 years ago
|
||
I'm not seeing any crashes with this signature in 58.0a1 though I did see them in 57.0a1 (and lots still in 57 betas). Is there something that might have fixed this in 58 that we could backport to 57 and maybe ESR 52.x? Not any of the bugs already mentioned here because those look like they were fixed in 56/57.
Assigning to :jib (triage owner) to find a real owner since this is a sec-high/crit bug and shouldn't be unassigned.
Assignee: nobody → jib
Flags: needinfo?(jib)
Reporter | ||
Comment 17•8 years ago
|
||
karl might have thoughts, or it might be related to some items he's working on
Assignee | ||
Comment 19•8 years ago
|
||
An hypothesis similar to "the large" from
https://bugzilla.mozilla.org/show_bug.cgi?id=1360334#c26:
1. AudioCallbackDriver::Shutdown() schedules AudioCallbackDriver::Stop() and
spins an event loop on the main thread waiting for Stop() to return.
2. cubeb_stream_stop() is called on the AsyncCubebTask thread.
3. wasapi_stream_stop() fails to join the render thread in 5 seconds and so
calls AudioCallbackDriver::StateCallback(CUBEB_STATE_ERROR).
4. mAudioStream is still set, and even if NextDriver() is set then it is not
scheduled. StateCallback() starts a SystemClockDriver.
5. The SystemClockDriver waits in WaitForNextIteration().
mNeedAnotherIteration is only cleared in the SystemClockDriver, and so this
wait will have a timeout.
6. AudioCallbackDriver::Shutdown() returns.
7. The graph releases the SystemClockDriver and destroys itself.
8. SystemClockDriver wakes, but no longer exists. UaF.
Assignee: jib → karlt
Status: NEW → ASSIGNED
Flags: needinfo?(karlt)
Assignee | ||
Comment 20•8 years ago
|
||
I suspect this is not yet fixed, but less frequent since the fix for bug 1360334.
I looked through changes since FIREFOX_BETA_57_BASE in MediaStreamGraph.cpp, GraphDriver.cpp, and media/libcubeb, but didn't see anything that I expect to have fixed this.
Bug 1388243 was also similar, and so that would have also reduced the frequency.
Assignee | ||
Comment 21•8 years ago
|
||
I considered borrowing mWaitState to fix this bug, but I think it is clearer to use a different and specific variable.
Attachment #8925423 -
Flags: review?(padenot)
Assignee | ||
Comment 22•8 years ago
|
||
This approach uses a single variable for the test in StateCallback() to cover
situations from comment 19 and
https://bugzilla.mozilla.org/show_bug.cgi?id=1408276#c15
as well as replacing the existing tests put in place for
https://bugzilla.mozilla.org/show_bug.cgi?id=1388243#c4 and
"the large" of https://bugzilla.mozilla.org/show_bug.cgi?id=1360334#c26
The change of approach results in removal of the recovery from failed
Dispatch() that appears to have been incidentally added with the fix for bug
1408276. Dispatch() should not fail until late in shutdown and the graph
should be shut down earlier in shutdown. If shutdown has already gone wrong,
there is not a whole lot to gain from keeping the graph running.
Failure to create a new thread is likely any time before shutdown, but there
is no recovery from this, before or after this patch. I'd prefer not to complicate this bug with that issue.
Attachment #8925424 -
Flags: review?(padenot)
Comment 23•8 years ago
|
||
Comment on attachment 8925423 [details] [diff] [review]
part 1: move declaration of mWaitState to SystemClockDriver
Review of attachment 8925423 [details] [diff] [review]:
-----------------------------------------------------------------
lgtm
Attachment #8925423 -
Flags: feedback+
Comment 24•8 years ago
|
||
Comment on attachment 8925424 [details] [diff] [review]
part 2: disable AudioCallback -> SystemClockDriver fallback before disowning graph
Review of attachment 8925424 [details] [diff] [review]:
-----------------------------------------------------------------
I traced through comment 19 and it looks very real. The fix looks good too.
What strikes me is that if GraphDriver operations were based on something like MozPromise (so resolve/reject handling at the callsite) we'd get rid of a whole lot of indirection and that would make reasoning about them simpler.
Thanks karl!
Attachment #8925424 -
Flags: feedback+
Updated•8 years ago
|
Attachment #8925423 -
Flags: review?(padenot) → review+
Updated•8 years ago
|
Attachment #8925424 -
Flags: review?(padenot) → review+
Assignee | ||
Comment 25•8 years ago
|
||
The most likely path suspected for this in comment 19 did not exist until 55
https://hg.mozilla.org/mozilla-central/rev/c98ff2135b46d70fc3ce3635703a019d42d24539#l6.198
but there are other perhaps less expected modes of error from
wasapi_stream_render_loop(). e.g.
https://hg.mozilla.org/releases/mozilla-esr52/file/e4fec62e5347/media/libcubeb/src/cubeb_wasapi.cpp#l918
We still have a crash report for version 52.4, which has the fix for bug 1360334
https://crash-stats.mozilla.com/report/index/52283d50-fe8f-42fa-ab85-28eca0171023
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(jib)
Assignee | ||
Comment 26•8 years ago
|
||
Assignee | ||
Comment 27•8 years ago
|
||
Comment on attachment 8925424 [details] [diff] [review]
part 2: disable AudioCallback -> SystemClockDriver fallback before disowning graph
[Security approval request comment]
> How easily could an exploit be constructed based on the patch?
Triggering the UaF requires either an error from system libraries or a 5
second timeout.
I guess an attacker could consume available RAM in order to make five second
timeouts more likely. I don't know any way that content could trigger this
kind of error.
> Do comments in the patch, the check-in comment, or tests included in the
> patch paint a bulls-eye on the security problem?
Comments in the patch describe how the code works and what is changed.
It is not too difficult to deduce the possibility of a UaF.
> Which older supported branches are affected by this flaw?
Part of the bug existed in 52.
> If not all supported branches, which bug introduced the flaw?
All supported.
> Do you have backports for the affected branches?
> If not, how different, hard to create, and risky will they be?
Backport for 52 is ready. Others would not be difficult.
Code has not changed much.
> How likely is this patch to cause regressions; how much testing does it need?
There is a moderate risk here, limited to Web Audio and WebRTC.
There are many tests, but the interaction of threads means that it is not
possible to explicitly test all possible paths. Some paths occur sometimes,
according to timing, and so it would be reassuring to have these tests running
on m-c for a few days before uplift.
Attachment #8925424 -
Flags: sec-approval?
Comment 28•8 years ago
|
||
sec-approval for checkin on 11/28 or immediately after, two weeks into the new cycle. Do not check this into trunk before then.
status-firefox57:
--- → wontfix
status-firefox58:
--- → affected
tracking-firefox58:
--- → +
tracking-firefox-esr52:
--- → ?
Whiteboard: [checkin on 11/28]
Updated•8 years ago
|
Attachment #8925424 -
Flags: sec-approval? → sec-approval+
Assignee | ||
Comment 29•8 years ago
|
||
https://crash-stats.mozilla.com/report/index/1b022b05-0f25-404f-8bdf-c472e0171117 indicates that 58 is still affected.
![]() |
||
Comment 30•8 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox59:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
![]() |
||
Comment 31•8 years ago
|
||
Comment 32•8 years ago
|
||
Please request Beta/ESR52 approval on the patches where applicable.
Flags: needinfo?(karlt)
Whiteboard: [checkin on 11/28]
Updated•8 years ago
|
Group: media-core-security → core-security-release
Updated•8 years ago
|
Assignee | ||
Comment 33•8 years ago
|
||
Comment on attachment 8925424 [details] [diff] [review]
part 2: disable AudioCallback -> SystemClockDriver fallback before disowning graph
Approval Request Comment
Only this patch need land on beta 58.
[Feature/Bug causing the regression]:
Don't know, but there are crashes on 52.4. See comment 25.
[User impact if declined]:
sec-high
[Is this code covered by automated tests?]:
There are many tests, but the interaction of threads means that it is not
possible to explicitly test all possible paths. Some paths occur sometimes,
according to timing.
[Has the fix been verified in Nightly?]:
No. We don't have a reproducing testcase, and crashes are not frequent enough
to infer anything from lack of crash reports.
[Needs manual test from QE? If yes, steps to reproduce]:
No.
[List of other uplifts needed for the feature/fix]:
Bug 1408276 part 4.
[Is the change risky?]:
There is a moderate risk here, limited to Web Audio and WebRTC.
[Why is the change risky/not risky?]:
Some paths occur sometimes, according to timing.
[String changes made/needed]:
None.
Flags: needinfo?(karlt)
Attachment #8925424 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 34•8 years ago
|
||
Comment on attachment 8926260 [details] [diff] [review]
esr52 patch (part 2 only)
[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined:
Fix Landed on Version:
Risk to taking this patch (and alternatives if risky):
String or UUID changes made by this patch:
See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
[Approval Request Comment]
Only this patch need land on ESR52.
Bug 1408276 esr52 patch needs to land before this.
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined:
sec-high
Fix Landed on Version:
59
Risk to taking this patch (and alternatives if risky):
Moderate
String or UUID changes made by this patch:
None.
Attachment #8926260 -
Flags: approval-mozilla-esr52?
Comment 35•8 years ago
|
||
Comment on attachment 8925424 [details] [diff] [review]
part 2: disable AudioCallback -> SystemClockDriver fallback before disowning graph
Fix a sec-high. Beta58+ & ESR52+.
Attachment #8925424 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Updated•8 years ago
|
Attachment #8926260 -
Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
Comment 36•8 years ago
|
||
uplift |
Comment 37•8 years ago
|
||
uplift |
Updated•8 years ago
|
Whiteboard: [adv-main58+][adv-esr52.6+]
Updated•8 years ago
|
Flags: qe-verify-
Whiteboard: [adv-main58+][adv-esr52.6+] → [adv-main58+][adv-esr52.6+][post-critsmash-triage]
Updated•7 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•