Closed
Bug 1335450
Opened 8 years ago
Closed 8 years ago
Crash in RefPtr<T>::RefPtr<T> | nsTArray_Impl<T>::AppendElement<T> | TakeFrameRequestCallbacksFrom
Categories
(Core :: Web Painting, defect)
Tracking
()
People
(Reporter: jesup, Assigned: bzbarsky)
References
Details
(4 keywords, Whiteboard: [post-critsmash-triage][adv-main52+][adv-esr45.8+])
Crash Data
Attachments
(1 file)
1.55 KB,
patch
|
mrbkap
:
review+
abillings
:
approval-mozilla-aurora+
abillings
:
approval-mozilla-beta+
gchang
:
approval-mozilla-release-
gchang
:
approval-mozilla-esr45+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-1dcb9d26-8bbf-4c76-a141-91fcd2170131.
=============================================================
UAF in code called from TakeFrameRequestCallbacksFrom() (which was added or last changed in Bug 1145439). Called from the RefreshDriver tick.
High frequency crash; 300 in the last week.
Almost all crashes are in 50 or newer. Spiked in mid-December to current levels. There are some older crashes, but at very low frequencies:
https://crash-stats.mozilla.com/signature/?product=Firefox&address=~e5e5&signature=RefPtr%3CT%3E%3A%3ARefPtr%3CT%3E%20%7C%20nsTArray_Impl%3CT%3E%3A%3AAppendElement%3CT%3E%20%7C%20TakeFrameRequestCallbacksFrom&date=%3E%3D2016-10-31T16%3A11%3A53.000Z&date=%3C2017-01-31T16%3A11%3A53.000Z#graphs
Updated•8 years ago
|
Group: core-security → layout-core-security
Updated•8 years ago
|
Component: Layout → Layout: Web Painting
Comment 1•8 years ago
|
||
We're looping through mThrottledFrameRequestCallbackDocs, which is an array of raw nsIDocument pointers and it looks like we crash trying to access the document pointer. It seems like there is nothing preventing documents in mThrottledFrameRequestCallbackDocs from dying. Or am I missing something obvious that guarantees they are live?
Comment 2•8 years ago
|
||
Looking at the initial changeset that added the raw nsIDocument* pointer array (called mAnimationFrameListenerDocs at the time): https://hg.mozilla.org/mozilla-central/rev/c05c386324c0 it looks like the idea was that documents call RevokeAnimationFrameListeners before they go away, from nsDocument::DeleteShell(). Maybe that's not happening the way it should.
Comment 3•8 years ago
|
||
Tracking for 51+ since this is rated sec-critical.
Markus, do you think there is anything actionable here? Are you still investigating?
status-firefox51:
--- → affected
status-firefox52:
--- → affected
status-firefox53:
--- → affected
status-firefox54:
--- → ?
tracking-firefox51:
--- → +
tracking-firefox52:
--- → +
tracking-firefox53:
--- → +
Flags: needinfo?(mstange)
Comment 4•8 years ago
|
||
I'm not investigating at the moment. Timothy, do you have thoughts on comment 2?
Flags: needinfo?(mstange) → needinfo?(tnikkel)
Comment 5•8 years ago
|
||
Comment 2 is right, comment 1 missed that.
The places in nsDocument where frame callbacks are revoked are in several different places and have different conditions. Someone more familiar with nsDocument could take a look to see if they see any obvious mistakes there.
Flags: needinfo?(tnikkel)
![]() |
Assignee | |
Comment 7•8 years ago
|
||
Comment 2 is right. Documents with no presshell should not be in this list. Documents that had a presshell should not be getting destroyed without the presshell being destroyed; for one thing the presshell holds a strong ref to the document.
In general, the idea is that a document is registered with the refresh driver only when all the following are true:
1) mPresShell != nullptr
2) IsEventHandlingEnabled()
3) !AnimationsPaused()
4) !mFrameRequestCallbacks.IsEmpty()
Let's look through places that might affect those conditions:
* mPresShell being non-null. This can only change in
nsDocument::DeleteShell and nsDocument::CreateShell.
* DeleteShell calls RevokeAnimationFrameNotifications() if
IsEventHandlingEnabled() && !AnimationsPaused() (and recall that our
invariant is that if either condition were false then the document is
not registered with the refresh driver to start with).
RevokeAnimationFrameNotifications only unregisters if !mFrameRequestCallbacks.IsEmpty().
* CreateShell calls MaybeRescheduleAnimationFrameNotifications() which checks mPresShell and
!mFrameRequestCallbacks.IsEmpty() and IsEventHandlingEnabled() but NOT !AnimationsPaused().
* IsEventHandlingEnabled being true. This can change due to mEventsSuppressed changing or
mScriptGlobalObject changing.
* mEventsSuppressed can become nonzero in nsDocument::SuppressEventHandling, which
calls RevokeAnimationFrameNotifications if we're transitioning to either
!IsEventHandlingEnabled() or AnimationsPaused().
* mEventsSuppressed can become zero in DecreaseEventSuppression, which calls
MaybeRescheduleAnimationFrameNotifications().
* mScriptGlobalObject can change in nsDocument::SetScriptGlobalObject which calls
MaybeRescheduleAnimationFrameNotifications() when going non-null and calls
RevokeAnimationFrameNotifications if
mPresShell && !EventHandlingSuppressed() && !AnimationsPaused()
* AnimationsPaused() being false. This can change in SuppressEventHandling() or
ResumeAnimations(). We already looked at the former; the latter calls
MaybeRescheduleAnimationFrameNotifications().
* mFrameRequestCallbacks.IsEmpty() being false. This can change in
ScheduleFrameRequestCallback(), which calls ScheduleFrameRequestCallbacks
if mPresShell && IsEventHandlingEnabled() && !AnimationsPaused() and
transitioning to a nonzero count. It can also change in
nsIDocument::CancelFrameRequestCallback( which will remove if
mPresShell && IsEventHandlingEnabled() (but does NOT check AnimationsPaused()).
So at first glance there are two bugs:
1) nsIDocument::CancelFrameRequestCallback should not call RevokeFrameRequestCallbacks if AnimationsPaused(), because there is no need: they're already revoked.
2) MaybeRescheduleAnimationFrameNotifications should check AnimationsPaused().
But #2 is presumably what's causing problems here. If we pause animations and _then_ create a presshell, then we will incorrectly register ourselves with the refresh driver. If we then proceed to delete the presshell and kill the document, it won't get unregistered, and we can get these crashes.
I knew this all sounded familiar; see bug 1191942...
Flags: needinfo?(bzbarsky)
![]() |
Assignee | |
Comment 8•8 years ago
|
||
So the question is: when is AnimationsPaused() true? It used to be the case that we'd do:
mSuspendedDoc->SuppressEventHandling(nsIDocument::eAnimationsOnly);
in nsGlobalWindow::EnterModalState. See bug 956704. But bug 1316330 recently changed this to do:
topDoc->SuppressEventHandling(nsIDocument::eEvents);
Doesn't this effectively regress bug 956704?
Anyway, bug 1316330 is on 53, as of 4 days ago. With that landed, AnimationsPaused() is never true, and all that animation-pausing code is dead code and can go away.
So here's a question: did landing bug 1316330 on Aurora make this crash disappear in 53?
Flags: needinfo?(rjesup)
Flags: needinfo?(mrbkap)
Flags: needinfo?(jjong)
Flags: needinfo?(bugs)
![]() |
Assignee | |
Comment 9•8 years ago
|
||
I believe this should fix things; we probably want to backport this to various branches
Attachment #8837196 -
Flags: review?(mrbkap)
![]() |
Assignee | |
Updated•8 years ago
|
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
![]() |
Assignee | |
Comment 10•8 years ago
|
||
Comment on attachment 8837196 [details] [diff] [review]
Fix our condition for rescheduling frame request callbacks
[Security approval request comment]
How easily could an exploit be constructed based on the patch? Not that easily. I tried to do this in the past, in bug 1191942, and couldn't even get a reliable crash. It's probably doable, but you have to get the timing and GC behavior and so forth _just_ right somehow.
Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? It's pretty clear from the patch that this has to do with frame request callbacks and the AnimationsPaused() condition. Code searching will quickly lead one to see that the latter is only true in modal state. And looking at ScheduleFrameRequestCallbacks will probably quickly show that a weak pointer is held to the document. But apart from that, nothing in the patch really points out the issues.
Which older supported branches are affected by this flaw? All of them.
If not all supported branches, which bug introduced the flaw? Bug 956704.
Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? I expect this patch to cleanly apply everywhere; it certainly applies cleanly to ESR45.
How likely is this patch to cause regressions; how much testing does it need? I _think_ regressions are fairly unlikely, but this code is finicky. More testing is likely better.
Attachment #8837196 -
Flags: sec-approval?
![]() |
Assignee | |
Comment 11•8 years ago
|
||
Comment on attachment 8837196 [details] [diff] [review]
Fix our condition for rescheduling frame request callbacks
[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: I expect this is sec:crit.
User impact if declined: Crashes
Fix Landed on Version: Will be 54.
Risk to taking this patch (and alternatives if risky): Moderate, but I don't
think there are any safer alternatives here, apart from having
likely-exploitable crashes.
String or UUID changes made by this patch: None.
See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Approval Request Comment
[Feature/Bug causing the regression]: Bug 956704
[User impact if declined]: Crashes
[Is this code covered by automated tests?]: Not very well, no.
[Has the fix been verified in Nightly?]: No.
[Needs manual test from QE? If yes, steps to reproduce]: It's unclear what the
steps to reproduce are....
[List of other uplifts needed for the feature/fix]: No.
[Is the change risky?]: Moderately.
[Why is the change risky/not risky?]: It's touching pretty finicky code. I
_think_ my analysis in this bug, and my boolean tests, are both correct,
but I wouldn't bet my life on it.
[String changes made/needed]: None.
Attachment #8837196 -
Flags: approval-mozilla-release?
Attachment #8837196 -
Flags: approval-mozilla-esr52?
Attachment #8837196 -
Flags: approval-mozilla-esr45?
Attachment #8837196 -
Flags: approval-mozilla-beta?
Attachment #8837196 -
Flags: approval-mozilla-aurora?
Updated•8 years ago
|
status-firefox-esr45:
--- → affected
status-firefox-esr52:
--- → affected
tracking-firefox54:
--- → ?
tracking-firefox-esr45:
--- → ?
tracking-firefox-esr52:
--- → ?
Comment 12•8 years ago
|
||
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #8)
> So here's a question: did landing bug 1316330 on Aurora make this crash
> disappear in 53?
There weren't many crashes on aurora 53, but it looks like the last one was from a build from Feb. 11:
https://crash-stats.mozilla.com/signature/?version=53.0a2&signature=RefPtr%3CT%3E%3A%3ARefPtr%3CT%3E%20%7C%20nsTArray_Impl%3CT%3E%3A%3AAppendElement%3CT%3E%20%7C%20TakeFrameRequestCallbacksFrom&date=%3E%3D2017-01-31T20%3A47%3A27.000Z&date=%3C2017-02-14T20%3A47%3A27.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-date&page=1#reports
The patch from bug 1316330 landed on Aurora on Feb. 10th, which means it should be in the Feb. 11th build. So maybe that patch didn't fix the issue completely.
![]() |
Assignee | |
Comment 13•8 years ago
|
||
That suggests the patch I attached won't completely fix it either...
Reporter | ||
Comment 14•8 years ago
|
||
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #8)
> Anyway, bug 1316330 is on 53, as of 4 days ago. With that landed,
> AnimationsPaused() is never true, and all that animation-pausing code is
> dead code and can go away.
>
> So here's a question: did landing bug 1316330 on Aurora make this crash
> disappear in 53?
https://crash-stats.mozilla.com/report/index/ea83ab5a-d305-4abf-91e2-1ab192170212 says no, it didn't go away. (BuildID 20170211xxxx, 3 days ago; 2.5 days before your comment). You may want to verify if that fix is in that build or not.
(I think this is also what liz was saying)
Flags: needinfo?(rjesup)
Comment 16•8 years ago
|
||
Comment on attachment 8837196 [details] [diff] [review]
Fix our condition for rescheduling frame request callbacks
Review of attachment 8837196 [details] [diff] [review]:
-----------------------------------------------------------------
Given comment 8, I don't understand how this patch fixes this bug.
Attachment #8837196 -
Flags: review?(mrbkap)
Updated•8 years ago
|
Flags: needinfo?(mrbkap)
Comment 17•8 years ago
|
||
Should this go in even if this doesn't completely fix the bug? The last few comments leave me confused.
Flags: needinfo?(bzbarsky)
![]() |
Assignee | |
Comment 18•8 years ago
|
||
So here's my take on this:
1) The logic in nsDocument is clearly buggy per comment 7.
2) On m-c, that bugginess may be irrelevant because AnimationsPaused() is always false, afaict.
3) The thing that makes AnimationsPaused() always be false is not on various
branches (52 and earlier), so we absolutely need to fix the logic on those
branches to have any hope of not hitting this crash there.
4) Fixing the logic may not be enough to fix all the crashes,
given comment 12 and comment 14.
I'm happy to spin off a separate bug for "fix the logic and backport everywhere" if people would prefer that.
Flags: needinfo?(bzbarsky)
![]() |
Assignee | |
Comment 19•8 years ago
|
||
Comment on attachment 8837196 [details] [diff] [review]
Fix our condition for rescheduling frame request callbacks
See comment 18.
Attachment #8837196 -
Flags: review?(mrbkap)
Updated•8 years ago
|
Attachment #8837196 -
Flags: approval-mozilla-esr52?
Updated•8 years ago
|
Comment 20•8 years ago
|
||
Comment on attachment 8837196 [details] [diff] [review]
Fix our condition for rescheduling frame request callbacks
sec-approval+.
I'll give aurora, and beta approval as well.
Attachment #8837196 -
Flags: approval-mozilla-beta?
Attachment #8837196 -
Flags: approval-mozilla-beta+
Attachment #8837196 -
Flags: approval-mozilla-aurora?
Attachment #8837196 -
Flags: approval-mozilla-aurora+
Updated•8 years ago
|
Comment 21•8 years ago
|
||
Comment on attachment 8837196 [details] [diff] [review]
Fix our condition for rescheduling frame request callbacks
Review of attachment 8837196 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks. We should definitely file a followup at least for trunk to figure out if we can remove AnimationsPaused.
Attachment #8837196 -
Flags: review?(mrbkap) → review+
Comment 22•8 years ago
|
||
Updated•8 years ago
|
Attachment #8837196 -
Flags: sec-approval? → sec-approval+
Comment 23•8 years ago
|
||
uplift |
Comment 24•8 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Comment 25•8 years ago
|
||
uplift |
![]() |
Assignee | |
Comment 26•8 years ago
|
||
> We should definitely file a followup at least for trunk to figure out if we can remove AnimationsPaused.
Jessica filed bug 1340086.
Comment 27•8 years ago
|
||
Comment on attachment 8837196 [details] [diff] [review]
Fix our condition for rescheduling frame request callbacks
Fix a sec-critical. ESR45+.
Attachment #8837196 -
Flags: approval-mozilla-esr45? → approval-mozilla-esr45+
Comment 28•8 years ago
|
||
uplift |
Updated•8 years ago
|
Group: layout-core-security → core-security-release
Comment 29•8 years ago
|
||
Clearing ni since this has been resolved and bug 1340086 has landed too.
Flags: needinfo?(jjong)
Comment 30•8 years ago
|
||
Comment on attachment 8837196 [details] [diff] [review]
Fix our condition for rescheduling frame request callbacks
We don't have then plan to have dot release for 51. Rel51-.
Attachment #8837196 -
Flags: approval-mozilla-release? → approval-mozilla-release-
Updated•8 years ago
|
Flags: qe-verify-
Whiteboard: [post-critsmash-triage]
Updated•8 years ago
|
tracking-firefox-esr52:
? → ---
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main52+][adv-esr45.8+]
![]() |
Assignee | |
Comment 31•8 years ago
|
||
Bug 1342823 is tracking the fact that this is still happening on trunk. I have no idea why so far. :(
Updated•7 years ago
|
Flags: needinfo?(bugs)
Updated•7 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•