crash in nsCOMPtr<T>::nsCOMPtr<T>(nsIVariant*) | nsTArray_Impl<T>::AppendElement<T>(nsIDocument*&)

VERIFIED FIXED in Firefox 42

Status

()

defect
--
critical
VERIFIED FIXED
4 years ago
3 years ago

People

(Reporter: dmajor, Assigned: bzbarsky)

Tracking

(4 keywords)

42 Branch
mozilla43
x86
Windows NT
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox39 wontfix, firefox40 wontfix, firefox41+ wontfix, firefox42+ verified, firefox43+ verified, firefox-esr3842+ verified)

Details

(Whiteboard: [adv-main42+][adv-esr38.4+], crash signature)

Attachments

(4 attachments)

Reporter

Description

4 years ago
This bug was filed from the Socorro interface and is 
report bp-a2ad264a-47f7-41fc-8c09-e6db22150803.
=============================================================

The signature changes a lot thanks to COMDAT folding, but as far as I can tell this stack with the refresh driver is seen on all current trains.

A good number of these are at address 0x5a5a5a5e, and another chunk are at 0x4 -- either way it's an AddRef through a busted vtable. At first glance it looks like one of these FrameRequestCallbacks has been freed.

Frame 	Module 	Signature 	Source
0 	xul.dll 	nsCOMPtr<nsIVariant>::nsCOMPtr<nsIVariant>(nsIVariant*) 	xpcom/glue/nsCOMPtr.h
1 	xul.dll 	nsTArray_Impl<DocumentFrameCallbacks, nsTArrayInfallibleAllocator>::AppendElement<nsIDocument*&>(nsIDocument*&) 	xpcom/glue/nsTArray.h
2 	xul.dll 	TakeFrameRequestCallbacksFrom 	layout/base/nsRefreshDriver.cpp
3 	xul.dll 	nsRefreshDriver::RunFrameRequestCallbacks(__int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp
4 	xul.dll 	nsRefreshDriver::Tick(__int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp
5 	xul.dll 	mozilla::RefreshDriverTimer::TickDriver(nsRefreshDriver*, __int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp
6 	xul.dll 	mozilla::InactiveRefreshDriverTimer::TickOne() 	layout/base/nsRefreshDriver.cpp
7 	xul.dll 	mozilla::InactiveRefreshDriverTimer::TimerTickOne(nsITimer*, void*) 	layout/base/nsRefreshDriver.cpp
Reporter

Comment 1

4 years ago
Seth, is this an area that you'd be able to help with?
Flags: needinfo?(seth)
Is 39 really affected? bug 1145439 landed for FF 40.
Keywords: sec-critical
ok, a bit different stack, but probably the same issue.
(In reply to David Major [:dmajor] from comment #3)
> https://crash-stats.mozilla.com/search/?version=39.
> 0&signature=%3DnsCOMPtr%3CT%3E%3A%3AnsCOMPtr%3CT%3E%28nsIDOMWindow*%29+|+nsTA
> rray_Impl%3CT%3E%3A%3AAppendElement%3CT%3E%28nsIDocument*%26%29&_facets=signa
> ture&_facets=build_id&_facets=version&_facets=release_channel&_columns=date&_
> columns=signature&_columns=product&_columns=version&_columns=build_id&_column
> s=platform#crash-reports
> 
> Example crash: bp-fd8e3241-bce8-4510-aab6-f488c2150731

So this suggests bug 1145439 is not involved - it just changed the stack, but this was a preexisting issue.

I'm probably not the most familiar person with this aspect of the code. Checking blame, it looks to me like bz is probably the best person to talk to about memory management of FrameRequestCallbacks.

I'm happy to help if others can't, though. I just doubt I'm the person who can fix this the fastest.
Flags: needinfo?(seth)
Tracking this for 41+ since it's sec-critical.
bz, do you want to take this on, or can you suggest another owner?
Flags: needinfo?(bzbarsky)

Updated

4 years ago
Group: core-security → gfx-core-security
Hmm.  So is the 0x5a5a5a5e in this case the aDocument pointer we're passing to AppendElement?

I would sort of understand 0x5a5a5a5a -- that's jemalloc freed memory.

Anyway, so if we do in fact have a freed-document pointer here, then presumably the document didn't call RevokeFrameRequestCallbacks for some reason.

RevokeFrameRequestCallbacks is called by RevokeAnimationFrameNotifications and CancelFrameRequestCallback.

RevokeAnimationFrameNotifications is called in the following cases:

a) nsDocument::DeleteShell but only if IsEventHandlingEnabled().  IsEventHandlingEnabled() == !EventHandlingSuppressed() && mScriptGlobalObject.

b) nsDocument::SetScriptGlobalObject when mScriptGlobalObject && !aScriptGlobalObject && mShell && !EventHandlingSuppressed().

c) nsDocument::SuppressEventHandling if mEventsSuppressed == 0 && mAnimationsPaused == 0 && aIncrease != 0 && mPresShell && mScriptGlobalObject

Oh, and EventHandlingSuppressed() returns mEventsSuppressed.

OK, so I think I see a possible hole here.  Consider the following sequence of steps:

1)  We get a SuppressEventHandling call that's eAnimationsOnly.  mAnimationsPaused becomes nonzero.

2)  The page calls requestAnimationFrame().  In nsIDocument::ScheduleFrameRequestCallback we see that IsEventHandlingEnabled(), so add ourselves to the refresh driver.

3)  We get a SuppressEventHandling call that's eEvents.  Since mAnimationsPaused is nonzero we do NOT call RevokeAnimationFrameNotifications.

4)  The presshell is destroyed.  DeleteShell sees that EventHandlingSuppressed() is false, so doesn't call RevokeAnimationFrameNotifications.

5)  The document is torn down, either before event handling is unsuppressed or after.  SetScriptGlobalObject sees mPresShell is null, does not call RevokeAnimationFrameNotifications.

Net result is we end up with a dangling pointer.

Need to figure out when these eAnimationsOnly suppressions happen...
Flags: needinfo?(bzbarsky)
OK, so I haven't figured out a crashing testcase.  :(  But I do have a speculative fix, assuming the analysis in comment 8 is correct.
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
I guess that testcase asserts without the patch, but not with the patch.  So at least it's testing _something_.
Comment on attachment 8657325 [details] [diff] [review]
Make sure to not schedule requestAnimationFrame callbacks if animations are paused

Review of attachment 8657325 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsDocument.cpp
@@ +4637,5 @@
>      // We're detaching from the window.  We need to grab a pointer to
>      // our layout history state now.
>      mLayoutHistoryState = GetLayoutHistoryState();
>  
> +    if (mPresShell && !EventHandlingSuppressed() && !AnimationsPaused()) {

I don't understand how this change and the above change are relevant. Are you really fixing a separate bug here?

@@ +10327,5 @@
>    DebugOnly<FrameRequest*> request =
>      mFrameRequestCallbacks.AppendElement(FrameRequest(aCallback, newHandle));
>    NS_ASSERTION(request, "This is supposed to be infallible!");
> +  if (!alreadyRegistered && mPresShell && IsEventHandlingEnabled() &&
> +      !AnimationsPaused()) {

This is what really fixes the bug, right?
Attachment #8657325 - Flags: review?(roc)
> I don't understand how this change and the above change are relevant.

They're just keeping the existing behavior invariant of "don't call this function if it won't do anything anyway".  I could take these changes out, and it would just be slightly inefficient and a bit confusing as to why we're checking these other conditions but not this one.

> This is what really fixes the bug, right?

Yes, assuming I diagnosed it correctly.
Flags: needinfo?(roc)
Boris, is this something we want to uplift to Beta41? We gtb 41 RC on Monday so if safe this could be uplifted before that.
Flags: needinfo?(bzbarsky)
> Boris, is this something we want to uplift to Beta41?

Possibly, yes.

This needs all the sec-approval dance, though.  I didn't realize this had gotten reviewed.  :(
Flags: needinfo?(bzbarsky)
Comment on attachment 8657325 [details] [diff] [review]
Make sure to not schedule requestAnimationFrame callbacks if animations are paused

[Security approval request comment]
How easily could an exploit be constructed based on the patch? Not very easily (as in, I didn't manage to do it).

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?  Not exactly, though it's clear from the code that it's something about rAF and animation pausing.

Which older supported branches are affected by this flaw? All of them, I think; this date back to Firefox 30.

If not all supported branches, which bug introduced the flaw? Bug 956704.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?  I expect this patch to pretty much apply on all the branches.

How likely is this patch to cause regressions; how much testing does it need?  I think this is not very likely at all to cause regressions.  What we don't know is whether it fixes the bug...
Flags: needinfo?(roc)
Attachment #8657325 - Flags: sec-approval?
Comment on attachment 8657325 [details] [diff] [review]
Make sure to not schedule requestAnimationFrame callbacks if animations are paused

sec-approval=dveditz. Let's not rush this into the Firefox 41 release-candidate, OK to land on nightly for now and if it fixes the problem we can uplift to 42 (aurora or beta depending on when) and ESR 38.4  (but request separate branch approvals for those when you're ready)
Attachment #8657325 - Flags: sec-approval? → sec-approval+
https://hg.mozilla.org/mozilla-central/rev/ed8a6af1508a
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Too late to fix in 41.
Group: gfx-core-security → core-security-release
Daniel, Boris, do you think we should take this patch for 42 now? Thanks
Flags: needinfo?(dveditz)
Flags: needinfo?(bzbarsky)
Yes, please do -- the more testing the better.
Flags: needinfo?(dveditz)
ok, thanks Daniel

Bz, could you fill the uplift request if you agree? Thanks
Comment on attachment 8657325 [details] [diff] [review]
Make sure to not schedule requestAnimationFrame callbacks if animations are paused

Approval Request Comment
[Feature/regressing bug #]: Bug 956704
[User impact if declined]: Crashes in some cases
[Describe test coverage new/current, TreeHerder]: Nothing for this issue
   specifically; probably some general testing of this functionality.
[Risks and why]: I think this is pretty low risk.
[String/UUID change made/needed]: None.
Flags: needinfo?(bzbarsky)
Attachment #8657325 - Flags: approval-mozilla-beta?
Comment on attachment 8657325 [details] [diff] [review]
Make sure to not schedule requestAnimationFrame callbacks if animations are paused

thanks, should be in 42 beta 3.
Attachment #8657325 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify+
Comment on attachment 8657325 [details] [diff] [review]
Make sure to not schedule requestAnimationFrame callbacks if animations are paused

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: Crashes, possibly exploitable
Fix Landed on Version: 43 and backported to 42
Risk to taking this patch (and alternatives if risky): I think this is fairly low
   risk.
String or UUID changes made by this patch: None.

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Attachment #8657325 - Flags: approval-mozilla-esr38?
Comment on attachment 8657325 [details] [diff] [review]
Make sure to not schedule requestAnimationFrame callbacks if animations are paused

Approved as it's a sec-crit and fixed in 42, 43 already.
Attachment #8657325 - Flags: approval-mozilla-esr38? → approval-mozilla-esr38+
Not able to reproduce this crash with any of the given testcases; tried with several builds (Nightly from 2015-08-06, 40 beta, 41.0.2 build 2), under Windows 10 32-bit, Windows 7 64-bit and 32-bit. Is there any other way to reproduce the initial issue in order to confirm the fix with latest builds??
Flags: needinfo?(bzbarsky)
> Is there any other way to reproduce the initial issue

We don't know a reliable way to reproduce this.  See comment 9 and comment 14.
Flags: needinfo?(bzbarsky)
Via Socorro → no new crash reports present for both signatures in the last month:
- [@ nsCOMPtr<T>::nsCOMPtr<T>(nsIVariant*) | nsTArray_Impl<T>::AppendElement<T>(nsIDocument*&) ]
https://crash-stats.mozilla.com/report/list?range_unit=days&range_value=28&signature=nsCOMPtr%3CT%3E%3A%3AnsCOMPtr%3CT%3E%28nsIVariant%2A%29+%7C+nsTArray_Impl%3CT%3E%3A%3AAppendElement%3CT%3E%28nsIDocument%2A%26%29#tab-reports

- [@ nsCOMPtr<T>::nsCOMPtr<T>(nsIDOMWindow*) | nsTArray_Impl<T>::AppendElement<T>(nsIDocument*&) ]
https://crash-stats.mozilla.com/report/list?range_unit=days&range_value=28&signature=nsCOMPtr%3CT%3E%3A%3AnsCOMPtr%3CT%3E%28nsIDOMWindow%2A%29+%7C+nsTArray_Impl%3CT%3E%3A%3AAppendElement%3CT%3E%28nsIDocument%2A%26%29#tab-reports

Although we didn't manage to reproduce the crash, based on the above results, I'll call this verified.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Whiteboard: [adv-main42+][adv-esr38.4+]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.