Closed Bug 1430813 Opened 6 years ago Closed 6 years ago

Crash in nsStyleContext::PresContext (under PuppetWidget::DispatchEvent)

Categories

(Core :: DOM: Events, defect)

Unspecified
Windows 10
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla61
Tracking Status
firefox-esr52 --- unaffected
firefox57 --- wontfix
firefox58 --- wontfix
firefox59 + wontfix
firefox60 + fixed
firefox61 + fixed

People

(Reporter: marcia, Assigned: kats)

References

Details

(Keywords: crash, regression, topcrash)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is
report bp-02057a20-389a-469a-a8d7-552540180116.
=============================================================

Seen while looking at Windows nightly crash data: http://bit.ly/2rbqzsN. This crash is present on both 57 and 58, but is more visible on nightly. All crashes have crash reason: EXCEPTION_ACCESS_VIOLATION_READ

Some URLs:

*https://www.callofwar.com/play.php?L=4&bust=1&&uid=8580949&gameID=2211437&source=browser-desktop 
*https://www.amazon.co.jp/Samsung-USB3-1-TYPE-C-540MB-MU-PA500B/dp/B073GZBT36/ 
*http://allrecipes.com/recipe/54165/balsamic-bruschetta/ 
*http://www.marksandspencer.com/pure-cotton-floral-print-peplum-shell-top/p/p60130034?image=SD_01_T41_4552_F4_X_EC_90&color=NAVYMIX&prevPage=plp  

Top 10 frames of crashing thread:

0 xul.dll nsStyleContext::PresContext layout/style/nsStyleContextInlines.h:153
1 xul.dll WeakFrame::Init layout/generic/nsFrame.cpp:501
2 xul.dll mozilla::EventStateManager::PreHandleEvent dom/events/EventStateManager.cpp:535
3 xul.dll mozilla::PresShell::HandleEventInternal layout/base/PresShell.cpp:7752
4 xul.dll mozilla::PresShell::HandleEvent layout/base/PresShell.cpp:7401
5 xul.dll nsViewManager::DispatchEvent view/nsViewManager.cpp:812
6 xul.dll nsView::HandleEvent view/nsView.cpp:1139
7 xul.dll mozilla::widget::PuppetWidget::DispatchEvent widget/PuppetWidget.cpp:409
8 xul.dll mozilla::layers::APZCCallbackHelper::DispatchWidgetEvent gfx/layers/apz/util/APZCCallbackHelper.cpp:499
9 xul.dll mozilla::dom::TabChild::RecvRealTouchEvent dom/ipc/TabChild.cpp:1904

=============================================================
Summary: Crash in nsStyleContext::PresContext → Crash in nsStyleContext::PresContext (under PuppetWidget::DispatchEvent)
Small spike on nightly builds from the 21st and 22nd - over 300 crashes between the two builds.
Emilio, does this stack show you anything insightful?
Flags: needinfo?(emilio)
Note that we're spinning the event loop here with a sync XHR.
That's in the particular crash mentioned in comment 0, for which I don't have any insight unfortunately.

Most of the new crashes come from DisplayItemData::Destroy. In that case, that one looks a lot like a crash that was fixed by backout yesterday...
Flags: needinfo?(emilio)
Yeah, the nightly spike definitely looks fixed by backout. I'm just wondering if there's anything useful we get do for Fx59 at this point.
this content crash signature is spiking up now after 59 has gone into release.
Currently #8 content process topcrash.
Keywords: topcrash
On Release (59.0.1) that is. Not on Beta.
We're still seeing the crash on beta 60 so it may be worth trying to fix for 60. The crash volume wasn't high for 59 beta but it got pretty bad on release.
Looks like this is gone in 61 (or the signature changed?), but beta (60) is still crashy.
the signature appears to have changed to [@ WeakFrame::Init] in 61.0a1 build 20180321040527 and later.
Crash Signature: [@ nsStyleContext::PresContext] → [@ nsStyleContext::PresContext] [@ WeakFrame::Init ]
I don't know if this will be of help;
why is Firefox crashing when I open emails in Inbox by Google?
https://support.mozilla.org/en-US/questions/1212546

User comment;
https://support.mozilla.org/en-US/questions/1212546#answer-1101712
Update: I figured out how to reproduce the issue - simply open any email and immediately scroll down. Then the browser crashes. Clicking to open an email, pausing 2 or 3 seconds, and *then* scrolling down seems to be OK.
Moving to the APZ component based on the stack (this comes from APZCallbackHelper). Though maybe it's more DOM: Events...
Component: CSS Parsing and Computation → Panning and Zooming
:kats this top crash on windows seems to be APZ related, can you take a look?
Flags: needinfo?(bugmail)
All input events dispatched to the content process go through APZCCallbackHelper, that doesn't look like the problem here. As far as I can tell it looks like EventStateManager has a pre-existing mCurrentTarget (which is a WeakFrame) and trying to get the PresContext from that frame crashes. So the question is why would trying to get the pres context crash, that doesn't even make sense, it should just return null. Considering it's wrapped in a WeakFrame the frame shouldn't be getting destroyed without the WeakFrame's internal pointer getting cleared.

Emilio/bholley, do you understand the actual mechanics of the crash? The leaf frame in the stack is MOZ_STYLO_FORWARD [1] and I don't know what does or why it might crash.

[1] https://hg.mozilla.org/releases/mozilla-release/annotate/239e434d6d2b/layout/style/nsStyleContextInlines.h#l153
Component: Panning and Zooming → DOM: Events
Flags: needinfo?(bugmail)
Well, it's only crashing on style because nsIFrame::PresContext goes through ComputedStyle::PresContext. The nsIFrame pointer being garbage could explain such a crash.
So the frame pointer being garbage means that WeakFrame isn't working as intended or something is violating assumptions made by it. I looked around the code and I see that the presshell clears out the WeakFrame instances when it is destroyed at [1] but the EventStateManager has this fishy-looking code at [2]. I would expect this to somehow clear out mCurrentTarget but it doesn't. Could it be this leaves a dangling WeakFrame when the underlying nsIFrame is destroyed? Mats, you seem to have added WeakFrame relatively recently, do you have any thoughts here?

[2] https://searchfox.org/mozilla-central/rev/a30a60eadcff99d2a043241955d215dbeccad876/dom/events/EventStateManager.cpp#3851-3853
Flags: needinfo?(mats)
I don't see anything fishy about [2] from safety point of view.
The mCurrentTarget member is a WeakFrame so it will be automatically
nulled out when the underlying frame is destroyed, so clearing it
here would be redundant.  I'm guessing that what [2] is really
doing is make sure mCurrentTargetContent is up-to-date so that if
a new frame is created for that content the event processing could
continue using that frame instead, or something like that.

> So the frame pointer being garbage means that WeakFrame isn't
> working as intended or something is violating assumptions made by it.

I think you misunderstood what emilio said.  The crash stack at
bp-02057a20-389a-469a-a8d7-552540180116 indicates that aFrame given
to PreHandleEvent is already destroyed.  One level up aFrame came
from PresShell::mCurrentEventFrame which is a raw pointer.  But it's
automatically nulled out in PresShell::NotifyDestroyingFrame
whenever that frame is destroyed, except that we skip that exercise
if the whole shell is being destroyed (mIgnoreFrameDestruction).
It can't be deleted though, since one level up again, there is
a kungFuDeathGrip:
https://hg.mozilla.org/mozilla-central/annotate/b7d66e4e60ef/layout/base/PresShell.cpp#l7392
That doesn't prevent PresShell::Destroy() from destroying 
the frame tree though, but it clears mCurrentEventFrame too:
https://hg.mozilla.org/mozilla-central/annotate/b7d66e4e60ef/layout/base/PresShell.cpp#l1312

So the issue must be earlier than that.  The use of 'frame' here
looks suspicious to me:
https://hg.mozilla.org/mozilla-central/annotate/b7d66e4e60ef/layout/base/PresShell.cpp#l7400
I'd guess it's already destroyed there.

This code looks quite broken for example:
https://hg.mozilla.org/mozilla-central/annotate/b7d66e4e60ef/layout/base/PresShell.cpp#l7357
If 'frame' != 'targetFrame', or if the "class" isn't eMouseEventClass,
then it's possible that DispatchPointerFromMouseOrTouch might have
destroyed 'frame' without updating its value.
Flags: needinfo?(mats)
(In reply to Mats Palmgren (:mats) from comment #19)
> I don't see anything fishy about [2] from safety point of view.
> The mCurrentTarget member is a WeakFrame so it will be automatically
> nulled out when the underlying frame is destroyed, so clearing it
> here would be redundant.

True, I forgot that the WeakFrame is registered with the presshell no matter where the WeakFrame lives.

> > So the frame pointer being garbage means that WeakFrame isn't
> > working as intended or something is violating assumptions made by it.
> 
> I think you misunderstood what emilio said.  The crash stack at
> bp-02057a20-389a-469a-a8d7-552540180116 indicates that aFrame given
> to PreHandleEvent is already destroyed.

Ah yes, you're right, I misread the code. For some reason I thought it was the old mFrame in the WeakFrame that was being accessed rather than the new aFrame.

> So the issue must be earlier than that.  The use of 'frame' here
> looks suspicious to me:
> https://hg.mozilla.org/mozilla-central/annotate/b7d66e4e60ef/layout/base/
> PresShell.cpp#l7400
> I'd guess it's already destroyed there.
> 
> This code looks quite broken for example:
> https://hg.mozilla.org/mozilla-central/annotate/b7d66e4e60ef/layout/base/
> PresShell.cpp#l7357
> If 'frame' != 'targetFrame', or if the "class" isn't eMouseEventClass,
> then it's possible that DispatchPointerFromMouseOrTouch might have
> destroyed 'frame' without updating its value.

Indeed, nice catch! If the event is a touch event (which it is in this case, based on the stack), then `targetFrame` gets set to `aFrame`, which might not be the same as `frame`. DispatchPointerFromMouseOrTouch could then destroy `frame` while leaving `targetFrame` intact, which would bypass the block at [1] and leave `frame` pointing to garbage. Using it later would then crash.

It seems like similar blocks of code earlier in this function use an AutoWeakFrame to hold on to `frame` when doing operations similar to DispatchPointerFromMouseOrTouch, and then check for liveness afterwards and bail out if necessary. Seems like we should probably do something similar here.

[1] https://searchfox.org/mozilla-central/rev/a30a60eadcff99d2a043241955d215dbeccad876/layout/base/PresShell.cpp#7215
PresShell::HandleEvent is a horrible mess.
Here we might reassign 'frame' to the root frame of some shell ancestor:
https://searchfox.org/mozilla-central/rev/a30a60eadcff99d2a043241955d215dbeccad876/layout/base/PresShell.cpp#6911
but the rest of HandleEvent quite happily continues to use 'this'
to access members etc.

The IME block at the end for example makes the assumption that
'frame' belongs to 'this':
https://searchfox.org/mozilla-central/rev/a30a60eadcff99d2a043241955d215dbeccad876/layout/base/PresShell.cpp#7268
It even assigns mCurrentEventFrame = frame which is clearly
a recipe for disaster (exploitable UAFs).
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #20)
> It seems like similar blocks of code earlier in this function use an
> AutoWeakFrame to hold on to `frame` when doing operations similar to
> DispatchPointerFromMouseOrTouch, and then check for liveness afterwards and
> bail out if necessary. Seems like we should probably do something similar
> here.

I'll write a patch to do this.

(In reply to Mats Palmgren (:mats) from comment #21)
> PresShell::HandleEvent is a horrible mess.

Yeah, cleaning this up is something that should be done by somebody who understands it (i.e. not me) and is outside the scope of this bug.
Assignee: nobody → bugmail
(In reply to Mats Palmgren (:mats) from comment #21)
> PresShell::HandleEvent is a horrible mess.
It has been a mess always, and it is still a mess after some major cleanups happened recently.
Comment on attachment 8969693 [details]
Bug 1430813 - Prevent scenario where we keep a dead frame pointer on the EventStateManager.

https://reviewboard.mozilla.org/r/238488/#review244546

I guess we need this on beta too.

::: layout/base/PresShell.cpp:7237
(Diff revision 1)
>          shell = GetShellForEventTarget(frame, targetContent);
>          if (!shell) {
>            return NS_OK;
>          }
> +      } else if (!weakFrame.IsAlive()) {
> +        NS_WARNING("This code is a disaster!!");

Useless comment.
Attachment #8969693 - Flags: review?(bugs) → review+
Btw, one thing which made the code uglier recently is that Chrome and Edge don't follow the pointer event spec and they dispatch click event on an unexpected element and we had to add a similar bizarre code path.
(Google and MS folks would like to get rid of the behavior they have, eventually)
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/dcb911de9709
Prevent scenario where we keep a dead frame pointer on the EventStateManager. r=smaug
https://hg.mozilla.org/mozilla-central/rev/dcb911de9709
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Comment on attachment 8969693 [details]
Bug 1430813 - Prevent scenario where we keep a dead frame pointer on the EventStateManager.

Approval Request Comment
[Feature/Bug causing the regression]: probably since pointer events, code was added in bug 1420589
[User impact if declined]: crashes during touch events, under some circumstances, if the web content has a listener that makes the target frame get destroyed
[Is this code covered by automated tests?]: no
[Has the fix been verified in Nightly?]: no
[Needs manual test from QE? If yes, steps to reproduce]: don't have STR
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: not particularly
[Why is the change risky/not risky?]: patch is pretty well contained. it shouldn't make things worse, although there might be other faults in the code that it doesn't fix.
[String changes made/needed]: none
Attachment #8969693 - Flags: approval-mozilla-beta?
Comment on attachment 8969693 [details]
Bug 1430813 - Prevent scenario where we keep a dead frame pointer on the EventStateManager.

This is a high volume crash in 59, so it's great if we can improve this in Thursday's beta 16 build, though it may be hard to tell if we've fixed it before 60 release.
Attachment #8969693 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Blocks: 1466208
You need to log in before you can comment on or make changes to this bug.