Last Comment Bug 679933 - Crash [@ nsLayoutUtils::GetFirstContinuationOrSpecialSibling] with input and mask
: Crash [@ nsLayoutUtils::GetFirstContinuationOrSpecialSibling] with input and ...
Status: RESOLVED FIXED
[sg:dos][mitigated by frame-poisoning]
: crash, regression, testcase
Product: Core
Classification: Components
Component: SVG (show other bugs)
: Trunk
: All All
: -- critical (vote)
: mozilla9
Assigned To: Daniel Holbert [:dholbert]
:
: Jet Villegas (:jet)
Mentors:
Depends on: 682649 683702
Blocks: PoisonFrameCrash 620144
  Show dependency treegraph
 
Reported: 2011-08-17 16:40 PDT by Martijn Wargers [:mwargers] (not working for Mozilla)
Modified: 2011-08-31 14:31 PDT (History)
8 users (show)
dholbert: in‑testsuite+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
affected
affected
affected
unaffected
unaffected


Attachments
testcase (237 bytes, text/html)
2011-08-17 16:40 PDT, Martijn Wargers [:mwargers] (not working for Mozilla)
no flags Details
fix v1 (1.09 KB, patch)
2011-08-18 13:34 PDT, Daniel Holbert [:dholbert]
no flags Details | Diff | Splinter Review
patch v2 (1.40 KB, patch)
2011-08-19 17:48 PDT, Daniel Holbert [:dholbert]
no flags Details | Diff | Splinter Review
fix v3 (2.64 KB, patch)
2011-08-24 18:09 PDT, Daniel Holbert [:dholbert]
roc: review+
Details | Diff | Splinter Review

Description Martijn Wargers [:mwargers] (not working for Mozilla) 2011-08-17 16:40:04 PDT
Created attachment 553956 [details]
testcase

See testcase, which crashes current trunk build after 100ms. I guess this is a regression from bug 450340, when looking at the stack.

https://crash-stats.mozilla.com/report/index/bp-38334de0-69ac-4f21-9568-397262110817
0 	xul.dll 	nsLayoutUtils::GetFirstContinuationOrSpecialSibling 	
1 	xul.dll 	nsIFrame::InvalidateInternal 	layout/generic/nsFrame.cpp:4179
2 	xul.dll 	nsBlockFrame::InvalidateInternal 	layout/generic/nsBlockFrame.cpp:549
3 	xul.dll 	nsIFrame::InvalidateInternalAfterResize 	layout/generic/nsFrame.cpp:4169
4 	xul.dll 	nsIFrame::InvalidateInternal 	layout/generic/nsFrame.cpp:4184
5 	xul.dll 	nsIFrame::InvalidateWithFlags 	layout/generic/nsFrame.cpp:4108
6 	xul.dll 	nsIFrame::Invalidate 	layout/generic/nsIFrame.h:2031
7 	xul.dll 	nsIFrame::InvalidateOverflowRect 	layout/generic/nsFrame.cpp:4277
8 	xul.dll 	InvalidateAllContinuations 	layout/svg/base/src/nsSVGEffects.cpp:270
9 	xul.dll 	nsSVGPaintingProperty::DoUpdate 	
10 	xul.dll 	nsFrame::DestroyFrom 	layout/generic/nsFrame.cpp:421
11 	xul.dll 	nsContainerFrame::DestroyFrom 	layout/generic/nsContainerFrame.cpp:290
12 	xul.dll 	nsTextControlFrame::DestroyFrom 	layout/forms/nsTextControlFrame.cpp:210
etc..
Comment 1 Daniel Holbert [:dholbert] 2011-08-17 16:51:17 PDT
I get this regression range (using Linux):
20101220030345 fc50c521bf48
20101221030401 fb629ae54510

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=fc50c521bf48&tochange=fb629ae54510

That's long after bug 450340 landed, so assuming that bug is innocent for now.
Comment 2 Daniel Holbert [:dholbert] 2011-08-17 17:04:23 PDT
Crash is at the last line of this chunk of code:
> nsIFrame*
> nsLayoutUtils::GetFirstContinuationOrSpecialSibling(nsIFrame *aFrame)
> {
>   nsIFrame *result = aFrame->GetFirstContinuation();
>   if (result->GetStateBits() & NS_FRAME_IS_SPECIAL) {
>     while (PR_TRUE) {
>       nsIFrame *f = static_cast<nsIFrame*>
>         (result->Properties().Get(nsIFrame::IBSplitSpecialPrevSibling()));

There, |result| appears to be a deleted (or uninitialized) frame:
=======
(gdb) p *result
$5 = {
  <nsQueryFrame> = {
    _vptr.nsQueryFrame = 0x7ffffffff0dea7ff
  }, 
  members of nsIFrame: 
  static kFrameIID = nsQueryFrame::nsIFrame_id, 
  mRect = {
    <mozilla::gfx::BaseRect<int, nsRect, nsPoint, nsSize, nsMargin>> = {
      x = -253843457, 
      y = 2147483647, 
      width = -253843457, 
      height = 2147483647
    }, <No data fields>}, 
  mContent = 0x7ffffffff0dea7ff, 
  mStyleContext = 0x7ffffffff0dea7ff, 
  mParent = 0x7ffffffff0dea7ff, 
  mNextSibling = 0x7ffffffff0dea7ff, 
  mPrevSibling = 0x7ffffffff0dea7ff, 
  mState = 9223372036600932351, 
  mOverflow = {
    mType = 4041123839, 
    mVisualDeltas = {
      mLeft = 255 '\377', 
      mTop = 167 '\247', 
      mRight = 222 '\336', 
      mBottom = 240 '\360'
    }
  }
}
=======

At first glance, I don't think this is exploitable.
(When we crash, we're calling the non-virtual function nsIFrame::Properties(), which calls the non-virtual function PresContext(), which calls the non-virtual function GetStyleContext() (which just returns a member variable nsIFrame::mStyleContext), on which we call the non-virtual function GetRuleNode(), which tries to looks up "mRuleNode" on our bogus style-context pointer, and we basically crash due to dereferencing our bogus style-context pointer.)

Tentatively flagging as security-sensitive anyway, to be on the safe side (& in case a variant of this might be able to do crash in a more-evil way).
Comment 3 Daniel Holbert [:dholbert] 2011-08-17 17:07:50 PDT
(In reply to Daniel Holbert [:dholbert] from comment #2)
>     _vptr.nsQueryFrame = 0x7ffffffff0dea7ff
[...]
>   mContent = 0x7ffffffff0dea7ff, 
>   mStyleContext = 0x7ffffffff0dea7ff, 
>   mParent = 0x7ffffffff0dea7ff, 
>   mNextSibling = 0x7ffffffff0dea7ff, 
>   mPrevSibling = 0x7ffffffff0dea7ff, 

I believe these are frame-poisoned values, so hopefully this isn't too scary.
Comment 4 Daniel Holbert [:dholbert] 2011-08-17 17:24:01 PDT
Based on the use of masks in the testcase, this changeset stands out in the pushlog as likely to be related:
>	0ca1b65bb907	Robert Longson — Bug 620144 - clip paths and masks that cannot be resolved should be ignored r=jwatt,a=roc

(In fact, the testcase uses a mask that can't be resolved -- "#a")
Comment 5 Robert Longson 2011-08-18 03:07:58 PDT
Doesn't crash on Windows 7 with my Nightly. Are you all testing on Macs?
Comment 6 Daniel Holbert [:dholbert] 2011-08-18 03:12:36 PDT
I was testing using 64-bit Linux.

I also verified that backing out Bug 620144 fixes this, and that it stays fixed if I reapply all of that patch except for the tweak to the nsSVGRenderingObserver::GetReferencedFrame impl.  (but once I reapply that chunk, it re-breaks)

Will investigate further tomorrow.  Could be an uninitialized memory thing -- some sort of unintended reliance on that method's *aOK-setting-behavior.
Comment 7 Robert Longson 2011-08-18 03:24:35 PDT
(In reply to Daniel Holbert [:dholbert] from comment #6)
> I also verified that backing out Bug 620144 fixes this, and that it stays
> fixed if I reapply all of that patch except for the tweak to the
> nsSVGRenderingObserver::GetReferencedFrame impl.  (but once I reapply that
> chunk, it re-breaks)

That's the only non-comment and non-test change in that bug!

If we have a frame and the frame is the right type we return it with and without the patch.

If we have a frame and the frame is the wrong type we return nsnull and set aOK to false again with/without the patch

The only time we're different is when we don't have a frame. Here we still return nsnull but we don't set aOK to false.

We check in nsSVGUtils::PaintFrameWithEffects and nsSVGIntegrationUtils::PaintFramesWithEffects that mask is not null before we try to use it but we do carry on and try to paint stuff (without the mask) whereas before we would have bailed.
Comment 8 Robert Longson 2011-08-18 03:43:58 PDT
Actually I did get it to crash, it just doen't crash when you step through the code in the debugger.
Comment 9 Brandon Sterne (:bsterne) 2011-08-18 10:50:11 PDT
I'm not asserting anything about the exploitability of this bug, but wanted to fix the whiteboard so our queries will find it.
Comment 10 Daniel Holbert [:dholbert] 2011-08-18 13:05:16 PDT
So, what basically happens is this:
 (i)   We've got a nsBlockFrame whose "IBSplitSpecialPrevSibling" property points to its previous sibling (a nsInlineFrame)
 (ii)  While servicing the testcase's dynamic change, we call DestroyFrom() for the outermost container block.
 (iii) That calls "DestroyFrom" on the container's first child -- the nsInlineFrame.
 (iv)  Then it calls "DestroyFrom" on the container's second child -- the nsBlockFrame.
 (v)   As part of destroying our nsBlockFrame, we end up calling nsSVGIntegrationUtils::GetInvalidAreaForChangedSource on it, while invalidating observers of one of its children.
 (vi)  That makes us try to look up the block frame's first continuation.
 (vii) So, we grab its nsIFrame::IBSplitSpecialPrevSibling() property, which points to a frame that's been *destroyed* (up above in 'iii'). As soon as we use this frame, we crash.
Comment 11 Daniel Holbert [:dholbert] 2011-08-18 13:06:26 PDT
(Sorry, not sure how I changed Target Milestone & Branch. Reverting those changes)
Comment 12 Robert Longson 2011-08-18 13:14:57 PDT
Should we just clear out the child properties in step ii) i.e. before step iii.
Comment 13 Daniel Veditz [:dveditz] 2011-08-18 13:16:20 PDT
This looks like a poisoned-frame crash so not exploitable. If it's not a topcrash we don't need to track it for any particular release (but a fix would be nice, especially if we know the cause).
Comment 14 Daniel Holbert [:dholbert] 2011-08-18 13:34:41 PDT
Created attachment 554193 [details] [diff] [review]
fix v1

I'm not sure this is the best solution, but it does fix the crash.

The idea is to jump in between Comment 10's (iii) and (iv), and clear the reference to the frame that we've just destroyed in (iii).

Robert's suggestion (clear the property table entirely) is simpler, but I'm guessing we might need some frame properties while frame-Destroying in some cases...?  (not sure)
Comment 15 Robert O'Callahan (:roc) (email my personal email if necessary) 2011-08-18 16:47:05 PDT
I think it would make more sense, when we destroy a FRAME_IS_SPECIAL frame in step iii, to remove the IBSplitSpecialPrevSibling property from its next-continuation at that point. Get rid of the dangling pointer as quickly as possible.
Comment 16 Daniel Holbert [:dholbert] 2011-08-19 17:48:18 PDT
Created attachment 554592 [details] [diff] [review]
patch v2

This patch removes the dangling pointer to the nsInlineFrame just before we destroy it, along the lines of what roc suggested.

However, when I mentioned this to dbaron, he said it was odd that these IB siblings are all in the same linebox at this point (at least I think they are) -- so I want to investigate a bit more before requesting review.
Comment 17 Boris Zbarsky [:bz] (still a bit busy) 2011-08-23 12:07:03 PDT
Patch 2 looks fine, except it would be really nice to only do the Get() for first-continuations of special frames.... but at that point I suspect child is always a first-continuation because the earlier ones are dead.

And I looked at the frame tree here, and it looks fine: the inlines and blocks are in different line boxes.
Comment 18 Robert O'Callahan (:roc) (email my personal email if necessary) 2011-08-23 18:10:29 PDT
Wouldn't it make more sense to put this destruction logic in nsFrame::DestroyFrom?
Comment 19 Daniel Holbert [:dholbert] 2011-08-23 18:14:57 PDT
(In reply to Boris Zbarsky (:bz) from comment #17)
> And I looked at the frame tree here, and it looks fine: the inlines and
> blocks are in different line boxes.

Ok, thanks (& sorry for the false alarm on that).  I've been busy wrestling with a tegra board on bug 652050 and hadn't double-checked it yet.

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #18)
> Wouldn't it make more sense to put this destruction logic in
> nsFrame::DestroyFrom?

Yes, I think you're right. I'll make that change.
Comment 20 Daniel Holbert [:dholbert] 2011-08-24 18:09:10 PDT
Created attachment 555610 [details] [diff] [review]
fix v3

This patch moves the destruction logic to nsFrame::DestroyFrom, as roc suggests.  Also added a crashtest.
Comment 21 Daniel Holbert [:dholbert] 2011-08-25 01:34:11 PDT
http://hg.mozilla.org/integration/mozilla-inbound/rev/c6e432ffd5e2
Comment 22 Ed Morley [:emorley] 2011-08-25 18:45:54 PDT
Merged to m-c:
http://hg.mozilla.org/mozilla-central/rev/c6e432ffd5e2

Not closing due to status-firefox6/7/8: affected flags, just in case this was going to be backported. Please close if appropriate.
Comment 23 Daniel Holbert [:dholbert] 2011-08-25 20:24:22 PDT
Thanks for the merge!  Closing (We close bugs when they land on Trunk, regardless of fixed-on-branches-status -- and then we use flags to track branch landings.)

I'm not going to lobby for landing this on branches, since it's a safe crash and not likely to be hit by actual web content.

Note You need to log in before you can comment on or make changes to this bug.