Closed Bug 765409 Opened 12 years ago Closed 12 years ago

Crash [@ nsLineLayout::ReflowFrame(nsIFrame *,unsigned int &,nsHTMLReflowMetrics *,bool &) ] | ASSERTION: unexpected flow: 'mFrames.ContainsFrame(nextInFlow)' | ASSERTION: StealFrame: can't find aChild: 'removed' | ASSERTION: StealFrame failure: 'NS_SUCCE

Categories

(Core :: Layout, defect)

14 Branch
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla18
Tracking Status
firefox14 - wontfix
firefox15 - wontfix
firefox16 - wontfix
firefox17 + verified
firefox18 + verified
firefox-esr10 --- unaffected
firefox-esr17 --- verified

People

(Reporter: bc, Assigned: MatsPalmgren_bugz)

References

()

Details

(6 keywords, Whiteboard: [adv-track-main17+])

Crash Data

Attachments

(5 files, 3 obsolete files)

Attached file crash report
1. http://www2.emersonprocess.com/en-US/brands/EIM-Actuators/Products/Electric/Pages/M2CP_Electric_Actuator.aspx

###!!! ASSERTION: unexpected flow: 'mFrames.ContainsFrame(nextInFlow)', file c:/work/mozilla/builds/nightly/mozilla/layout/generic/nsInlineFrame.cpp, line 500

###!!! ASSERTION: StealFrame: can't find aChild: 'removed', file c:/work/mozilla/builds/nightly/mozilla/layout/generic/nsContainerFrame.cpp, line 1210

###!!! ASSERTION: StealFrame failure: 'NS_SUCCEEDED(rv)', file c:/work/mozilla/builds/nightly/mozilla/layout/generic/nsContainerFrame.cpp, line 1339

Crash @ nsLineLayout::ReflowFrame(nsIFrame *,unsigned int &,nsHTMLReflowMetrics *,bool &)

This appears to be Beta/14, Aurora/15, and Nightly/16 though I am getting inconsistent results sometimes on different platforms.
Crash Signature: nsLineLayout::ReflowFrame(nsIFrame *,unsigned int &,nsHTMLReflowMetrics *,bool &) nsLineLayout::ReflowFrame) → [@ nsLineLayout::ReflowFrame(nsIFrame *,unsigned int &,nsHTMLReflowMetrics *,bool &) ] [@ nsLineLayout::ReflowFrame ]
It might be a regression from bug 537624.
Version: Trunk → 14 Branch
Attached file frame dump
I think the assumption here
http://hg.mozilla.org/mozilla-central/annotate/9602a9e99045/layout/generic/nsInlineFrame.cpp#l490
that all our children's NIFs must be in our child list and should
be reparented is wrong.
Assignee: nobody → matspal
Attached patch fix (obsolete) — Splinter Review
Only reparent a child's next-in-flows that has our prevInFlow as parent (we stole those from its OverflowList).

https://tbpl.mozilla.org/?usebuildbot=1&tree=Try&rev=09c6f9338c13
The code here still looks a bit fragile.  If we leave the loop that
reparent and reflow child frames early we leave a tail with the
wrong parent.  It looks like we have something in Reflow to fix that
(assuming they got pushed to our OverflowList):
http://hg.mozilla.org/mozilla-central/annotate/9602a9e99045/layout/generic/nsInlineFrame.cpp#l364
but it only does a simple SetParent, not the ReparentFloatsForInlineChild
and ReparentStyleContext that we do in the loop.

I think it would be more robust to continue reparenting all the child
frames, but not reflow any after 'done' is set true of course.
It's more complicated than that since ReflowInlineFrame also does
reparenting, and may have pushed the tail to our own OverflowList:
http://hg.mozilla.org/mozilla-central/annotate/9602a9e99045/layout/generic/nsInlineFrame.cpp#l670
Keywords: testcase-wanted
The reduction was fairly straightforward. The less obvious bits:

* A slow-loading image became a dynamic change to an inline-block element's size.

* <div style="display: inline"> became <span style="unicode-bidi: -moz-isolate">

Suddenly the fuzz testcase in bug 760957 doesn't seem so weird!
Group: core-security
Blocks: 765711
Hi Mats, it looks like you aren't quite happy with your current fix?
Comment on attachment 633853 [details] [diff] [review]
fix

Right.
Attachment #633853 - Attachment is obsolete: true
(In reply to David Bolter [:davidb] from comment #11)
> What should we do here?

We're still targeting this for FF15 if a low risk fix is found. What are the next steps here?
Attached patch fix v2 (obsolete) — Splinter Review
Comments roughly per hunk in the order they appear:

Return earlier in case of IsFrameTreeTooDeep (to avoid leaving frames
on the principal list that needs reparenting)

If we're not lazily reparenting, then reparent any frames we still
from the prev-in-flow overflow list full - i.e. SetParent + 
ReparentFloatsForInlineChild + ReparentStyleContext (if :first-line).

Reparent the next-in-flows of a child we're lazily parenting, until
we find a sibling with our next-in-flow as the parent (this is the
correct condition to test for unlike the last patch).

When lazily reparenting, if we stop reflowing children for some
reason then continue reparenting the remaining children on the
principal list...

... thus no need to that in ReflowInlineFrame.

In PullOneFrame, we only need one frame so don't pull in all frames
from the nif's overflow list since then we would need to reparent
all of them. (thus leaving the nif's principal list empty when we
return - I don't see a problem with that).


Green on Try:
https://tbpl.mozilla.org/?usebuildbot=1&tree=Try&rev=c6defdd9ee36
Attachment #654369 - Flags: review?(bzbarsky)
Attached patch fix v2, wdiff (obsolete) — Splinter Review
... well, the hunks I commented is the ones in this diff actually.
(In reply to Alex Keybl [:akeybl] from comment #12)
> We're still targeting this for FF15 if a low risk fix is found. What are the
> next steps here?

I think this patch is too high risk for 15.
Keywords: highrisk
Comment on attachment 654369 [details] [diff] [review]
fix v2

r=me
Attachment #654369 - Flags: review?(bzbarsky) → review+
Oops, I triggered this assertion while testing this again...

It seems our stop condition - that the child's nif's parent is
our nif, does not hold because it can be further away in the
nif chain.

In this case, we have an empty nif (lime 0x7fffddb49ae0) in
between us (blue 0x7fffd70793f0) and our child's nif (red).
Attachment #654369 - Attachment is obsolete: true
Attachment #654370 - Attachment is obsolete: true
So to avoid doing O(n*m) frame traversals we could put all
the (m) nifs in a temporary hashtable, but that seems a bit
expensive for the gain of lazy parenting.

Perhaps we should just not do lazy parenting in case there
are more than one nif?

Suggestions?
Find our first non-empty next-in-flow and use that for all n frames?
Attached patch fix, v3Splinter Review
Thanks roc, yes that would work, but I think I've found better way.
I collected some statistics on the frame tree when
"child->GetNextInFlow()" is non-null:

Running reftests+crashtests results in 10 hits (~0.5%), in all of them
the nif equals GetNextSibling().

Loading the Alexa top 100 sites results in zero hits (even when
resizing the window, reloading etc).

For the crash URL, the hit rate is ~3%, in 75% of those the nif
is the next-sibling.  The remaining 25% are either in the inline's nif,
or in an empty nif's nif (and in these cases GetNextSibling() is null).

So, in 100% of the tests, the nif is either the next sibling (and
we should reparent it) or the next sibling is null (and we shouldn't).
I think it's rather unlikely that we will have any other cases.

====

This patch is the same as before but with the following inter-diff:
140a141
> +        nsIFrame* nextSibling = child->GetNextSibling();
142,145c143,149
< +        if (child && child->GetParent() == GetNextContinuation() &&
< +            GetNextContinuation()) {
< +          MOZ_ASSERT(!mFrames.ContainsFrame(child));
< +          child = nullptr;
---
> +        if (NS_UNLIKELY(child)) {
> +          while (child != nextSibling && nextSibling) {
> +            nextSibling = nextSibling->GetNextSibling();
> +          }
> +          if (!nextSibling) {
> +            child = nullptr;
> +          }

So instead of testing the child's parent I'm simply iterating the
siblings to see of it can be found (for the test data above we will
never enter the loop).  This is better because it's exactly what
we want to do - if it's in mFrames it should be reparented.

https://tbpl.mozilla.org/?tree=Try&rev=7ced33ae2dd1
Attachment #659230 - Flags: review?(bzbarsky)
Comment on attachment 659230 [details] [diff] [review]
fix, v3

r=me
Attachment #659230 - Flags: review?(bzbarsky) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/e0c765e8be0f
Flags: in-testsuite?
Target Milestone: --- → mozilla18
https://hg.mozilla.org/mozilla-central/rev/e0c765e8be0f
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
This is sec-critical and tracking-firefox17+, are we planning on landing this on beta?
Crash Signature: [@ nsLineLayout::ReflowFrame(nsIFrame *,unsigned int &,nsHTMLReflowMetrics *,bool &) ] [@ nsLineLayout::ReflowFrame ] → [@ nsLineLayout::ReflowFrame(nsIFrame*, unsigned int&, nsHTMLReflowMetrics*, bool&) ] [@ nsLineLayout::ReflowFrame ]
Comment on attachment 659230 [details] [diff] [review]
fix, v3

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 
User impact if declined: sec-critical crash
Testing completed (on m-c, etc.): on trunk/aurora since 2012-09-07
Risk to taking this patch (and alternatives if risky): medium risk since the code is somewhat fragile.  The final patch (v3) is much better than the earlier iterations though (which comment 15 refers to as high risk).
String or UUID changes made by this patch: none
Attachment #659230 - Flags: approval-mozilla-beta?
No longer blocks: 760952
Please set sec-approval? on this patch as it fits the criteria, being sec-critical and looking for uplift.
Comment on attachment 659230 [details] [diff] [review]
fix, v3

Sorry, my bad - this doesn't need sec-approval given that it's already on trunk & Aurora now.
Attachment #659230 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Whiteboard: [adv-track-main17+]
Keywords: verifyme
Confirmed crash on build 2012-6-15 nightly
Verified fixed on build 2012-11-13, 17.0b6
Win7 and Mac OS 10.8
Thanks Matt, can you also verify in the latest 17esr nightly? Aurora 18.0a2 too if you have time.
Verified fixed on build 2012-11-14, 17.0 ESR
Verified fixed on build 2012-11-14, 18.0a2 Aurora
Depends on: 837007
Thanks Matt.

Changing status-firefox16:affected to wontfix since we won't be issuing patches for that version.
Status: RESOLVED → VERIFIED
Keywords: verifyme
Group: core-security
Crash test:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4c559cbeba13
Flags: in-testsuite? → in-testsuite+
Blocks: 5588
Depends on: 1382213
No longer depends on: 1382213
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: