Closed Bug 762902 Opened 12 years ago Closed 11 years ago

crash in mozilla::layout::AutoMaybeDisableFontInflation::AutoMaybeDisableFontInflation

Categories

(Core :: Layout, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla22

People

(Reporter: martijn.martijn, Assigned: jwir3)

References

(Blocks 1 open bug)

Details

(Keywords: crash, testcase)

Crash Data

Attachments

(5 files, 1 obsolete file)

Attached file testcase
See testcase, which crashes current trunk build on load.
The top stack seems to indicate this has perhaps something to do with font inflation?

This bug was filed from the Socorro interface and is 
report bp-d022568a-08c3-4c03-b35d-1341b2120608 .
============================================================= 
0 	xul.dll 	mozilla::layout::AutoMaybeDisableFontInflation::AutoMaybeDisableFontInflation 	layout/base/nsLayoutUtils.h:1682
1 	xul.dll 	ChildShrinkWrapWidth 	layout/tables/nsTableOuterFrame.cpp:503
2 	xul.dll 	nsTableOuterFrame::ComputeAutoSize 	layout/tables/nsTableOuterFrame.cpp:544
3 	xul.dll 	nsFrame::ComputeSize 	layout/generic/nsFrame.cpp:3782
4 	xul.dll 	nsHTMLReflowState::InitConstraints 	layout/generic/nsHTMLReflowState.cpp:1897
5 	xul.dll 	nsHTMLReflowState::Init 	layout/generic/nsHTMLReflowState.cpp:250
6 	xul.dll 	nsHTMLReflowState::nsHTMLReflowState 	layout/generic/nsHTMLReflowState.cpp:139
7 	xul.dll 	nsContainerFrame::ReflowOverflowContainerChildren 	layout/generic/nsContainerFrame.cpp:1089
8 	xul.dll 	nsLayoutUtils::CalculateContentBottom 	layout/base/nsLayoutUtils.cpp:3314
9 		@0x4af 	
10 		@0xaa6aa5f
Still crashes in current trunk build.
Blocks: 724978
I'm getting this when running the testcase for bug https://bugzilla.mozilla.org/show_bug.cgi?id=836895 after applying the trunk patch for bug 724978.
It looks like this is happening because InnerTableFrame() is null. There seem to be other situations where InnerTableFrame() is assumed to be non-null...

I suppose I could simply add a null check as the first condition of the if statement in the constructor for AutoMaybeDisableFontInflation, but this doesn't explain why the InnerTableFrame() is getting into a situation where it's null.

dbaron, is this an case which you would expect to happen? (i.e. Can an nsOuterTableFrame have an inner table frame that's null, or is this an exceptional circumstance anyway?)
Flags: needinfo?(dbaron)
It looks to me like InnerTableFrame is somewhat poorly named. It returns mFrames.FirstChild(), which (iiuc) could be null at any time. So, it should be called GetInnerTableFrame().

Aside from that, there doesn't appear to be any special handling in nsOuterTableFrame that ensures it's first child will always exist. I could be wrong about this, though, since I've only been looking within the nsOuterTableFrame class itself...
I don't think we should ever have an outer table frame that has a null inner table frame, except while it is being constructed or destroyed; I'd expect code in nsCSSFrameConstructor would ensure this, but I haven't checked.
Flags: needinfo?(dbaron)
Assignee: nobody → sjohnson
What looks like is happening is that an outer table frame is split across two continuations, like the following:


                     +------------------+
                     |                  |
                     |    block frame   |
                     |                  |
                     +------------------+
                              +
                              |
                              |
                              |
                 +------------+------------+
                 |                         |
                 |                      +--+>------------+
           +-----v---------+            |                |
           |               |            |  outer 2       |
           | outer 1       |            +--------+-------+
           +-----+---------+                     |
                 |                               |
                 |                               |
                 |                               |
                 |                               |
                 |                           +----------------+
         +------------------+                |                |
         |                  |                |   inner 2      |
         |   inner 1        |                |                |
         +------------------+                +----------------+

Where outer2 is a continuation of outer 1, and inner 2 is a continuation of inner 1. But, then we decide, while reflowing inner 1, that inner 2 can be merged into inner1. So, we call StealFrame() on inner2, which makes inner2 go away from its parent, causing the InnerTableFrame() call to return null (temporarily).

The problem is that there appears to be quite a bit of code in between when inner2 is stolen and outer 2 is stolen into outer 1 that depends on InnerTableFrame() returning non-null. 

I suppose we could add a special StealFrame() method to nsTableFrame that does the cleanup right away (right now it's using nsContainerFrame's generic StealFrame() method), but I'm not sure how to do this the right way...
Attached patch b762902 (obsolete) — Splinter Review
After looking at this more carefully, I determined what is happening is the following:

1) While trying to find an appropriate balance height, the table is being split into two continuations (one for each of the columns). This leads to a situation where we have one continuation for the first outer table cell (shown in comment 6), along with one of each of the inner table cells.
2) Eventually, we get to a situation where the column height is just large enough to fit the text that is a child of the column set, and thus returns a reflow status of COMPLETE. 3) At the end of the ReflowChild() call, then, since the child frame is COMPLETE, we remove its continuation, which happens to be the inner table frame #2. It's expected that this frame will also be COMPLETE, and as the stack is unwound, we'll eventually get back to nsBlockReflowContext and remove the outer table frame #2. 
4) Unfortunately, we now have a small left over available height in the first column, but not enough to even place one line of text for the absolutely positioned child. During the ReflowAbsoluteFrames(), it's determined that a continuation for outer #1 is actually needed. Since we already have a next in flow (outer #2), we don't create the continuation. Instead, the reflow status is merged into outer #1's reflow status (TRUNCATED, since mIsTopOfPage is false), and we return up to nsBlockReflowContext, which determines that we can't remove the frame outer #2. But, outer #2 is in a bad state, because it's only child has been removed already. 

My solution to this is as follows:
1) Split nsContainerFrame::ReflowChild into two separate functions, ReflowChildAndKeepContinuations(), which does the reflow as before, but doesn't remove the child continuations if the reflow is complete, and RemoveContinuationsIfComplete(), which does the continuation removal. ReflowChild() becomes an unconditional call of both of these, then. 
2) In the outer table frame, if we have absolutely positioned children, then we reflow, but keep the continuations around until the absolute frames are reflowed as well. After that, we perform the cleanup. 

This fixes the crash, as well as the crash from bug 836895. I also made a comment to indicate that this is a special case of reflow for tables, and other clients should use ReflowChild() instead.
Attachment #721311 - Flags: review?(dbaron)
Try run:
https://tbpl.mozilla.org/?tree=Try&rev=a6db5cdb0d96
OS: Windows NT → All
Hardware: x86 → All
Comment on attachment 721311 [details] [diff] [review]
b762902

Re-requesting review, based on dbaron thinking that perhaps you might be better suited to review this, mats.
Attachment #721311 - Flags: review?(dbaron) → review?(matspal)
Comment on attachment 721311 [details] [diff] [review]
b762902

>layout/generic/crashtests/crashtests.list
>+load 762902.html

You forgot to "hg add" this file?

>layout/generic/nsContainerFrame.h
>+  nsresult ReflowChildAndKeepContinuations(nsIFrame*  aKidFrame,
>+  void RemoveContinuationsIfComplete(nsIFrame* aKidFrame,

I think this is unnecessarily invasive.  Allow me to suggest a simpler
alternative solution:
  * Define a new ReflowChild flag, NS_FRAME_NO_DELETE_NEXT_IN_FLOW.
  * In ReflowChild, skip the DeleteNextInFlowChild if it's set.
  * Pass that flag to ReflowChild of the inner table frame.

I don't think we ever need to DeleteNextInFlowChild an inner table
frame because the only time we want it removed is when an outer table
frame does a fully complete reflow and destroys its next-in-flow,
in which case the next-in-flow inner table frame will be destroyed too.

(FYI, you probably want to update your tree to get bug 850964 to avoid
a conflict on adding that flag (if you agree on that solution)).

>layout/tables/nsTableOuterFrame.cpp

The changes fixes the crash but the frame tree is still not quite right.
The ExcessOverflowContainersList list should have been picked up by the
next-in-flow.  I think we might as well fix that bit while we're here.
Something like this:

>   UpdateReflowMetrics(captionSide, aDesiredSize, innerMargin, captionMargin);
>+  if (GetPrevInFlow()) {
>+    ReflowOverflowContainerChildren(aPresContext, aOuterRS,
>+                                    aDesiredSize.mOverflowAreas, 0,
>+                                    aStatus);
>+  }
>   FinishReflowWithAbsoluteFrames(aPresContext, aDesiredSize, aOuterRS, aStatus);
Attachment #721311 - Flags: review?(matspal) → review-
I think this is what the frame tree should look like.
There's still a layout error with these changes: where's the "t" ?
But we can spawn that off as a separate bug.
(In reply to Mats Palmgren [:mats] from comment #12)
> You forgot to "hg add" this file?
Ah, yes. I added it with the try-server run, but not the patch I posted, apparently.

> I think this is unnecessarily invasive.  Allow me to suggest a simpler
> alternative solution:
>   * Define a new ReflowChild flag, NS_FRAME_NO_DELETE_NEXT_IN_FLOW.
>   * In ReflowChild, skip the DeleteNextInFlowChild if it's set.
>   * Pass that flag to ReflowChild of the inner table frame.
> I don't think we ever need to DeleteNextInFlowChild an inner table
> frame because the only time we want it removed is when an outer table
> frame does a fully complete reflow and destroys its next-in-flow,
> in which case the next-in-flow inner table frame will be destroyed too.

Yes, this seems reasonable, but I think you meant "outer table frame" in the last step, right? (Since we're having issues with the next-in-flow child being deleted - i.e. the inner table frame - we'd have to pass it as flags to the outer table frame). So, my thinking is something along these lines:
- Add the flag to nsContainerFrame.h
- Only delete next-in-flow children if we don't have NO_DELETE_NEXT_IN_FLOW set.
- Pass the flag as part of ReflowChild() on the outer table frame.

> (FYI, you probably want to update your tree to get bug 850964 to avoid
> a conflict on adding that flag (if you agree on that solution)).

Thanks for that heads-up. :)

> The changes fixes the crash but the frame tree is still not quite right.
> The ExcessOverflowContainersList list should have been picked up by the
> next-in-flow.  I think we might as well fix that bit while we're here.
> Something like this:
> 
> >   UpdateReflowMetrics(captionSide, aDesiredSize, innerMargin, captionMargin);
> >+  if (GetPrevInFlow()) {
> >+    ReflowOverflowContainerChildren(aPresContext, aOuterRS,
> >+                                    aDesiredSize.mOverflowAreas, 0,
> >+                                    aStatus);
> >+  }
> >   FinishReflowWithAbsoluteFrames(aPresContext, aDesiredSize, aOuterRS, aStatus);

I'll look at this, as well.
> - Pass the flag as part of ReflowChild() on the outer table frame.

I'm sure we mean the same thing :-)

Just to be clear, I mean nsTableOuterFrame::OuterDoReflowChild should pass
the flag to ReflowChild when aChildFrame == InnerTableFrame().
(In reply to Mats Palmgren [:mats] from comment #16)
> Just to be clear, I mean nsTableOuterFrame::OuterDoReflowChild should pass
> the flag to ReflowChild when aChildFrame == InnerTableFrame().

Yes, that is what I meant, too. ;) Just wanted to be sure you and I weren't talking about different things, I was looking at it incorrectly. :)
Attached patch b762902 (v2)Splinter Review
Addressed review comments in the previous patch. I'll split the layout issue into another bug.
Attachment #721311 - Attachment is obsolete: true
Attachment #724931 - Flags: review?(matspal)
Blocks: 851121
(In reply to Mats Palmgren [:mats] from comment #14)
> There's still a layout error with these changes: where's the "t" ?
> But we can spawn that off as a separate bug.

Added bug 851121 to track this.
Comment on attachment 724931 [details] [diff] [review]
b762902 (v2)

This looks great.  r=mats

A minor nit:

+// Only applies to ReflowChild; if true, don't delete the next-in-flow, even
+// if it's empty.

s/even if it's empty/even if the reflow is fully complete/
is probably more accurate.
Attachment #724931 - Flags: review?(matspal) → review+
Inbound:
https://hg.mozilla.org/integration/mozilla-inbound/rev/cc590755a7dc
Target Milestone: --- → mozilla22
https://hg.mozilla.org/mozilla-central/rev/cc590755a7dc
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
fwiw, crash automation hit this testcase and reproduced on beta/21 but not aurora/22 or nightly./23
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: