Closed Bug 393956 Opened 17 years ago Closed 17 years ago

Crash [@ nsIFrame::HasView] [@ nsIFrame::GetNextInFlow] with -moz-column and height

Categories

(Core :: Layout, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED

People

(Reporter: jruderman, Assigned: fantasai.bugs)

References

Details

(Keywords: assertion, crash, testcase, Whiteboard: [sg:critical?])

Crash Data

Attachments

(7 files, 2 obsolete files)

The testcases are based on reftests/bugs/379349-1b.xhtml (from bug 379349).  The only addition is the (dynamic) use of the CSS "content" property.
Flags: blocking1.9?
###!!! ASSERTION: same old and new parent frame: 'aOldParentFrame != aNewParentFrame', file /Users/jruderman/trunk/mozilla/layout/generic/nsHTMLContainerFrame.cpp, line 423

###!!! ASSERTION: overflow containers out of order or bad parent: '!(aOverflowCont->GetStateBits() & NS_FRAME_IS_OVERFLOW_CONTAINER)', file /Users/jruderman/trunk/mozilla/layout/generic/nsContainerFrame.cpp, line 1331

###!!! ASSERTION: existing framelist: 'rv != NS_PROPTABLE_PROP_OVERWRITTEN', file /Users/jruderman/trunk/mozilla/layout/generic/nsContainerFrame.cpp, line 1141

Crash [@ nsIFrame::HasView] dereferencing 0xddddde01
###!!! ASSERTION: bad prev-in-flow: 'prevInFlow', file /Users/jruderman/trunk/mozilla/layout/generic/nsContainerFrame.cpp, line 1002

Crash [@ nsIFrame::GetNextInFlow] dereferencing 0x00000000
Whiteboard: [sg:critical?]
Depends on: 394237
OS: Mac OS X → All
Might this be fixed by Boris's recent changes to making sure counter changes are within BeginUpdate/EndUpdate?  (Are they even all landed yet?)
I crash on the testcases here with the tree that has those changes in it, so they don't seem to help.
The issue is that, inside  of nsOverflowContinuationTracker::Insert, the call to  mParent->SetPropTableFrames() at 1384 is indirectly causing us to destroy the "aOverflowCont" frame.  After we pass that line, aOverflowFrame has all its pointers set to 0xdddddddd.  

And then the next time we touch aOverflowFrame, we crash.

Source code link:
  http://mxr.mozilla.org/seamonkey/source/layout/generic/nsContainerFrame.cpp#1384
(In reply to comment #6)
> And then the next time we touch aOverflowFrame, we crash.

s/aOverflowFrame/aOverflowCont/
Here's the backtrace for when we destroy aOverflowCont, which has address 0x8b42868.

Notice in particular level 15 (inside of nsOverflowContinuationTracker::Insert) and level 8 (the "Destroy" call for aOverflowCont)
Flags: blocking1.9? → blocking1.9+
Attachment #283621 - Attachment description: reduced test (1 col) → reduced test, 1-col. (crashes when loaded)
Oops -- I left a dangling "}" at the end of the style section for these reduced tests.  

Doesn't affect how we handle them, but removing it anyway for correctness.
Attachment #283621 - Attachment is obsolete: true
Assignee: nobody → dholbert
Ok, I'm pretty sure the big-picture problem is this:

Consider a 2-column page, with overflowing content inside the columns.  Its frame tree will have a columnSetFrame with three blockFrames in it -- one blockFrame for each of the two columns, and a third "dummy" blockFrame to be the nextContinuation for the second column. (to store its overflow content)

Now, the third "dummy" blockFrame is *supposed to* purge the second column's blockFrame's ExcessOverflowContainers-list, but it never actually does, because this dummy blockFrame never get reflowed. (At least, it doesn't get reflowed when we need it to be for the purposes of this bug.) (See code snippet below)  


We do depend on it to be reflowed, because that's when it'd call nsContainerFrame::ReflowOverflowContainerChildren to purge its prevInFlow's ExcessOverflowContainers-list. (i.e. to purge the list on the second  column's blockFrame)


This situation causes problems when we move to the next iteration of the "while(1)" reflow loop in nsColumnSetFrame::Reflow, when we repeat the reflow using different metrics.  In the next iteration of this reflow, if we need to set the ExcessOverflowContainers-list for the second blockFrame, we'll find that there's already a list there.  This is unexpected behavior, and we end up freeing the existing frame list, which frames that are still in use.  (*bad*)


Here's some relevant code from nsColumnSetFrame.cpp, showing where we skip over reflowing the "dummy" column. The enclosing "while" loop  basically loops across all of the columns(using the variable "child"), reflowing each one.  When we break out of the loop, child is the second column's blockFrame and kidNextInFlow is its continuation.

 444   while (child) {
   ...
 517       ReflowChild(child, PresContext(), kidDesiredSize, kidReflowState
   ...
 546     nsIFrame* kidNextInFlow = child->GetNextInFlow();
 547 
 548     if (NS_FRAME_IS_FULLY_COMPLETE(aStatus) && !NS_FRAME_IS_TRUNCATED(aStatus)) {
   ...
 551     } else {
 552       ++columnCount;
  ...
 585       if (columnCount >= aConfig.mBalanceColCount) {
 586         // No more columns allowed here. Stop.
 587         aStatus |= NS_FRAME_REFLOW_NEXTINFLOW;
 588         kidNextInFlow->AddStateBits(NS_FRAME_IS_DIRTY);
 589         
 590         // Move any of our leftover columns to our overflow list. Our
 591         // next-in-flow will eventually pick them up.
 592         nsIFrame* continuationColumns = child->GetNextSibling();
 593         if (continuationColumns) {
 594           SetOverflowFrames(PresContext(), continuationColumns);
 595           child->SetNextSibling(nsnull);
 596         }
 597         break;
 598       }
 599     }

More on the 
Links:
 Context of code snippet
  http://mxr.mozilla.org/seamonkey/source/layout/generic/nsColumnSetFrame.cpp#517
 Documentation on excess overflow containers list:
  http://mxr.mozilla.org/seamonkey/source/layout/generic/nsContainerFrame.h#204
Status: NEW → ASSIGNED
Sorry, the area around the links at the end of my last post got garbled a bit when I copied stuff into the text box.

Meant to say this:
 Context of code snippet:
    http://mxr.mozilla.org/seamonkey/source/layout/generic/nsColumnSetFrame.cpp#517
  Documentation on excess overflow containers list:
    http://mxr.mozilla.org/seamonkey/source/layout/generic/nsContainerFrame.h#204
So, at first glance, I see two potential ways to resolve this:

1. Reflow the "dummy" frame somewhere before the break at nsColumnSetFrame.cpp:597
   (But this might have unintended consequences / do a lot of unnecessary work?)

2. Call ReflowOverflowContainerChildren on the dummy frame that break statement, to *just* purge the prevInFlow's excess overflow list, but not actually do a full reflow.
   (I'm not sure if it's totally legal to do this outside of a Reflow() call on the block itself though)
In general, we should not crash just because some next-in-flow didn't get reflowed. So something like option 2 sounds better.
I've got a patch for bug 398322 that deals with a similar problem. Basically, the nsOverflowContinuationTracker constructor doesn't expect "interrupted" reflow. It seems to fix this problem, too.
Yup, I think that patch looks like just the thing!

It should exactly fix the issue mentioned in comment 6 about nsContainerFrame.cpp:1384, in nsOverflowContinuationTracker::Insert.  The patch ensures that the only way we could reach that line (and impose a new list as mParent's ExcessOverflowContainers-list) would be if mParent didn't have an ExcessOverflowContainers-list in the first place.  Because if it did have a list, then mOverflowContList would be set, and we'd skip that clause in Insert.  That fixes the bad situation in this bug where we overwrite an existing list and cause valid frames to be destroyed.
Fix checked in, nsContainerFrame.cpp rev 1.291, please verify
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
"Testcase A" on this bug still crashes for me.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 398322
Assignee: dholbert → fantasai.bugs
Status: REOPENED → NEW
Hardware: PC → All
(In reply to comment #20)
> "Testcase A" on this bug still crashes for me.

This remaining crash is happening at a different place -- within nsContainerFrame::DeleteNextInFlowChild. (see attached stacktrace)

Soon before we crash, we fail this precondition-assertion:

 1033   NS_PRECONDITION(prevInFlow, "bad prev-in-flow");

Then we crash at this line:

 1064   NS_POSTCONDITION(!prevInFlow->GetNextInFlow(), "non null next-in-flow");

because we're dereferencing prevInFlow, and it's null.
The problem's in DoRemoveFrame: it needs to call DeleteNextInFlow on the next-in-flow and destroy aDeletedFrame separately. I've got a fix for this crash. I want to study the surrounding code a bit more before I submit this, though.
Status: NEW → ASSIGNED
Attached patch patchSplinter Review
Attachment #284329 - Flags: superreview?(roc)
Attachment #284329 - Flags: review?(roc)
Attachment #284329 - Flags: superreview?(roc)
Attachment #284329 - Flags: superreview+
Attachment #284329 - Flags: review?(roc)
Attachment #284329 - Flags: review+
fantasai checked the patch in a few hours ago (2007-10-10 15:09).  Yay!
Status: ASSIGNED → RESOLVED
Closed: 17 years ago17 years ago
Resolution: --- → FIXED
verified fixed using  Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a9pre) Gecko/2007101104 Minefield/3.0a9pre. I verified using all the testcases in the bug.
Status: RESOLVED → VERIFIED
Flags: in-testsuite?
Branch does not seem to be affected (no assertions/crashes with either testcase).
Group: security
Flags: wanted1.8.1.x-
s/either testcase/all four non-obsolete testcases
Four crashtests checked in :)
Flags: in-testsuite? → in-testsuite+
Crash Signature: [@ nsIFrame::HasView] [@ nsIFrame::GetNextInFlow]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: