Closed Bug 502017 Opened 15 years ago Closed 15 years ago

Crash [@ nsHTMLReflowState::GetHypotheticalBoxContainer] with -moz-column, overflow, clear, margin, fixed pos, and float

Categories

(Core :: Layout, defect)

1.9.1 Branch
x86
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking1.9.1 --- .2+
status1.9.1 --- .2-fixed

People

(Reporter: dholbert, Assigned: dholbert)

References

Details

(4 keywords, Whiteboard: [sg:critical?][fixed by bug 67752 on m-c][1.9.1.2+])

Crash Data

Attachments

(5 files)

I'm filing this bug for a remaining 1.9.1 crash, with a modified version of bug 467493's testcase. (just adding an empty floated div)  See in particular bug 467493 comment 25.

The attached "testcase 1" crashes in both optimized and debug 1.9.1 builds.  (This testcase is the same as "testcase 3" on bug 467493)

In my debug build, I get these assertions before the crash:
###!!! ASSERTION: Placeholder relationship should have been torn down; see comments in nsPlaceholderFrame.h: '!shell->FrameManager()->GetPlaceholderFrameFor(mOutOfFlowFrame)', file /mozilla/layout/generic/nsPlaceholderFrame.cpp, line 132
###!!! ASSERTION: no placeholder frame: 'nsnull != placeholderFrame', file /mozilla/layout/generic/nsHTMLReflowState.cpp, line 1122

I also get many copies of this warning:
WARNING: bad value: file /mozilla/layout/generic/nsFloatManager.cpp, line 149

Crash report (hasn't loaded yet, but should soon):
http://crash-stats.mozilla.com/report/pending/f7955dc3-a1ff-4e8a-a542-bdc8f2090702
testcase 1 crashes on load in this 1.9.1 build:
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1) Gecko/20090624 Firefox/3.5

and it's WFM in this mozilla-central build:
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2a1pre) Gecko/20090701
Minefield/3.6a1pre

Presumably, this used to crash on mozilla-central, even after the float-manager landing fixed bug 467493's other testcases.  A fix range would be great.
Flags: blocking1.9.1.1?
Whiteboard: [sg:critical?][needs m-c fix range]
Attached file backtrace at crash —
Here's a backtrace of the crash on testcase 1.
Attached file frametree at crash —
Here's a dump of the frametree at crash-time. (This is from the same crash occurrence as was used to generate the backtrace in attachment 386559 [details])
Attachment #386559 - Attachment description: crash backtrace → backtrace at crash
Attachment #386557 - Attachment description: testcase 1 → testcase 1 (crashes on loading in 1.9.1 & 1.9.0)
(In reply to comment #1)
> Presumably, this used to crash on mozilla-central, even after the float-manager
> landing fixed bug 467493's other testcases.  A fix range would be great.

Ok, here's the m-c fix range:
CRASHES ON:
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2a1pre) Gecko/20090505 Minefield/3.6a1pre
WFM on:
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2a1pre) Gecko/20090506 Minefield/3.6a1pre
Whiteboard: [sg:critical?][needs m-c fix range] → [sg:critical?]
Pushlog for fix range:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=2a82ba52ed30&tochange=7518078cb842

bug 67752 looks like the only likely fix-candidate in that range.
I've confirmed that testcase 1 crashes on Mac 1.9.1, too.

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.1pre) Gecko/20090630 Shiretoko/3.5.1pre
Crash report:
http://crash-stats.mozilla.com/report/index/1c2f4392-268e-4d99-a443-0a0922090702
OS: Linux → All
(In reply to comment #5)
> bug 67752 looks like the only likely fix-candidate in that range.

Confirmed that bug 67752 fixed this on mozilla-central.  Changeset 47e935bd7f3b hits the assertions & the crash from comment 0, whereas changeset a08d1947ec5a does not (though it still has lots of "WARNING: bad value" messages).

It doesn't look like bug 67752 is targeted for 1.9.1 (or 1.9.0), though, so it probably won't be able to fix this on the branches...
Depends on: ireflow
Whiteboard: [sg:critical?] → [sg:critical?][fixed by bug 67752 on m-c]
It's really weird that bug 67752 changed behavior here.  Are we actually interrupting on the testcase?  Or was it the scrolling changes or some such?

Or maybe it was the changes in nsBlockFrame::ReflowDirtyLines around pulling stuff?  We could port that sort of thing to branch, I think.
(In reply to comment #8)
> Or maybe it was the changes in nsBlockFrame::ReflowDirtyLines around pulling
> stuff?

That sounds most likely.
I think that we should try to backport those.  Do you want to give it a shot or should I?
In my build at changeset a08d1947ec5a (that's bug 67752's patch), if I revert bug 67752's changes to nsBlockFrame.* and recompile, the assertions & the crash return.  So it's definitely one of the changes to nsBlockFrame.cpp that fixed this (probably the pulling stuff, as bz suggested).

I'll try to backport that -- looks pretty straightforward.
This is just the pulling chunk from bug 67752's nsBlockFrame changes.  It applies cleanly on 1.9.1 and 1.9.0, and it fixes this crash on both branches.  It also fixes the crash on bug 467493's testcases on 1.9.0 branch.
Flags: wanted1.9.1.x+
Flags: wanted1.9.0.x+
Comment on attachment 386646 [details] [diff] [review]
Backported fix: just the "pull" chunk from bug 67752's patch [checked in to 1.9.0: c26]

So, here's a summary what's going on in this backported patch, starting with the final chunk:

>@@ -2076,24 +2079,28 @@ nsBlockFrame::ReflowDirtyLines(nsBlockRe
>       if (!bifLineIter.Next() ||                
>           !bifLineIter.GetLine()->IsDirty()) {
>-        if (IS_TRUE_OVERFLOW_CONTAINER(aState.mNextInFlow))
>-          NS_FRAME_SET_OVERFLOW_INCOMPLETE(aState.mReflowStatus);
>-        else
>-          NS_FRAME_SET_INCOMPLETE(aState.mReflowStatus);
>         skipPull=PR_TRUE;
>       }
>     }
>   }
>+
>+  if (skipPull && aState.mNextInFlow) {
>+    NS_ASSERTION(heightConstrained, "Height should be constrained here\n");
>+    if (IS_TRUE_OVERFLOW_CONTAINER(aState.mNextInFlow))
>+      NS_FRAME_SET_OVERFLOW_INCOMPLETE(aState.mReflowStatus);
>+    else
>+      NS_FRAME_SET_INCOMPLETE(aState.mReflowStatus);
>+  }

This chunk makes sure that, if we've got a next-in-flow and we're going to skip pulling from it, we'll set an INCOMPLETE bit on our reflow status.  Otherwise, our next-in-flow could end up being destroyed while we've still potentially got kids with mNextContinuation pointers into it, and that's bad news.

Previously, we were only setting the INCOMPLETE bit inside of a clause that required a whole bunch of conditions to be satisfied. (In the testcase attached here, the |aState.mReflowState.mFlags.mNextInFlowUntouched| condition was the one that failed and made us skip  setting the INCOMPLETE bit.)

>-  // We can skip trying to pull up the next line if there is no next
>-  // in flow or we were told not to or we know it will be futile, i.e.,
>+  // We can skip trying to pull up the next line if our height is constrained
>+  // (so we can report being incomplete) and there is no next in flow or we
>+  // were told not to or we know it will be futile, i.e.,
[snip]
>-  PRBool skipPull = willReflowAgain;
>-  if (aState.mNextInFlow &&
>+  PRBool heightConstrained =
>+    aState.mReflowState.availableHeight != NS_UNCONSTRAINEDSIZE;
>+  PRBool skipPull = willReflowAgain && heightConstrained;

This chunk makes sure we only set |skipPull| if our height is constrained.  Otherwise (if we have an unconstrained height), we may not be safe to return an INCOMPLETE reflow status (which would end up getting set in the above-described chunk of the patch, if |skipPull| were true).


>-  if (aState.mNextInFlow &&
>+  if (!skipPull && heightConstrained && aState.mNextInFlow &&
>       (aState.mReflowState.mFlags.mNextInFlowUntouched &&

This chunk adds two conditions to the "if" check here -- !skipPull && heightConstrained.

Now that the INCOMPLETE-setting has been moved out of this "if" clause, this clause just does one thing -- it turns on |skipPull| in certain situations.  It doesn't do anything else.  So, there's no point in entering it if |skipPull| is already turned on. (hence the first added condition)  We also don't want to enter it if our height is unconstrained, because then we definitely don't want to turn on skipPull (because we can't return an incomplete reflow status, as described above).  (hence the second added condition)
Attachment #386646 - Flags: review?(roc)
Attached patch crashtest (as patch) — — Splinter Review
Here's testcase 1 as a crashtest, for landing on all branches after this is fixed on all branches.  I've just re-confirmed that this crashes pre-patch, and doesn't crash post-patch, using an up-to-date 1.9.1 checkout.

(I generated this crashtest patch using a 1.9.1 build -- it may need minor bitrot-fixing in "crashtest.list" on 1.9.0 and on mozilla-central)
Flags: in-testsuite?
Attachment #386646 - Flags: approval1.9.1.1?
Attachment #386646 - Flags: approval1.9.0.13?
Comment on attachment 386646 [details] [diff] [review]
Backported fix: just the "pull" chunk from bug 67752's patch [checked in to 1.9.0: c26]

Requesting approval to land this backport on the 1.9.1 & 1.9.0 stable branches.

Fixes a security-sensitive crash, has baked on mozilla-central, and is a fairly minimal change. (See comment 13 for an overview of what it does.)

This also will fix bug 467493 (a related crash) on the 1.9.0 branch, as mentioned in comment 12. (That bug is WFM on 1.9.1, because the float manager works around it.)
Flags: blocking1.9.0.13?
Flags: blocking1.9.0.13? → blocking1.9.0.13+
Assignee: nobody → dholbert
Comment on attachment 386646 [details] [diff] [review]
Backported fix: just the "pull" chunk from bug 67752's patch [checked in to 1.9.0: c26]

Approved for 1.9.0.13, a=dveditz for release-drivers
Attachment #386646 - Flags: approval1.9.0.13? → approval1.9.0.13+
Attachment 386646 [details] [diff] checked in to 1.9.0 branch:
Checking in layout/generic/nsBlockFrame.cpp;
/cvsroot/mozilla/layout/generic/nsBlockFrame.cpp,v  <--  nsBlockFrame.cpp
new revision: 3.963; previous revision: 3.962
done
Keywords: fixed1.9.0.13
Attachment #386646 - Attachment description: Backported fix: just the "pull" chunk from bug 67752's patch → Backported fix: just the "pull" chunk from bug 67752's patch [checked in 1.9.0: comment 17]
Attachment #386646 - Attachment description: Backported fix: just the "pull" chunk from bug 67752's patch [checked in 1.9.0: comment 17] → Backported fix: just the "pull" chunk from bug 67752's patch [checked in to 1.9.0: c17]
Comment on attachment 386646 [details] [diff] [review]
Backported fix: just the "pull" chunk from bug 67752's patch [checked in to 1.9.0: c26]

1.9.0.13 will be in sync with 1.9.1.2, so we'll take this patch then.
Attachment #386646 - Flags: approval1.9.1.1? → approval1.9.1.2?
I'm leaving this as wanted1.9.1.x, but since we don't yet have our new blocking1.9.1 flag, I can't make it block 1.9.1.2 yet. It does though.
Flags: blocking1.9.1.1?
Whiteboard: [sg:critical?][fixed by bug 67752 on m-c] → [sg:critical?][fixed by bug 67752 on m-c][1.9.1.2+]
Backed out of 1.9.0 -- this appears to have somehow caused a test failure in test_bug343416.xul, on Mac.  (It went orange for two consecutive cycles, with a test-failure that doesn't appear to be known-sporadic.)
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.0/1247255968.1247259491.3895.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.0/1247260655.1247264163.22911.gz

Backout:
Checking in layout/generic/nsBlockFrame.cpp;
/cvsroot/mozilla/layout/generic/nsBlockFrame.cpp,v  <--  nsBlockFrame.cpp
new revision: 3.964; previous revision: 3.963
done

Log from failing test:
{
*** 669 INFO Running chrome://mochikit/content/chrome/widget/test/test_bug343416.xul...
*** 670 INFO PASS | nsIIdleService should exist and be implemented on all tier 1 platforms.
*** 671 ERROR FAIL | Getting the idle Time should not fail in normal circumstances on any tier 1 platform. |  | chrome://mochikit/content/chrome/widget/test/test_bug343416.xul
*** 672 INFO PASS | The nsIIdleService should allow us to add an observer.
*** 673 INFO PASS | The nsIIdleService should allow us to add the same observer again.
*** 674 INFO PASS | The nsIIdleService should allow us to remove the observer just once.
*** 675 ERROR FAIL | Getting the idle time should not fail in normalcircumstances on any tier 1 platform. |  | chrome://mochikit/content/chrome/widget/test/test_bug343416.xul
*** 676 ERROR FAIL | The idle time should have increased by roughly the amount of time it took for the timeout to fire. |  | chrome://mochikit/content/chrome/widget/test/test_bug343416.xul
*** 677 INFO PASS | We set a timeout and it should have fired by now.
*** 678 ERROR FAIL | We added a listener and it should have been called by now. |  | chrome://mochikit/content/chrome/widget/test/test_bug343416.xul
*** 679 INFO PASS | We added a listener and we should be able to remove it.
}
Keywords: fixed1.9.0.13
I seriously doubt this patch should have affected that test...
Me too, but with two consecutive orange cycles, best diagnostic is to back out.

Having said that, the failing test *stayed orange* in the post-backout cycle:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.0/1247266920.1247270435.27327.gz

So this checkin couldn't be responsible.  I just pinged #build to see if someone can reboot (or otherwise fix) the still-orange tinderbox.
Rebooted bm-xserve21 to see if that resolves the Fx3.0 unit issue.
The reboot seems to have fixed the issue -- the cycle before the reboot was orange with the nsIIdleService failure, and the four cycles after the reboot have all been green.
Re-landed patch on 1.9.0.x (CVS Trunk):
Checking in layout/generic/nsBlockFrame.cpp;
/cvsroot/mozilla/layout/generic/nsBlockFrame.cpp,v  <--  nsBlockFrame.cpp
new revision: 3.965; previous revision: 3.964
done
Keywords: fixed1.9.0.13
Attachment #386646 - Attachment description: Backported fix: just the "pull" chunk from bug 67752's patch [checked in to 1.9.0: c17] → Backported fix: just the "pull" chunk from bug 67752's patch [checked in to 1.9.0: c26]
blocking1.9.1: --- → .2+
Attachment #386646 - Flags: approval1.9.1.2? → approval1.9.1.2+
Comment on attachment 386646 [details] [diff] [review]
Backported fix: just the "pull" chunk from bug 67752's patch [checked in to 1.9.0: c26]

a=beltzner, please land on mozilla-1.9.1
dholbert: Can you please land this on 1.9.1 asap?
Was just about to.  Landed.
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/63785ea38770

status1.9.1 --> .2fixed
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Verified fixed in 1.9.0 using my debug build (Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.13pre) Gecko/2009072711 GranParadiso/3.0.13pre). Verified crash and assertion in comment 0 from build in early July. Both are fixed now.
Verified using test case in comment #0. 3.5 crashed on load. 3.5.2 did not. Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2

(3.5 and 3.5.2 behaved the same on XP, no crash).
Keywords: verified1.9.1
Group: core-security
Crash Signature: [@ nsHTMLReflowState::GetHypotheticalBoxContainer]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: