Last Comment Bug 465651 - Crash [@ nsFrameList::InsertFrame] with -moz-column and lots of floats
: Crash [@ nsFrameList::InsertFrame] with -moz-column and lots of floats
Status: RESOLVED FIXED
[sg:critical?][worked around by 19144...
: assertion, crash, testcase, verified1.9.0.14
Product: Core
Classification: Components
Component: Layout (show other bugs)
: Trunk
: x86 All
: -- critical (vote)
: ---
Assigned To: Daniel Holbert [:dholbert]
:
Mentors:
Depends on: 191448 503961
Blocks: randomstyles
  Show dependency treegraph
 
Reported: 2008-11-18 19:13 PST by Jesse Ruderman
Modified: 2011-06-13 10:01 PDT (History)
12 users (show)
dveditz: blocking1.9.0.14+
dveditz: wanted1.9.0.x+
jruderman: in‑testsuite+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-
wontfix


Attachments
testcase (969 bytes, text/html)
2008-11-20 14:09 PST, Jesse Ruderman
no flags Details
attempt at band-aid fix (1.17 KB, patch)
2009-06-13 19:54 PDT, Daniel Holbert [:dholbert]
no flags Details | Diff | Splinter Review
patch for 1.9.0.x [checked in: c36] (2.33 KB, patch)
2009-06-23 16:39 PDT, Daniel Holbert [:dholbert]
dveditz: approval1.9.0.14+
Details | Diff | Splinter Review
series of frametrees for broken (unpatched) code vs working (band-aid#2) code (23.11 KB, application/zip)
2009-06-24 17:31 PDT, Daniel Holbert [:dholbert]
no flags Details
patch for mozilla-central & 1.9.1 (2.33 KB, patch)
2009-06-24 17:41 PDT, Daniel Holbert [:dholbert]
roc: review+
roc: superreview+
Details | Diff | Splinter Review

Description Jesse Ruderman 2008-11-18 19:13:25 PST
###!!! ASSERTION: prev sibling has different parent: 'aNewFrame->GetParent() == aPrevSibling->GetParent()', file /Users/jruderman/central/layout/generic/nsFrameList.cpp, line 186
/Users/jruderman/central/obj-ff/dist/MinefieldDebug.app/Contents/MacOS/run-mozilla.sh: line 131:  1938 Segmentation fault      "$prog" ${1+"$@"}

Crash [@ nsFrameList::InsertFrame]
Comment 1 Mats Palmgren (vacation) 2008-11-20 13:01:43 PST
Jesse, you forgot to attach the testcase?
Comment 2 Jesse Ruderman 2008-11-20 14:09:52 PST
Created attachment 349280 [details]
testcase
Comment 3 Bernd 2009-01-29 10:43:15 PST
this asserts now at 

###!!! ASSERTION: We placed a float where there was no room!: 'psd->mX - mTrimmableWidth <= psd->mRightEdge', file d:/moz_src/src/layout/generic/nsLineLayout.cpp, line 345

The assert was first added as 
NS_ASSERTION(psd->mX <= psd->mRightEdge,"We placed a float where there was no room!"); 

with the patch from bug 50630 (patches without a bug# in the comment are fun, but you can run but you can not hide ;-) ) 

The assert was later in bug 441259 changed to 
NS_ASSERTION(psd->mX - mTrimmableWidth <= psd->mRightEdge, "We placed a float where there was no room!");

psd->mX         is 60 
psd->mRightEdge is 0
mTrimmableWidth is 0

I have 60 twips
Comment 4 Bernd 2009-01-29 10:49:54 PST
the assert is covered by bug 439204
Comment 5 Bernd 2009-01-29 10:53:34 PST
the testcase is wfm with respect to the crash
Comment 6 Jesse Ruderman 2009-01-31 17:14:23 PST
Same here.  I'll check this testcase in as a crashtest and mark it as asserting 96 times.
Comment 7 Bob Clary [:bc:] 2009-04-28 16:17:44 PDT
1.9.0 !exploitable report:
UNKNOWN: Data from Faulting Address may be used as a return value starting at gklayout!nsIFrame::GetParent
Comment 8 Daniel Veditz [:dveditz] 2009-05-04 19:37:49 PDT
On 1.9.0 this testcase still gets the assertion from comment 0 and crashes in debug builds on windows and mac dereferencing a deleted address. Didn't see a crash in an equivalent nightly (release) build.

Since this is now a public crash test and nicely flagged as a potential problem by !exploitable we now need to fix it in 1.9.0.x sooner than later.

qawanted: a "fix window" of when this went away on the trunk so we can identify a potential fix.

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x00000000ddddddf9
Crashed Thread:  0

Thread 0 Crashed:
   nsIFrame::GetParent() const + 9 (nsIFrame.h:685)
   nsFrameList::InsertFrame(nsIFrame*, nsIFrame*, nsIFrame*) + 277 (nsFrameList.cpp:186)
   nsOverflowContinuationTracker::Insert(nsIFrame*, unsigned int&) + 1002 (nsContainerFrame.cpp:1467)
   nsBlockFrame::ReflowBlockFrame(nsBlockReflowState&, nsLineList_iterator, int*) + 4955 (nsBlockFrame.cpp:3140)
   nsBlockFrame::ReflowLine(nsBlockReflowState&, nsLineList_iterator, int*) + 292 (nsBlockFrame.cpp:2253)
   nsBlockFrame::ReflowDirtyLines(nsBlockReflowState&) + 1988 (nsBlockFrame.cpp:1888)
   nsBlockFrame::Reflow(nsPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, unsigned int&) + 802 (nsBlockFrame.cpp:955)
   nsContainerFrame::ReflowChild(nsIFrame*, nsPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, int, int, unsigned int, unsigned int&, nsOverflowContinuationTracker*) + 368 (nsContainerFrame.cpp:775)
   nsColumnSetFrame::ReflowChildren(nsHTMLReflowMetrics&, nsHTMLReflowState const&, unsigned int&, nsColumnSetFrame::ReflowConfig const&, int, nsCollapsingMargin*, nsColumnSetFrame::ColumnBalanceData&) + 1321 (nsColumnSetFrame.cpp:571)
   nsColumnSetFrame::Reflow(nsPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, unsigned int&) + 1018 (nsColumnSetFrame.cpp:893)
   nsBlockReflowContext::ReflowBlock(nsRect const&, int, nsCollapsingMargin&, int, int, nsLineBox*, nsHTMLReflowState&, unsigned int&, nsBlockReflowState&) + 477 (nsBlockReflowContext.cpp:311)
   nsBlockFrame::ReflowBlockFrame(nsBlockReflowState&, nsLineList_iterator, int*) + 2131 (nsBlockFrame.cpp:2981)
   nsBlockFrame::ReflowLine(nsBlockReflowState&, nsLineList_iterator, int*) + 292 (nsBlockFrame.cpp:2253)
   nsBlockFrame::ReflowDirtyLines(nsBlockReflowState&) + 1988 (nsBlockFrame.cpp:1888)
   nsBlockFrame::Reflow(nsPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, unsigned int&) + 802 (nsBlockFrame.cpp:955)
   nsBlockReflowContext::ReflowBlock(nsRect const&, int, nsCollapsingMargin&, int, int, nsLineBox*, nsHTMLReflowState&, unsigned int&, nsBlockReflowState&) + 477 (nsBlockReflowContext.cpp:311)
   nsBlockFrame::ReflowBlockFrame(nsBlockReflowState&, nsLineList_iterator, int*) + 2131 (nsBlockFrame.cpp:2981)
   nsBlockFrame::ReflowLine(nsBlockReflowState&, nsLineList_iterator, int*) + 292 (nsBlockFrame.cpp:2253)
   nsBlockFrame::ReflowDirtyLines(nsBlockReflowState&) + 1988 (nsBlockFrame.cpp:1888)
   nsBlockFrame::Reflow(nsPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, unsigned int&) + 802 (nsBlockFrame.cpp:955)
   nsContainerFrame::ReflowChild(nsIFrame*, nsPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, int, int, unsigned int, unsigned int&, nsOverflowContinuationTracker*) + 368 (nsContainerFrame.cpp:775)
   CanvasFrame::Reflow(nsPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, unsigned int&) + 504 (nsHTMLFrame.cpp:589)
   nsContainerFrame::ReflowChild(nsIFrame*, nsPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, int, int, unsigned int, unsigned int&, nsOverflowContinuationTracker*) + 368 (nsContainerFrame.cpp:775)
   nsHTMLScrollFrame::ReflowScrolledFrame(ScrollReflowState*, int, int, nsHTMLReflowMetrics*, int) + 877 (nsGfxScrollFrame.cpp:499)
   nsHTMLScrollFrame::ReflowContents(ScrollReflowState*, nsHTMLReflowMetrics const&) + 125 (nsGfxScrollFrame.cpp:593)
   nsHTMLScrollFrame::Reflow(nsPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, unsigned int&) + 651 (nsGfxScrollFrame.cpp:794)
   nsContainerFrame::ReflowChild(nsIFrame*, nsPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, int, int, unsigned int, unsigned int&, nsOverflowContinuationTracker*) + 368 (nsContainerFrame.cpp:775)
   ViewportFrame::Reflow(nsPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, unsigned int&) + 533 (nsViewportFrame.cpp:285)
   PresShell::DoReflow(nsIFrame*) + 1006 (nsPresShell.cpp:6228)
   PresShell::ProcessReflowCommands(int) + 304 (nsPresShell.cpp:6315)
   PresShell::DoFlushPendingNotifications(mozFlushType, int) + 469 (nsPresShell.cpp:4537)
   PresShell::ReflowEvent::Run() + 203 (nsPresShell.cpp:6114)
   nsThread::ProcessNextEvent(int, int*) + 676 (nsThread.cpp:511)
   NS_ProcessPendingEvents_P(nsIThread*, unsigned int) + 146 (nsThreadUtils.cpp:180)
   nsBaseAppShell::NativeEventCallback() + 181 (nsBaseAppShell.cpp:122)
   nsAppShell::ProcessGeckoEvents(void*) + 518 (nsAppShell.mm:310)
Comment 9 Daniel Veditz [:dveditz] 2009-05-04 19:43:23 PDT
I don't crash in a 1.9.1 debug build anymore, so probably one of the "fixed1.9.1" bugs took care of this. Any likely dupes based on the stack or the testcase?
Comment 10 Samuel Sidler (old account; do not CC) 2009-05-05 13:44:05 PDT
Can we get a fix range for this bug? We probably do want to take it for code freeze in 1.9.0.11 (tomorrow) given that it's been checked in publicly. (Depending on the fix, of course.)
Comment 11 Daniel Holbert [:dholbert] 2009-05-05 17:21:24 PDT
I've done some targeted debug builds, and so far I've narrowed this to becoming fixed in mozilla-central between Jan 3rd and Jan 5th 2009.  (changeset 87d32ea05ca2 is broken, changeset f918c5e6ba7f is working.)

I'm guessing this is due to the SpaceManager --> FloatManager change (Bug 191448) on Jan 4th.  I'm doing a build before & after that landed to verify...
Comment 12 Daniel Holbert [:dholbert] 2009-05-05 17:57:00 PDT
Yeah -- changeset 9ffef99c9aa7 crashes, and changeset aa32889429db doesn't crash (but still asserts).  So, this was fixed by the Float Manager landing.

Is that something we'd consider porting to 1.9.0.x?
Comment 13 Samuel Sidler (old account; do not CC) 2009-05-06 15:16:03 PDT
Bumping to 1.9.0.12, because we need to decide if we want to take the Float Manager changes on 1.9.0. From dholbert over IRC (because Dan asked me to):

21:37 <dholbert> yeah... I think that's a bigg-ish change :-/
21:38 <dholbert> though in theory, it should have basically zero functional effects, aside from changing behavior in cases of integer overflow (according to dbaron)
21:39 <dholbert> It's a simplification of an existing structure, with the intention of keeping behavior exactly the same, and just adding a few safety checks & overflow checks
21:39 <dholbert> so, as biggish patches go, it might be on the landable side.  I dunno though

Any chance this is easily fixable another way?
Comment 14 Daniel Veditz [:dveditz] 2009-05-13 15:35:29 PDT
Sorry to make Daniel the stuckee, but here you go. We need a 1.9.0 backport of bug 191448 plus the three regression fixes from it. We've got about 4 weeks to 1.9.0.12 code-freeze so if you need to find another assignee please do so soon.
Comment 15 Daniel Veditz [:dveditz] 2009-05-13 15:36:48 PDT
Sam says to put the combined backport in bug 191448
Comment 16 David Baron :dbaron: ⌚️UTC+8 (review requests must explain patch) 2009-05-13 15:41:19 PDT
I'm not sure how great an idea that is; that bug may depend on other changes in the 1.9.1 cycle, obviously or non-obviously.
Comment 17 Daniel Holbert [:dholbert] 2009-05-14 15:07:25 PDT
(In reply to comment #15)
> Sam says to put the combined backport in bug 191448

Done -- I attached a patch-bundle in bug 191448 comment 37. (and a single combined patch in the comment after that)

I've confirmed that with that combined patch, 1.9.0.x compiles successfully & no longer crashes on this bug's testcase (in my debug build). (I confirmed that the testcase crashes pre-patching, too.)

(In reply to comment #16)
> I'm not sure how great an idea that is; that bug may depend on other changes in
> the 1.9.1 cycle, obviously or non-obviously.

From backporting, the only _obvious_ dependencies were on three code-cleanup bugs: Bug 432891, Bug 409331, and Bug 458969. (as mentioned in bug 191448 comment 37).

However, the non-obvious dependencies dbaron mentions could be scary... I'm only moderately familiar with the code in this patch, so I'd suggest trusting dbaron's intuitions from comment 16 over mine from comment 13. :)
Comment 18 Daniel Holbert [:dholbert] 2009-06-11 14:18:52 PDT
I'll look into the exact cause of the crash here, in the hopes that a single bounds-check somewhere may work as a band-aid...
Comment 19 Daniel Holbert [:dholbert] 2009-06-13 19:54:17 PDT
Created attachment 383138 [details] [diff] [review]
attempt at band-aid fix

I haven't looked at our float or continuation-model stuff for a little while, and there's a lot of that involved here, so this is a bit over my head at the moment, but basically here's what makes us crash:
 - We hit this chunk of code in nsBlockReflowContext::ReflowBlock [1]
  345       nsIFrame* kidNextInFlow = mFrame->GetNextInFlow();
  346       if (nsnull != kidNextInFlow) {
  [...]
  353         aState.mOverflowTracker.Finish(mFrame);
  354         static_cast<nsHTMLContainerFrame*>(kidNextInFlow->GetParent())
  355           ->DeleteNextInFlowChild(mPresContext, kidNextInFlow);

When we get to this point in the code on this bug's testcase, the following pointers all refer to the same nsBlockFrame:
     (a) mFrame->mNextContinuation->mNextContinuation
     (b) kidNextInFlow->mNextContinuation
     (c) aState.mOverflowTracker.mPrevOverflowCont
(Note: (a) and (b) are trivially equal, because kidNextInFlow is defined to be mFrame->mNextContinuation)

Now, inside of the "DeleteNextInFlowChild" call quoted above at line 355, we delete the block frame that's pointed to by the three pointers above.  However, we never end up clearing pointer (c), in the overflow tracker.  And so, later on, we end up making a virtual method call on that pointer, and we crash.  Specifically, this happens in this part of nsBlockFrame.cpp [2]:
  3139             aState.mOverflowTracker.Insert(nextFrame, frameReflowStatus);

That "Insert" call includes a call to nsFrameList::InsertFrame, passing mPrevOverflowCont (the evil stale pointer) as the aPrevSibling argument.  InsertFrame makes a method call on this stale pointer (the call being "aPrevSibling->GetNextSibling()"), and we crash. [3]

I'm attaching an attempted band-aid fix that works around the crash by clearing mPrevOverflowCont inside of the "Finish" method (which is called just before DeleteNextInFlowChild in the first quoted section above).  It only clears this pointer if we find that it's in the passed-in frame's next-in-flow list (which is about to be deleted).  I'm not sure this is correct, but it's based off of existing code in "Finish" that seems to achieve the same end for the mSentry pointer.  It also seems to avoid the crash, while preventing 

I'm hoping that someone who's more comfortable with the continuation model (fantasai, roc, bz/dbaron?) can verify if this looks correct or not -- there may be more or less that we need to do in the band-aided chunk, in order to correctly clean things up there.  (If this patch is correct, we may also want a version of it on mozilla-central, too, though we'd probably want a testcase that can trigger this on mozilla-central first.)

Links to code referenced above:
[1] http://mxr.mozilla.org/firefox/source/layout/generic/nsBlockReflowContext.cpp#345
[2] http://mxr.mozilla.org/firefox/source/layout/generic/nsBlockFrame.cpp#3139
[3] http://mxr.mozilla.org/firefox/source/layout/generic/nsFrameList.cpp#185
Comment 20 Daniel Holbert [:dholbert] 2009-06-13 20:08:40 PDT
By the way, I don't think dbaron's FloatManager patch directly fixed this bug in mozilla-central -- I think it just may have just made us avoid the bad execution path on this particular testcase, due to some differences in available space computations.

e.g. I initially looked at where the execution paths started to diverge in pre-patch vs post-patch code, and it looks like we have a particular call to nsBlockReflowState::GetAvailableSpace that leaves mAvailSpaceRect at width=0 in old code but at width=-120 in new code.  (And in the new code, this negative width later makes us trigger the assertion mentioned in comment 3, IIRC).  I think there are a few other places  like this where the FloatManager computes sizes/positions differently, and I'm guessing that this affects whether various floats fit in certain bands.  And this, in turn, affects which floats that we need to make continuations for, and whether we end up going down the execution path that leads to the crash.
Comment 21 Daniel Holbert [:dholbert] 2009-06-13 23:25:51 PDT
cc'ing fantasai, in case she has any thoughts on comment 19
Comment 22 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2009-06-16 15:17:08 PDT
> InsertFrame makes a method call on this stale pointer (the call being
> "aPrevSibling->GetNextSibling()"), and we crash. [3]

This is not a virtual method call, just a read from the frame object.
Comment 23 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2009-06-16 15:24:11 PDT
It's hard for me to tell if this patch is really an improvement. Maybe it's just as easy to delete a frame that's in the overflow tracker's list before mPrevOverflowCont, in which case this patch won't help.

This seems like the sort of thing that Zack's frame poisoning work should help with.
Comment 24 Daniel Holbert [:dholbert] 2009-06-23 16:39:13 PDT
Created attachment 384770 [details] [diff] [review]
patch for 1.9.0.x [checked in: c36]

FWIW, this patch (suggested by fantasai) fixes the crash, though we're not entirely sure why yet.

Intuition behind the patch:
 - The idea here is to avoid having two nsOverflowContinuationTracker objects monitoring the same frame-subtree at the same time. ReflowOverflowContainerChildren creates a temporary tracker, which only lasts for the life of that method -- but currently, this method is called just after the nsBlockReflowState (and *its* tracker) has been created.  So we have two trackers at the same time.
 - This patch moves the ReflowOverflowContainerChildren call up to before where we initialize the nsBlockReflowState, so that its temporary tracker is both created & destroyed **before** we make another tracker for the reflow state.
Comment 25 fantasai 2009-06-23 17:32:31 PDT
Just looked through nsBlockFrame::Reflow again. It uses the block reflow state (which contains the main OverflowContinuationTracker) to do a bunch of stuff after the current ReflowOverflowContainerChildren call; there's definitely a potential for interleaving calls to the two trackers, which they're not designed to handle. I think either shifting ReflowOverflowContainerChildren up to before the creation of the main tracker, or shifting it down to right before ComputeFinalSize should be the right fix here.
Comment 26 Daniel Veditz [:dveditz] 2009-06-24 15:52:07 PDT
This is hugely hopeful and it is wonderful to see such a small patch after looking at what the space manager rewrite would involve, but it's a little too uncertain/untested to cram into the end of 1.9.0.12 (already slipped the code-freeze a week). Looking forward to this in 1.9.0.13, will this be reviewed and tested by time the reopens in a couple of weeks?
Comment 27 Daniel Holbert [:dholbert] 2009-06-24 15:59:05 PDT
(In reply to comment #26)
> Looking forward to this in 1.9.0.13, will this be reviewed
> and tested by time the reopens in a couple of weeks?

Definitely. I'm doing some final analysis on what's going on here (will post that within the hour), and then I should have a patch ready for review.
Comment 28 Daniel Holbert [:dholbert] 2009-06-24 16:08:18 PDT
(In reply to comment #25)
> I think either shifting ReflowOverflowContainerChildren up
> to before the creation of the main tracker, or shifting it down to right before
> ComputeFinalSize should be the right fix here.

I think the former is what needs to happen (as in band-aid #2), but the latter (shifting ReflowOverflowContainerChildren down) doesn't work.

From testing various options, it looks like the key thing that's needed is for ReflowOverflowContainerChildren to move *above the ReflowDirtyLines call*. It can even be called immediately before ReflowDirtyLines -- that still prevents the crash.  But if it's called immediately after ReflowDirtyLines (as in existing code), we crash.

More details to come.
Comment 29 Daniel Holbert [:dholbert] 2009-06-24 17:31:23 PDT
Created attachment 385019 [details]
series of frametrees for broken (unpatched) code vs working (band-aid#2) code

I'm attaching a zipped series of frametrees I generated at various points during execution of nsBlockFrame::Reflow for the second column.  These frametrees are from the first time that Reflow is called for that column.  (I've also tweaked my build slightly to hard-code some starting values for knownInfeasibleHeight and knownFeasibleHeight -- 1979 and 2041, respectively -- to arrive at the crashy conditions in fewer reflows.)

So, when we start this reflow of the second column, its previous sibling (the first column) contains...
    (A) an "OverflowOutOfFlow-List" **inside of a child block**
    (B) an Overflow-lines list
    (C) an ExcessOverflowContainers list
    (D) an OverflowOutOfFlow list

At a high level, I think we need to drain *all* of those from the previous column before the ReflowDirtyLines call, or else we crash.

These methods achieve this draining:
 * ReflowOverflowContainers pulls (A) and (C) from the previous column (forming the second column's OverflowContainers-list)
 * DrainOverflowLines pulls (B) and (D) from the previous column (forming the second column's normal contents & Float-List).

Now, in existing code, we don't call ReflowOverflowContainers until *after* ReflowDirtyLines, and at that point, I think (A) and (C) don't fit in our second column.  So, ReflowOverflowContainers ends up needing to make an ExcessOverflowContainers list right away. (and the continuation frame inside that list is the frame that ends up getting deleted later on and killing us!!)

However, if we apply the band-aid and call ReflowOverflowContainers *before* ReflowDirtyLines, then we pull (A) and (C) into the second column *before* calling ReflowDirtyLines.  When pulled this early, they don't require an ExcessOverflowContainers list.  Later, when we call ReflowDirtyLines, we find that not everything fits, and so we create an Overflow-Lines list and an OverflowOutOfFlow-list.  (These lists aren't created in unpatched code).

At a high level, it seems healthier to get pull the overflowed content from the previous column *before* reflowing our dirty lines, because it gives us a better shot at positioning everything correctly.  So, bottom line, I think band-aid #2 is the right fix.
Comment 30 Daniel Holbert [:dholbert] 2009-06-24 17:41:49 PDT
Created attachment 385024 [details] [diff] [review]
patch for mozilla-central & 1.9.1

Here's a version of the patch for mozilla-central and 1.9.1, because I imagine we'll want this on those branches before the 1.9.0.x branch.

I'll see if I can come up with a mozilla-central-crashing testcase, too.
Comment 31 Daniel Holbert [:dholbert] 2009-06-24 17:46:17 PDT
Comment on attachment 385024 [details] [diff] [review]
patch for mozilla-central & 1.9.1

Note that the only difference between the 1.9.0.x patch and the mozilla-central/1.9.1 patch is the "space manager" vs "float manager" terminology in the contextual code and comments.
Comment 32 Daniel Holbert [:dholbert] 2009-06-25 11:18:40 PDT
(In reply to comment #30)
> I'll see if I can come up with a mozilla-central-crashing testcase, too.

I briefly looked into making a mozilla-central-crashing testcase, but I don't think it's easily doable from this bug's existing testcase.  It looks like the SpaceManager-->FloatManager patch changed our reflow behavior pretty significantly here, so that now *none* of the Overflow lists from comment 29 (labeled A/B/C/D) are present on the first column when we reflow the second column.

So, I think this may have to go without a testcase on mozilla-central, unless someone comes up with one.  (Technically, this bug already *does* have a testcase already checked in on mozilla-central -- the original testcase was landed in comment 6.  It's just that this testcase is worked around by the Float Manager as already described here.)
Comment 33 Daniel Holbert [:dholbert] 2009-06-29 00:19:34 PDT
Landed attachment 385024 [details] [diff] [review] on mozilla-central:
http://hg.mozilla.org/mozilla-central/rev/f2bb2b76e93c

I'm switching the "branch" field back to "trunk" and resolving as fixed, since we've now landed code on mozilla-central to fix this type of bug there.

I think this patch should land on 1.9.1 when possible (for Firefox 3.5.1?) and then on the 3.0.x branch.
Comment 34 Daniel Holbert [:dholbert] 2009-06-29 00:46:15 PDT
Comment on attachment 385024 [details] [diff] [review]
patch for mozilla-central & 1.9.1

Requesting approval to land this on 1.9.1 branch, when that reopens (for 1.9.1.1 / Firefox 3.5.1?, I assume).

This is a patch that's needed on 3.0.x to fix a crasher. This particular crash hasn't been demonstrated to affect 3.5 since bug 191448 landed, because that bug's patch works around it for this particular testcase (and we haven't yet been able to construct a non-worked-around testcase).  But this patch a fairly minimal change that makes more sense safety-wise than our prior behavior (see comment 24 & comment 25).  It'd be nice to have this on 3.5 in the spirit of "don't land code on old branches until it's also landed on newer branches".
Comment 35 Daniel Veditz [:dveditz] 2009-07-10 11:29:37 PDT
Comment on attachment 384770 [details] [diff] [review]
patch for 1.9.0.x [checked in: c36]

Approved for 1.9.0.13, a=dveditz for release-drivers
Comment 36 Daniel Holbert [:dholbert] 2009-07-10 12:01:03 PDT
attachment 384770 [details] [diff] [review] landed on CVS trunk:
Checking in layout/generic/nsBlockFrame.cpp;
/cvsroot/mozilla/layout/generic/nsBlockFrame.cpp,v  <--  nsBlockFrame.cpp
new revision: 3.962; previous revision: 3.961
done

We still need to fix this on 1.9.1. (but I imagine that may not happen for a dot-release or two, because I think I heard that 1.9.1.1 is pretty crowded with requests)

The crashtest also needs landing on both 1.9.0 & 1.9.1.  I'm not doing that until it's fixed on both branches -- I know the crashtest is already public from being checked into m-c, but it's probably best not to make it "more" public.
Comment 37 Al Billings [:abillings] 2009-07-27 13:45:19 PDT
Verified this is fixed in 1.9.0 with my debug build (Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.13pre) Gecko/2009072711 GranParadiso/3.0.13pre). Debug build from before checkin still gets assertion in comment 0 and crash but both are gone in current build.
Comment 38 Daniel Veditz [:dveditz] 2009-08-07 11:28:04 PDT
Comment on attachment 385024 [details] [diff] [review]
patch for mozilla-central & 1.9.1

Approved for 1.9.1.3, a=dveditz for release-drivers
Comment 39 Samuel Sidler (old account; do not CC) 2009-08-12 18:09:01 PDT
dholbert: Are you going to be able to land this on 1.9.1 soon?
Comment 40 Samuel Sidler (old account; do not CC) 2009-08-12 18:25:16 PDT
Comment on attachment 385024 [details] [diff] [review]
patch for mozilla-central & 1.9.1

Talked with dholbert. He's still investigating the regression from this on m-c. I'll come back around for 1.9.1.4 and approve this.
Comment 41 Daniel Holbert [:dholbert] 2009-08-12 18:29:31 PDT
Yup -- the regression is bug 503961, two new assertion failures that Jesse found.

There shouldn't be any pressing need to land this bug's patch quickly on 1.9.1, since this bug is worked around by the space manager anyway, so postponing to 1.9.1.4 should be fine.
Comment 42 Daniel Veditz [:dveditz] 2009-08-24 16:22:42 PDT
Comment on attachment 385024 [details] [diff] [review]
patch for mozilla-central & 1.9.1

If we land this on the 1.9.1 branch will we cause the regression bug 503961 ?
Comment 43 Daniel Holbert [:dholbert] 2009-08-24 16:46:54 PDT
Yes.
Comment 44 Daniel Veditz [:dveditz] 2009-08-28 11:45:33 PDT
Since bug 191448 is an effective mitigation in 1.9.1 and later this doesn't "block" the release, but it would still be nice to correct the underlying problem.

Note You need to log in before you can comment on or make changes to this bug.