Last Comment Bug 588237 - Crash [@ nsLayoutUtils::GetFloatFromPlaceholder] with -moz-column, float
: Crash [@ nsLayoutUtils::GetFloatFromPlaceholder] with -moz-column, float
Status: RESOLVED FIXED
[fuzzblocker:layout-crashes]
: assertion, crash, regression, testcase
Product: Core
Classification: Components
Component: Layout (show other bugs)
: Trunk
: All All
-- critical with 1 vote (vote)
: mozilla21
Assigned To: David Baron :dbaron: ⌚️UTC-8
:
: Jet Villegas (:jet)
Mentors:
Depends on:
Blocks: randomclasses 563584 621424
  Show dependency treegraph
 
Reported: 2010-08-17 17:17 PDT by Jesse Ruderman
Modified: 2013-01-12 12:47 PST (History)
22 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-


Attachments
testcase (crashes Firefox when loaded) (237 bytes, text/html)
2010-08-17 17:17 PDT, Jesse Ruderman
no flags Details
stack traces (20.64 KB, text/plain)
2010-08-17 17:20 PDT, Jesse Ruderman
no flags Details
Stuff I'm ignoring due to this and related bugs (2.72 KB, text/plain)
2012-11-26 12:31 PST, Jesse Ruderman
no flags Details
, patch 1: Store the original parent of first-in-flow pushed floats. (4.47 KB, patch)
2013-01-09 15:38 PST, David Baron :dbaron: ⌚️UTC-8
no flags Details | Diff | Splinter Review
, patch 2: Fix comment that is now incorrect due to work in bug 563584 or followups. (1.83 KB, patch)
2013-01-09 15:38 PST, David Baron :dbaron: ⌚️UTC-8
roc: review+
Details | Diff | Splinter Review
, patch 3: Remove the pushed floats list when it is empty. (1.06 KB, patch)
2013-01-09 15:38 PST, David Baron :dbaron: ⌚️UTC-8
roc: review+
Details | Diff | Splinter Review
, patch 4: Pull pushed floats back from the next-in-flow at the start of reflow. (3.04 KB, patch)
2013-01-09 15:38 PST, David Baron :dbaron: ⌚️UTC-8
no flags Details | Diff | Splinter Review
, patch 5: remove the optimization of skipping FlowAndPlaceFloat for some pushed floats, simply because the optimization looks fishy to me. (1.72 KB, patch)
2013-01-09 15:38 PST, David Baron :dbaron: ⌚️UTC-8
roc: review+
Details | Diff | Splinter Review
patch 4: Pull pushed floats back from the next-in-flow at the start of reflow. (4.26 KB, patch)
2013-01-10 17:22 PST, David Baron :dbaron: ⌚️UTC-8
roc: review+
Details | Diff | Splinter Review

Description User image Jesse Ruderman 2010-08-17 17:17:50 PDT
Created attachment 466860 [details]
testcase (crashes Firefox when loaded)

###!!! ASSERTION: Placeholder relationship should have been torn down already; this might mean we have a stray placeholder in the tree.: '!placeholder || nsLayoutUtils::IsProperAncestorFrame(aDestructRoot, placeholder)', file layout/generic/nsFrame.cpp, line 442

###!!! ASSERTION: Null out-of-flow for placeholder?: 'outOfFlow', file nsPlaceholderFrame.h, line 200

Null deref crash [@ nsLayoutUtils::GetFloatFromPlaceholder]
Comment 1 User image Jesse Ruderman 2010-08-17 17:20:41 PDT
Created attachment 466861 [details]
stack traces
Comment 2 User image Jesse Ruderman 2010-08-19 12:01:58 PDT
dbaron is on a -moz-column crash-killing rampage.
Comment 3 User image David Baron :dbaron: ⌚️UTC-8 2010-08-20 04:44:52 PDT
I was really just looking at regressions from bug 563584; I think this is probably another one, actually.

The problem (at the point of the second assertion and the crash) is that
we've destroyed a float without destroying its placeholder.

This happens, as pointed out by the first assertion, because we call
DeleteNextInFlowChild on a block that still has a float list.  Its float
list contains pushed floats from a number of continuations back.

The previous block (the third continuation) reported FULLY_COMPLETE
reflow status, despite that there's still an additional pushed float in
its next continuation (the fourth continuation).

Calling nsFrame::DumpFrameTree in nsContainerFrame::ReflowChild right
before the call to DeleteNextInFlowChild was somewhat enlightening.
Comment 4 User image David Baron :dbaron: ⌚️UTC-8 2010-08-28 11:32:04 PDT
Hmmm.  I bet I know what's happening.  If we reflow the second outer columnset twice, the second time we reflow it, we won't have reflowed the placeholders for the floats inside it again, and thus we won't know that we again have to push them.

I'm not sure what to do about this.
Comment 5 User image David Baron :dbaron: ⌚️UTC-8 2010-09-28 13:04:01 PDT
... and I needed to debug this again to remind myself why this isn't happening:  it's because the placeholders are actually in the previous continuation, and the floats for those placeholders are in the next continuation.  So when we reflow the frame twice, it can report being complete since the first time it was reflowed, it pushed the placeholders to its next continuation, but it doesn't gather them again when it's reflowed a second time.
Comment 6 User image David Baron :dbaron: ⌚️UTC-8 2010-10-02 09:14:45 PDT
Maybe, to fix it, we can pull all the pushed floats from the start of the next-in-flow's float list?  But when we did that we'd want to immediately try reflowing those that were pushed from a previous block, but not those that were pushed from this block.
Comment 7 User image David Baron :dbaron: ⌚️UTC-8 2010-10-20 17:24:53 PDT
I think we should just store the original parent on floats that get pushed, in a frame property.  Then at the start of reflow we can pull back any floats at the start of the next-in-flow's float list that were pushed, fix up the pushed float bit on them, and reflow any that were pushed from a prior continuation.
Comment 8 User image David Baron :dbaron: ⌚️UTC-8 2010-12-10 14:28:27 PST
Actually, I think that's still not right.

I think I've decided I'm not going to worry about making sure the layout here is correct; instead I'll just make it not crash by ensuring that we don't report the frame as complete in these cases.
Comment 9 User image David Baron :dbaron: ⌚️UTC-8 2010-12-10 14:29:52 PST
(It's still not right because I still don't know how to determine when to reflow those pushed floats relative to continuations of floats.)
Comment 10 User image David Baron :dbaron: ⌚️UTC-8 2010-12-10 14:46:23 PST
Actually, maybe I was thinking about it wrong -- I think I was thinking about floats pushed from |this|, when I really only need to think about floats pushed from this's previous continuations, and they should always get reflowed at the end of the pushed floats.

Also, I think the floats in question would still be in *our* pushed floats list, which might make this even easier.
Comment 11 User image Brandon Sterne (:bsterne) 2011-10-17 16:33:40 PDT
Boy, from reading the comments, this bug sounds pretty complicated.  I'm told it blocks a good number of our fuzzing efforts from progressing.  David, it seems like you've put a decent amount of thought into this bug.  Any chance I can persuade you to take another look at it and maybe attempt a patch?
Comment 12 User image David Baron :dbaron: ⌚️UTC-8 2011-10-17 19:48:15 PDT
The patches:

pushed-float-store-original-parent
fix-float-comment
remove-empty-pushed-floats
pull-pushed-floats-back

in my patch queue at https://hg.mozilla.org/users/dbaron_mozilla.com/patches/ are work-in-progress for this bug; I don't remember right now what's involved in finishing that work.
Comment 13 User image Jesse Ruderman 2012-11-26 12:28:25 PST
This bug forces my DOM fuzzer to ignore crashes in large swaths of layout code, along with many key layout assertions.  I'm pretty sure I've missed security holes due to this bug being unfixed, even if this bug is somehow not a security hole on its own.

If we can't get our CSS column code under control quickly, we should disable it.
Comment 14 User image Jesse Ruderman 2012-11-26 12:31:25 PST
Created attachment 685290 [details]
Stuff I'm ignoring due to this and related bugs
Comment 15 User image David Bolter [:davidb] 2013-01-08 12:22:39 PST
(In reply to David Baron [:dbaron] from comment #12)
> The patches:
> 
> pushed-float-store-original-parent
> fix-float-comment
> remove-empty-pushed-floats
> pull-pushed-floats-back
> 
> in my patch queue at
> https://hg.mozilla.org/users/dbaron_mozilla.com/patches/ are
> work-in-progress for this bug; I don't remember right now what's involved in
> finishing that work.

Do we know who is picking this up?
Comment 16 User image David Baron :dbaron: ⌚️UTC-8 2013-01-09 15:35:47 PST
So the testcase in this bug no longer crashes for me.  Jesse, do you have a testcase that does, that you think is this bug?

In any case, I looked through my patches, tried to find things wrong with them by inspection, and fixed a few things.  In particular, I just wrote the insertionPrevSibling bit of patch 4 (which is the main patch) and all of patch 5.  I'll post them for review, since I don't have any better ideas.
Comment 17 User image David Baron :dbaron: ⌚️UTC-8 2013-01-09 15:38:15 PST
Created attachment 700105 [details] [diff] [review]
, patch 1:  Store the original parent of first-in-flow pushed floats.
Comment 18 User image David Baron :dbaron: ⌚️UTC-8 2013-01-09 15:38:26 PST
Created attachment 700106 [details] [diff] [review]
, patch 2:  Fix comment that is now incorrect due to work in bug 563584 or followups.
Comment 19 User image David Baron :dbaron: ⌚️UTC-8 2013-01-09 15:38:37 PST
Created attachment 700107 [details] [diff] [review]
, patch 3:  Remove the pushed floats list when it is empty.
Comment 20 User image David Baron :dbaron: ⌚️UTC-8 2013-01-09 15:38:48 PST
Created attachment 700108 [details] [diff] [review]
, patch 4:  Pull pushed floats back from the next-in-flow at the start of reflow.
Comment 21 User image David Baron :dbaron: ⌚️UTC-8 2013-01-09 15:38:58 PST
Created attachment 700109 [details] [diff] [review]
, patch 5:  remove the optimization of skipping FlowAndPlaceFloat for some pushed floats, simply because the optimization looks fishy to me.
Comment 22 User image Robert O'Callahan (:roc) (email my personal email if necessary) 2013-01-10 16:21:22 PST
Comment on attachment 700105 [details] [diff] [review]
, patch 1:  Store the original parent of first-in-flow pushed floats.

Review of attachment 700105 [details] [diff] [review]:
-----------------------------------------------------------------

It's not clear to me why you can't just find this value via the float's placeholder.

::: layout/generic/nsBlockFrame.h
@@ +131,5 @@
>  
>    friend nsIFrame* NS_NewBlockFrame(nsIPresShell* aPresShell, nsStyleContext* aContext, uint32_t aFlags);
>  
> +  // This is a pointer (weak) from a pushed float to its original parent.
> +  NS_DECLARE_FRAME_PROPERTY(PushedFloatOriginalParent, nullptr);

Extend this comment to specify the lifetime of the property.
Comment 23 User image Daniel Holbert [:dholbert] 2013-01-10 16:23:15 PST
> ::: layout/generic/nsBlockFrame.h
> @@ +131,5 @@
> >  
> >    friend nsIFrame* NS_NewBlockFrame(nsIPresShell* aPresShell, nsStyleContext* aContext, uint32_t aFlags);
> >  
> > +  // This is a pointer (weak) from a pushed float to its original parent.
> > +  NS_DECLARE_FRAME_PROPERTY(PushedFloatOriginalParent, nullptr);
> 
> Extend this comment to specify the lifetime of the property.

Also:  NS_DECLARE_FRAME_PROPERTY() doesn't need a semicolon at the end.
Comment 24 User image Robert O'Callahan (:roc) (email my personal email if necessary) 2013-01-10 16:25:48 PST
Comment on attachment 700108 [details] [diff] [review]
, patch 4:  Pull pushed floats back from the next-in-flow at the start of reflow.

Review of attachment 700108 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/generic/nsBlockFrame.cpp
@@ +4527,5 @@
> +  nsFrameList *ourPushedFloats = GetPushedFloats();
> +  if (ourPushedFloats) {
> +    // When we pull back floats, we want to put them with the pushed
> +    // floats, which must live at the start of our float list, but we
> +    // want them at the end of those pushed floats.

Is this true? Couldn't some of them belong at the end of our floats list? E.g. a float that just failed to fit at the end of this block?
Comment 25 User image David Baron :dbaron: ⌚️UTC-8 2013-01-10 16:40:28 PST
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #22)
> It's not clear to me why you can't just find this value via the float's
> placeholder.

I guess I could, but it might require walking up through some number of inlines.  Maybe that's still better, though?
Comment 26 User image David Baron :dbaron: ⌚️UTC-8 2013-01-10 16:48:17 PST
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #24)
> Comment on attachment 700108 [details] [diff] [review]
> , patch 4:  Pull pushed floats back from the next-in-flow at the start of
> reflow.
> > +  nsFrameList *ourPushedFloats = GetPushedFloats();
> > +  if (ourPushedFloats) {
> > +    // When we pull back floats, we want to put them with the pushed
> > +    // floats, which must live at the start of our float list, but we
> > +    // want them at the end of those pushed floats.
> 
> Is this true? Couldn't some of them belong at the end of our floats list?
> E.g. a float that just failed to fit at the end of this block?

I believe the code maintains the invariant that pushed floats (which were pushed from a previous block) go earlier in the floats list than any floats whose placeholders are in this block.  This invariant should be maintained by (in the order they happen during reflow):
 * nsBlockFrame::DrainPushedFloats
 * nsBlockFrame::ReflowPushedFloats
 * (nsBlockReflowState/nsLineLayout)::AddFloat, which is what's called when we reflow a float through a placeholder, and which always appends to the end of the float list if the float isn't in the current block's float list

I'm not particularly confident that the ordering *within* the pushed floats is maintained correctly, but I think any errors there should be layout bugs rather than dangling pointers.
Comment 27 User image Robert O'Callahan (:roc) (email my personal email if necessary) 2013-01-10 17:04:31 PST
(In reply to David Baron [:dbaron] from comment #25)
> I guess I could, but it might require walking up through some number of
> inlines.  Maybe that's still better, though?

I think it would be, yes.

There have been a number of bugs where we exit reflow with frames still in some overflow list. Normally these bugs are relatively innocuous, but with this patch they would become more dangerous because we could delete the block and leave a dangling pointer, plus PushFloatPastBreak would miss at least one update to the property.

It just seems a lot safer to not cache frame pointers where they could live long, until we have a good performance reason to do so.

(In reply to David Baron [:dbaron] from comment #26)
> I believe the code maintains the invariant that pushed floats (which were
> pushed from a previous block) go earlier in the floats list than any floats
> whose placeholders are in this block.

I agree, but if GetPushedFloats contains a float whose placeholder is the last frame in this block, we need to put that float at the end of our floats list, do we not?
Comment 28 User image David Baron :dbaron: ⌚️UTC-8 2013-01-10 17:09:28 PST
Comment on attachment 700105 [details] [diff] [review]
, patch 1:  Store the original parent of first-in-flow pushed floats.

The new version of patch 4 no longer depends on patch 1, per roc's suggestion above.

(I'm getting gradually less confident that the patches fix things, though, since I no longer have a testcase that shows the bug.)
Comment 29 User image David Baron :dbaron: ⌚️UTC-8 2013-01-10 17:11:31 PST
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #27)
> I agree, but if GetPushedFloats contains a float whose placeholder is the
> last frame in this block, we need to put that float at the end of our floats
> list, do we not?

The somewhat tricky thing about this code is that it's only pulling back the floats that are pushed from prior blocks, and *not* the floats pushed from this block.

(There'd be a new patch 4 up by now if the bugzilla API weren't timing out...)
Comment 30 User image David Baron :dbaron: ⌚️UTC-8 2013-01-10 17:22:03 PST
Created attachment 700784 [details] [diff] [review]
patch 4: Pull pushed floats back from the next-in-flow at the start of reflow.
Comment 31 User image Robert O'Callahan (:roc) (email my personal email if necessary) 2013-01-10 17:31:34 PST
Comment on attachment 700784 [details] [diff] [review]
patch 4: Pull pushed floats back from the next-in-flow at the start of reflow.

Review of attachment 700784 [details] [diff] [review]:
-----------------------------------------------------------------

You should probably add that information to the comment in this patch.
Comment 32 User image David Baron :dbaron: ⌚️UTC-8 2013-01-10 18:02:26 PST
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #31)
> Comment on attachment 700784 [details] [diff] [review]
> patch 4: Pull pushed floats back from the next-in-flow at the start of
> reflow.
> 
> Review of attachment 700784 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> You should probably add that information to the comment in this patch.

I'm assuming "that information" is comment 26.  (comment 29 is already in the comment in the patch. :-)

I need to figure out a good place to put it... I'm thinking a comment above DrainPushedFloats's implementation, with pointers to that comment from elsewhere.
Comment 33 User image Robert O'Callahan (:roc) (email my personal email if necessary) 2013-01-10 18:44:30 PST
(In reply to David Baron [:dbaron] from comment #32)
> I'm assuming "that information" is comment 26.  (comment 29 is already in
> the comment in the patch. :-)

I don't think it's clear then. At least, I don't think it's clear from this:
+  // If we're getting reflowed multiple times without our
+  // next-continuation being reflowed, we might need to pull back floats
+  // that we just put in the list to be pushed to our next-in-flow.
+  // But we can't pull any next-in-flows of floats on our own float list.
Comment 34 User image David Baron :dbaron: ⌚️UTC-8 2013-01-10 21:19:51 PST
Just landed:
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/33ff2a97b021
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/0542ba6920b3
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/5c29097e7bbd
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/9e1f9e488479
though I managed not to see comment 33 until writing this.

Still, I'm not sure what would make it clearer than what's already there:

+        nsIFrame *floatOriginalParent = presContext->PresShell()->
+          FrameConstructor()->GetFloatContainingBlock(placeholder);
+        if (floatOriginalParent != this) {
+          // This is a first continuation that was pushed from one of our
+          // previous continuations.  Take it out of the pushed floats
Comment 35 User image Robert O'Callahan (:roc) (email my personal email if necessary) 2013-01-10 21:26:28 PST
Yes, it's clear in the code, but the comment I cited makes it sound like you're going to pull back all floats.
Comment 36 User image David Baron :dbaron: ⌚️UTC-8 2013-01-10 22:21:39 PST
Backed out for crashtest orange:
https://hg.mozilla.org/integration/mozilla-inbound/rev/fcbaced55036
Comment 37 User image Jesse Ruderman 2013-01-11 02:51:20 PST
(In reply to David Baron [:dbaron] from comment #16)
> So the testcase in this bug no longer crashes for me.  Jesse, do you have a
> testcase that does, that you think is this bug?

I don't have another testcase with this crash signature handy, but the testcase in bug 621424 seems to be fixed by this patch.
Comment 38 User image David Baron :dbaron: ⌚️UTC-8 2013-01-11 10:16:40 PST
A valgrind run of the crashtest directory that was failing (layout/generic/crashtests/) shows that the underlying problem causing the orange seems to be in 570160.html (which was not obvious from the log or from running individual tests or from running the directory without valgrind).  More later, once I look into it.
Comment 39 User image David Baron :dbaron: ⌚️UTC-8 2013-01-11 10:19:36 PST
(It probably would have helped if I'd noticed the class="reftest-print" yesterday evening.)
Comment 40 User image David Baron :dbaron: ⌚️UTC-8 2013-01-11 12:48:06 PST
Amusingly enough, the crash was due to the tiny patch 3, not to patch 4.  In particular, nsBlockFrame::SplitFloat calls StealFrame, which steals a next-in-flow from this's pushed floats list, and then deletes that list.  But nsBlockReflowState has a cached pointer to that list which is used in the aState.AppendPushedFloat() call at the end of SplitFloat.

I'm inclined to just drop patch 3; it was just cleanup of something I thought I could clean up, but actually can't.
Comment 41 User image David Baron :dbaron: ⌚️UTC-8 2013-01-11 12:55:40 PST
(In reply to David Baron [:dbaron] from comment #40)
> I'm inclined to just drop patch 3; it was just cleanup of something I
> thought I could clean up, but actually can't.

And, really, it will get deleted soon enough when the next-in-flow calls its DrainPushedFloats method at the start of reflow.

(Though what if the block ends up being complete and doesn't have a next-in-flow?  Do we incorrectly mark the block incomplete and create a next-in-flow we shouldn't, or do we leave the extra empty frame list lying around until we destroy the block?)
Comment 43 User image David Baron :dbaron: ⌚️UTC-8 2013-01-11 14:53:16 PST
(And while addressing roc's issue with the comments, I also added a few more FIXMEs that I noticed to the code.)

Note You need to log in before you can comment on or make changes to this bug.