Closed Bug 588237 Opened 14 years ago Closed 11 years ago

Crash [@ nsLayoutUtils::GetFloatFromPlaceholder] with -moz-column, float

Categories

(Core :: Layout, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla21
Tracking Status
blocking2.0 --- -

People

(Reporter: jruderman, Assigned: dbaron)

References

Details

(4 keywords, Whiteboard: [fuzzblocker:layout-crashes])

Crash Data

Attachments

(7 files, 2 obsolete files)

237 bytes, text/html
Details
20.64 KB, text/plain
Details
2.72 KB, text/plain
Details
1.83 KB, patch
roc
: review+
Details | Diff | Splinter Review
1.06 KB, patch
roc
: review+
Details | Diff | Splinter Review
1.72 KB, patch
roc
: review+
Details | Diff | Splinter Review
4.26 KB, patch
roc
: review+
Details | Diff | Splinter Review
###!!! ASSERTION: Placeholder relationship should have been torn down already; this might mean we have a stray placeholder in the tree.: '!placeholder || nsLayoutUtils::IsProperAncestorFrame(aDestructRoot, placeholder)', file layout/generic/nsFrame.cpp, line 442

###!!! ASSERTION: Null out-of-flow for placeholder?: 'outOfFlow', file nsPlaceholderFrame.h, line 200

Null deref crash [@ nsLayoutUtils::GetFloatFromPlaceholder]
Attached file stack traces
dbaron is on a -moz-column crash-killing rampage.
I was really just looking at regressions from bug 563584; I think this is probably another one, actually.

The problem (at the point of the second assertion and the crash) is that
we've destroyed a float without destroying its placeholder.

This happens, as pointed out by the first assertion, because we call
DeleteNextInFlowChild on a block that still has a float list.  Its float
list contains pushed floats from a number of continuations back.

The previous block (the third continuation) reported FULLY_COMPLETE
reflow status, despite that there's still an additional pushed float in
its next continuation (the fourth continuation).

Calling nsFrame::DumpFrameTree in nsContainerFrame::ReflowChild right
before the call to DeleteNextInFlowChild was somewhat enlightening.
Blocks: 563584
Hmmm.  I bet I know what's happening.  If we reflow the second outer columnset twice, the second time we reflow it, we won't have reflowed the placeholders for the floats inside it again, and thus we won't know that we again have to push them.

I'm not sure what to do about this.
blocking2.0: --- → ?
blocking2.0: ? → betaN+
Assignee: nobody → dbaron
Keywords: regression
OS: Mac OS X → All
Hardware: x86 → All
... and I needed to debug this again to remind myself why this isn't happening:  it's because the placeholders are actually in the previous continuation, and the floats for those placeholders are in the next continuation.  So when we reflow the frame twice, it can report being complete since the first time it was reflowed, it pushed the placeholders to its next continuation, but it doesn't gather them again when it's reflowed a second time.
Maybe, to fix it, we can pull all the pushed floats from the start of the next-in-flow's float list?  But when we did that we'd want to immediately try reflowing those that were pushed from a previous block, but not those that were pushed from this block.
I think we should just store the original parent on floats that get pushed, in a frame property.  Then at the start of reflow we can pull back any floats at the start of the next-in-flow's float list that were pushed, fix up the pushed float bit on them, and reflow any that were pushed from a prior continuation.
Actually, I think that's still not right.

I think I've decided I'm not going to worry about making sure the layout here is correct; instead I'll just make it not crash by ensuring that we don't report the frame as complete in these cases.
(It's still not right because I still don't know how to determine when to reflow those pushed floats relative to continuations of floats.)
Actually, maybe I was thinking about it wrong -- I think I was thinking about floats pushed from |this|, when I really only need to think about floats pushed from this's previous continuations, and they should always get reflowed at the end of the pushed floats.

Also, I think the floats in question would still be in *our* pushed floats list, which might make this even easier.
Crash Signature: [@ nsLayoutUtils::GetFloatFromPlaceholder]
Boy, from reading the comments, this bug sounds pretty complicated.  I'm told it blocks a good number of our fuzzing efforts from progressing.  David, it seems like you've put a decent amount of thought into this bug.  Any chance I can persuade you to take another look at it and maybe attempt a patch?
The patches:

pushed-float-store-original-parent
fix-float-comment
remove-empty-pushed-floats
pull-pushed-floats-back

in my patch queue at https://hg.mozilla.org/users/dbaron_mozilla.com/patches/ are work-in-progress for this bug; I don't remember right now what's involved in finishing that work.
Whiteboard: [fuzzblocker:layout-crashes]
This bug forces my DOM fuzzer to ignore crashes in large swaths of layout code, along with many key layout assertions.  I'm pretty sure I've missed security holes due to this bug being unfixed, even if this bug is somehow not a security hole on its own.

If we can't get our CSS column code under control quickly, we should disable it.
(In reply to David Baron [:dbaron] from comment #12)
> The patches:
> 
> pushed-float-store-original-parent
> fix-float-comment
> remove-empty-pushed-floats
> pull-pushed-floats-back
> 
> in my patch queue at
> https://hg.mozilla.org/users/dbaron_mozilla.com/patches/ are
> work-in-progress for this bug; I don't remember right now what's involved in
> finishing that work.

Do we know who is picking this up?
So the testcase in this bug no longer crashes for me.  Jesse, do you have a testcase that does, that you think is this bug?

In any case, I looked through my patches, tried to find things wrong with them by inspection, and fixed a few things.  In particular, I just wrote the insertionPrevSibling bit of patch 4 (which is the main patch) and all of patch 5.  I'll post them for review, since I don't have any better ideas.
Flags: needinfo?(jruderman)
Comment on attachment 700105 [details] [diff] [review]
, patch 1:  Store the original parent of first-in-flow pushed floats.

Review of attachment 700105 [details] [diff] [review]:
-----------------------------------------------------------------

It's not clear to me why you can't just find this value via the float's placeholder.

::: layout/generic/nsBlockFrame.h
@@ +131,5 @@
>  
>    friend nsIFrame* NS_NewBlockFrame(nsIPresShell* aPresShell, nsStyleContext* aContext, uint32_t aFlags);
>  
> +  // This is a pointer (weak) from a pushed float to its original parent.
> +  NS_DECLARE_FRAME_PROPERTY(PushedFloatOriginalParent, nullptr);

Extend this comment to specify the lifetime of the property.
> ::: layout/generic/nsBlockFrame.h
> @@ +131,5 @@
> >  
> >    friend nsIFrame* NS_NewBlockFrame(nsIPresShell* aPresShell, nsStyleContext* aContext, uint32_t aFlags);
> >  
> > +  // This is a pointer (weak) from a pushed float to its original parent.
> > +  NS_DECLARE_FRAME_PROPERTY(PushedFloatOriginalParent, nullptr);
> 
> Extend this comment to specify the lifetime of the property.

Also:  NS_DECLARE_FRAME_PROPERTY() doesn't need a semicolon at the end.
Comment on attachment 700108 [details] [diff] [review]
, patch 4:  Pull pushed floats back from the next-in-flow at the start of reflow.

Review of attachment 700108 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/generic/nsBlockFrame.cpp
@@ +4527,5 @@
> +  nsFrameList *ourPushedFloats = GetPushedFloats();
> +  if (ourPushedFloats) {
> +    // When we pull back floats, we want to put them with the pushed
> +    // floats, which must live at the start of our float list, but we
> +    // want them at the end of those pushed floats.

Is this true? Couldn't some of them belong at the end of our floats list? E.g. a float that just failed to fit at the end of this block?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #22)
> It's not clear to me why you can't just find this value via the float's
> placeholder.

I guess I could, but it might require walking up through some number of inlines.  Maybe that's still better, though?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #24)
> Comment on attachment 700108 [details] [diff] [review]
> , patch 4:  Pull pushed floats back from the next-in-flow at the start of
> reflow.
> > +  nsFrameList *ourPushedFloats = GetPushedFloats();
> > +  if (ourPushedFloats) {
> > +    // When we pull back floats, we want to put them with the pushed
> > +    // floats, which must live at the start of our float list, but we
> > +    // want them at the end of those pushed floats.
> 
> Is this true? Couldn't some of them belong at the end of our floats list?
> E.g. a float that just failed to fit at the end of this block?

I believe the code maintains the invariant that pushed floats (which were pushed from a previous block) go earlier in the floats list than any floats whose placeholders are in this block.  This invariant should be maintained by (in the order they happen during reflow):
 * nsBlockFrame::DrainPushedFloats
 * nsBlockFrame::ReflowPushedFloats
 * (nsBlockReflowState/nsLineLayout)::AddFloat, which is what's called when we reflow a float through a placeholder, and which always appends to the end of the float list if the float isn't in the current block's float list

I'm not particularly confident that the ordering *within* the pushed floats is maintained correctly, but I think any errors there should be layout bugs rather than dangling pointers.
(In reply to David Baron [:dbaron] from comment #25)
> I guess I could, but it might require walking up through some number of
> inlines.  Maybe that's still better, though?

I think it would be, yes.

There have been a number of bugs where we exit reflow with frames still in some overflow list. Normally these bugs are relatively innocuous, but with this patch they would become more dangerous because we could delete the block and leave a dangling pointer, plus PushFloatPastBreak would miss at least one update to the property.

It just seems a lot safer to not cache frame pointers where they could live long, until we have a good performance reason to do so.

(In reply to David Baron [:dbaron] from comment #26)
> I believe the code maintains the invariant that pushed floats (which were
> pushed from a previous block) go earlier in the floats list than any floats
> whose placeholders are in this block.

I agree, but if GetPushedFloats contains a float whose placeholder is the last frame in this block, we need to put that float at the end of our floats list, do we not?
Comment on attachment 700105 [details] [diff] [review]
, patch 1:  Store the original parent of first-in-flow pushed floats.

The new version of patch 4 no longer depends on patch 1, per roc's suggestion above.

(I'm getting gradually less confident that the patches fix things, though, since I no longer have a testcase that shows the bug.)
Attachment #700105 - Attachment is obsolete: true
Attachment #700105 - Flags: review?(roc)
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #27)
> I agree, but if GetPushedFloats contains a float whose placeholder is the
> last frame in this block, we need to put that float at the end of our floats
> list, do we not?

The somewhat tricky thing about this code is that it's only pulling back the floats that are pushed from prior blocks, and *not* the floats pushed from this block.

(There'd be a new patch 4 up by now if the bugzilla API weren't timing out...)
Comment on attachment 700784 [details] [diff] [review]
patch 4: Pull pushed floats back from the next-in-flow at the start of reflow.

Review of attachment 700784 [details] [diff] [review]:
-----------------------------------------------------------------

You should probably add that information to the comment in this patch.
Attachment #700784 - Flags: review?(roc) → review+
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #31)
> Comment on attachment 700784 [details] [diff] [review]
> patch 4: Pull pushed floats back from the next-in-flow at the start of
> reflow.
> 
> Review of attachment 700784 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> You should probably add that information to the comment in this patch.

I'm assuming "that information" is comment 26.  (comment 29 is already in the comment in the patch. :-)

I need to figure out a good place to put it... I'm thinking a comment above DrainPushedFloats's implementation, with pointers to that comment from elsewhere.
(In reply to David Baron [:dbaron] from comment #32)
> I'm assuming "that information" is comment 26.  (comment 29 is already in
> the comment in the patch. :-)

I don't think it's clear then. At least, I don't think it's clear from this:
+  // If we're getting reflowed multiple times without our
+  // next-continuation being reflowed, we might need to pull back floats
+  // that we just put in the list to be pushed to our next-in-flow.
+  // But we can't pull any next-in-flows of floats on our own float list.
Just landed:
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/33ff2a97b021
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/0542ba6920b3
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/5c29097e7bbd
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/9e1f9e488479
though I managed not to see comment 33 until writing this.

Still, I'm not sure what would make it clearer than what's already there:

+        nsIFrame *floatOriginalParent = presContext->PresShell()->
+          FrameConstructor()->GetFloatContainingBlock(placeholder);
+        if (floatOriginalParent != this) {
+          // This is a first continuation that was pushed from one of our
+          // previous continuations.  Take it out of the pushed floats
Yes, it's clear in the code, but the comment I cited makes it sound like you're going to pull back all floats.
(In reply to David Baron [:dbaron] from comment #16)
> So the testcase in this bug no longer crashes for me.  Jesse, do you have a
> testcase that does, that you think is this bug?

I don't have another testcase with this crash signature handy, but the testcase in bug 621424 seems to be fixed by this patch.
Flags: needinfo?(jruderman)
A valgrind run of the crashtest directory that was failing (layout/generic/crashtests/) shows that the underlying problem causing the orange seems to be in 570160.html (which was not obvious from the log or from running individual tests or from running the directory without valgrind).  More later, once I look into it.
(It probably would have helped if I'd noticed the class="reftest-print" yesterday evening.)
Amusingly enough, the crash was due to the tiny patch 3, not to patch 4.  In particular, nsBlockFrame::SplitFloat calls StealFrame, which steals a next-in-flow from this's pushed floats list, and then deletes that list.  But nsBlockReflowState has a cached pointer to that list which is used in the aState.AppendPushedFloat() call at the end of SplitFloat.

I'm inclined to just drop patch 3; it was just cleanup of something I thought I could clean up, but actually can't.
(In reply to David Baron [:dbaron] from comment #40)
> I'm inclined to just drop patch 3; it was just cleanup of something I
> thought I could clean up, but actually can't.

And, really, it will get deleted soon enough when the next-in-flow calls its DrainPushedFloats method at the start of reflow.

(Though what if the block ends up being complete and doesn't have a next-in-flow?  Do we incorrectly mark the block incomplete and create a next-in-flow we shouldn't, or do we leave the extra empty frame list lying around until we destroy the block?)
(And while addressing roc's issue with the comments, I also added a few more FIXMEs that I noticed to the code.)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: