Closed Bug 411835 Opened 17 years ago Closed 16 years ago

Crash [@ GetChildListNameFor] with -moz-column, position:absolute

Categories

(Core :: Layout, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9.2a1
Tracking Status
status1.9.2 --- beta1-fixed
blocking1.9.1 --- .16+
status1.9.1 --- .16-fixed

People

(Reporter: jruderman, Assigned: MatsPalmgren_bugz)

References

Details

(Keywords: crash, regression, testcase, Whiteboard: [sg:critical?][depends on bug 468563, caused bug 526217] )

Crash Data

Attachments

(4 files, 3 obsolete files)

Attached file testcase
Loading the testcase makes Firefox crash [@ GetChildListNameFor] dereferencing 0xdddddddd.
Flags: blocking1.9?
We're under DeletingFrameSubtree, and the first frame in the destroyQueue is already dead.  So when we go to make the RemoveFrame() call, all sorts of bad stuff happens.

We need to fix this for 1.9, imo.
Priority: -- → P2
Whiteboard: [sg:critical?]
Attached file Frame dump #1
A restyle event leads up to DeletingFrameSubtree() on 0x12f4428 (lime).
Traversing that, and its continuations, we find placeholders for
(red) and (blue) which are added to the destroy queue.
We're destroying frames in reverse order from that list so we
destroy (blue) first, which is means (red) is already destroyed
when we get to it on the destroy queue.

We shouldn't have added these frames at all to the destroy queue since
they are descendants of the frame we originally wanted to remove (lime).
That, in itself, isn't normally a problem that would lead to a crash
as long as we find all the placeholders, and in the right order.
(ie. if (red) were last on destroy queue it would have been destroyd
and remove from the frame tree by the time we destroy (blue))

What also worries me is the hypothetical case where (blue) is *not* a 
descendant of (lime), then we would recurse into it but since
DoDeletingFrameSubtree() does not traverse continuations we would fail
to find the placeholder for (red) and simply not process it, which
could lead to crashes. (We only traverse continuations for the original
frame, in DeletingFrameSubtree()).
I don't know if this case can occur though.

https://bugzilla.mozilla.org/show_bug.cgi?id=379349#c3 says:
#  The overflow frames are still linked up as next-in-flows, 
#  but might not be contiguous in a breadth-first traversal.
Does that break the assumption on order in the destroy queue?
Given the frame tree I think this is a regression from bug 379349
which probably broke the frame tree assumptions we made in bug 310638.
Blocks: 379349
Keywords: regression
OS: Mac OS X → All
Hardware: PC → All
Attached patch wip (obsolete) — Splinter Review
This fixes the testcase but I'm not sure if it's enough...
Hmm, maybe "nsTArray<nsWeakFrame> destroyQueue" would be a more robust solution...
> nsTArray<nsWeakFrame> destroyQueue

That's dangerous.  It'll only work if you preallocate the array length.  If it has to resize, it'll memmove nsWeakFrames, which breaks them (since they hold pointers to each other).
Ok, I see what you mean, thanks for pointing it out.
Attached patch NS_FRAME_EXTERNAL_REFERENCE hack (obsolete) — Splinter Review
Here's a hack that uses NS_FRAME_EXTERNAL_REFERENCE to get a
notification to the frame constructor which then can clear out
the frame in the destroy queue.  I'm not seriously suggesting
this as the fix, just something I wanted to explore, just in case...
(it does fix the crash though)
I think the current code relies on the fact that we skip all child
lists that contain out-of-flows, and we expect to find all relevant
placeholders instead.  Also, it does not process next-in-flows other
than for the top frame because we expected to find all those within
the subtrees of the top frame continuations.  After bug 379349, the
first assumption is clearly false and I think the second assumption
might be false too.

My current thoughts on a real fix is this:
 1. continue to process OverflowContainers-list but skip all
    out-of-flows on this list.
    (same for excessOverflowContainersList? are there more?)
 2. process next-continuations for all out-of-flows, ie these two calls:
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/layout/base/nsCSSFrameConstructor.cpp&rev=1.1450&root=/cvsroot&mark=9343-9345,9348-9351#9278
    should be changed to call a new (mutual recursive) function that
    calls DoDeletingFrameSubtree() on 'outOfFlowFrame' continuations.

Does this sound like the right approach?

For the testcase this means we would not find the placeholder for (red)
when processing (cyan), since (brown) is an out-of-flow on a
OverflowContainers-list.  We will find the placeholder for (blue) when
processing the last-in-flow for (lime), we will not add (blue) to the
destroy queue but recurse on its next-continuations, including (brown)
and this time will find the placeholder for (red) but again it's within
the continuation subtrees of (lime) so it won't be added to the destroy
queue either, but we will still process its children.

For the hypothetical case I was thinking of in comment 2 where (blue)
escapes the subtree we would add it to the destroy queue and recurse on
its next-continuations where we find (red) when processing (brown), but
it's within the subtree of (blue) so it won't be added to the destroy
queue, but we will still process its children.
... although now that I think about it we should probably skip float
continuations since we're supposed to have placeholder continuations
for those IIRC.

BTW, is my assumption that the [excess]overflowContainersList lists
can contain both in-flow and out-of-flow frames correct?
(if it's only OOF frames we could just skip these lists)
Blocks: 412243
Yes. The [excess]overflowContainersList can contain in-flow frames and abspos OOF frames only. It doesn't contain floats directly.

Abspos OOF frames never have placeholder continuations; they only have one placeholder for the first-in-flow. Fixed frames should never split. Floats have placeholder continuations, but we'd like to change that eventually.
Assignee: nobody → mats.palmgren
Flags: blocking1.9? → blocking1.9+
Mats, any progress on your work in progress?  :)
Flags: wanted1.9.0.x+
Flags: blocking1.9-
Flags: tracking1.9+
Flags: blocking1.9.1?
Still crashes on trunk.
Flags: blocking1.9.1? → blocking1.9.1+
Priority: P2 → P3
This is on our "Top Security Bugs" list and it appears to have stalled a bit.  It would be great if we can get some traction here.
Blocks: 463785
The "wip2" patch also seems to fix bug 463785.  I'll try to make a patch
for review in the next couple of days, sorry for the delay.
Attached patch Patch rev. 4Splinter Review
Walk child frame next-in-flows that are overflow containers.
Skip [excess]overflowContainersList to avoid processing them more than once.
(I'm assuming overflow containers only lives on those lists and nowhere else)
Attachment #296505 - Attachment is obsolete: true
Attachment #296609 - Attachment is obsolete: true
Attachment #296632 - Attachment is obsolete: true
Attachment #348490 - Flags: review?(fantasai.bugs)
Whiteboard: [sg:critical?] → [sg:critical?][needs review]
Comment on attachment 348490 [details] [diff] [review]
Patch rev. 4

+  // The invariant that "continuing frames should be found as part of the
+  // walk over the top-most frame's continuing frames" does not hold for
+  // out-of-flow overflow containers, so we need to walk them too.
...
+    // The above is true for normal child next-in-flows but not overflow
+    // containers which we do walk because they *can* escape the subtree
+    // we're deleting.

I believe these comments are wrong. OOF overflow containers /will/ be found if their first-in-flow's ancestor's continuations are all found. I think the problem here is the loop in DoDeletingFrameSubtree. It loops over the GetNextSibling() links and it assumes for each frame it processes that its continuations will be processed next. That assumption that breaks down when overflow containers are involved because continuations are no longer contiguous. I conclude that the comments need some revising.

I think the code in the patch is correct. I'm not convinced this is the best way to fix the problem as described above: I suspect some kind of loop over continuations in general would also solve the problem and would be easier to adapt for OOF handling in the future when float continuations don't have placeholders, etc. However we're not in that world yet, and float-placeholder relationships are tricky, so this is just an idea for consideration; I won't r- you on that.
Attachment #348490 - Flags: review?(fantasai.bugs) → review-
Whiteboard: [sg:critical?][needs review] → [sg:critical?][needs patch revised for review comments]
Hi Mats, can you please address fantasai's review comments?  We would like to get this one closed out as it is one of our "Top Security Bugs".
Comment on attachment 348490 [details] [diff] [review]
Patch rev. 4

(In reply to comment #18)
> OOF overflow containers /will/ be found if
> their first-in-flow's ancestor's continuations are all found.

Yes, that's reasonable, but the crash occurs when the first-in-flow's
ancestor is not part of the walk at all. I think the first frame dump in
bug 463785 (attachment 347096 [details]) illustrates this.
The first-in-flow's ancestor (DIV 0x180c710) of the first-in-flow OOF
overflow container (TD 0x1901698/blue) isn't involved at all.

The comment above refers to the initial frame to DeletingFrameSubtree
(TR 0x1901000/green) and its continuations.

> I think the code in the patch is correct.

Since we agree on the code change I want to push the patch as is
(to start the baking for 1.9.1) and make comment improvements,
if necessary, as a follow-up patch, ok?
Attachment #348490 - Flags: superreview?(roc)
Attachment #348490 - Flags: review?(roc)
Why not just fix the comments now? Surely it won't take long.
I believe the comments are factually correct, so I don't know how to fix them.
Comment on attachment 348490 [details] [diff] [review]
Patch rev. 4

Yes, I think you're right.
Attachment #348490 - Flags: superreview?(roc)
Attachment #348490 - Flags: superreview+
Attachment #348490 - Flags: review?(roc)
Attachment #348490 - Flags: review+
http://hg.mozilla.org/mozilla-central/rev/f6aa0df95deb

I'm holding the crashtest until Firefox 3.0.x is fixed.

-> FIXED
Status: NEW → RESOLVED
Closed: 16 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Whiteboard: [sg:critical?][needs patch revised for review comments] → [sg:critical?]
Target Milestone: --- → mozilla1.9.2a1
Depends on: 468563
Blocks: 429881
Mats, is it ok if I land this on 1.9.1 for you?
I think we should fix/bake bug 468563 first and land that at the same time.
I'll do that in a day or two.
Flags: blocking1.9.0.7?
Flags: blocking1.9.0.7? → blocking1.9.0.7+
Priority: P3 → P1
Whiteboard: [sg:critical?] → [sg:critical?][needs 1.9.1 landing - depends on bug 468563]
It's been much longer than a day or two and this bug needs to land *now* to make beta3 :-(.
If this is truly a P1 blocker, and is truly dependent on bug 468563, then that other bug should also be a P1 blocker. If not, then we should either remove the dependency and land this one, or make it not a P1.
Changing this back to P3, which is where it was before dveditz changed it last week.
Priority: P1 → P3
Flags: wanted1.9.1+
Flags: blocking1.9.1-
Flags: blocking1.9.1+
Flags: blocking1.9.0.7+ → blocking1.9.0.8+
I'm not clear why this is blocking1.9.1-minus, but we appear to have a patch so why don't you just request 1.9.1 approval and land it?
Flags: blocking1.9.1- → blocking1.9.1?
Comment on attachment 348490 [details] [diff] [review]
Patch rev. 4

Yeah, I can't see any reason not to take this on 1.9.1
Attachment #348490 - Flags: approval1.9.1?
What about comment 26, which is a regression from this bug?
Comment on attachment 348490 [details] [diff] [review]
Patch rev. 4

You're right, I misunderstood.
Flags: blocking1.9.1?
Priority: P3 → P2
Flags: blocking1.9.0.8+ → blocking1.9.0.9?
Flags: blocking1.9.0.10?
Blocks: 508473
Depends on: 526217
No longer depends on: 526217
blocking1.9.1: --- → ?
blocking1.9.2: --- → ?
blocking1.9.1: ? → .11+
blocking1.9.2: ? → .5+
blocking1.9.2: .5+ → .6+
blocking1.9.1: .11+ → needed
blocking1.9.2: .7+ → needed
This was fixed on the trunk well before the 1.9.2 branch, marking it fixed for that branch.
blocking1.9.2: needed → ---
Whiteboard: [sg:critical?][needs 1.9.1 landing - depends on bug 468563] → [sg:critical?][needs 1.9.1 landing - depends on bug 468563, caused bug 526217]
Depends on: ZDI-CAN-852
blocking1.9.1: needed → .15+
1.9.1 fix pushed in bug 468563.
Whiteboard: [sg:critical?][needs 1.9.1 landing - depends on bug 468563, caused bug 526217] → [sg:critical?]
Whiteboard: [sg:critical?] → [sg:critical?][depends on bug 468563, caused bug 526217]
Group: core-security
Crash Signature: [@ GetChildListNameFor]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: