333 bytes, text/html
11.00 KB, text/html
2.64 KB, patch
|Details | Diff | Splinter Review|
7.30 KB, patch
|Details | Diff | Splinter Review|
Created attachment 296464 [details] testcase Loading the testcase makes Firefox crash [@ GetChildListNameFor] dereferencing 0xdddddddd.
We're under DeletingFrameSubtree, and the first frame in the destroyQueue is already dead. So when we go to make the RemoveFrame() call, all sorts of bad stuff happens. We need to fix this for 1.9, imo.
Created attachment 296503 [details] Frame dump #1 A restyle event leads up to DeletingFrameSubtree() on 0x12f4428 (lime). Traversing that, and its continuations, we find placeholders for (red) and (blue) which are added to the destroy queue. We're destroying frames in reverse order from that list so we destroy (blue) first, which is means (red) is already destroyed when we get to it on the destroy queue. We shouldn't have added these frames at all to the destroy queue since they are descendants of the frame we originally wanted to remove (lime). That, in itself, isn't normally a problem that would lead to a crash as long as we find all the placeholders, and in the right order. (ie. if (red) were last on destroy queue it would have been destroyd and remove from the frame tree by the time we destroy (blue)) What also worries me is the hypothetical case where (blue) is *not* a descendant of (lime), then we would recurse into it but since DoDeletingFrameSubtree() does not traverse continuations we would fail to find the placeholder for (red) and simply not process it, which could lead to crashes. (We only traverse continuations for the original frame, in DeletingFrameSubtree()). I don't know if this case can occur though. https://bugzilla.mozilla.org/show_bug.cgi?id=379349#c3 says: # The overflow frames are still linked up as next-in-flows, # but might not be contiguous in a breadth-first traversal. Does that break the assumption on order in the destroy queue?
Given the frame tree I think this is a regression from bug 379349 which probably broke the frame tree assumptions we made in bug 310638.
Created attachment 296505 [details] [diff] [review] wip This fixes the testcase but I'm not sure if it's enough...
Hmm, maybe "nsTArray<nsWeakFrame> destroyQueue" would be a more robust solution...
> nsTArray<nsWeakFrame> destroyQueue That's dangerous. It'll only work if you preallocate the array length. If it has to resize, it'll memmove nsWeakFrames, which breaks them (since they hold pointers to each other).
Ok, I see what you mean, thanks for pointing it out.
Created attachment 296609 [details] [diff] [review] NS_FRAME_EXTERNAL_REFERENCE hack Here's a hack that uses NS_FRAME_EXTERNAL_REFERENCE to get a notification to the frame constructor which then can clear out the frame in the destroy queue. I'm not seriously suggesting this as the fix, just something I wanted to explore, just in case... (it does fix the crash though)
I think the current code relies on the fact that we skip all child lists that contain out-of-flows, and we expect to find all relevant placeholders instead. Also, it does not process next-in-flows other than for the top frame because we expected to find all those within the subtrees of the top frame continuations. After bug 379349, the first assumption is clearly false and I think the second assumption might be false too. My current thoughts on a real fix is this: 1. continue to process OverflowContainers-list but skip all out-of-flows on this list. (same for excessOverflowContainersList? are there more?) 2. process next-continuations for all out-of-flows, ie these two calls: http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/layout/base/nsCSSFrameConstructor.cpp&rev=1.1450&root=/cvsroot&mark=9343-9345,9348-9351#9278 should be changed to call a new (mutual recursive) function that calls DoDeletingFrameSubtree() on 'outOfFlowFrame' continuations. Does this sound like the right approach? For the testcase this means we would not find the placeholder for (red) when processing (cyan), since (brown) is an out-of-flow on a OverflowContainers-list. We will find the placeholder for (blue) when processing the last-in-flow for (lime), we will not add (blue) to the destroy queue but recurse on its next-continuations, including (brown) and this time will find the placeholder for (red) but again it's within the continuation subtrees of (lime) so it won't be added to the destroy queue either, but we will still process its children. For the hypothetical case I was thinking of in comment 2 where (blue) escapes the subtree we would add it to the destroy queue and recurse on its next-continuations where we find (red) when processing (brown), but it's within the subtree of (blue) so it won't be added to the destroy queue, but we will still process its children.
Created attachment 296632 [details] [diff] [review] wip2, what I suggested in comment 9 ... although now that I think about it we should probably skip float continuations since we're supposed to have placeholder continuations for those IIRC. BTW, is my assumption that the [excess]overflowContainersList lists can contain both in-flow and out-of-flow frames correct? (if it's only OOF frames we could just skip these lists)
Yes. The [excess]overflowContainersList can contain in-flow frames and abspos OOF frames only. It doesn't contain floats directly. Abspos OOF frames never have placeholder continuations; they only have one placeholder for the first-in-flow. Fixed frames should never split. Floats have placeholder continuations, but we'd like to change that eventually.
Mats, any progress on your work in progress? :)
Still crashes on trunk.
This is on our "Top Security Bugs" list and it appears to have stalled a bit. It would be great if we can get some traction here.
The "wip2" patch also seems to fix bug 463785. I'll try to make a patch for review in the next couple of days, sorry for the delay.
Created attachment 348490 [details] [diff] [review] Patch rev. 4 Walk child frame next-in-flows that are overflow containers. Skip [excess]overflowContainersList to avoid processing them more than once. (I'm assuming overflow containers only lives on those lists and nowhere else)
Comment on attachment 348490 [details] [diff] [review] Patch rev. 4 + // The invariant that "continuing frames should be found as part of the + // walk over the top-most frame's continuing frames" does not hold for + // out-of-flow overflow containers, so we need to walk them too. ... + // The above is true for normal child next-in-flows but not overflow + // containers which we do walk because they *can* escape the subtree + // we're deleting. I believe these comments are wrong. OOF overflow containers /will/ be found if their first-in-flow's ancestor's continuations are all found. I think the problem here is the loop in DoDeletingFrameSubtree. It loops over the GetNextSibling() links and it assumes for each frame it processes that its continuations will be processed next. That assumption that breaks down when overflow containers are involved because continuations are no longer contiguous. I conclude that the comments need some revising. I think the code in the patch is correct. I'm not convinced this is the best way to fix the problem as described above: I suspect some kind of loop over continuations in general would also solve the problem and would be easier to adapt for OOF handling in the future when float continuations don't have placeholders, etc. However we're not in that world yet, and float-placeholder relationships are tricky, so this is just an idea for consideration; I won't r- you on that.
Hi Mats, can you please address fantasai's review comments? We would like to get this one closed out as it is one of our "Top Security Bugs".
Comment on attachment 348490 [details] [diff] [review] Patch rev. 4 (In reply to comment #18) > OOF overflow containers /will/ be found if > their first-in-flow's ancestor's continuations are all found. Yes, that's reasonable, but the crash occurs when the first-in-flow's ancestor is not part of the walk at all. I think the first frame dump in bug 463785 (attachment 347096 [details]) illustrates this. The first-in-flow's ancestor (DIV 0x180c710) of the first-in-flow OOF overflow container (TD 0x1901698/blue) isn't involved at all. The comment above refers to the initial frame to DeletingFrameSubtree (TR 0x1901000/green) and its continuations. > I think the code in the patch is correct. Since we agree on the code change I want to push the patch as is (to start the baking for 1.9.1) and make comment improvements, if necessary, as a follow-up patch, ok?
Why not just fix the comments now? Surely it won't take long.
I believe the comments are factually correct, so I don't know how to fix them.
Comment on attachment 348490 [details] [diff] [review] Patch rev. 4 Yes, I think you're right.
http://hg.mozilla.org/mozilla-central/rev/f6aa0df95deb I'm holding the crashtest until Firefox 3.0.x is fixed. -> FIXED
Mats, is it ok if I land this on 1.9.1 for you?
I think we should fix/bake bug 468563 first and land that at the same time. I'll do that in a day or two.
It's been much longer than a day or two and this bug needs to land *now* to make beta3 :-(.
If this is truly a P1 blocker, and is truly dependent on bug 468563, then that other bug should also be a P1 blocker. If not, then we should either remove the dependency and land this one, or make it not a P1.
Changing this back to P3, which is where it was before dveditz changed it last week.
I'm not clear why this is blocking1.9.1-minus, but we appear to have a patch so why don't you just request 1.9.1 approval and land it?
Comment on attachment 348490 [details] [diff] [review] Patch rev. 4 Yeah, I can't see any reason not to take this on 1.9.1
What about comment 26, which is a regression from this bug?
This was fixed on the trunk well before the 1.9.2 branch, marking it fixed for that branch.
1.9.1 fix pushed in bug 468563.