Use-after-poison in nsFloatManager::GetFlowArea (with floats, multicol, and huge width)
Categories
(Core :: Layout, defect, P3)
Tracking
()
People
(Reporter: s.paraschoudis, Assigned: MatsPalmgren_bugz)
References
Details
(5 keywords, Whiteboard: [adv-main68-])
Crash Data
Attachments
(4 files, 1 obsolete file)
Updated•10 years ago
|
Updated•10 years ago
|
| Assignee | ||
Comment 1•10 years ago
|
||
Comment 2•10 years ago
|
||
Comment 3•10 years ago
|
||
Updated•10 years ago
|
Comment 4•8 years ago
|
||
Updated•8 years ago
|
Updated•7 years ago
|
Comment 5•7 years ago
|
||
I've attached a simplified testcase. Actually the cause is the violation of pushed-float invariant in nsColumnSetFrame.
Under certain cases, it will reflow the content with a smaller height constraint when a bigger one has already overflowed. If the previous failure has pushed some float, then a smaller height constraint could make the prev-in-flow of those pushed floats overflow, thus violates the invariant of pushed-float.
I will submit a patch.
Comment 6•7 years ago
|
||
If a BSize is already infeasible, we should not try a smaller size. Because if
the previous failure leaves some pushed floats, a smaller try might make
the prev-in-flow of the pushed float overflow, which violates the invariant of
pushed float and may cause use-after-free and crash.
Comment 7•7 years ago
|
||
Hi Mats,
I'm not sure whether you're the correct person to review this patch, I set the reviewer to you because I found multiple comments from you in layout/generic code.
Could you assign this bug to me and review my patch? In case I got the reviewer wrong, could you forward the patch to a correct reviewer?
Thanks!
| Assignee | ||
Updated•7 years ago
|
| Assignee | ||
Comment 8•7 years ago
|
||
Violet, maybe you should apply for Level 1 Commit Access so you can push to Try yourself?
https://wiki.mozilla.org/ReleaseEngineering/TryServer
Comment 9•7 years ago
|
||
Ok, I will file a request for level 1 access and run the test before submitting. I've just read the instruction page, it says I need a voucher to apply, would you vouch for me?
Regarding the bug, I'm not sure whether I have explained the root cause clearly on that page. A step-by-step explanation might be very lengthy, but the timeline can be quickly figured out by running $ MOZ_LOG=ColumnSet:4 ./mach run with some printf to track nsBlockFrame::Reflow, nsBlockFrame::SplitFloat, etc.
The oddity emerges when the bsize constraint goes from 40 to 19, just a few log messages before crash happens.
| Assignee | ||
Comment 10•7 years ago
|
||
(In reply to violet.bugreport from comment #9)
Ok, I will file a request for level 1 access and run the test before submitting. I've just read the instruction page, it says I need a voucher to apply, would you vouch for me?
Sure. CC me on the bug.
Regarding the bug, I'm not sure whether I have explained the root cause clearly on that page. A step-by-step explanation might be very lengthy, but the timeline can be quickly figured out by running
$ MOZ_LOG=ColumnSet:4 ./mach runwith someprintfto tracknsBlockFrame::Reflow,nsBlockFrame::SplitFloat, etc.
I can't really review any changes unless I think I have a clear understanding what the problem is first. I can debug it myself of course but that may take some time.
Comment 11•7 years ago
|
||
The machine I use to debug firefox has some problems, so I need time to fix them before applying for try server access.
So I think it would be better to submit the update that I already tested a few days ago. Would you test it on the try server and review the change? Thanks!
| Assignee | ||
Comment 12•7 years ago
|
||
Well, it passed tests on Linux at least:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6ffdbe718e26c854bd4ad98b2d15e602453359a8
As I said earlier, I'm not going to review this until I understand
what goes wrong in the frame tree with our current code. It would
help if you can try to explain it to me, otherwise I'll have to
debug it myself.
Comment 13•7 years ago
•
|
||
Ok, I will explain the oddity in the nsColumnSetFrame with a further reduced testcase by dropping the surrounding div around dev.float-L which causes an assertion failure (see my explanation in a comment on phabricator). So the testcase would be:
<div class="multicol-b">
<div class="step"></div>
<div class="float-L"></div>
</div>
-
1px=60 logical unit. Both
.stepand.float-Lhave 1px height, thus it's totally 2px=120. And we need at least 40 BSize to place it in 3 columns (120 = 40 * 3). -
At the beginning of the log, it's the outer multicol
.multicol-arelowing the child #0 withavailBSize=1699,float-Rconsumes 26px=1560, while the inner multicol has 1px border, which consumes 2*1px=120. So theavailableContentBSizefor the inner one only remains1699 - 1560 - 120 = 19. But because of the lack of astd::minin initialization, it tries to reflow40. -
see the inline comments in the log
// outer reflows child #0
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Reflowing child #0 0x7fe0ef551be8: availBSize=1699
[00000003] nsBlockFrame:0x7fe0ef551be8 Reflow// .float-R reflows
[00000004] nsBlockFrame:0x7fe0ef551d38 Reflow
[00000004] nsBlockFrame:0x7fe0ef551d38 Reflow done// inner reflows
[Child 8618: Main Thread]: D/ColumnSet ChooseColumnStrategy: numColumns=3, colISize=4000, expectedISizeLeftOver=0, colBSize=40, colGap=0// at this point, we only have 19 available, why try
mColMaxBSize=40? To exceed 19 is pointless
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Doing column reflow pass: mLastBalanceBSize=40, mColMaxBSize=40, RTL=0, mBalanceColCount=3, mColISize=4000, mColGap=0[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Reflowing child #0 0x7fe0ef551df0: availBSize=40
[00000004] nsBlockFrame:0x7fe0ef551df0 Reflow
[00000005] nsBlockFrame:0x7fe0ef551f40 Reflow
[00000005] nsBlockFrame:0x7fe0ef551f40 Reflow done
[00000004] nsBlockFrame:0x7fe0ef551df0 Reflow done
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Reflowed child #0 0x7fe0ef551df0: status=[Complete=N,NIF=Y,Truncated=N,Break=N,FirstLetter=N], desiredSize=(4000,40), CarriedOutBEndMargin=0 (ignored)
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Next childOrigin.iCoord=4060
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Reflowing child #1 0x7fe0ef552398: availBSize=40
[00000004] nsBlockFrame:0x7fe0ef552398 Reflow
[00000005] nsBlockFrame:0x7fe0ef552450 Reflow
[00000005] nsBlockFrame:0x7fe0ef552450 Reflow done
[00000005] nsBlockFrame:0x7fe0ef551ff8 Reflow
[00000005] nsBlockFrame:0x7fe0ef551ff8 Reflow done
[00000004] nsBlockFrame 0x7fe0ef552398 SplitFloat 0x7fe0ef551ff8 -> 0x7fe0ef5522d0
[00000004] nsBlockFrame:0x7fe0ef552398 Reflow done
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Reflowed child #1 0x7fe0ef552398: status=[Complete=O,NIF=Y,Truncated=N,Break=N,FirstLetter=N], desiredSize=(4000,40), CarriedOutBEndMargin=0 (ignored)
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Next childOrigin.iCoord=8060// Why inconsistent height? Both child #0 and #1 have 40, #2 only has 19?
// That's because https://github.com/mozilla/gecko-dev/blob/master/layout/generic/nsColumnSetFrame.cpp#L712
// the last column is unbounded, thus it takes all avail size, which is 19, which is smaller than the previous columns.
// Now we take a look at float splitting state. 60+60 should be split into 3 column, thus (40)+(20+20)+(40),
// So there is already a split for 60 -> 20 + 40.
// An unfortunate consequence of the oddity here is that the 40 is further split into 40 -> 19 + 21, then
// 21 is pushed to child#2 as a pushed-float, see splitting->below.
// After this reflow, we end up with split float1ff8,22d0,2790
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Reflowing child #2 0x7fe0ef552558: availBSize=19
[00000004] nsBlockFrame:0x7fe0ef552558 Reflow
[00000005] nsBlockFrame:0x7fe0ef5522d0 Reflow
[00000005] nsBlockFrame:0x7fe0ef5522d0 Reflow done
-> [00000004] nsBlockFrame 0x7fe0ef552558 SplitFloat 0x7fe0ef5522d0 -> 0x7fe0ef552790
[00000004] nsBlockFrame:0x7fe0ef552558 Reflow done
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Reflowed child #2 0x7fe0ef552558: status=[Complete=O,NIF=Y,Truncated=N,Break=N,FirstLetter=N], desiredSize=(4000,19), CarriedOutBEndMargin=0 (ignored)
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Done column reflow pass: Infeasible :(// Now it ends with an infeasible state, we arrive at https://github.com/mozilla/gecko-dev/blob/master/layout/generic/nsColumnSetFrame.cpp#L1111
// Then we reflow at 19 since it's the available size. But we shouldn't have tried 40 in the first place
[Child 8618: Main Thread]: D/ColumnSet FindBestBalanceBSize: KnownInfeasibleBSize=40, KnownFeasibleBSize=40
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Doing column reflow pass: mLastBalanceBSize=40, mColMaxBSize=19, RTL=0, mBalanceColCount=3, mColISize=4000, mColGap=0
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Reflowing child #0 0x7fe0ef551df0: availBSize=19
[00000004] nsBlockFrame:0x7fe0ef551df0 Reflow
[00000005] nsBlockFrame:0x7fe0ef551f40 Reflow
[00000005] nsBlockFrame:0x7fe0ef551f40 Reflow done
[00000004] nsBlockFrame:0x7fe0ef551df0 Reflow done
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Reflowed child #0 0x7fe0ef551df0: status=[Complete=N,NIF=Y,Truncated=N,Break=N,FirstLetter=N], desiredSize=(4000,19), CarriedOutBEndMargin=0 (ignored)
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Next childOrigin.iCoord=4060
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Reflowing child #1 0x7fe0ef552398: availBSize=19
[00000004] nsBlockFrame:0x7fe0ef552398 Reflow
[00000005] nsBlockFrame:0x7fe0ef552450 Reflow
[00000005] nsBlockFrame:0x7fe0ef552450 Reflow done
[00000004] nsBlockFrame:0x7fe0ef552398 Reflow done
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Reflowed child #1 0x7fe0ef552398: status=[Complete=N,NIF=Y,Truncated=N,Break=N,FirstLetter=N], desiredSize=(4000,19), CarriedOutBEndMargin=0 (ignored)
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Next childOrigin.iCoord=8060// first 2 children only consume 19+19=38 height, thus the first split float
1ff8is entirely overflow in child#1,
// child#2 drains it, but one of the split float (2790) in its next-in-flow chain is still in child#2's pushed-float list.
[Child 8618: Main Thread]: D/ColumnSet ReflowChildren: Reflowing child #2 0x7fe0ef552558: availBSize=19
[00000004] nsBlockFrame:0x7fe0ef552558 Reflow
[00000004] nsBlockFrame:0x7fe0ef552558 DrainOverflowLines 0x7fe0ef551ff8 <- this one, overflow from child #1
[00000004] nsBlockFrame:0x7fe0ef552558 DrainOverflowLines: nif=0x7fe0ef5522d0
[00000004] nsBlockFrame:0x7fe0ef552558 DrainOverflowLines: nif=0x7fe0ef552790 <- this one, the third split float still in pushed float list of child#2, which violates assertion.
Assertion failure: mFloats.ContainsFrame(nif), at layout/generic/nsBlockFrame.cpp:4756 -
Conclusion: we should not try a size that is greater than
availableContentBSizein the first place, it's a wasted computation at best. It will also make some previously overflow-but-not-drained frames hanging around to do bad things.
Comment 14•7 years ago
|
||
My updated patch is to address this problem. As a note, the original code will theoretically produce a wrong layout. Suppose we have 118 content to lay out in 3 columns, with an availableContentBSize 39 after a previously successful reflow (in order to have mLastBalanceBSize=40 so that the first try will start at 40). Then with original code it will be 118 = 40 + 40 + 38 with a feasible state. But the correct layout should respect the constraint and produce 118 = 39 + 39 + 39, with 1 overflow. This is quite theoretical, since a couple of logical unit can hardly have visible effect.
Comment 15•7 years ago
|
||
Hi Mats,
It's just a kind reminder that this patch I submitted half a month a ago is waiting for your review. I've already explained the timeline of this bug with logging message in a previous comment.
Could you review this patch now? Or if you don't have time, could point me to someone who is also eligible to review this patch? I'm afraid I would probably forget some of the detail in the bug if it takes too long.
| Assignee | ||
Comment 16•7 years ago
|
||
Sorry for the delay, some other bugs required my attention.
I'm debugging this now so I'll get back to you soon...
Comment 17•7 years ago
|
||
Hi Mats,
Do you have further update of your reviewing? If you have any questions about this bug and my patch, feel free to ask me.
| Assignee | ||
Comment 18•7 years ago
•
|
||
So here's my initial analysis of what happens in the frame tree
that leads up to the crash. I've colorized the interesting float
continuation chain, and numbered the interesting frames dumps
in a grey background color and numbered them. Referring to
each of the numbered trees in this log:
- initially we have a float in the first column
- we reflow and find that it overflows (but its parent is complete),
so we put a NIF for it on the OverflowContainersList - when reflowing the next column, we find that it still doesn't
fit so we create another NIF and put it on the PushedFloatsList
and create a continuation for the parent on ExcessOverflowContainersList
CRASH
Now backtracking a bit in rr to see what happens between 3 and the crash...
- This is the frame tree we start with when reflowing our float (red).
Note that the inner ColumnSet has a NIF at this point. - Backtracking to where we create that ColumnSet NIF.
I think this is the point where things start to go wrong.
We've decided that the float's parent doesn't fit in in this
column at all and we push the lines containing it (creating
the Overflow-lines list) and create the NIF. - At this point the NIF has picked up the Overflow-lines so
the float first-in-flow is in the ColumnSet NIF and its continuations
in the previous ColumnSet. Likewise for the float's parent block
which is also in the wrong order. This seems questionable to me.
Technically, we would either be COMPLETE, in which case we'd delete
the NIFs, or INCOMPLETE, in which case we'd push them, but still...
From 6, this is essentially what happens, we decide it's COMPLETE
and delete the NIFs. The reason we crash is that one of the float
NIFs are registered in the float manager, and we try to use it
after it's been deleted (frame-poisoning makes it non-exploitable
though).
(my apologies for the verbosity of this log)
| Assignee | ||
Updated•7 years ago
|
| Assignee | ||
Comment 19•7 years ago
|
||
I wrote a WIP to fix the underlying cause of the crash here:
https://hg.mozilla.org/try/rev/36932abd37ac11308bfee0068ef5eaf74e9f7846
If a continuation pulls in children from its prev-in-flow
then it should also move any overflow containers for those
from its OverflowContainers list, to avoid the child
continuations from becoming in the wrong order.
The WIP currently moves them to the principal child list
(mFrames), but ExcessOverflowContainers might be better.
I'll investigate this some more...
| Assignee | ||
Comment 20•7 years ago
|
||
FTR, there's one more detail I want to highlight in the log above.
At the very end, I print out nsFloatManager::mFloats. We have
the (float) child of the overflow container (green) at index 0,
and the (red) child float from the line list in slot 1. This is
because we call ReflowOverflowContainerChildren before reflowing
the principal children (in the lines), so it's added first.
The stack when we add that float is:
#0 nsTArray_Impl<nsFloatManager::FloatInfo, nsTArrayInfallibleAllocator>::AppendElement
#1 nsFloatManager::AddFloat
#2 mozilla::BlockReflowInput::FlowAndPlaceFloat
#3 nsBlockFrame::ReflowPushedFloats
#4 nsBlockFrame::Reflow
#5 nsContainerFrame::ReflowChild
#6 nsContainerFrame::ReflowOverflowContainerChildren
#7 nsBlockFrame::Reflow
So after we reflow the red float and find that it's COMPLETE
and destroy its continuations, we have a dead continuation
in the float manager.
| Assignee | ||
Comment 21•7 years ago
|
||
Hope you don't mind if I steal this one Violet. I'd like to fix
the underlying crash here separately from any sizing changes.
Please file a follow-up bug with your patch since it might give
more correct layout results. Thanks.
| Assignee | ||
Comment 22•7 years ago
|
||
Comment 23•7 years ago
|
||
Comment 24•7 years ago
|
||
| bugherder | ||
Updated•7 years ago
|
Updated•7 years ago
|
Updated•6 years ago
|
Description
•