Closed Bug 1124917 Opened 10 years ago Closed 8 years ago

Very slow flexbox performance on flexified Cleopatra

Categories

(Core :: Layout, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1209697

People

(Reporter: mstange, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

I attempted to convert the cleopatra tree view to flexbox and ended up with something very sluggish. I then reduced the page to a standalone performance test.

I think there are at least two bugs here: Adding the marker element shouldn't need to reflow the tree, and reflowing the tree shouldn't take as long as it does in the first place. If, for some reason, the CSS is written in such a way that reflow *has* to do a lot of work, we should at least find out how we can rewrite it in a cheaper way that still gives the same result.

Chrome is very fast on this testcase but the layout looks very broken in it, so I'm not sure we can infer anything from that.
Flags: needinfo?(dholbert)
Good news -- I'm very close to fixing bug 1054010 (and its chain of dependencies), and it helps a *lot* here.

In a debug build without bug 1054010 fixed, the testcase here alerts with reflow times in the ~3100ms range.

In a debug build *with* bug 1054010 fixed, the testcase here alerts with reflow times in the ~270ms range. So, 90% speedup.

(Numbers will likely be an order of magnitude lower in opt builds, but I expect the speedup to be comparable.)
Depends on: 1054010
Flags: needinfo?(dholbert)
(In reply to Daniel Holbert [:dholbert] from comment #1)
> (Numbers will likely be an order of magnitude lower in opt builds, but I
> expect the speedup to be comparable.)

You're be correct here given the difference is an order of magnitude but I've been bitten many many times by this reasoning before I learned so I'd avoid it as much as possible. :)

Sounds great otherwise!
Sounds promising! What would be needed to move the times all the way down to zero, or at least very close to zero? I don't really see why we'd need to process the tree items at all during reflow, when we've only inserted an absolutely positioned element in some secluded part of the page.
FWIW: I tried an opt build with my patch for bug 1054010, and the reported reflow times are all less than 100ms, or not too much over 100ms -- (45,54,71,99,90,71,95,98,84,97), while current Nightly gives me values in the 300-400ms range (310,339,343,323,357,379,341,373,369,344).

So not a 90% speedup, but still a marked improvement.

(In reply to Markus Stange [:mstange] from comment #3)
> Sounds promising! What would be needed to move the times all the way down to
> zero, or at least very close to zero? I don't really see why we'd need to
> process the tree items at all during reflow, when we've only inserted an
> absolutely positioned element in some secluded part of the page.

I'd need to poke around in gdb a bit, to see what the various dirty flags look like when we hit reflow in this case, and how easy it is to optimize for it. (It's possible it might be easy.)

Ever since bug 851012, flexbox reflow is on the greedy side, and can't easily benefit from our block-layout "reflow not needed, return early" optimizations, because its children interact & influence each other's sizes in fundamentally different ways than they do in block layout.  So, unlike e.g. nsBlockFrame::Reflow, the flexbox reflow method greedily triggers a ::Reflow call on each of its children, and then trusts each of those children to return early if *they* can tell that nothing's changing.  This is pretty cheap if we have blocks inside of a flex container, but if the children themselves are flex containers, then this means we're still doing quite a bit of ::Reflowing.

There are likely optimization that can be done to improve this particular case, though, and for other easy/obvious cases.  But e.g. if all we know is that one of our descendants in one of our flex items is dirty & needs a reflow (which I suspect is effectively the information we have here), that can easily end up changing the fit-content or min-content size of that flex item, which could (by default) affect the amount of flexed free-space that it deserves, which then affects the amount of flexed free space that everything else deserves. So a single descendant being dirty can often mean that we need to reflow everything. In this case, you say the thing needing a reflow is an abspos descendant, so really it *can't* affect the sizing of our descendant chain, but I'm not sure offhand how easy it is to detect that, at nsFlexContainerFrame::Reflow.)

Unrelated to this abspos-descendant scenario -- there are of course cases when we can be sure that changes inside of a flex item won't affect its sizing -- e.g. if none of the flex item's [min-width, min-height, flex-basis, & cross-size property (height or width) are set to values that depend on content.  In that sort of scenario, if just that one flex item is dirty, we should be OK to just reflow that one flex item and leave the rest un-reflowed; but we don't yet have a special case for that sort of thing.
Er, I meant "ever since bug 851607" (which is related to bug 851012).
(In reply to Daniel Holbert [:dholbert] from comment #4)
> But e.g. if all we know is
> that one of our descendants in one of our flex items is dirty & needs a
> reflow (which I suspect is effectively the information we have here), that
> can easily end up changing the fit-content or min-content size of that flex
> item,

Can we, in that case, reflow just the dirty item, check whether its fit-content or min-content size has changed, and skip all the rest of reflow if it hasn't?
We could, though that would require that we cache those values from the previous layout, for each flex item that cares about them (& cares about detecting if they've changed).

Right now, when we hit reflow, all we have saved from the previous reflow is the flex item's mRect, which just tells us *final position & size* that we arrived at for that flex item (after running the flex algorithm), and that's not much use for telling us whether the item's intrinsic sizes have changed.

(We probably should cache more at the end of flex layout -- likely some subset of the things that we store in each FlexItem struct, similar to how we persist nsLineBoxes in block layout.)
Caching more values sounds absolutely worth it to me if it means that we can short-circuit large parts of reflow. I don't think we should rely on being able to detect special cases like absolute position.

I have another example from cleopatra: The sidebar on the right ("heaviest callstack") changes its contents whenever a different tree row is selected. It has a fixed width and overflow:auto. In my flexified version of cleopatra, it's a flex sibling of the tree wrapper div. We should ideally only reflow the contents of that sidebar and nothing else in that case. Would we be able to do that optimization even without caching the intrinsic sizes? We know that the width is fixed, so we know that the intrinsic sizes are fixed, too (is that correct?), so if the width hasn't changed (which we can detect because that's the one thing we cache), we know that the intrinsic sizes haven't changed either... would that work?
I'm not sure -- supposing we notice that this particular right-sidebar flex item is marked as dirty & needs a reflow (because something changed inside of it), and we can tell that it *currently* has a fixed width, I suspect we still can't be sure that it *had* a fixed width on the previous reflow.  Maybe we already expose enough information to reason about that via some other way, though (e.g. via the absence of effects we'd expect to see from the nsChangeHint that would've been triggered, if that flex item's width / flex-basis *had* changed since its last reflow).  I don't have the reflow dirty-state management stuff fresh enough in my memory to be able to say for sure.
Makes sense. Thanks for the answers!

Would it be useful if I created more perftests like the one I've attached here, covering the cleopatra cases I care about?
Yes, thanks! Though, maybe don't pay too much attention to specific pathological cases until next week, after bug 1054010 is (hopefully) fixed.
Sounds good, I'll wait until that happens then.
Results on Windows 7 - Intel HD3000
Firefox 49: Reflow times: 43,45,43,45,42,43,42,42,42,42 (in ms)
Chrome 52: Reflow times: 2,1,1,0,0,1,0,0,0,0 (in ms)
Before bug 1209697: 26,33,31,45,40,28,35,37,27,41
After: 5,5,6,14,4,10,5,6,7,6

David, I'm triaging the possible dupes of bug 1209697 and I'm tempted to close this as a dupe, but given chromium's numbers perhaps there's some room for improvement? (I fear it's a more general issue though)
Flags: needinfo?(dholbert)
Also, note that chromium no longer seems to display a broken layout.
Yeah, I think we can cal this "fixed", particularly that the times here were originally in the seconds (thousands of milliseconds), per comment 1.  And I don't expect this is bad enough to cause the sluggish layout described in comment 0.

(The chrome comparison still shows some room for improvement, particularly on subsequent reflows; but we'd probably want a more targeted testcase to drill down on that.)

I'll mark this as a dupe.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(dholbert)
Resolution: --- → DUPLICATE
> And I don't expect this is bad enough to cause the sluggish layout described in comment 0.

(Sorry, by "this" I mean "the new behavior in current Nightlies")
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: