BuildDisplayList takes too long on CNN.com

VERIFIED FIXED in Firefox 14

Status

()

Core
Layout
VERIFIED FIXED
5 years ago
5 years ago

People

(Reporter: jrmuizel, Assigned: roc)

Tracking

(Blocks: 1 bug)

Trunk
mozilla14
x86_64
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox14 verified, firefox15 verified, blocking-fennec1.0 beta+)

Details

(Whiteboard: [layout])

Attachments

(10 attachments, 3 obsolete attachments)

1.32 KB, patch
Details | Diff | Splinter Review
1.18 KB, patch
Details | Diff | Splinter Review
7.49 KB, patch
mattwoodrow
: review+
Details | Diff | Splinter Review
11.82 KB, patch
mattwoodrow
: review+
Details | Diff | Splinter Review
2.54 KB, patch
mattwoodrow
: review+
Details | Diff | Splinter Review
2.56 KB, patch
mattwoodrow
: review+
Details | Diff | Splinter Review
21.41 KB, patch
mattwoodrow
: review+
Details | Diff | Splinter Review
7.93 KB, patch
dbaron
: review+
Details | Diff | Splinter Review
1.17 KB, patch
dbaron
: review+
Details | Diff | Splinter Review
1.62 KB, patch
Details | Diff | Splinter Review
(Reporter)

Description

5 years ago
about 5% of our PresShell::Paint time is in BuidlDisplayList (14ms). It would be nice if this was cheaper.
(Reporter)

Updated

5 years ago
Blocks: 717774
Created attachment 612433 [details] [diff] [review]
patch to help performance testing on desktop

This patch forces display list construction and layer construction to run 100 times for every paint. It helps when profiling those phases on desktop.

I use this bookmarklet:
javascript:(function() { var d = document.createElement("div"); d.setAttribute("style", "position:absolute; top:0; left:0; width:2px; height:2px;"); document.body.appendChild(d); var iters = 0; var start = Date.now(); function doframe() { d.style.background = d.style.background ? "" : "black"; mozRequestAnimationFrame(doframe); if (++iters == 100) { alert((Date.now() - start) + "ms"); } } doframe(); })();
to force continuous tiny repaint of the page (cnn.com in this case) and report an elapsed time to paint 100 frames. On my laptop I'm getting results like this:
35861ms
38900ms
37763ms
38083ms
So each display-list construction and layer construction is taking 3-4ms maybe. This is for a browser window with outer dimensions 1146x1080.
Assignee: nobody → roc
36% of the profile is under BuildDisplayListForStackingContext (display list construction), 35% is under PaintRoot (mostly layer construction), and 25% under ComputeVisibilityForRoot.
Created attachment 612513 [details] [diff] [review]
Part 1: Cache bounds (union of all child item bounds) in nsDisplayWrapList

Results after this patch:
36880ms
34236ms
34375ms
Significant improvement.
Attachment #612513 - Flags: review?(matt.woodrow)
Comment on attachment 612513 [details] [diff] [review]
Part 1: Cache bounds (union of all child item bounds) in nsDisplayWrapList

Do we never change the list after create the wrap list item? Say in deleting items that aren't visible?
That's a good point, but it's not a problem because it's OK for the bounds to be conservative.

There is the issue that nsDisplayWrapList::GetOpaqueRegion uses GetBounds, which makes everything a bit confusing when GetBounds is conservative (although I think still correct). I make that clear by having it use mVisibleRect instead.
Comment on attachment 612513 [details] [diff] [review]
Part 1: Cache bounds (union of all child item bounds) in nsDisplayWrapList

Review of attachment 612513 [details] [diff] [review]:
-----------------------------------------------------------------

As long as it isn't possible to create an nsDisplayWrapList without ever calling GetBounds on it (and I don't think that it is), this looks good.
Attachment #612513 - Flags: review?(matt.woodrow) → review+
nsDisplayTransform::GetBounds() might not call GetBounds on its child list if we are prerendering. Don't know if that's worth worrying about though.

Updated

5 years ago
blocking-fennec1.0: --- → ?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #5)
> There is the issue that nsDisplayWrapList::GetOpaqueRegion uses GetBounds,
> which makes everything a bit confusing when GetBounds is conservative
> (although I think still correct). I make that clear by having it use
> mVisibleRect instead.

Actually, I don't need to change that. The point is that GetOpaqueRegion returns a region where every visible pixel is opaque. It's OK for that region to include the bounds of items that were not opaque, but were also not visible. I will add a patch to change comments to make this clear.
Created attachment 613082 [details] [diff] [review]
part 0: fix comment
The patches here are on top of the patch in bug 733607.
Depends on: 733607
(In reply to Matt Woodrow (:mattwoodrow) from comment #7)
> nsDisplayTransform::GetBounds() might not call GetBounds on its child list
> if we are prerendering. Don't know if that's worth worrying about though.

No, I don't think it is.
Created attachment 613083 [details] [diff] [review]
Part 2: When nsOverflowClipWrapper is asked to clip a display item t hat is entirely inside the clip rect, don't bother wrapping the item

Results after this patch:
33300ms
32436ms
32106ms
Significant improvement.
Attachment #613083 - Flags: review?(matt.woodrow)
Created attachment 613084 [details] [diff] [review]
In DisplayBorderBackgroundOutline, when we have created an n sDisplayBackground item and it's themed, don't bother checking to create an nsDisplayBorder item

Results after this patch:
29630ms
30935ms
28158ms
28536ms
Significant improvement.
Attachment #613084 - Flags: review?(matt.woodrow)
The improvement in part 3 is probably not going to be seen on mobile since theming isn't important there.
Created attachment 613092 [details] [diff] [review]
Part 1 v2

This actually had a bad bug that triggered assertions: TryMerge implementations need to combine their mBounds. Moved the nsDisplayWrapList merging code to a MergeFrom helper.
Attachment #613092 - Flags: review?(matt.woodrow)
Created attachment 613096 [details] [diff] [review]
box-shadow patch

We create an nsDisplayBoxShadowOuter and an nsDisplayBoxShadowInner every time we see a box-shadow. We generally only need one of them (usually outer). This patch should let us avoid creating the other one.

For some reason I completely fail to understand, it seems to make things a bit slower :-(.
Attachment #612513 - Attachment is obsolete: true

Updated

5 years ago
blocking-fennec1.0: ? → beta+
Attachment #613092 - Flags: review?(matt.woodrow) → review+
Attachment #613084 - Flags: review?(matt.woodrow) → review+
Attachment #613083 - Flags: review?(matt.woodrow) → review+
Blocks: 741228

Comment 17

5 years ago
Can we land the r+'ed patches now and do the rest in follow up?
Sure.
https://hg.mozilla.org/integration/mozilla-inbound/rev/c6ac469f7247
https://hg.mozilla.org/integration/mozilla-inbound/rev/ef599d7f3f11
https://hg.mozilla.org/integration/mozilla-inbound/rev/70ef464f50e3
https://hg.mozilla.org/integration/mozilla-inbound/rev/1a204c38918e
[inbound merger: please leave this bug open.]
> [inbound merger: please leave this bug open.]

I've put [leave open] in the whiteboard;  please remove once all the patches have landed on inbound.
Whiteboard: [leave open]
Created attachment 613879 [details] [diff] [review]
Part 4: Create Inner/Outer box-shadow items only if we actually have inset/outset shadows

Looks like the performance issues were measurement error. This patch doesn't actually help much on the CNN page (the only box-shadows are in the Firefox chrome, and 2 of the 3 elements affected use both inner and outer shadows), but it would help on any page that used a lot of box-shadows, especially since inset shadows are rarely used on the Web.
Attachment #613096 - Attachment is obsolete: true
Attachment #613879 - Flags: review?(matt.woodrow)
One of these patches turned android mochitest-robocop perma-orange.
6 INFO TEST-UNEXPECTED-FAIL | test_bug720538 | Checking for iframe a few pixels from the right edge - Color rgba(0,130,0,255) not close enough to expected rgb(0,0,255)
Some Windows reftests are perma-orange as well. I'll back out.
Backed out parts 1, 2 and 3.
https://hg.mozilla.org/integration/mozilla-inbound/rev/4dc5567fc431

Updated

5 years ago
Depends on: 744384
Whiteboard: [leave open] → [leave open][layout]
https://hg.mozilla.org/mozilla-central/rev/c6ac469f7247
https://hg.mozilla.org/mozilla-central/rev/ef599d7f3f11
https://hg.mozilla.org/mozilla-central/rev/70ef464f50e3
https://hg.mozilla.org/mozilla-central/rev/1a204c38918e
(In reply to Ehsan Akhgari [:ehsan] from comment #24)
> https://hg.mozilla.org/mozilla-central/rev/c6ac469f7247
> https://hg.mozilla.org/mozilla-central/rev/ef599d7f3f11
> https://hg.mozilla.org/mozilla-central/rev/70ef464f50e3
> https://hg.mozilla.org/mozilla-central/rev/1a204c38918e

Backout: https://hg.mozilla.org/mozilla-central/rev/4dc5567fc431
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #20)
> Looks like the performance issues were measurement error. This patch doesn't
> actually help much on the CNN page (the only box-shadows are in the Firefox
> chrome, and 2 of the 3 elements affected use both inner and outer shadows),
> but it would help on any page that used a lot of box-shadows, especially
> since inset shadows are rarely used on the Web.

So what's the state of the issue here that's a beta blocker for Fennec native?
Part 2 caused reftest failures because the clip-rect is snapped to nearest pixel edges, so even if (before snapping) it contains the bounds of some item, after snapping some partially covered pixel of the item might be clipped out. So part 2 doesn't quite preserve rendering.

I'll land parts 1 and 3 again. They seem to be green on try.
(In reply to David Baron [:dbaron] from comment #26)
> So what's the state of the issue here that's a beta blocker for Fennec
> native?

The patches here should improve things a bit but probably not enough. I'm still working on this bug.
Relanded parts 1 and 3:
https://hg.mozilla.org/integration/mozilla-inbound/rev/007e4f53e361
https://hg.mozilla.org/integration/mozilla-inbound/rev/7623055c5171
Created attachment 614741 [details] [diff] [review]
Part 5: Simple tweaks to avoid wrapping empty lists and to reorder boolean tests
Attachment #613083 - Attachment is obsolete: true
Attachment #614741 - Flags: review?(matt.woodrow)

Comment 31

5 years ago
Matt, can you do these reviews asap please?
Attachment #613879 - Flags: review?(matt.woodrow) → review+
Attachment #614741 - Flags: review?(matt.woodrow) → review+
Comment on attachment 614741 [details] [diff] [review]
Part 5: Simple tweaks to avoid wrapping empty lists and to reorder boolean tests

Review of attachment 614741 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/generic/nsFrame.cpp
@@ +2163,5 @@
>    } else if (disp->IsFloating()) {
> +    if (!list.IsEmpty()) {
> +      rv = aLists.Floats()->AppendNewToTop(new (aBuilder)
> +          nsDisplayWrapList(aBuilder, child, &list));
> +      NS_ENSURE_SUCCESS(rv, rv);

Do we need to wrap when the list consists of a single item?
(In reply to Matt Woodrow (:mattwoodrow) from comment #32)
> ::: layout/generic/nsFrame.cpp
> @@ +2163,5 @@
> >    } else if (disp->IsFloating()) {
> > +    if (!list.IsEmpty()) {
> > +      rv = aLists.Floats()->AppendNewToTop(new (aBuilder)
> > +          nsDisplayWrapList(aBuilder, child, &list));
> > +      NS_ENSURE_SUCCESS(rv, rv);
> 
> Do we need to wrap when the list consists of a single item?

We can probably get away with not wrapping these at all, since nothing ever tries to dig into the Floats() list and stuff in it. I had a patch to do that but it didn't seem to make much difference and it's maybe a bit more risky than some of the other things I'm trying. I'll probably revive it later.
https://hg.mozilla.org/mozilla-central/rev/007e4f53e361
https://hg.mozilla.org/mozilla-central/rev/7623055c5171
https://hg.mozilla.org/mozilla-central/rev/c44c5a7141e3
DisplayBackgroundBorderOutline uses 7-8% of the profile, so I tried a patch that adds a frame state bit indicating that DisplayBBO doesn't need to do anything because the frame has no background/border/outline. Surprisingly it didn't help much because only about 10% of the 360-ish calls to DisplayBBO in a single display list construction pass fail to generate anything. I think this means that most frames that call DisplayBBO (and are visible, i.e. have nonempty overflow area) actually do have a background or border.
Created attachment 615239 [details] [diff] [review]
Part 6: avoid nsIFrame::GetOffsetTo calls

The call to nsIFrame::GetOffsetTo from nsDisplayItem::nsDisplayItem was using a significant chunk of the profile (around 7%). GetOffsetTo is expensive because we have to walk all the way up the (crossdoc) frame tree to the reference frame, which is typically the root frame of the root document.

This patch makes that go away by caching a frame/offset pair in nsDisplayListBuilder and efficiently updating that cache (without calling GetOffsetTo) in BuildDisplayListForChild. nsDisplayListBuilder::ToReferenceFrame gets a hit rate of over 99%. The remaining calls to GetOffsetTo in display list construction are mainly BuildDisplayListForChild entering an out-of-flow frame and setting up the cached frame/offset pair from scratch.

I'm switched to a saved copy of CNN.com so these numbers aren't quite comparable to the ones above:

Before:
27928ms
27074ms
26730ms
27351ms
26899ms

After:
25771ms
25403ms
25572ms

Significant improvement.
Checked in parts 4 and 5.
https://hg.mozilla.org/integration/mozilla-inbound/rev/1bb71108f4de
https://hg.mozilla.org/integration/mozilla-inbound/rev/7b426ec26c0a
Created attachment 615254 [details] [diff] [review]
Part 7: remove expensive nsIFrame::HasBorder checks

Turns out none of the sites that call nsIFrame::HasBorder really need to do all the work of getting the used-border. It's enough to just check nsStyleBorder. If our style doesn't have a border, then we're not going to draw one even if GetUsedBorder returns something non-zero (due to reflow not having run yet).

This is a small perf win, a bit hard to verify on my noisy environment, but the profiler shows it as an improvement in DisplayBackgroundBorderOutline.
Attachment #615254 - Flags: review?(dbaron)
Attachment #615239 - Flags: review?(matt.woodrow)
Created attachment 615305 [details] [diff] [review]
Part 8: speed up SortByZOrder

nsDisplayList::SortByZOrder was about 4% of the profile. This patch reduces it to 1%. Most of the cost was comparing the order of the display items in the content tree, but we don't really need to do that; where we call SortByZOrder, the items will already have been put into the PositionedDescendants() list in frame order (usually via placeholders), which follows content order.
Attachment #615305 - Flags: review?(dbaron)
I have a bunch of patches which make all the display item creation methods, including BuildDisplayList, return void instead of nsresult, removing a lot of conditional branches. I suspect they probably won't help perf much, although they simplify code and we should land them. But they're big patches so I'll hold off until after the next uplift.
https://hg.mozilla.org/mozilla-central/rev/1bb71108f4de
https://hg.mozilla.org/mozilla-central/rev/7b426ec26c0a
Comment on attachment 615239 [details] [diff] [review]
Part 6: avoid nsIFrame::GetOffsetTo calls

Review of attachment 615239 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me, the removal of mEffectsFrame probably should be a separate patch though.
Attachment #615239 - Flags: review?(matt.woodrow) → review+
Depends on: 745934
Comment on attachment 615254 [details] [diff] [review]
Part 7: remove expensive nsIFrame::HasBorder checks

r=dbaron
Attachment #615254 - Flags: review?(dbaron) → review+
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #39)
> nsDisplayList::SortByZOrder was about 4% of the profile. This patch reduces
> it to 1%. Most of the cost was comparing the order of the display items in
> the content tree, but we don't really need to do that; where we call
> SortByZOrder, the items will already have been put into the
> PositionedDescendants() list in frame order (usually via placeholders),
> which follows content order.

Is that true if, say, one of them was inside a float or abs pos element and one of them wasn't?  (nsBlockFrame::BuildDisplayList definitely builds display lists for floats not through placeholders.)
Whiteboard: [leave open][layout] → [leave open][layout][has patches][needs review dbaron]
(In reply to David Baron [:dbaron] from comment #44)
> (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #39)
> > nsDisplayList::SortByZOrder was about 4% of the profile. This patch reduces
> > it to 1%. Most of the cost was comparing the order of the display items in
> > the content tree, but we don't really need to do that; where we call
> > SortByZOrder, the items will already have been put into the
> > PositionedDescendants() list in frame order (usually via placeholders),
> > which follows content order.
> 
> Is that true if, say, one of them was inside a float or abs pos element and
> one of them wasn't?

Yes, I think so.

> (nsBlockFrame::BuildDisplayList definitely builds
> display lists for floats not through placeholders.)

No, it doesn't, it builds them through placeholders (except for the floats pushed into it from prev-in-flows).
Checked in parts 6 (mEffectsFrame change split off into part 5.9) and 7:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a18c49118152
https://hg.mozilla.org/integration/mozilla-inbound/rev/5d290ee96e81
https://hg.mozilla.org/integration/mozilla-inbound/rev/680493f701d9
Comment on attachment 615305 [details] [diff] [review]
Part 8: speed up SortByZOrder

ok, r=dbaron

(I hope we have good test coverage for the edge cases here, even so.  They're interesting cases to test...)
Attachment #615305 - Flags: review?(dbaron) → review+
https://hg.mozilla.org/mozilla-central/rev/a18c49118152
https://hg.mozilla.org/mozilla-central/rev/5d290ee96e81
https://hg.mozilla.org/mozilla-central/rev/680493f701d9
Backed out for reftest failures: https://hg.mozilla.org/integration/mozilla-inbound/rev/8c8b6bbaf7cd
That was part 8.

The reftest failure is in text-overflow/anonymous-block.html; a marker that's not visible before part 8 is now visible. We add overflow markers at the end of the PositionedDescendants() list, during DisplayLine for each line. Before part 8, we'd eventually sort the PositionedDescendants() into content order, and because the marker content is the block itself, this moved the markers to be before (below) all PositionedDescendants() contributed by the block descendants. In this testcase a marker is moved to be below a rel-pos element contributed by a previous line, and the rel-pos element covers it. With part 8, the marker isn't moved and the rel-pos element doesn't cover it anymore.

I think probably markers for line N should be able to appear on top of PositionedDescendants() contributed by lines < N. So, I'll just change the test.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #50)
> I think probably markers for line N should be able to appear on top of
> PositionedDescendants() contributed by lines < N. So, I'll just change the
> test.

Although it's sort of disturbing to have things be out of content order like this.

Why not just put the marker at the end of the Content() list, after the in-flow non-positioned content on the line? That leads to simpler code too.
> Why not just put the marker at the end of the Content() list

Wouldn't that put the ellipsis below backgrounds of the block's inline children?

I think you can add the markers to the bottom of PositionedDescendants() instead of
top (in TextOverflow::CreateMarkers) to keep the current text-overflow rendering.
(In reply to Mats Palmgren [:mats] from comment #52)
> Wouldn't that put the ellipsis below backgrounds of the block's inline
> children?

It would put the ellipsis below backgrounds of inline children on the following lines. That's OK isn't it?

> I think you can add the markers to the bottom of PositionedDescendants()
> instead of
> top (in TextOverflow::CreateMarkers) to keep the current text-overflow
> rendering.

That wouldn't quite work because it would put the ellipses below PostionedDescendants() of earlier siblings of the block too.
> It would put the ellipsis below backgrounds of inline children on the
> following lines.

That's not what I'm seeing.  Backgrounds of child inlines goes on the Content()
list in DisplayLine:
http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsBlockFrame.cpp#6211
so if we put the markers on that list and someone eventually content-sorts
that list the markers will be sorted below.  Your suggested fix breaks:

data:text/html,<div style="overflow:hidden;text-overflow:ellipsis;width:50px;"><span style="padding:0 40px;background:lime"></span></div>

(and many other text-overflow reftests)

> That wouldn't quite work because it would put the ellipses below
> PostionedDescendants() of earlier siblings of the block too.

The lists that the TextOverflow works on is a local collection created in
DisplayLine, I don't see how my suggested change can affect earlier siblings
of the block.
Created attachment 616824 [details] [diff] [review]
my suggested part 9 for clarity

This pass layout/reftests/text-overflow/reftest.list locally on Linux.
For a future correct solution we should probably collect the markers for all
lines on a special list, and add them after all lines are done, in
nsBlockFrame::BuildDisplayList.  It doesn't seem worth the risk doing that
in this cycle though.
(In reply to Mats Palmgren [:mats] from comment #54)
> That's not what I'm seeing.  Backgrounds of child inlines goes on the
> Content()
> list in DisplayLine:
> http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsBlockFrame.
> cpp#6211
> so if we put the markers on that list and someone eventually content-sorts
> that list the markers will be sorted below.  Your suggested fix breaks:
> 
> data:text/html,<div
> style="overflow:hidden;text-overflow:ellipsis;width:50px;"><span
> style="padding:0 40px;background:lime"></span></div>
> 
> (and many other text-overflow reftests)

Yeah, I suppose that's true if someone content-sorts the list.

> > That wouldn't quite work because it would put the ellipses below
> > PostionedDescendants() of earlier siblings of the block too.
> 
> The lists that the TextOverflow works on is a local collection created in
> DisplayLine, I don't see how my suggested change can affect earlier siblings
> of the block.

I misunderstood which PositionedDescendants list you were talking about. I think you're right.
(In reply to Mats Palmgren [:mats] from comment #55)
> Created attachment 616824 [details] [diff] [review]
> my suggested part 9 for clarity
> 
> This pass layout/reftests/text-overflow/reftest.list locally on Linux.

Actually no, I was right the first time. mMarkerList isn't a list from the DisplayLine collection, it's the PositionedDescendants() list passed into nsBlockFrame::BuildDisplayList.

I'm not going to land this again for now, so there's no urgency. When I get back to it, I'll try putting the ellipses in the PositionedDescendants() list before all the regular positioned descendants of the block.

Comment 59

5 years ago
Was review.

Roc, if we have all we're going to take for now, can we close this bug to get it off the blocker list and file separate follow up bugs?
Whiteboard: [leave open][layout][has patches][needs review dbaron] → [leave open][layout][has patches]
Sure.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Filed bug 748646 to follow-up on comment 58 (put text-overflow markers in content order).
Target Milestone: --- → mozilla14
cnn.com is loading pretty fast on the latest Nightly and Aurora builds. 
Closing bug as verified fixed on:

Firefox 15.0a1 (2012-05-29)
Firefox 14.0a2 (2012-05-29)

Device: Galaxy Nexus
OS: Android 4.0.2
Status: RESOLVED → VERIFIED
status-firefox14: --- → verified
status-firefox15: --- → verified
Blocks: 772690
(In reply to Mats Palmgren [:mats] from comment #61)
> Filed bug 748646 to follow-up on comment 58 (put text-overflow markers in
> content order).

...and now that that's been addressed, I spun off bug 772690 to re-land part 8 from here (the patch that was backed out in comment 49)
Whiteboard: [leave open][layout][has patches] → [layout][has patches]
Version: unspecified → Trunk
Whiteboard: [layout][has patches] → [layout]
You need to log in before you can comment on or make changes to this bug.