Closed
Bug 1485420
Opened 6 years ago
Closed 6 years ago
Remove extra clips on items
Categories
(Core :: Graphics: WebRender, enhancement, P2)
Tracking
()
RESOLVED
FIXED
mozilla64
Tracking | Status | |
---|---|---|
firefox64 | --- | fixed |
People
(Reporter: jrmuizel, Assigned: aosmond)
References
Details
Attachments
(2 files, 5 obsolete files)
10.39 KB,
patch
|
mstange
:
review+
|
Details | Diff | Splinter Review |
26.22 KB,
patch
|
aosmond
:
review+
|
Details | Diff | Splinter Review |
This causes a bunch of display list bloat and work down the line for WebRender.
For example, each background image has it's own clip chain instead of using the internal clip.
Reporter | ||
Updated•6 years ago
|
Assignee: nobody → aosmond
Assignee | ||
Comment 2•6 years ago
|
||
Distilled from the discussion between mstange and jrmuizel: Each WebRender display item already has a clip rect associated with it. Currently this may be duplicated in the clip chain. We should store the leaf of the clip chain in the display item instead, assuming its ASR matches the item, and it isn't a rounded rect; each display item will need to be updated to take this into consideration. Some items may already be using the clip rect, in which case it perform an intersection with the leaf. If there is no leaf, it should just use the bounds (like it does today). The clip chain itself should have excluded the leaf. If the ASR doesn't match or it is rounded, it should work as it does today.
In practical terms this means we will change ClipManager::BeginClip to filter out the leaf from the clip chain if possible, and call a new method to get the leaf of the clip chain directly from the display item, yielding it if it would have been taken out of the clip chain already. Special attention will need to be paid to the container display items (e.g. nsDisplayTransform, Opaque, etc) because instead of a normal display item, they have a stacking context that will also need to take the clip chain leaf.
Assignee | ||
Comment 3•6 years ago
|
||
Comment 2 continued, reasoning behind this:
1) WebRender is better at dealing with inline clips, vs an associated clip chain
2) It will make the display lists smaller, because the clip chain shrunk
3) We won't have to iterate over as many clips later on as the clip chain is smaller
Assignee | ||
Comment 4•6 years ago
|
||
Still missing all of the container display items.
Reporter | ||
Comment 5•6 years ago
|
||
Comment on attachment 9003487 [details] [diff] [review]
WIP v1
Review of attachment 9003487 [details] [diff] [review]:
-----------------------------------------------------------------
::: gfx/layers/wr/ClipManager.h
@@ +72,5 @@
> + static LayoutDeviceRect GetItemClipRect(nsDisplayItem* aItem,
> + const LayoutDeviceRect& aBounds);
> +
> + static wr::LayoutRect GetItemClipRoundedRect(nsDisplayItem* aItem,
> + const LayoutDeviceRect& aBounds)
GetRoundedItempClipRect() avoids the reader thinking this is returning a rect with rounded corners.
::: layout/tables/nsTableFrame.cpp
@@ +7550,5 @@
> LayoutDeviceRect borderRect = LayoutDeviceRect::FromUnknownRect(NSRectToRect(param->mBorderRect + aOffset,
> param->mAppUnitsPerDevPixel));
>
> wr::LayoutRect roundedRect = wr::ToRoundedLayoutRect(borderRect);
> + wr::LayoutRect clipRect = layers::ClipManager::GetItemClipRoundedRect(aItem, borderRect);
Instead of passing around aItem here and calling GetItemClip() multiple times we can probably just do it up front and pass around the clip rect.
Reporter | ||
Comment 6•6 years ago
|
||
Comment on attachment 9003487 [details] [diff] [review]
WIP v1
Kats, if you have the time, it'd be nice to know if you have any opposition to this.
Flags: needinfo?(kats)
Comment 7•6 years ago
|
||
Do we have numbers that give us a sense of how much perf improvement we get from this? The change can be described in simple conceptual terms but I suspect this is going to increase the code complexity dramatically and I'd like to avoid that if the perf win is small. Leaving ni on me since I'd like to look at Andrew's patch in more detail as well.
Assignee | ||
Comment 8•6 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) (parental leave) from comment #7)
> Do we have numbers that give us a sense of how much perf improvement we get
> from this? The change can be described in simple conceptual terms but I
> suspect this is going to increase the code complexity dramatically and I'd
> like to avoid that if the perf win is small. Leaving ni on me since I'd like
> to look at Andrew's patch in more detail as well.
I should have something more complete in the next day or so.
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fdab7d65aeb2cbd5862e334c9aab6f254cfc8ab6
With a bit of extra code to disable this path on the container display items. Only a couple of reftest failures so hopefully not making a meaningful difference on the results. Going to my gmail inbox, my scene-1.0.ron file from the capture went from 4.1 MB to 1.7 MB. I'll collect some proper performance numbers once I fix the reftests.
Comment 9•6 years ago
|
||
Comment on attachment 9003487 [details] [diff] [review]
WIP v1
Review of attachment 9003487 [details] [diff] [review]:
-----------------------------------------------------------------
The way I would have done this patch is in ClipManager::BeginItem, compute the item clip rect as a Maybe<LayoutDeviceRect> and then push it onto a stack in the wr::DisplayListBuilder (the call sites for BeginItem can provide a handle to it). Then, instead of changing the call sites for every display item, I would do the clip computation inside the various wr::DisplayListBuilder::PushXXX functions using the topmost thing on the stack. If the topmost item is a Nothing then just use the rect passed in, otherwise intersect the provided rect with the topmost.
That would make the change less brittle in that it's easier to enumerate all the PushXXX functions in wr::DisplayListBuilder than it is to enumerate all their call sites and make sure you don't miss anything. Also, for gecko display items that generate multiple WR display items, you're currently calling GetItemClipRect multiple times with the exact same args which is kind of inefficient. BeginItem already has the clip in question so it should cheap/free to do there.
Comment 10•6 years ago
|
||
But assuming the patch doesn't get much more complicated than this, it seems like it should be worth doing. The scene file reduction probably won't translate to an equivalent perf improvement but even so reducing the display list size that drastically seems worthwhile.
Flags: needinfo?(kats)
Assignee | ||
Comment 11•6 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) (parental leave) from comment #9)
> Comment on attachment 9003487 [details] [diff] [review]
> WIP v1
>
> Review of attachment 9003487 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> The way I would have done this patch is in ClipManager::BeginItem, compute
> the item clip rect as a Maybe<LayoutDeviceRect> and then push it onto a
> stack in the wr::DisplayListBuilder (the call sites for BeginItem can
> provide a handle to it). Then, instead of changing the call sites for every
> display item, I would do the clip computation inside the various
> wr::DisplayListBuilder::PushXXX functions using the topmost thing on the
> stack. If the topmost item is a Nothing then just use the rect passed in,
> otherwise intersect the provided rect with the topmost.
>
> That would make the change less brittle in that it's easier to enumerate all
> the PushXXX functions in wr::DisplayListBuilder than it is to enumerate all
> their call sites and make sure you don't miss anything. Also, for gecko
> display items that generate multiple WR display items, you're currently
> calling GetItemClipRect multiple times with the exact same args which is
> kind of inefficient. BeginItem already has the clip in question so it should
> cheap/free to do there.
Yeah for the multiple calls, it was just the easiest way to get this working quickly :). I like your overall approach better, I'll see about rewriting it that way.
Comment 12•6 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) (parental leave) from comment #9)
> The way I would have done this patch is in ClipManager::BeginItem, compute
> the item clip rect as a Maybe<LayoutDeviceRect> and then push it onto a
> stack in the wr::DisplayListBuilder (the call sites for BeginItem can
> provide a handle to it). Then, instead of changing the call sites for every
> display item, I would do the clip computation inside the various
> wr::DisplayListBuilder::PushXXX functions using the topmost thing on the
> stack. If the topmost item is a Nothing then just use the rect passed in,
> otherwise intersect the provided rect with the topmost.
I really like this suggestion.
Assignee | ||
Comment 13•6 years ago
|
||
This should pass all the reftests. I'll begin rewriting it as kats suggested.
try: https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=3d2e8cfa4d62456801a5d5e4f971b49f66c1d790
Attachment #9003487 -
Attachment is obsolete: true
Assignee | ||
Comment 14•6 years ago
|
||
Assignee | ||
Comment 15•6 years ago
|
||
Attachment #9004206 -
Attachment is obsolete: true
Assignee | ||
Comment 16•6 years ago
|
||
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fcd7e7c5b56687dd25a076aa1857004e3ea80819
This should be green on try for real this time. I should rejig if possible to make it work more like ClipManager::mItemClipStack does today (I had to add ClipManager::EndItem as an example, ideally I would not).
Reporter | ||
Updated•6 years ago
|
Priority: -- → P2
Assignee | ||
Comment 17•6 years ago
|
||
Attachment #9004624 -
Attachment is obsolete: true
Assignee | ||
Comment 18•6 years ago
|
||
green try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4ce78b2c328136b7757f0d55765ea777e44c094f
I merged the leaf extraction with ItemClips rather than having it broken out separately.
Attachment #9004625 -
Attachment is obsolete: true
Assignee | ||
Comment 19•6 years ago
|
||
Although the scene.rson file size can be cut in half, the actual binary representation of the display list appears to be reduced (caveat: on the few pages I tried) by around 6-7%. The rate of us merging in the clip rect leaf seems to be high, anywhere from 66% in general to 95% on gmail. The reuse of the clip chain on the clip stack (clips.HasSameInputs) appears to be reduced slightly, by perhaps a few percent from say 70%ish to 65%ish. There may be opportunities to improve this -- the leaf could be different, while the clip chain after stripping the leaf is the same, so we just need to regenerate the leaf rect calculation.
Assignee | ||
Updated•6 years ago
|
Attachment #9004839 -
Flags: review?(mstange)
Assignee | ||
Updated•6 years ago
|
Attachment #9004840 -
Flags: review?(mstange)
Updated•6 years ago
|
Blocks: stage-wr-trains
Comment 20•6 years ago
|
||
Comment on attachment 9004839 [details] [diff] [review]
0001-Bug-1485420-Part-1.-Expose-WebRender-bindings-to-mer.patch, v2
Review of attachment 9004839 [details] [diff] [review]:
-----------------------------------------------------------------
::: gfx/webrender_bindings/WebRenderAPI.h
@@ +540,5 @@
> std::unordered_map<layers::FrameMetrics::ViewID, wr::WrClipId> mScrollIds;
>
> + // Contains the current leaf of the clip chain to be merged with the
> + // display item's clip rect when pushing an item. May be set to Nothing() if
> + // there is clip rect to merge with.
Should this say "if there is *no* clip rect to merge with"?
::: gfx/webrender_bindings/WebRenderTypes.h
@@ +359,5 @@
> + r.origin.y = std::max(aRect.origin.y, aOther.origin.y);
> + r.size.width = std::min(aRect.origin.x - r.origin.x + aRect.size.width,
> + aOther.origin.x - r.origin.x + aOther.size.width);
> + r.size.height = std::min(aRect.origin.y - r.origin.y + aRect.size.height,
> + aOther.origin.y - r.origin.y + aOther.size.height);
I think this would be clearer if the " - r.origin.x/y" was outside the std::min(). Then it'll be more obvious that we're taking the minimum of the right/bottom edge, and then convert the edge to the size.
Attachment #9004839 -
Flags: review?(mstange) → review+
Assignee | ||
Comment 21•6 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #20)
> Comment on attachment 9004839 [details] [diff] [review]
> 0001-Bug-1485420-Part-1.-Expose-WebRender-bindings-to-mer.patch, v2
>
> Review of attachment 9004839 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: gfx/webrender_bindings/WebRenderAPI.h
> @@ +540,5 @@
> > std::unordered_map<layers::FrameMetrics::ViewID, wr::WrClipId> mScrollIds;
> >
> > + // Contains the current leaf of the clip chain to be merged with the
> > + // display item's clip rect when pushing an item. May be set to Nothing() if
> > + // there is clip rect to merge with.
>
> Should this say "if there is *no* clip rect to merge with"?
>
Yes, fixed.
> ::: gfx/webrender_bindings/WebRenderTypes.h
> @@ +359,5 @@
> > + r.origin.y = std::max(aRect.origin.y, aOther.origin.y);
> > + r.size.width = std::min(aRect.origin.x - r.origin.x + aRect.size.width,
> > + aOther.origin.x - r.origin.x + aOther.size.width);
> > + r.size.height = std::min(aRect.origin.y - r.origin.y + aRect.size.height,
> > + aOther.origin.y - r.origin.y + aOther.size.height);
>
> I think this would be clearer if the " - r.origin.x/y" was outside the
> std::min(). Then it'll be more obvious that we're taking the minimum of the
> right/bottom edge, and then convert the edge to the size.
I have done so, but for reference, I copied this from:
https://searchfox.org/mozilla-central/rev/d4ef4e9747133aa2914aca2a15cf9df1e42a6aa0/gfx/2d/BaseRect.h#144
Attachment #9004839 -
Attachment is obsolete: true
Attachment #9007784 -
Flags: review+
Comment 22•6 years ago
|
||
Comment on attachment 9004840 [details] [diff] [review]
0002-Bug-1485420-Part-2.-Make-ClipManager-trim-clip-chain.patch, v2
Review of attachment 9004840 [details] [diff] [review]:
-----------------------------------------------------------------
::: gfx/layers/wr/ClipManager.cpp
@@ +158,5 @@
> }
>
> + // We may be able to reduce the number of display items (e.g. clips) by
> + // combining the leaf of the clip chain with the clip rect of the display
> + // item. Container display items are not currently supported because the clip
To me, the phrasing "We may be able to" makes it sound like this is a theoretical future optimization, and not something that we're doing here. How about phrasing it as "In most cases we can combine the leaf of the clip chain with the clip rect of the display item. This reduces the number of clip items, which avoids some overhead further down the pipeline."
@@ +193,5 @@
> + default:
> + separateLeaf = true;
> + break;
> + }
> + }
I thought about how to write this in a prettier way but I couldn't come up with anything I liked. So what you have is fine.
Attachment #9004840 -
Flags: review?(mstange) → review+
Comment 23•6 years ago
|
||
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a3ea43b5e6d8
Part 1. Expose WebRender bindings to merge a clip rect with a display item. r=mstange
https://hg.mozilla.org/integration/mozilla-inbound/rev/302dcd813942
Part 2. Make ClipManager trim clip chains which can be merged with the display item. r=mstange
Comment 24•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/a3ea43b5e6d8
https://hg.mozilla.org/mozilla-central/rev/302dcd813942
Status: NEW → RESOLVED
Closed: 6 years ago
status-firefox64:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Updated•6 years ago
|
status-firefox63:
affected → ---
Comment 25•6 years ago
|
||
a talos improvement:
== Change summary for alert #15933 (as of Sat, 15 Sep 2018 01:52:35 GMT) ==
Improvements:
3% displaylist_mutate linux64-qr opt e10s stylo 7,790.08 -> 7,551.09
3% displaylist_mutate windows10-64-qr opt e10s stylo 6,867.71 -> 6,695.37
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=15933
You need to log in
before you can comment on or make changes to this bug.
Description
•