Open Bug 1017298 Opened 10 years ago Updated 2 years ago

Poor layerization during TART resulting in a lot of buffer resizes

Categories

(Core :: Graphics: Layers, defect)

29 Branch
x86
macOS
defect

Tracking

()

People

(Reporter: mattwoodrow, Unassigned)

References

Details

Attachments

(2 files)

After we close a tab, the remaining tabs then animate to larger widths to take up the newly available space in the tab strip. This appears to cause a lot of buffer resizes, and  Bas thinks this may be hurting us in our TART scores.

Example (trimmed) layer trees look like:

Before:
    ClientThebesLayer (0x8c46b78) [visible=< (x=0, y=0, w=1781, h=46); (x=0, y=86, w=1781, h=1); (x=1, y=87, w=1779, h=988); >] [valid=< (x=0, y=0, w=1781, h=46); (x=0, y=86, w=1781, h=1); (x=1, y=87, w=1779, h=988); >]
    ClientThebesLayer (0x176f3cc8) [transform=[ 1 0; 0 1; -41 16; ]] [visible=< (x=114, y=0, w=1203, h=30); (x=1414, y=0, w=3, h=30); >] [valid=< (x=114, y=0, w=1203, h=30); (x=1414, y=0, w=3, h=30); >]
    ClientThebesLayer (0x17b06d00) [transform=[ 1 0; 0 1; -41 16; ]] [visible=< (x=64, y=0, w=1451, h=30); >] [valid=< (x=64, y=0, w=1451, h=30); >]
    
    ClientThebesLayer (0x1759c3b0) [visible=< (x=0, y=45, w=1781, h=41); >] [valid=< (x=0, y=45, w=1781, h=41); >]
    ClientThebesLayer (0x17dbe4c0) [transform=[ 1 0; 0 1; -41 16; ]] [visible=< (x=1500, y=0, w=130, h=31); >] [valid=< (x=1500, y=0, w=130, h=31); >]

After:
    ClientThebesLayer (0x8c46b78) [visible=< (x=0, y=0, w=1781, h=46); (x=0, y=86, w=1781, h=1); >] [valid=< (x=0, y=0, w=1781, h=46); (x=0, y=86, w=1781, h=1); (x=1, y=87, w=1779, h=988); >]
    ClientThebesLayer (0x176f3cc8) [transform=[ 1 0; 0 1; 1 16; ]] [visible=< (x=119, y=0, w=1275, h=30); >] [valid=< (x=117, y=0, w=1277, h=30); >]
    ClientThebesLayer (0x17b06d00) [transform=[ 1 0; 0 1; 1 16; ]] [visible=< (x=15, y=0, w=1484, h=7); (x=15, y=7, w=1484, h=18); (x=1614, y=7, w=16, h=18); (x=15, y=25, w=1484, h=5); >] [valid=< (x=15, y=0, w=1484, h=7); (x=15, y=7, w=1484, h=18); (x=1614, y=7, w=16, h=18); (x=15, y=25, w=1484, h=5); >]
    

    ClientThebesLayer (0x1759c3b0) [visible=< (x=0, y=45, w=1781, h=41); >] [valid=< (x=0, y=45, w=1781, h=41); >]
    ClientThebesLayer (0x17dbe4c0) [transform=[ 1 0; 0 1; 1 16; ]] [visible=< (x=1483, y=0, w=136, h=31); >] [valid=< (x=1483, y=0, w=136, h=31); >]


Where 0x17b06d00 resized 1451 -> 1615, and 0x17dbe4c0 resized 130 -> 136.

Getting consecutive ThebesLayers means that we have content using multiple active geometry roots. I doubt we have any active scrolling going on, so this we're probably making things active due to top/left changes.

One option to fix this would be to disable the active geometry root for content that also has a changing size. That might not always work though, since there's no guarantee we'd immediately flatten it into the background. Other layers can get between the previously active layer and it's background and we'd still get the same layer tree. We could try really hard and force flattening (like we do with BasicLayers to avoid component alpha layers), but there's a whole bunch of performance pitfalls there too.

The other options would be try detect the size changes in the buffer allocation code (RotatedContentBuffer) and allocate some extra size as a buffer. I guess we run the risk of jankyness there, since we'll hit the slower resize path every N paints instead of every paint.
Hmm.. an idea. Tabs and the tab strip are a special case; it's *the* primary UI that we have that users interact with often, and the main UI that has lots of frequent animation.  Conveniently, we also know the min/max size of the tabs.  Can we just add a surface cache for this size range?  The min size of a tab is 1/2 its max size -- if we just keep a cache of a dozen or so max-tab-size surfaces (and never reallocate down if it's within the appropriate range), we should be able to avoid all of these problems at negligible performance cost.
(In reply to Matt Woodrow (:mattwoodrow) from comment #0)
> After we close a tab, the remaining tabs then animate to larger widths to
> take up the newly available space in the tab strip. This appears to cause a
> lot of buffer resizes, and  Bas thinks this may be hurting us in our TART
> scores.

All of TART's animations are with one pinned tab, and then only one more tab which is animated and tested, and the browser window is wide enough (1024x768 IRC) to not let the tested tab bump into the end of the tabstrip.

So I can't see how this issue (which is quite possibly very real) is reflected in TART numbers.

OTOH, TART can measure such case as well - when running it manually: open few tabs such that the tabstrip is full of them, and then run one of the animation sub-tests.

But this doesn't happen during "normal" talos runs of TART.
Also, I encourage anyone who considers TART numbers, to run it at least once locally.

At this link you'll find about half a page which describes what TART measures and how, and at the bottom of it simple instructions on how to use it locally (just install the addon and visit a URL) and a link to install the addon:

https://wiki.mozilla.org/Buildbot/Talos/Tests#TART.2FCART

And do ping me with whatever concerns you have about it. I'm willing to modify anything you think is important.
(In reply to Avi Halachmi (:avih) from comment #2)
> 
> All of TART's animations are with one pinned tab, and then only one more tab
> which is animated and tested, and the browser window is wide enough
> (1024x768 IRC) to not let the tested tab bump into the end of the tabstrip.
> 
> So I can't see how this issue (which is quite possibly very real) is
> reflected in TART numbers.

Well that explains why my attempts to fix this didn't have any effect on the results.
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #1)
> Hmm.. an idea. Tabs and the tab strip are a special case; it's *the* primary
> UI that we have that users interact with often, and the main UI that has
> lots of frequent animation.  Conveniently, we also know the min/max size of
> the tabs.  Can we just add a surface cache for this size range?  The min
> size of a tab is 1/2 its max size -- if we just keep a cache of a dozen or
> so max-tab-size surfaces (and never reallocate down if it's within the
> appropriate range), we should be able to avoid all of these problems at
> negligible performance cost.

Yeah, that's quite a nice idea. By the looks of things, the majority of the tabs are all within a single layer though, except for the active layer. The larger layer (~1500px wide) wouldn't hit this cache.
When we resize a buffer, we copy all the contents from the old to the new one, but we don't take invalidation into account.

Clipping to the valid region should prevent us from copying pixels that will be overwritten later.
This is a pretty rough POC.

It detects when we resize a buffer twice in a row and pads out the buffer size to accommodate future size changes.

Unfortunately the combination of these two changes doesn't seem to have any effect on our talos results:
https://tbpl.mozilla.org/?tree=Try&rev=f3680b8b9afe
(In reply to Matt Woodrow (:mattwoodrow) from comment #4)
> Well that explains why my attempts to fix this didn't have any effect on the
> results.

Yes. However, if you run TART locally, you can test this scenario very easily and get instant feedback on your changes - run one of the sub-tests after you've manually added several tabs. Just install the addon, set 2 prefs (for ASAP if you want), and choose animations to run and measure.

It displays all the results locally, including aggregates/averages of each specific animation (and even all individual intervals), so no need for extra tools to examine the results.
Also, it might be worth noting that while I do believe the regression numbers, I don't claim to put any specific weight/priority to these numbers.

It's obviously desirable to not regress, but how important the regression[s] is or how much effort and resources we should put into reducing it is beyond my pay-grade.

The bug assignee didn't login in Bugzilla in the last 7 months, so the assignee is being reset.

Assignee: matt.woodrow → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: