Closed Bug 1247979 Opened 4 years ago Closed 4 years ago

scrolling-boxes is really slow with apz

Categories

(Core :: Layout, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
e10s + ---
firefox45 --- unaffected
firefox46 --- wontfix
firefox47 --- fixed
firefox48 --- verified

People

(Reporter: jimm, Assigned: jrmuizel)

References

(Blocks 3 open bugs, )

Details

(Keywords: perf, Whiteboard: [gfx-noted])

Attachments

(3 files, 2 obsolete files)

testcase:

http://people.mozilla.org/~roc/scrolling-boxes.html

<The_8472> speaking of performance, this is like 10x slower under e10s: http://people.mozilla.org/~roc/scrolling-boxes.html
<RyanVM> 40x slower than Chrome on my system
<RyanVM> and yeah, non-e10s is 4x slower than Chrome
<RyanVM> it's APZ
<chutten> but fixed backgrounds are the devil
<jimm> sounds similar to bug 1242969
<RyanVM> e10s w/o APZ is basically the same (if not slightly faster) than non-e10s
Blocks: e10s-perf
We're doing a lot of repainting here and we probably shouldn't. Shrinking the displayport significantly speeds this up.
Component: Panning and Zooming → Layout
Keywords: perf
Whiteboard: [gfx-noted]
Profile at [1] (taken from Nightly, where the test takes about ~11 seconds vs ~1 second on Chrome).

This is spending about 35% of the time in nsRegion::ScaleToOutsidePixels. Jeff, can that be pixman code be optimized any more? We might also be building some absurd region containing a ton of 10x10 rects and then doing region operations on it.

[1] https://cleopatra.io/#report=2258ab3fc58453546fcbbe144826f43730b293e7
Flags: needinfo?(jmuizelaar)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #2)
> Profile at [1] (taken from Nightly, where the test takes about ~11 seconds
> vs ~1 second on Chrome).
> 
> This is spending about 35% of the time in nsRegion::ScaleToOutsidePixels.
> Jeff, can that be pixman code be optimized any more? We might also be
> building some absurd region containing a ton of 10x10 rects and then doing
> region operations on it.
> 
> [1] https://cleopatra.io/#report=2258ab3fc58453546fcbbe144826f43730b293e7

ScaleToOutsidePixels can be optimized a lot. I will do so.
Flags: needinfo?(jmuizelaar)
Attached patch Optimize ScaleToOutsidePixels (obsolete) — Splinter Review
This does a union of all the rectangles at once, instead of one-by-one. It also cuts down a lot on allocation churn.

It would be possible to avoid an additional allocation/copy if we change the interface of this function to mutate the existing region instead of returning a new one.

Note, I have not tested the correctness of this patch, but it should be ok.
Attachment #8718926 - Attachment is obsolete: true
On my local debug build that patch improves performance on this page by ~28% (sample size of 1). We should get it landed and run another profile.
Attached patch Avoid a bunch of heap allocation (obsolete) — Splinter Review
This keeps around a temporary nsRegion so that we can avoid modifying these regions in place. Modifying regions in place forces a new allocation and we'd rather not.
This time without other changes.
Attachment #8718957 - Attachment is obsolete: true
Alternatively we could use AccumRegions for these regions. See bug 1248044.
Still slow with e10s.  About 10.5 seconds.
NI on myself to get another profile once this is in a Nightly. I don't want to do an opt build locally.
Flags: needinfo?(bugmail.mozilla)
Keywords: leave-open
On the latest nightly on OS X for me it takes ~5700ms compared to Chrome's ~900ms. A huge improvement already. Here's a new profile: https://cleopatra.io/#report=aaf726c63fcbd881eac6de2c739e72fe5102628c - we're still spending a lot of time in pixman region operations, for example around 5.4% of time in PaintedLayerData::AccumulateEventRegions which is all in _moz_pixman_region32_union.
Actually that 5.4% goes up to like 22% if you look at a subset of the profile that includes many paints trims out the empty space on the beginning/end of the profile. Jeff, can you take a look at this profile and see if there is more that can be done?
Flags: needinfo?(jmuizelaar)
I expect that using the AccumRegions from bug 1248044 should help with the rest of the time. I'm on PTO this week but I can look at writing a patch the week after.
Flags: needinfo?(jmuizelaar)
Assignee: nobody → jmuizelaar
Priority: -- → P1
Did somebody save this testcase locally? Roc's people is gone and he doesn't have a backup.
I'm nominating this to block APZ rollout, but that decision probably ultimately lives with Jeff (or maybe mbest). It's an artificial testcase but one that's likely to get attention.
Thanks for providing a copy of the test, Alice! I've uploaded it to my people account so it's easier for people to run.
Bug 1253860 fixes this bug; the test now takes comparable time to non-e10s. I ran it on March 10 nightly on my OS X machine 5 times in each of three configurations:

Config   no-e10s    e10s+APZ     e10s no-APZ
Samples   1882       2233         1660
          1824       1485         1350
          1614       1478         1562
          1977       1509         1465
          1772       2111         1682
--------------------------------------------
Average  1813.8     1763.2       1543.8
--------------------------------------------

e10s without APZ is still faster, but e10s with APZ is still better than non-e10s so I call this a win. We can improve this further by making the event regions code faster; the win in bug 1253860 comes from skipping the display list building entirely when we can.
Status: NEW → RESOLVED
Closed: 4 years ago
Depends on: 1253860
Keywords: leave-open
Resolution: --- → FIXED
The patch that landed here actually landed in 47 so updating flags according to that. However the bug as described "scrolling-boxes is really slow with APZ" isn't actually fixed in 47; it requires bug 1253860 to be really fixed. I hope leave-open dies in a fire.
Target Milestone: --- → mozilla47
I'm changing the target milestone to 48 as bug 1253860 is wontfix for 47.

Verified as fixed using latest Aurora 48.0a2 under Win 7 64-bit and Mac OS X 10.9.5, the average I obtained was 1382ms under Win and 1115ms under Mac.
Target Milestone: mozilla47 → mozilla48
You need to log in before you can comment on or make changes to this bug.