Scrolling performance in Release 32 is much better than in Nightly builds. http://www.nytimes.com/interactive/2013/01/29/us/where-50000-guns-in-chicago-came-from.html?ref=us&_r=2&
Created attachment 8500021 [details] scrollspeed.js Str 1. Open http://www.nytimes.com/interactive/2013/01/29/us/where-50000-guns-in-chicago-came-from.html?ref=us&_r=2 2. Run attached script in ScratchPad with chrome privileged #1 -30% regressed by Bug 1039926 Scroll speed 2400 msec -> 3100 msec #2 -64% regressed by Bug 952977 Scroll speed 2900-3200 msec -> 4900-5100 msec #3 +15% Progression by Bug 902952 Scroll speed 4900-5100 msec -> 4200-4300 msec #4 +25% Progression by Bug 932762 Scroll speed 4200-4300 msec -> 3200-3400 msec #1 Regression window(m-i) Scroll speed 2400 msec L:\trunk\2014\07\inbound\firefox inbound 23-Jul-2014 0626 1406093088 Bug 1042423 Do css background clipping using DisplayItemClip https://hg.mozilla.org/integration/mozilla-inbound/rev/06b540d3667a Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0 ID:20140722222448 Scroll speed 3100 msec L:\trunk\2014\07\inbound\firefox inbound 23-Jul-2014 0708 1406095484 https://hg.mozilla.org/integration/mozilla-inbound/rev/912c96edb352 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0 ID:20140722230444 #1 Pushlog: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=06b540d3667a&tochange=912c96edb352 #1 Suspect : Bug 1039926 #2 Regression window(m-i) Scroll speed 2900-3200 msec https://hg.mozilla.org/integration/mozilla-inbound/rev/a6f981a6431a Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0 ID:20140801052717 Scroll speed 4900-5100 msec https://hg.mozilla.org/integration/mozilla-inbound/rev/d764f90842f3 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0 ID:20140801053216 #2 pushlog: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=a6f981a6431a&tochange=d764f90842f3 #2 Suspect : Bug 952977 #3 Progression window(m-i) Scroll speed 4900-5100 msec https://hg.mozilla.org/integration/mozilla-inbound/rev/165c3fd176ec Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0 ID:20141001112720 Scroll speed 4200-4300 msec https://hg.mozilla.org/integration/mozilla-inbound/rev/d954ed24e795 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0 ID:20141001113823 #3 pushlog: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=165c3fd176ec&tochange=d954ed24e795 #3 Suspect : Bug 902952 #4 Progression window(m-i) Scroll speed 4200-4300 msec https://hg.mozilla.org/integration/mozilla-inbound/rev/21e9a5b9a4e3 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0 ID:20141004035135 Scroll speed 3200-3400 msec https://hg.mozilla.org/integration/mozilla-inbound/rev/75c93e9a7c97 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0 ID:20141004041835 #4 pushlog: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=21e9a5b9a4e3&tochange=75c93e9a7c97 #4 Suspect : Bug 932762
[Tracking Requested - why for this release]:
Matt, can you take a look at the first regression?
And David, any idea why your work to switch to Matrix4x4 would have caused such a large regression? Was there anything other than a straight swap in all those patches?
On OSX there is a much bigger regression than either of the two regressions established in comment 1. This is caused by: https://hg.mozilla.org/mozilla-central/rev/4effa50d8934 Matt Woodrow — Bug 982338 - Enable tiling for OSX
Jonathan, those patches simply switched gfx3DMatrix to Matrix4x4. The slowdown may be due to more multiplications, as you've pointed out, or due to converting matrices back and forth.
Investigating and profiling this on OSX shows a bunch of interesting stuff: * There is a crazy number of display items visible. * We're invalidating a bunch of the svg display items when scrolling, not only is this wasted work, but it's giving us a really complicated invalidation region. * We're painting a lot of the svg items within an inactive transform just to offset them. This incurs a bunch of probably unnecessary work to do with inactive layer managers. * Inactive layer managers call GetClipBounds() on the gfxContext multiple times and we spend a huge amount of time computing path bounds here. Caching the bounds of clip paths seems to help a lot. * We spend massive amounts of time in RecomputeVisibility. This is a combination of the complex invalidation, the massive number of items and some dumbness on our part. I have a patch for the latter which seems to help, not entirely sure if it's correct yet. * The remainder of the time is spent painting, mainly in Filling/Stroking paths. It's pretty likely the tiling made this part slower since we have to do a bunch of work for each tile that the paths touch. With current trunk I see paints taking up to 800ms, and scrolling is basically unusable. With the two fixes that I mentioned we get down to around ~180ms per frame and is visually a lot better (but still awful). Fixing the unnecessary invalidation, and avoiding nsDisplayTransform for translation-only transforms would also make a huge difference here. Jonathan, do you want to look into that? Probably the simplest thing to do to reduce the impact of tiling would be to increase the tile size. 256x256 is rather small for a desktop screen, increasing it should be an easy win. We could also look at rendering complex display lists to a temporary surface and then just copying to the real tiles. We have code for this already, but it's slower in the general case because the extra allocation/copy is slower than the per-tile overhead. Dynamically choosing which method to use would be 'interesting'.
I just found (in bug 554004) that we were previously simplifying complex visible region, and with tiling we lost that. A lot of my previous comments are still valid, but their effects will be somewhat hidden once we reintroduce the region simplification.
With region simplification RecomputeVisibility basically drops away to nothing, and we're hitting around ~120ms per frame. We're still painting the majority of the display items when scrolling though (paint flashing shows large rects instead of highlighting the items individually), so that needs to be fixed before we can hope to perform well.
Created attachment 8500212 [details] [diff] [review] Simplify invalid region for tiled layers
FWIW scrolling is still pretty bad even with this patch.
Forgot to add. Mozilla/5.0 (Windows NT 6.3; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0
Yeah, the patch only fixes the specific regression caused by tiling. The other regressions, and problems that have always existed still make us slow.
Thanks for the info Matt.
Matt - 1. Is there anything further to be done in this bug? 2. Your patch landed in 35. 34 is also marked as affected. Can you submit a Beta approval request?
No, this bug is fixed. I'm not sure we should rush to uplift, since the android regression (bug 1086642) is still open.
I don't see much action in bug 1086642. Unless there is a fix in the next day or two it is very unlikely that this issue will be fixed in Firefox 34.
As we're out of time on Beta and we don't have a fix for bug 1086642, I'm marking this bug as wontfix in 34. We'll have to live with this perf regression for at least one cycle.
I've verified this fix on Windows 7 x64, Mac OS X 10.9.5, Ubuntu 14.04 x64, using Firefox 35 Beta 5 (BuildID=20141218174327). The test results were mixed: - Windows 7 x64 - improvement of ~0.7 sec compared to 34.0.5; ~1.2 sec slower than 32.0.3 * 34.0.5 = 3107.64 ms * 35.0b5 = 2448.84 ms * 32.0.3 = 1272.68 ms - Mac OS X 10.9.5 - regression of ~0.5 sec compared to 34.0.5; ~2.2 sec slower than 32.0.3 * 34.0.5 = 2978.48 ms * 35.0b5 = 3421.48 ms * 32.0.3 = 1163.52 ms - Ubuntu 14.04 x64 - improvement of ~0.2 sec compared to 34.0.5; ~4.4 sec slower than 32.0.3 * 34.0.5 = 6341.24 ms * 35.0b5 = 6158.28 ms * 32.0.3 = 1743.12 ms While this fix brings some improvement for Windows, it does not seem to bring almost any improvement for Linux, and causes a half a second regression for Mac (which is probably the one reported in bug 1080845). Is the improvement for Windows and Linux the expected one? Linux still seems to be pretty bad compared to 32.0.3. Matt, do you think there's additional work to be done here (and thus we should reopen this issue), or maybe this is/should be tracked in a separate bug? Testing details: - testing was performed using steps from comment 1 - testing was done with maximized window (1 tab) and clean profiles - 5 measurements were taken for each version and OS, and the results shown here are the average
Any thoughts on results from comment 21 above?
There has been no update for a while on this issue. Should we follow up on it?
The scrolling is still very slow on the latest nightly (win32 on win8.1 64bit, D2D1.1 etc)
We should probably open another bug; with this one closed, it may not get the attention it needs.
I've filed bug 1175904 for any followup work on this issue.