Closed Bug 820061 Opened 7 years ago Closed 7 years ago

Awful performance with SVG animation when hardware accelerated

Categories

(Core :: SVG, defect)

x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla20
Tracking Status
firefox17 --- unaffected
firefox18 + verified
firefox19 + verified
firefox20 --- verified
b2g18 --- fixed

People

(Reporter: joe, Assigned: mattwoodrow)

References

()

Details

(Keywords: regression)

Attachments

(2 files, 1 obsolete file)

The URL attached is a Starbucks page that has an animation of birds that fly in on top of the "A new gift is waiting for you" text. Jeff tells me that they're made of SVG.

In software-only mode, Firefox is normally quick in rendering these birds, and they look good. But when hardware acceleration is enabled, they occasionally look very fuzzy, and their performance (and Firefox's, while they're displayed) is atrocious.
This regression shows up in FF18.
Juan - can you find a regression range here? I'd suggest starting with the dlbi nightly, given this regressed in FF18.
QA Contact: jbecerra
Flags: needinfo?(jbecerra)
This doesn't appear to be related to DLBI specifically. Juan is still trying to find a good regression range. Jet - do you have somebody that can profile why the animations have slowed down significantly?
Assignee: nobody → bugs
From a quick glance w/ inspector, it looks like the bird is dynamically generated with SVG, and is animated w/ scripted tweaks to "transform" attributes on <g> elements.

The animations appear to be generated dynamically using the Google "Swiffy" library.
s/bird is/birds are/
I went back to 17 (nightlies) and I found this regression range:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=588424024294&tochange=89dcadd42ec4

There a number of SVG fixes in that range.

Whatever caused it in 17 was probably backed out because a release version of 17 works well. This problem is very serious because the whole machine becomes non-responsive until you kill the process, and even that takes a while.
Flags: needinfo?(jbecerra)
Most likely https://hg.mozilla.org/mozilla-central/rev/b077c43a4306

The prefs set by bug 776054 got unset before 17 release IIRC can't find the bug that did it though.
The site in the URL crashes for me on a Retina Mac using Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:20.0) Gecko/20121213 Firefox/20.0. I don't get Socorro but instead the Apple Crash reporter.
I took a profile of the page after it has loaded (after the birds have flown in and are then bobbing up and down) when the CPU continues to have high load:

http://people.mozilla.com/~bgirard/cleopatra/#report=d40e776016f92794380aec126814fbe799f44028

Over 80% of the time is being spent under ThebesLayerOGL::RenderLayer.

Only 14% of the time is being spent under FlameLayerBuilder::DrawThebesLayer (under the ThebesLayerOGL::RenderLayer call) to draw SVG (nsDisplaySVGPathGeometry items).

63.5% of the time is being spent under GLContext::UploadSurfaceToTexture (which again is under the ThebesLayerOGL::RenderLayer call).

So I don't think bug 776054 is at fault here except that, as intended, it changes SVG to make use of layers. It just seems that taking the layer code paths is much more expensive than the old code paths for some reason.

Someone who knows the layers design and how it is expected to behave should take a look at the profile and see if something looks off with regards to the layers code.
I took that profile with 20.0a1 (2012-12-14) on OS X 10.8.
(In reply to Jonathan Watt [:jwatt] from comment #9)
> So I don't think bug 776054 is at fault here except that, as intended, it
> changes SVG to make use of layers. It just seems that taking the layer code
> paths is much more expensive than the old code paths for some reason.
> 
> Someone who knows the layers design and how it is expected to behave should
> take a look at the profile and see if something looks off with regards to
> the layers code.

Can you help us to understand whether disabling bug 776054 for FF18 would resolve this, even if it isn't really the regressing bug? We're looking for a low risk fix here for FF18, given the user impact in comment 6.

BenWa's offered to take a look at comment 9.
(In reply to Alex Keybl [:akeybl] from comment #11)
> Can you help us to understand whether disabling bug 776054 for FF18 would
> resolve this, even if it isn't really the regressing bug? We're looking for
> a low risk fix here for FF18, given the user impact in comment 6.

Disabling the SVG DisplayList feature will resolve this, and that may be the smart move as we haven't made acceptable progress on the regressions. :(
Jet: I'd be worried about disabling SVG display lists for 18, because DLBI is also in 18 and we haven't tested the interaction between DLBI and the old svg code at all.

Is there anything I can help with here? I don't know the svg code particularly well, but I'm happy to spend most of next week working on svg regressions if it might stop us from having to disable it.
We've got several layers that are 7k x 4k to draw birds that are wayyy smaller which is why the upload is so slow. I'm sure that once we get these layers to a manageable size the performance will be fine.
Attached image Texture preview
(In reply to Robert Longson from comment #7)
> The prefs set by bug 776054 got unset before 17 release IIRC can't find the
> bug that did it though.

The prefs are enabled for 17, so right now our current users are using the SVG display list code paths.

(In reply to Alex Keybl [:akeybl] from comment #11)
> Can you help us to understand whether disabling bug 776054 for FF18 would
> resolve this

A better and simpler solution (assuming we can't figure out why the layers are so large) would simply be to hardcode layers to always be inactive for SVG. (I share Matt's concern that disabling the SVG display list prefs simply won't work given DLBI.)

(In reply to Jet Villegas (:jet) from comment #12)
> Disabling the SVG DisplayList feature will resolve this, and that may be the
> smart move as we haven't made acceptable progress on the regressions. :(

SVG display lists are shipping in 17, and right now we have only 3 SVG display list regressions that we've deemed serious enough to track for 18, all of which I'm confident we can fix for 18. I personally don't think we need to be thinking about disabling SVG display lists, even if flipping the prefs still works given DLBI.
Thanks jwatt - whatever the final solution, we'll want to get a fix onto mozilla-beta by Tuesday. Would it make sense to already land a patch for hardcoding layers to always be inactive for SVG on m-c/m-a for early testing?
Sure, we can do that (although I'm hopeful BenWa/Matt can shed some light on why the textures are oversized and we can fix that). That's the patch "patch for mozilla-beta (17)" on bug 786894, which has approval-beta+ marked, although that was for 17. I can go ahead and land that on 18 too if you like.
Given that the texture is filled with content, I'd suggest that this isn't a bug in the layers system.

My initial guess is that we're scaling the content down massively, which would make the current behaviour 'correct', though probably not desired.

Having a look at the display list / layer dump for this page would answer that.
Display list and layer tree: http://pastebin.mozilla.org/1996021

We've got a whole bunch of active nsDisplayTransform that are downscaling by as much as 200x.

With active transform, we're drawing and retaining the content at the unscaled size (massive) and then are downscaling at composite time.

The easy option here would be to force transform to be inactive if the untransformed content size exceeds some arbitrary limitation.

A nicer option might be to modify ChooseScaleAndSetTransform (in FrameLayerBuilder.cpp) to pass the downscale into the ThebesLayer so that we only draw/retain the content at the downscaled size.

We do this already for upscaling (to prevent fuzzy rendering when we upscale by a large amount). Adding support for downscaling shouldn't be too hard.
The other problem is that instead of just adjusting the transforms, we are invalidating the content too.

Repainting these massive layers each frame is hurting us more than having them in the first place, but both are pretty bad.
(In reply to Matt Woodrow (:mattwoodrow) from comment #20)
> A nicer option might be to modify ChooseScaleAndSetTransform (in
> FrameLayerBuilder.cpp) to pass the downscale into the ThebesLayer so that we
> only draw/retain the content at the downscaled size.

Yikes. It didn't even occur to me that the layers code wouldn't be doing something like this already. Sounds to me like we definitely want to do this.
This was already r+'ed by mattwoodrow in bug 786894 and landed for 17. Just attaching it here to get explicit approval for beta 18 too as per comment 17.
Attachment #692592 - Flags: approval-mozilla-beta?
This improves performance a lot. I'm getting around 15fps in a debug build at the steady state now. I wouldn't call that good, but it's usable.

The main remaining issue is that SVG invalidates all content when we change the transform. Fixing that should help immensely. I believe jwatt is working on a patch for this?
Attachment #692669 - Flags: review?(roc)
Callstack for the invalidation when we change transform is here:

http://pastebin.mozilla.org/1998591
https://hg.mozilla.org/mozilla-central/rev/8d66bb18b4a3
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
Juanb, can we please get some verification around this?Thank you !
When hardware acceleration is enabled, I can still reproduce this issue on the latest Nightly
(In reply to Manuela Muntean from comment #29)
> When hardware acceleration is enabled, I can still reproduce this issue on
> the latest Nightly

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:20.0) Gecko/20121217 Firefox/20.0
Build ID: 20121217030850
On Mac OSX 10.7.5 I get the same faulty behaviour with the latest Nightly. The animation is very slow in comparison with its behaviour on Windows 7 64-bit using the latest Nightly.

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:20.0) Gecko/20121217 Firefox/20.0
Build ID: 20121217030850
What's about:buildconfig say is the source for your build Manuela?
Flags: needinfo?(manuela.muntean)
(In reply to Robert Longson from comment #32)
> What's about:buildconfig say is the source for your build Manuela?

http://hg.mozilla.org/mozilla-central/rev/c8a1314aa449
Flags: needinfo?(manuela.muntean)
That's Sat Dec 15 20:02:28 2012 +0100 (click on it and see) You need a nightly that's Mon Dec 17 17:30:51 2012 +1300 or later.
(In reply to Robert Longson from comment #34)
> That's Sat Dec 15 20:02:28 2012 +0100 (click on it and see) You need a
> nightly that's Mon Dec 17 17:30:51 2012 +1300 or later.

I've tried downloading the latest Nightly build from 2 sources:

1) http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-central/

2) http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2012-12-17-03-08-50-mozilla-central/

From both sources I get the result stated in comment 33, which is a build from Sat Dec 15 20:02:28 2012 +0100.

When I go to "About Nightly" in the menu, it tells me that my Nightly is up to date.
Probably best to wait till a new nightly comes out I think.
I saw that new Nightly builds appeared, but the problem mentioned in comment 36 is still there.

Although the build ID is 20121218030803, after going in about:buildconfig, I can see that the build is from Thu Dec 13 11:38:24 2012 -0800 (at Thu Dec 13 11:38:24 2012 -0800).

I am testing this on Mac OSX 10.7.5.
(In reply to Manuela Muntean from comment #37)
> I saw that new Nightly builds appeared, but the problem mentioned in comment
> 36 is still there.
> 
> Although the build ID is 20121218030803, after going in about:buildconfig, I
> can see that the build is from Thu Dec 13 11:38:24 2012 -0800 (at Thu Dec 13
> 11:38:24 2012 -0800).
> 
> I am testing this on Mac OSX 10.7.5.

Sorry, I meant to say comment 35 instead of comment 36.
Attachment #692592 - Attachment is obsolete: true
Attachment #692592 - Flags: approval-mozilla-beta?
Comment on attachment 692669 [details] [diff] [review]
Clamp to inverse powers of 2 as well

For me the starbucks site has gone from almost unusable to very slick. This has baked on m-c for a couple of days. Can we land on branches now?

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 776054
User impact if declined: huge perf hits on some SVG, possibly quite common
Testing completed (on m-c, etc.): been on m-c for a couple of days
Risk to taking this patch (and alternatives if risky): should be very low
String or UUID changes made by this patch: none
Attachment #692669 - Flags: approval-mozilla-beta?
Attachment #692669 - Flags: approval-mozilla-aurora?
Comment on attachment 692669 [details] [diff] [review]
Clamp to inverse powers of 2 as well

Low risk patch for a perf regression in FF18 and up. Approving on branches.QA to help with additional testing here for verification.
Attachment #692669 - Flags: approval-mozilla-beta?
Attachment #692669 - Flags: approval-mozilla-beta+
Attachment #692669 - Flags: approval-mozilla-aurora?
Attachment #692669 - Flags: approval-mozilla-aurora+
The Starbucks example now works well in the latest Nightly. We'll verify in beta 5 builds when they come out.
The animation runs smoothly on Firefox 18 beta 5 for me.

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:18.0) Gecko/20100101 Firefox/18.0
Build ID: 20121219074241
Dropping qawanted since there's nothing left to investigate here. Manueal, please verify this is fixed on Firefox 19 and 20 when you have time.
Keywords: qawanted
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #45)
> Dropping qawanted since there's nothing left to investigate here. Manueal,
> please verify this is fixed on Firefox 19 and 20 when you have time.

This issue is also fixed for the latest Nightly & Aurora.

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:20.0) Gecko/20121220 Firefox/20.0
Build ID: 20121220030912

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:19.0) Gecko/20121220 Firefox/19.0
Build ID: 20121220042016
mass remove verifyme requests greater than 4 months old
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.