investigate/decide how to manage Alpenglow performance for 81
Categories
(Firefox :: Messaging System, task, P1)
Tracking
()
People
(Reporter: dmosedale, Assigned: dmosedale)
References
(Blocks 2 open bugs)
Details
(Whiteboard: [fxperf])
+++ This bug was initially created as a clone of Bug #1660568 +++
In bug 1643776, we landed a new theme (Alpenglow). Enabling this new theme during the about:welcome flow (and elsewhere, such as from the addons manager theme pane) can (usually?) causes a visual flash in the titlebar. We need to figure out how much of a problem we think this is. On my 2018 Mac, it's somewhat noticable (maybe .25s by eyeball). On a Linux VM I tested, it was fairly noticable, however, it wasn't so noticeable by comparison, since all of the themes rendered slowly enough that one visibly watches them render. On a Windows 7 VM I tested, all three were so slow that it didn't stand out much unless you knew exactly what you were looking forward.
https://profiler.firefox.com/public/eyjcyjh3b50fpj3hkqxfvns341q27reh6n20y4r/flame-graph/?globalTrackOrder=7-8-9-0-1-2-3-4-5-6&hiddenGlobalTracks=1-2-3-4-5-6&hiddenLocalTracksByPid=1838888-0-1-3-4-5~1839102-0&localTrackOrderByPid=1838888-6-0-1-2-3-4-5~1838926-0~1838999-0~1839224-0~1839153-0~1839052-0~1839102-1-0~&range=3831m352&thread=0&v=5 has a profile that lgreco made on Linux. There's a bunch of interesting stuff here, where it looks like a lot of the code execution is in layout and javascript, though the compositor is doing a bunch of waiting, so that may play in here too. We did some speculating that the extra images, the color adjustments, or the gradients of the SVG are involved in the problem.
Next steps probably include checking this out on a reference machine, possibly doing some timing with the profiler there and playing around with some of the theme properties to see which of them (if any) are low hanging fruit here. We may want to take videos too...
| Assignee | ||
Updated•5 years ago
|
Comment 1•5 years ago
|
||
Ciprian, could you run a perf test for this as discussed in triage?
Comment 2•5 years ago
|
||
I just took a quick look at the profile here. There's a lot of time spent when switching lightweight themes where we do a series of style flushes - it looks like this is mainly to evaluate colours in LightweightThemeConsumer._sanitizeCSSColor. It might be a good idea to try to batch up those reads into a single spot so that we only need to flush once. That might save some time.
But that's a general problem for all lightweight themes, I think. Looking at the screenshots, I think I see the perceived performance problem that this bug is concerned with - mainly, a delay between choosing the theme, and the main theme image appearing in the background.
According to this profile, there's a delay of roughly 170ms between the user clicking, and the image appearing. I actually suspect that the time is mainly being spent decoding the theme image. How big is that image?
| Assignee | ||
Updated•5 years ago
|
| Assignee | ||
Comment 3•5 years ago
|
||
(In reply to Tim Spurway [:tspurway] from comment #1)
Ciprian, could you run a perf test for this as discussed in triage?
Ciprian and I chatted some about this, and we thought that it would make sense to get videos of these first-time-after-startup theme switches in about:welcome:
light->dark on Windows reference machine-like hardware
light->alpenglow on Windows reference machine like hardware
light->dark on somewhat less ancient Windows hardware
light->alpenglow on somewhat less ancient Windows hardware
Maybe the way to do it is to assume that older reference like hardware is likely to be running Win 7 (Win 7 is still 24% of our user base as of this writing), and that less ancient hardware is more likely running Win 10 (56% of our userbase as of this writing)? And do the videos/testing there.
I don't think we should block on getting feedback about this question before doing the videos/testing, but if we can, I'd love to hear mconley and JimT's thoughts here.
| Assignee | ||
Comment 4•5 years ago
|
||
(In reply to Mike Conley (:mconley) (:⚙️) from comment #2)
(Various helpful advice elided)
Thank you so much for the advice, I'll be digging into your suggestions and question today!
Looking at the screenshots, I think I see the perceived performance problem that this bug is concerned > with - mainly, a delay between choosing the theme, and the main theme image appearing in the
background.
In general, absolutely, and furthermore - this bug is concerned with Alpenglow in comparison with the light and dark themes. In some testing configurations, Alpenglow shows a flash of the titlebar during switching that link and dark do not. If we decide that Alpenglow can't be made sufficiently fast, than we would probably temporarily disable it on the 81 branch.
According to this profile, there's a delay of roughly 170ms between the user clicking, and the image appearing. I actually suspect that the time is mainly being spent decoding the theme image. How big is that image?
All the images are SVGs, so presumably it's not decoding per se.
Comment 5•5 years ago
|
||
(In reply to Dan Mosedale (:dmose, :dmosedale) from comment #3)
I don't think we should block on getting feedback about this question before doing the videos/testing, but if we can, I'd love to hear mconley and JimT's thoughts here.
I think that feels like more of a Product question. I can help guide efforts, but making decisions on what is ultimately acceptable and what is not should probably be up to Product. You can consult with esmyth who's done Product Performance work before, unless jimthomas feels confident making a call.
All the images are SVGs, so presumably it's not decoding per se.
Ah. In that case, I wonder if the SVG rasterization process is what's slowing us down here. Where is the alpenglow SVG image stored?
| Assignee | ||
Comment 6•5 years ago
|
||
mconley: https://searchfox.org/mozilla-central/source/browser/themes/addons/alpenglow has the various different images as SVG files; they end up in the omnijar, I believe.
I chatted with Dan a bit about this. I think best to target fixing low hanging fruit for performance on Win 7/10. Let's not dive down the rabbit hole until we have some real data about the impact of this performance, but take a few days to improve what we can and see where that gets us. At that point I'll make a call whether we want to include this theme in the default list or run an experiment to quantify the performance impact.
| Assignee | ||
Comment 8•5 years ago
|
||
I added some a profile marker in the code and did some profiling on my Mac (10.14.6, MacBook Pro - 15-inch - 2018, 2.9GHz Intel Core i9); here's an example:
https://share.firefox.dev/34tSsOU
~120ms calling addon.enable until promise resolves going from light->alpenglow
~88ms calling addon.enable until promise resolves going from alpenglow->dark
Turns out the Mac profiler doesn't yet (but should very soon as per bug 1592031) show the frames at the top of the profiler, so in some ways, this particular profile is less useful than the Linux one.
More tomorrow...
| Assignee | ||
Comment 9•5 years ago
|
||
Thanks to Florian, I've learned of about:profiling to configure the profile, and that if one forces webrender off on Mac, the screenshots work. Meaningingfully more useful:
https://share.firefox.dev/3aWorsd
~121ms calling addon.enable until promise resolves going from light->alpenglow
~90ms calling addon.enable until promise resolves going from alpenglow->dark
In that profile, the first DOM event is the pointer up event from clicking on the Alpenglow button. It's now quite clear from the marker chart that addon.enable() is far and away the majority of the problem, lightweight_theme_styling_update is the majority of that, and _sanitizeCSSColor is the biggest chunk of that. That's also true for the alpenglow->dark change.
So even though this bug is primarily about why Alpenglow is so much slower, it's actually much high leverage to optimize the _sanitizeCSSColor compositing that mconley referred to, so I'm going to dig into that first. If we can get a meaningful win there, there's a reasonable chance Alpenglow will be fast enough to not have to worry about theme-specific stuff for 81.
| Assignee | ||
Updated•5 years ago
|
| Assignee | ||
Comment 10•5 years ago
|
||
I've dug into this, spent some time chatting with :Emilio, and at his generous offer, I've filed bug 1661123 for the _sanitizeCSSColor piece, and he's going to do something there.
| Assignee | ||
Comment 11•5 years ago
|
||
Excitingly, Emilio's patch from bug 1661123 has made things a bunch faster on my (relatively fast) Mac, configuration described in comment 8, it has effectively cut addon.enable in half, when not using webrender (from my understanding of the patch, it shouldn't matter much which rendering engine is in use):
~71.5ms calling addon.enable until promise resolves going from light->alpenglow
~38.8ms calling addon.enable until promise resolves going from alpenglow->dark
In the longer light->alpenglow case means that there is a ~140ms delay from the time the user clicks on the alpenglow button to the time the alpenglow theme is displayed. Not ideal, but much faster than before.
~50ms of that time appears to be spent flushing layout while rebuilding the toolbar luminance cache, which looks like it's already been optimized in one way by Dao in bug 1061947.
There's probably more we can do here. That said, I suspect that given how dramatic a win that is, the next step would be for QA to do the testing we talked about as discussed in comment 3 as soon as there is a nightly build of mozilla-central after bug 1661123 has merged to m-c. The reasoning being if these numbers hold up, the speed of Alpenglow enabling after that patch is likely to be less than the speeds of all the themes, including the simpler ones, beforehand. Which is to say, hopefully fast enough for 81, even if not ideal. As I understand it from JimT, we should soon have some data that helps us understand how people are reacting to the theme choices which can help us in this decision.
| Assignee | ||
Comment 12•5 years ago
|
||
It looks like the optimization Dao applied above was on top of the cache implemented in https://bugzilla.mozilla.org/show_bug.cgi?id=1334642 and discussed in great detail there. There are some mentions of various possible follow-on work to improve things further from there, but I don't have enough context to analyze whether there are any more reasonable straightforward wins around the toolbar luminance cache. :mconley, you were pretty deeply involved in the discussion in that bug, do you have any thoughts here?
Comment 13•5 years ago
|
||
Circling back with some screen recordings as promised. Here is a list of the various systems we have used to see the "issue":
- Ryzen 7 3700x, 32GB, RTX 2070, Windows 10 x64
- I5-7600k, 16GB, GTX 1080ti, Windows 10 x64
- AMD Fx8320, 16GB, AMD RS780, Ubuntu 18.04 x64
- AMD Fx8320, 16GB, AMD RS780, Windows 7 x32
- i5 2.7 Ghz, 8GB, Iris Pro 1536MB, iMac 10.15.6
- i7 4700HQ, 8GB, GTX 970M, Windows 10 x64
- AMD FX-8320, 16GB, ATI Radeon 3000, Windows 7 x64
- I5 2.7 Ghz, 8GB, Intel Iris Graphics 610 1536 MB, MacBook Pro - 10.15.6
- Intel(R) Pentium(R) CPU B960 @ 2.20GHz (2 CPUs), ~2.2GHz, 6GB, Intel® HD Graphic 1760 MB, Windows 7 Ultimate x64
- Intel i3-3120M, 8GB, Intel HD Graphics 4000, Windows 8.1
We've seen the white flash on ALL the systems we've tested regardless of where the action was initiated (about:welcome, about:addons, or from the customization menu). It is most noticeable when switching from Dark Theme to Alpenglow, and it's less noticeable when switching from Light Theme to Alpenglow, but this is most likely due to the minor differences between the Light theme and the Alpenglow one.
Much like what the previous comments have said so far, the issue is only noticeable during the first ever theme switch after a start-up.
On the more performant systems the flash can be seen on a single frame (usually) and on the less performant ones the flash is present for much longer (3 frames).
I've compiled a folder with a few screen recordings that show the flash, in some cases we've had to inspect the footage frame-by-frame to say for sure if it happens or not. Link to folder.
@dmose, please let me know if we can help with anything else.
| Assignee | ||
Comment 14•5 years ago
•
|
||
Thanks, Ciprian, that's a super helpful baseline!
Since you've done that work, (the perf fix that Emilio did in bug 1661123) has merged to mozilla-central. Can you take a look at (and make videos) of say, four systems with a Nightly that contains that fix (anything built after 3pm pacific on Thursday should be new enough). In particular, could you look at the slowest systems running Windows 7 and windows 10, as well as some medium speed systems, say Windows 10 and a Mac?
With luck, that will be just the data we need to make the call on whether there's anything else we need to do for 81.
Thanks!
Comment 15•5 years ago
|
||
Yes, of course!
I've downloaded Nightly builds from the pushlog in bug 1661123. Visibly, the medium systems now perform much like the higher end ones. But, we can still see the white flash when changing the first theme. Link to screen recordings. Low speed systems seem to perform the same.
| Assignee | ||
Comment 16•5 years ago
|
||
Thanks, Ciprian!
Jim, have a look at the videos; I'd be curious to hear your thoughts about how much more we want to invest here (for 81 or for 82+, if the answers are different). Feel free me to ping me on slack if it's helpful to talk through this...
Comment 17•5 years ago
|
||
Before we go down the rabbit hole of attempting to further optimize the ToolbarIconColor luminance cache, are we certain that style flushes are where the bulk of the delay now is? Has anybody gathered more recent profiles with emilio's patch applied? I think we should probably confirm the assertion that we need to focus on trying to eliminate an initial style flush.
Updated•5 years ago
|
| Assignee | ||
Comment 18•5 years ago
|
||
Yes, comment 11 is from a profile with Emilio's patch applied, and that looked to be the biggest chunk of the delay. I should have posted a link to the profile here. I can do that later this AM.
Comment 19•5 years ago
|
||
Dan, with the current improvement I'm comfortable with where performance is at this point. We should flag to return for some polish work at a later time, but for now this looks sufficient.
| Assignee | ||
Comment 20•5 years ago
|
||
For reference, here are the measurements and profile I made on Friday, with Emilio's patch (which has now landed to m-c):
m-c with patch:
https://share.firefox.dev/2QBdwuS
dark-> alpenglow
- addon.enable: 79ms
- lwt styling update: 55ms
- click to theme displayed:180ms
https://share.firefox.dev/2DbYt7K
light -> alpenglow
- addon.enable: 80ms
- lwt styling update: 54ms
- click to theme displayed: 90ms
https://share.firefox.dev/31CVbUo
alpenglow -> dark
- add-on.enable: 42ms
- lwt styling update: 28ms
- click to theme displayed: 110ms
This bug is as fixed as it's going to get for 81, courtesy of Emilio's patch having just landed in mozilla-central, so I'm going to resolve it as fixed./
Jim, how do you want to think about if/when we continue to work on performance specifically for Alpenglow, which has the white flash? If it seems potentially desirable, I'm happy to clone this bug and do more work there.
| Assignee | ||
Comment 21•5 years ago
|
||
Note that if we don't want to drive it forward, the right thing to probably do is to clone this bug into the theme component.
| Assignee | ||
Updated•5 years ago
|
| Assignee | ||
Comment 22•5 years ago
|
||
Actually, I suspect bug 1660568 is covers this well enough, and we don't need to clone this bug.
| Assignee | ||
Updated•5 years ago
|
Description
•