1365823 - tab switching to gmail with theme is very slow on mac

Reporter

Description

•

7 years ago

I'm running 55.0a1 (2017-05-17) (64-bit) on a MacBook Pro (15-inch, 2016).  I'm seeing pretty bad tab switch times when going to gmail.  See this profile:

https://perfht.ml/2quKq4w

This is for switching from example.com in one tab to my mozilla mail in another tab.  It shows:

* 586ms GC major... not sure if this is contiguous or many slices
* 18ms to build layers
* 131ms rasterize
* 91ms layer transaction
* 21ms composite and many other large composites

This is with the "dark" gmail theme applied.  If I remove the theme it seems somewhat better:

https://perfht.ml/2quHaGf

I realize this may be an issue with the theme, but observing some of my family members themes seem popular.

Filing this in graphics because of all the missed frame budgets there.  Its unclear to me if the GC major is a problem or not.

Jeff Muizelaar [:jrmuizel]

Comment 1

•

7 years ago

The large rasterize time is caused by creating a whole bunch of new tiles. I'm guessing this might have become worse with multi-e10s because we either need to keep the tile pool per process live and take the memory hit or recreate it when you switch from one content process to the other.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 2

•

7 years ago

That long GC is from another child process, no?
(Also, that GC has several slices, and just waiting for input between them)
Looks like the UI for showing GCs is quite misleading. It looks as if there was something processed all the time, yet when hovering one can see the slices.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 3

•

7 years ago

(In reply to Olli Pettay [:smaug] from comment #2)
> That long GC is from another child process, no?
> (Also, that GC has several slices, and just waiting for input between them)
> Looks like the UI for showing GCs is quite misleading. It looks as if there
> was something processed all the time, yet when hovering one can see the
> slices.

Ok, let ignore the GC parts of the profile, then.  My confusion with the profiler.

(In reply to Jeff Muizelaar [:jrmuizel] from comment #1)
> The large rasterize time is caused by creating a whole bunch of new tiles.
> I'm guessing this might have become worse with multi-e10s because we either
> need to keep the tile pool per process live and take the memory hit or
> recreate it when you switch from one content process to the other.

FWIW, I saw this repeatedly when switching back and forth between the same two tabs.  So this wasn't something like a process I had not viewed in a long time.

I'll add some multi-e10s folks to the CC list since there is a suspicion this is worse with multi-e10s.  I'll try to retest with fewer content processes as well.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 4

•

7 years ago

Note, since I'm only seeing this on mac it may not be showing up in our multi-e10s experiment data.  My understanding is we had low number of samples from the experiment on mac.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 5

•

7 years ago

I retested:

1 content process: https://perfht.ml/2ruVNqo
2 content processes: https://perfht.ml/2ruKaj3

Not a huge difference although the single-e10s seems about 20ms faster.  I shutdown all my other tabs this time, so I really only had 2 processes.

Also I noticed that this pause mainly triggers when I have the mail list open.  If I have the message body opened, then tab switching seems fast.

Are we maybe creating too many layers for the message list?

Jeff Muizelaar [:jrmuizel]

Comment 6

•

7 years ago

So even in the 1 content process case we're still allocating a bunch of tiles so maybe we're just not keeping enough around. I'll see if I can dig up something that will give us more information on the size of the tile pool and the size requested.

Jeff Muizelaar [:jrmuizel]

Updated

•

7 years ago

Flags: needinfo?(jmuizelaar)

Erin Lancaster [:elan]

Updated

•

7 years ago

Whiteboard: [qf] → [qf] [e10s-multi:?]

Gabor Krizsanits (INACTIVE)

Comment 7

•

7 years ago

(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #5)
> I retested:
> 
> 1 content process: https://perfht.ml/2ruVNqo
> 2 content processes: https://perfht.ml/2ruKaj3

For the high level overview the difference is:
nsDisplayList::PaintRoot regressed from 129ms to 138ms

Mainly from ClientTiledPaintedLayer::InvalidateRegion which regressed from 9ms to 16ms

(In reply to Jeff Muizelaar [:jrmuizel] from comment #6)
> So even in the 1 content process case we're still allocating a bunch of
> tiles so maybe we're just not keeping enough around. I'll see if I can dig
> up something that will give us more information on the size of the tile pool
> and the size requested.

Yeah this is what it looks like, the regressions is mostly coming from _moz_pixman_region32_init_rects (from ClientTiledPaintedLayer::InvalidateRegion).

Since we will have to decide if we should block the release or not on this, could you give us an ETA on this work? Also, do you think it will be an upliftable change?

Jerry Shih[:jerry] (UTC+8) (inactive)

Updated

•

7 years ago

Priority: -- → P3

Whiteboard: [qf] [e10s-multi:?] → [qf] [e10s-multi:?] [gfx-noted]

Jeff Muizelaar [:jrmuizel]

Updated

•

7 years ago

Flags: needinfo?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Updated

•

7 years ago

Flags: needinfo?(jmuizelaar)

Tim Guan-tin Chien [:timdream] (please needinfo)

Updated

•

7 years ago

Updated

•

7 years ago

Whiteboard: [qf] [e10s-multi:?] [gfx-noted] → [qf] [e10s-multi:-] [gfx-noted]

Naveed Ihsanullah [:naveed]

Updated

•

7 years ago

Whiteboard: [qf] [e10s-multi:-] [gfx-noted] → [qf:p3] [e10s-multi:-] [gfx-noted]

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 8

•

7 years ago

I guess I'm surprised tab switch on mac wouldn't be considered a higher qf priority.  I know its not where our main user population is, but a lot of web developers use it.

Anyway, here's a profile of switching between two bugzilla tabs:

https://perfht.ml/2sbwlqj

Rasterization and LayerTransaction still completely blow our frame budget, even though its not as bad as the gmail case.

Jeff, are there some prefs or constants I can play with to see if it helps on my machine?

Markus Stange [:mstange]

Comment 9

•

7 years ago

Increasing layers.tile-initial-pool-size or layers.tile-pool-unused-size might help.

Markus Stange [:mstange]

Comment 10

•

7 years ago

Flipping layers.componentalpha.enabled to false should also help.

And bug 1265824 should eliminate the rest of the slowness.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 11

•

7 years ago

(In reply to Markus Stange [:mstange] from comment #9)
> Increasing layers.tile-initial-pool-size or layers.tile-pool-unused-size
> might help.

These did not help.  Here are some profiles using different values:

layers.tile-initial-pool-size=50
https://perfht.ml/2s6CFlU

layers.tile-initial-pool-size=75
https://perfht.ml/2s6VLIA

layers.tile-initial-pool-size=100
https://perfht.ml/2s6y2se

layers.tile-initial-pool-size=100
layers.tile-pool-unused-size=50
https://perfht.ml/2s6RqFh

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 12

•

7 years ago

(In reply to Markus Stange [:mstange] from comment #10)
> Flipping layers.componentalpha.enabled to false should also help.

This helped a lot!

https://perfht.ml/2s6zVVJ
https://perfht.ml/2s6JZ15

Is this something we would consider flipping to false by default?  What is the long term fix here?

Flags: needinfo?(mstange)

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 13

•

7 years ago

The profiles in comment 12 had the layers.tile.* prefs set to 100/50.  Here is one with those reset:

https://perfht.ml/2s6zjiQ

An improvement, but maybe not as quite as large.

Markus Stange [:mstange]

Comment 14

•

7 years ago

(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #12)
> Is this something we would consider flipping to false by default?  What is
> the long term fix here?

The medium term fix is to disable component alpha on HiDPI displays (originally bug 941095, has been discussed in bug 1366618 again). The long term fix is webrender, which will not need component alpha layers in order to support subpixel text anti-aliasing.

Flags: needinfo?(mstange)

Ben Kelly [:bkelly, not reviewing]

Reporter

Updated

•

7 years ago

Depends on: 1366618

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 15

•

7 years ago

Ok, thanks.  I had to set layers.componentalpha.enabled false on my 2016 macbook pro.  It felt very sluggish without the pref change and I noticed on a daily basis.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 16

•

7 years ago

This is the issue we spoke about this morning where Firefox Nightly feels sluggish on my 2016 MacBook Pro.

Flags: needinfo?(milan)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 17

•

7 years ago

Let's where the conversation goes in bug 1366618.

Flags: needinfo?(milan)

Firefox Bug Husbandry Bot

Updated

•

7 years ago

Keywords: perf

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

2 years ago

Performance Impact: --- → P3

Whiteboard: [qf:p3] [e10s-multi:-] [gfx-noted] → [e10s-multi:-] [gfx-noted]

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Gregory Pappas [:gregp]

Updated

•

1 year ago

Status: NEW → RESOLVED

Closed: 1 year ago

Depends on: fixed-by-webrender
No longer depends on: 1366618

Flags: needinfo?(jmuizelaar)

Resolution: --- → WORKSFORME