Open Bug 1389712 Opened 7 years ago Updated 15 days ago

switching to a Google Slides tab is slow (long repaint)

Categories

(Core :: Graphics: WebRender, defect, P3)

defect

Tracking

()

Performance Impact high
Tracking Status
firefox57 --- affected

People

(Reporter: asa, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug, )

Details

(4 keywords, Whiteboard: [photon-perf][triage][gfx-noted])

I have one tab that's slow to switch to. It's a Google Slides document. Here's a profile of me switching through 6 tabs, the final tab switch is the slow one. https://perfht.ml/2vuVbEr
hopefully better profile https://perfht.ml/2vupJWJ
Whiteboard: [qf][photon-perf][triage]
325ms Rasterize. Bas can you investigate if there is anything we can do to help with this.
Flags: needinfo?(bas)
It looks like a good chunk of that rasterizing is in nsDisplaySVGText::Paint, but I'm not sure whether that makes this a SVG bug (i.e. should we make those paints faster) vs. a layerization bug (i.e. should we have a cached layerization & avoid doing those repaints in the first place).
Component: General → Graphics
Product: Firefox → Core
Summary: switching to a particular tab is slow. here's a profile → switching to a Google Slides tab is slow (long repaint)
From the profile, it appears only a small amount of time in the 'rasterize' window (in both profiles), is actually spent rasterizing (I have a difficult time explaining why, other than, wow, this all went slow, when zooming in to a 325ms window, it only gives me <170ms of samples), the time actually spent rasterizing is much smaller (<40ms) and seems to be almost exclusively spent drawing text. There's nothing obvious standing out here that's actionable. On a glance I can't explain why we spend 325m in rasterizing, only have 170ms of samples, of which only 40 end up in actual rasterization. Perhaps people who know more about the actual profiler can shed some light on this.

A profile with a third-party profiler would probably be helpful here.
Flags: needinfo?(bas)
Looking at this range: https://perfht.ml/2wLo7YH

First in the chrome process:

https://perfht.ml/2wLfZHQ
The "select" event took 12ms, the profiler sampled 7ms, I think it's quite expensive for a select event.

For the content process:

https://perfht.ml/2wL5BQt
There are 3 paints, the second paint took 54ms to build display list, 65ms to build layers, but only 0.33ms to rasterize. 

https://perfht.ml/2wLTxyb
Similarly the third paint took 36ms to build display list, 72ms to build layers, but only 0.33ms to rasterize. 

Is it possible for the second and third case skip the paint entirely?

https://perfht.ml/2wLqk6G
The first paint took longer. 49ms to build display list, 89ms in layers building, 325ms in rasterization. This paint is not triggered by vsync, but triggered by TabChild::InternalSetDocShellIsActive. It's implemented in bug 1279086 to prevent script from blocking first paint.

There is a telemetry to track the time spent in the forced paint: TABCHILD_PAINT_TIME. The p75 is 45ms and p95 is 121ms https://mzl.la/2wL8aSo.

https://perfht.ml/2wL2YxT
The rasterization took 325ms but the profiler only sampled 173ms. When I filtered the symbol by "SVG" I found there are 141ms sampled time in nsDisplaySVGText::Paint and nsDisplayFilter (SVG filters).

(In reply to Daniel Holbert [:dholbert] from comment #3)
> It looks like a good chunk of that rasterizing is in
> nsDisplaySVGText::Paint, but I'm not sure whether that makes this a SVG bug
> (i.e. should we make those paints faster) vs. a layerization bug (i.e.
> should we have a cached layerization & avoid doing those repaints in the
> first place).

The first paint is forced by tab switch so I think it can not use cached layers. I'm not sure if we can use nsIPresShell::PAINT_COMPOSITE instead of nsIPresShell::PAINT_LAYERS. Make those paints faster should help. I think the second and third paint used cached layerization so avoided the repaint, but it would be nice to avoid the layer building as well.
(In reply to Bas Schouten (:bas.schouten) from comment #4)
> (I have a difficult
> time explaining why, other than, wow, this all went slow, when zooming in to
> a 325ms window, it only gives me <170ms of samples)

ehsan said in the triage meeting yesterday that this was due to a known windows profiler bug, where the reported call tree sample-time is ~half of the actual value.

So that ~170ms of samples really represents ~340ms, which is (approximately) the time you're expecting.
The reason is that we use Sleep(0) to wait 1ms on Windows, and that actually takes 2ms to yields on the sampling thread, which means we really sample at 2ms intervals even though the profiler UI setting interval is set to sample at 1ms.
Jonathan, is there anything we can do to improve the SVG text paint time? It looks like most of the time is in D2D1 DrawGlyphRun...

CJ, is there anything obvious in the SVG filter part?
Flags: needinfo?(jfkthame)
Flags: needinfo?(cku)
(In reply to Kan-Ru Chen [:kanru] (UTC+8) from comment #8)
> Jonathan, is there anything we can do to improve the SVG text paint time? It
> looks like most of the time is in D2D1 DrawGlyphRun...
> 
> CJ, is there anything obvious in the SVG filter part?
No, you can search by "PaintFilteredFrame" keyword. All filters are painted through this function
Flags: needinfo?(cku)
Oh, wait I do find PaintFilteredFrame in https://perfht.ml/2wLqk6G 
I will check
(In reply to Kan-Ru Chen [:kanru] (UTC+8) from comment #8)
> Jonathan, is there anything we can do to improve the SVG text paint time? It
> looks like most of the time is in D2D1 DrawGlyphRun...

I don't think we can do much to directly make D2D text drawing itself faster... but perhaps we're doing too much of it. Is it possible we're re-rendering the same text many times, when we should be caching a layer?
Flags: needinfo?(jfkthame)
hi asa,
Would you mind to share the slide? And may I know the exact UI steps of "the final tab switch is the slow one"(such as use arrow key or mouse?)?
Flags: needinfo?(asa)
(In reply to C.J. Ku[:cjku](UTC+8) from comment #12)
> hi asa,
> Would you mind to share the slide? And may I know the exact UI steps of "the
> final tab switch is the slow one"(such as use arrow key or mouse?)?

The slide is at https://docs.google.com/presentation/d/1GeCoonIkYEhwgivhaZQvRZNAcnn-5r-rTgKfd-uNGKA/edit?pli=1#slide=id.g1ded2000cb_0_0

Tab switching was mouse clicking the tabs.
Flags: needinfo?(asa)
Whiteboard: [qf][photon-perf][triage] → [qf:p2][photon-perf][triage]
Flags: needinfo?(cku)
https://perfht.ml/2wLqk6G 

nsFilterInstance::PaintFilterFrame took 43ms of 473 ms(around 9.09%)
* nsFilterInstance::BuildSourceImage took 27ms
* gfxPlatform::CreateOffscreenContentDrawTarget took 7ms: I don't think it's nomral, it basically allocate buffers for drawing only.

I tried to repro this symptom on local machines(windows/mac/linux), but was in vain. On my machines, nsFilterInstance::PaintFilterFrame always takes less then 5ms.
Flags: needinfo?(cku)
Whiteboard: [qf:p2][photon-perf][triage] → [qf:p2][photon-perf][triage][gfx-noted]
Keywords: perf
I need to re-evaluate this.
Assignee: nobody → bas
Status: NEW → ASSIGNED
Flags: needinfo?(bas)
This seems to perform okay for me with D2D, but seems a little slow with Skia, investigating further.
Flags: needinfo?(bas)

Currently, particularly while loading this slide set we seem to do considerably worse than chrome. The whole browser becomes janky.

This is clearly related to WebRender as can be seen here, several 500ms+ render.

https://share.firefox.dev/3h2rLWp

Assignee: bas → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(jmuizelaar)
Whiteboard: [qf:p2][photon-perf][triage][gfx-noted] → [qf:p1:responsiveness][photon-perf][triage][gfx-noted]
Blocks: gfx-triage
Flags: needinfo?(jmuizelaar)
Flags: needinfo?(jmuizelaar)

It looks like the root cause here might be blob splitting causing a bunch of stacked blobs and a lot of texture upload. Perhaps the arrows with shadow filters are causing us to split up a bunch of big things. That being said, I wasn't able to reproduce anything as bad as the profile Bas recently posted.

Nical, do you want to take a closer look and confirm?

Flags: needinfo?(jmuizelaar) → needinfo?(nical.bugzilla)
Blocks: wr-blob-perf
No longer blocks: gfx-triage

I'm seeing similar profiles althought not as spectacular as Bas, but how bad the problem is depends on how many blob layers we end up with which depends on the contents of the slides. We are hurt by texture uploads a lot more than by actual blob rasterization.

I think that the best way forward here is to stop blindly putting anything inside of an SVG element into a blob recording. Most of what I see on these slides are images and text that we could render with regular display items.

Flags: needinfo?(nical.bugzilla)

Bas, can you still reproduce the 500ms paints? I don't see anything close to that.

Flags: needinfo?(bas)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #21)

Bas, can you still reproduce the 500ms paints? I don't see anything close to that.

Yep: https://share.firefox.dev/3wzoMvQ

Flags: needinfo?(bas)
Depends on: 1739295
Performance Impact: --- → P1
Whiteboard: [qf:p1:responsiveness][photon-perf][triage][gfx-noted] → [photon-perf][triage][gfx-noted]

The Performance Priority Calculator has determined this bug's performance priority to be P1.

Platforms: Windows
Impact on site: Causes noticeable jank
[x] Affects major website
[x] Able to reproduce locally
[x] Multiple reporters

Keywords: reproducible, top50
Severity: normal → S3

The severity field for this bug is set to S3. However, the Performance Impact field flags this bug as having a high impact on the performance.
:bhood, could you consider increasing the severity of this performance-impacting bug? Alternatively, if you think the performance impact is lower than previously assessed, could you request a re-triage from the performance team by setting the Performance Impact flag to ??

For more information, please visit auto_nag documentation.

Flags: needinfo?(bhood)
Performance Impact: high → ?
Flags: needinfo?(bhood)

This depends highly on the slides. The one mentioned in comment 13 is rather complex and even with that the profile looks like https://share.firefox.dev/3PcOiQE
But the behavior is definitely worse than in some other browsers and this is a major website, so performance calculator gives still the same evaluation.

Performance Impact: ? → high

I've opened the slides in comment 13 in Fx107.0.1, with and without acceleration (WR), and I'm not seeing any high delays in rendering. At most, there may be tens of milliseconds on the initial page load, and virtually no delay once loaded and tabs are switched.

This report is already being tracked by a metabug (1582153), but I'll raise its severity profile to match the performance impact.

Severity: S3 → S2

Nical, can you take a look at this and see where we're at?

Flags: needinfo?(nical.bugzilla)

(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #25)

This depends highly on the slides. The one mentioned in comment 13 is rather complex and even with that the profile looks like https://share.firefox.dev/3PcOiQE

This profile looks mostly fine, there is a big 16s firstContentfullPaint marker but since the threads are mostly idl for a big part of it I suppose we are waiting for stuff to arrive from network.

Yep: https://share.firefox.dev/3wzoMvQ

Bas's profile shows very long texture uploads. It looks like we are rasterizing a very large amount of blob image tiles because of very poor layerization decisions.

Tim, could you queue this for some research? This has a high performance impact, and Nical isn't able to dig into it very deeply right now.

Flags: needinfo?(nical.bugzilla) → needinfo?(tnikkel)
Type: enhancement → defect

(In reply to Nicolas Silva [:nical] from comment #28)

(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #25)

This depends highly on the slides. The one mentioned in comment 13 is rather complex and even with that the profile looks like https://share.firefox.dev/3PcOiQE

This profile looks like it is switching to the google slide tab.

Yep: https://share.firefox.dev/3wzoMvQ

This profile looks like it is of loading the document.

We should decide what we want to address in this bug. Since the page is google docs, it has a very large svg image sprite sheet, which we're not currently great at. On load, and a tab switch in which those resources have been discarded, we pay the price for that. On warm tab switch those resources should still be around. Turning on svg blog images should hopefully help with the load/cold tab switch case. On warm tab switch it looks more like the fallback grouping code is the largest contributor.

Asa, is this still reproducible for you with current builds?

Flags: needinfo?(asa)
Severity: S2 → S3
Flags: needinfo?(asa)
Flags: needinfo?(tnikkel)
Component: Graphics → Graphics: WebRender
You need to log in before you can comment on or make changes to this bug.