Closed Bug 1412731 Opened 7 years ago Closed 6 years ago

Jank while browsing the Reddit pages with the integrated GPU

Categories

(Core :: Graphics: WebRender, defect, P3)

Other Branch
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox57 --- unaffected
firefox58 --- unaffected

People

(Reporter: pchang, Unassigned)

References

(Blocks 1 open bug, )

Details

(Whiteboard: [wr-reserve][gfx-noted])

I saw jank while browsing the Reddit website which contained lots of text and could be the same problem with this bug.

[STR] - tested on windows 10
1. Use the scroll test tool from https://bug894128.bmoattachments.org/attachment.cgi?id=776049
2. Open https://www.reddit.com/r/NintendoSwitch/
3. Use scroll test with 50000, 10, 120 setting

From the following profile or 'webrender debug profiler', I saw the average time of compositor CPU time was around 100ms.
http://perfht.ml/2iCodPX

It looks like we spent lots of time drawing and uploading texture.

55% -> cpu webrender::renderer::Renderer::draw_instanced_batch<webrender::gpu_types::PrimitiveInstance>
21% -> cpu webrender::device::Device::update_texture_from_pbo
Whiteboard: [wr-mvp] [triage]
I'm using Dell XPS 13 Dual core cpu with Intel Iris 540

After applying patches from bug 1380014, I still saw CPU bound on the GPU process.
With VS 2017 profiler, I saw build_scene+Document::render took about 55% CPU usage and the Render::render spent around 35% CPU time.

Function Name	Inclusive Samples	Exclusive Samples	Inclusive Samples %	Exclusive Samples %	Module Name
webrender::render_backend::Document::build_scene	5,231	2	40.85	0.02	xul.dll
webrender::render_backend::Document::render	        1,914	0	14.95	0.00	xul.dll

webrender::renderer::{{impl}}::render::{{closure}}	4,576	0	35.74	0.00	xul.dll
-> webrender::renderer::Renderer::draw_tile_frame       2,315	1	18.08	0.01	xul.dll
-> webrender::renderer::Renderer::update_gpu_cache	2,174	0	16.98	0.00	xul.dll
Depends on: 1408174
If I disabled APZ, the scrolling performance is getting better.

From the profiler with APZ off, build_scene is down to 24% CPU time. We should fix bugzilla 1408174.

[Profiler with APZ off]
Function Name	Inclusive Samples	Exclusive Samples	Inclusive Samples %	Exclusive Samples %	Module Name
webrender::render_backend::RenderBackend::process_document	2,251	0	51.19	0.00	xul.dll
-> webrender::render_backend::Document::render	                1,151	0	26.18	0.00	xul.dll
-> webrender::render_backend::Document::build_scene	        1,087	0	24.72	0.00	xul.dll

Function Name	Inclusive Samples	Exclusive Samples	Inclusive Samples %	Exclusive Samples %	Module Name
webrender::renderer::{{impl}}::render::{{closure}}	        1,575	0	35.82	0.00	xul.dll
-> webrender::renderer::Renderer::update_gpu_cache	        884	0	20.10	0.00	xul.dll
-> webrender::renderer::Renderer::draw_tile_frame	        607	0	13.80	0.00	xul.dll
Can you try reproducing this with a yaml capture of the page?

Make sure you use -r in wrench during replay so that the scene is rebuilt.
Flags: needinfo?(howareyou322)
Priority: -- → P3
Whiteboard: [wr-mvp] [triage] → [wr-mvp] [triage][wr-reserve-candidate]
Peter, is this in a regular Nightly, or in a local build? If it's in a local build, please make sure you're testing a build with --enable-release, due to https://developer.mozilla.org/en-US/docs/Mozilla/Benchmarking#Rust_optimization_level
Whiteboard: [wr-mvp] [triage][wr-reserve-candidate] → [wr-reserve]
(In reply to Markus Stange [:mstange] from comment #4)
> Peter, is this in a regular Nightly, or in a local build? If it's in a local
> build, please make sure you're testing a build with --enable-release, due to
> https://developer.mozilla.org/en-US/docs/Mozilla/
> Benchmarking#Rust_optimization_level

I still can reproduce this with today's Nightly.

After testing in another laptop, I confirmed the jank only happened with the integrated GPU. When I switched to the discrete GPU, the jank was gone. I'm using Intel Iris 540.

(In reply to Jeff Muizelaar [:jrmuizel] from comment #3)
> Can you try reproducing this with a yaml capture of the page?
> 
> Make sure you use -r in wrench during replay so that the scene is rebuilt.

Sure. Working on it.
Flags: needinfo?(howareyou322)
Whiteboard: [wr-reserve] → [wr-reserve][gfx-noted]
If I skip the 'box-shadow' effects in Reddit website, the scrolling performance is getting better. I will check the profiler with/without box-shadow to figure out the next step.
Summary: Jank while browsing the Reddit pages → Jank while browsing the Reddit pages with the integrated GPU
The box-shadow problems are known and a big improvement will come from https://github.com/servo/webrender/pull/1954. It hasn't made it to mozilla-central yet though.
It's coming in bug 1412280, on autoland now.
Depends on: 1412280
(In reply to Jeff Muizelaar [:jrmuizel] from comment #7)
> The box-shadow problems are known and a big improvement will come from
> https://github.com/servo/webrender/pull/1954. It hasn't made it to
> mozilla-central yet though.

I tried this patch with --enable-release flag in my local. The fps of scrolling increases from 6~10 fps to 20~25 fps. If I disabled apz, the scrolling fps becomes 30+ fps.
(In reply to Peter Chang[:pchang] from comment #9)
> (In reply to Jeff Muizelaar [:jrmuizel] from comment #7)
> > The box-shadow problems are known and a big improvement will come from
> > https://github.com/servo/webrender/pull/1954. It hasn't made it to
> > mozilla-central yet though.
> 
> I tried this patch with --enable-release flag in my local. The fps of
> scrolling increases from 6~10 fps to 20~25 fps. If I disabled apz, the
> scrolling fps becomes 30+ fps.

That the framerate should improve so much with apz disabled is somewhat surprising to me. Possibly worth investigating.
(In reply to Peter Chang[:pchang] from comment #0)
> I saw jank while browsing the Reddit website which contained lots of text
> and could be the same problem with this bug.
> 
> [STR] - tested on windows 10
> 1. Use the scroll test tool from
> https://bug894128.bmoattachments.org/attachment.cgi?id=776049

Note that this bookmarklet is old, and uses main-thread-driven scrolling. I have a modified version at https://staktrace.github.io/moz-pages/scrolltest.html which does the scrolling using CSS smooth-scroll which should be taking advantage of APZ. It gives different results.

But anyway I'll take a look at scrolling on the reddit page and see if there's anything that can be improved on the APZ side.
On Windows 10, I made a --enable-release build with a logging patch applied, ran `MOZ_WEBRENDER=1 ./mach run https://www.reddit.com/r/NintendoSwitch/`, scrolled around a bunch (manually), and captured the output. I did this for both APZ enabled and disabled. You can see the patch and resulting output at [1].

Based on this output I'm seeing similar FPS for both cases, and in fact APZ enabled seems marginally better just by eyeballing the numbers. But regardless of APZ enabled vs disabled, most of the low framerate is due to the fact that we're dropping frames because WR is busy at [2]. So even if we assume APZ enabled is worse than APZ disabled, it's not because of work that the compositor thread is doing - it's because of work that WR is doing on the render thread as a result of APZ being enabled. I believe this mostly falls under bug 1408174.

Another possible reason that APZ being enabled might negatively impact WR's render thread perf is if the creation of extra scrolling clips and such slows down the WR-side processing. A fix for this would again be on the WR side, to deal more efficiently with scrolling clips - having these scrolling clips is a requirement for APZ to work at all so we can't just eliminate them on the gecko side.

[1] https://gist.github.com/staktrace/d3bd8f8e4a3addad2871c2312dfbc69a
[2] http://searchfox.org/mozilla-central/rev/423b2522c48e1d654e30ffc337164d677f934ec3/gfx/layers/wr/WebRenderBridgeParent.cpp#1148
Looks like this issue got improved from bug 1408174. Now I got simlar FPS with/without APZ(this only happenes with the integrated GPU).
But I still saw high cpu usage on the renderer thread and one of these bottleneck is glclear.

The following is opened issue in WR.
https://github.com/servo/webrender/issues/1440
This seems to mostly work now.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.