https://old.reddit.com/r/nier/ does not run smoothly at 120fps

UNCONFIRMED
Unassigned

Status

()

defect
P5
normal
UNCONFIRMED
11 months ago
4 months ago

People

(Reporter: pnm79623, Unassigned, NeedInfo)

Tracking

(Depends on 2 bugs, Blocks 1 bug, {leave-open, nightly-community})

63 Branch
x86_64
Windows 10
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

()

Attachments

(14 attachments)

4.72 MB, video/webm
Details
17.16 KB, text/plain
Details
9.99 MB, video/webm
Details
9.01 MB, video/webm
Details
9.95 MB, video/webm
Details
4.49 MB, video/webm
Details
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
Reporter

Description

11 months ago
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:62.0) Gecko/20100101 Firefox/62.0
Build ID: 20180713213322

Steps to reproduce:

Open https://old.reddit.com/r/nier/
When page finishes loading scrolling becomes consistently choppy.
If animation is blocked, scrolling becomes normal again.


Actual results:

Pages with complex animations that use transparency degrade scrolling performance when using WebRender.


Expected results:

Scrolling is smooth like when using Direct3D 11 Advanced Layers Compositing.
Reporter

Comment 1

11 months ago
Posted video Video demonstration
Video demonstration, first half WebRender OFF second half WebRender ON.
Could you please open about:support, click on the "Copy text to clipboard" button, paste it into a text file and upload it here (Attach File)? Thanks!
OS: Unspecified → Windows 10
Hardware: Unspecified → x86_64
Reporter

Comment 3

11 months ago
This seems to run well for me on Mac. I'll try it on Windows.
Flags: needinfo?(jmuizelaar)
Reporter

Comment 5

11 months ago
What's the refresh rate of you monitor? Can you also turn on gfx.webrender.debug.gpu-time-queries and gfx.webrender.debug.gpu-sample-queries?
Flags: needinfo?(jmuizelaar) → needinfo?(pnm79623)
Reporter

Comment 7

11 months ago
(In reply to Jeff Muizelaar [:jrmuizel] from comment #6)
> What's the refresh rate of you monitor? Can you also turn on
> gfx.webrender.debug.gpu-time-queries and
> gfx.webrender.debug.gpu-sample-queries?

Native 120Hz display. Attaching new video.
Flags: needinfo?(pnm79623)
Reporter

Comment 8

11 months ago
Great. It looks like we might be bottlenecked on the CPU side. Can you attach another video that has layers.acceleration.draw-fps  on and WebRender off?
Flags: needinfo?(pnm79623)
Reporter

Comment 10

11 months ago
CPU is i7-4790k@4.4 on all cores (power states/steps/saving disabled)
Flags: needinfo?(pnm79623)
Summary: Animations with transparency causes choppy scrolling → https://old.reddit.com/r/nier/ does not run smoothly at 120fps
Depends on: 1477358
So I remembered that gfx.webrender.debug.gpu-time-queries and gfx.webrender.debug.gpu-sample-queries can have a big impact on performance. If you turns those off and set gfx.webrender.debug.compact-profiler=true do you get a reasonably consistent frame rate of 120fps during scrolling? If so is it still choppy even though the frame-rate says 120fps?
Flags: needinfo?(pnm79623)
Reporter

Comment 12

11 months ago
(In reply to Jeff Muizelaar [:jrmuizel] from comment #11)
> So I remembered that gfx.webrender.debug.gpu-time-queries and
> gfx.webrender.debug.gpu-sample-queries can have a big impact on performance.
> If you turns those off and set gfx.webrender.debug.compact-profiler=true do
> you get a reasonably consistent frame rate of 120fps during scrolling? If so
> is it still choppy even though the frame-rate says 120fps?

With WebRender I'm getting 120FPS while "stationary". During scrolling FPS drops but rarely to double digits. You can see it in this attachment.
https://bug1476368.bmoattachments.org/attachment.cgi?id=8993540

With advanced layers FPS stays very high even during scrolling, though it still dips to 117-113 it doesn't feel weird like when WR is used.

On the other note, yesterday I tied forcing maximal performance mode in GPU for Firefox but it help at all.

Is there anything else I can do that might help?
Flags: needinfo?(pnm79623)
There might be a frame rate consistency issue here. We'll try to add some better metrics to the HUD to get a better idea of what's going on.

Can you install the Gecko profiler add-on https://perf-html.io/, open the add-on and go to settings, add the following threads "RenderBackend,Renderer,WebRender,Wr" to the list and then get a profile of the scrolling slowdown?
Flags: needinfo?(pnm79623)
Priority: -- → P3
Reporter

Comment 14

10 months ago
Posted video CPU usage
So I just quickly checked if anything changed in almost a month of nightly development but unfortunately even more problems appeared. CPU usage is very high on that page (4 times higher than without webrender 20% vs 5%). CPU usage drops to 0 if I close the tab or minimize Firefox.

I will test performance with that addon later.
Flags: needinfo?(pnm79623)
Priority: P3 → P4
Jeff -- Have you ever reproduced this on Windows/nvidia?  Have we added metrics (ref: Comment 13)?
Reporter -- Does this still happen for you?  What driver are you using? (nvidia?  Intel? something else?)  Have you only seen it on one machine so far?

Thanks!
Flags: needinfo?(pnm79623)
Flags: needinfo?(jmuizelaar)
I still see some weird frame pacing on this profile: https://perfht.ml/2EaT6oV
Flags: needinfo?(jmuizelaar)
Depends on: 1487864
Reporter

Comment 18

9 months ago
(In reply to Maire Reavy [:mreavy] Plz needinfo from comment #15)
> Jeff -- Have you ever reproduced this on Windows/nvidia?  Have we added
> metrics (ref: Comment 13)?
> Reporter -- Does this still happen for you?  What driver are you using?
> (nvidia?  Intel? something else?)  Have you only seen it on one machine so
> far?
> 
> Thanks!

This is still happening and CPU usage is relatively very high comparing to advanced layers. nVidia GTX1070, drivers 416.34, windows 10 1809
Flags: needinfo?(pnm79623)

How does this look for you now?

Flags: needinfo?(pnm79623)
Priority: P4 → P5

This page performs poorly on my x1 carbon laptop (linux + intel). From a quick look in perf there seem to be some low hanging fruits to pick. Here' what stands out at a glance:

  • The average number of retired instruction per cycle is low (usually means unhappy caches). On this CPU I usually get about 2.0 ins/cycle for instruction-bound workloads and here I'm getting 0.6.
  • There's a lot of page faults happening in RenderTaskTree::add. Looks like we can't recycle the allocation because the tree is sent to the render thread but we can record the previous allocated size and pre-allocate the vectors each frame (*).
  • A lot of instruction cache misses on the render thread in driver code and in draw_instanced_batch. Dzmitry's suggestion to remove redundant gl calls might help here (Edit: I got mixed up, the suggestion was in another bug).
  • PrimitiveStore::update_visibility is is high in number of samples and also in the number of data cache misses. This one might not be a low hanging fruit but I'm pointing it out because this function consistently shows up at the top of profiles for me lately.

(*): Actually, even though we send the RenderTaskTree to the renderer it looks like we only use it there for debugging purposes ... aaand no we do use it for non-debug things as well.

On a machine with a more powerful CPU and a 4k screen the biggest problem is GPU times with lots of time spent in B_Blend.
(Edit: I had picture caching disabled, my bad, it does wonders on this page).

On the CPU side it clearly doesn't help that the banner at the top has a css animation on background-position which is causing us to continuously go through DL building, scene building, frame building and rendering even when it is off-screen.

Optimizations will help but the best way to really cut CPU times for this type of pages is to add support for more animated properties.

Comment 23

4 months ago
Pushed by nsilva@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b7ad79d07c44
Preallocate the render task tree. r=kvark
Keywords: leave-open

Comment 26

4 months ago
Pushed by nsilva@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/df0e32716df1
Preallocate a few more items in the render task tree. r=kvark

Here is a series of small changes in a general attempt to reduce the __memmove_avx_unaligned_erms samples that were towards the top of the profile and consolidated into a single group. Perf wouldn't give me the stacks for these samples unfortunately, so I resorted to gdb breakpoints to figuring where a lot of these memmoves come from.

Removing some of the memmoves might not yield real perf wins if the time was mostly spent waiting for cold misses (the read will still happen), but if anything these patches reduce the amount of perf samples that fall into the __memmove_avx_unaligned_erms bucket and the potential cache misses move into hopefully more helpful symbols. Also the changes are generally trivial.

One of the most notable source of frequent small __memmove_avx_unaligned_erms on the render backend comes from moving TransformUpdateState in ClipScrollTree::update_tree. It isn't as straighforward to reduce as the first wave of changes, though.

Another thing that came up while profiling this page is the cost of moving/hashing/cloning FontInstace, but that required more involved surgery so I filed bug 1529272 for that.

Comment 35

4 months ago
Pushed by nsilva@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b512bb508796
Pre-allocate vectors in HitTester::read_clip_scroll_tree. r=kvark
https://hg.mozilla.org/integration/mozilla-inbound/rev/932ab91a896e
Pre-allocate primitives vector in setup_picture_caching. r=gw
https://hg.mozilla.org/integration/mozilla-inbound/rev/d0c93c9acd66
Avoid moving texture cache entries when evicting them. r=gw
https://hg.mozilla.org/integration/mozilla-inbound/rev/659354cec17d
Reserve storage for the dynamic property vectors in the bindings before filling them. r=kvark
https://hg.mozilla.org/integration/mozilla-inbound/rev/9af23e1d86c8
Avoid moving picture primitives when destroying them. r=gw
https://hg.mozilla.org/integration/mozilla-inbound/rev/7102801e2ca8
Add VecHelper::take/take_and_preallocate. r=gw
You need to log in before you can comment on or make changes to this bug.