Closed Bug 1619093 Opened 4 years ago Closed 2 years ago

Stutters caused by texture uploads with blob images

Categories

(Core :: Graphics: WebRender, defect, P3)

73 Branch
Desktop
Windows 10
defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: dobry1407, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:73.0) Gecko/20100101 Firefox/73.0

Steps to reproduce:

Scroll heavy page with GIFs or videos.

Actual results:

Scrolling is stuttering with WebRender enabled, but not with Direct3D11 (Advanced Layers).

Expected results:

Scrolling should be silky smooth, which happens with Direct3D11. Although WebRender makes pages load faster.

Component: Untriaged → Graphics: WebRender
OS: Unspecified → Windows 10
Product: Firefox → Core
Hardware: Unspecified → Desktop

It'd be great if you could provide an example page where you are seeing this.

Flags: needinfo?(dobry1407)

(In reply to Timothy Nikkel (:tnikkel) from comment #1)

It'd be great if you could provide an example page where you are seeing this.

Facebook, Reddit, Twitter, YouTube. Basically the more GIFs or videos, the worse it gets.

Flags: needinfo?(dobry1407)

(In reply to dobry1407 from comment #2)

(In reply to Timothy Nikkel (:tnikkel) from comment #1)

It'd be great if you could provide an example page where you are seeing this.

Facebook, Reddit, Twitter, YouTube. Basically the more GIFs or videos, the worse it gets.

Can you capture a profile of the problem happening? There are instructions here: https://profiler.firefox.com/

Flags: needinfo?(dobry1407)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #3)

(In reply to dobry1407 from comment #2)

(In reply to Timothy Nikkel (:tnikkel) from comment #1)

It'd be great if you could provide an example page where you are seeing this.

Facebook, Reddit, Twitter, YouTube. Basically the more GIFs or videos, the worse it gets.

Can you capture a profile of the problem happening? There are instructions here: https://profiler.firefox.com/

Here you go:

https://profiler.firefox.com/from-addon/calltree/?globalTrackOrder=0-1-2-3-4-5-6-7-8-9-10-11&hiddenGlobalTracks=1-2-3-4-5-6-8&hiddenLocalTracksByPid=8312-1&localTrackOrderByPid=11240-1-0~8312-0-1~8036-0~10976-0~&thread=8&v=4

I don't know how exactly it works, but stuttering should be seen here.

Flags: needinfo?(dobry1407)

(In reply to dobry1407 from comment #4)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #3)

(In reply to dobry1407 from comment #2)

(In reply to Timothy Nikkel (:tnikkel) from comment #1)

It'd be great if you could provide an example page where you are seeing this.

Facebook, Reddit, Twitter, YouTube. Basically the more GIFs or videos, the worse it gets.

Can you capture a profile of the problem happening? There are instructions here: https://profiler.firefox.com/

Here you go:

https://profiler.firefox.com/from-addon/calltree/?globalTrackOrder=0-1-2-3-4-5-6-7-8-9-10-11&hiddenGlobalTracks=1-2-3-4-5-6-8&hiddenLocalTracksByPid=8312-1&localTrackOrderByPid=11240-1-0~8312-0-1~8036-0~10976-0~&thread=8&v=4

I don't know how exactly it works, but stuttering should be seen here.

This is correct link: https://perfht.ml/2Vz5ztG

Can you add the "Renderer" thread in "Threads" section of the profiler settings and get a new profile?

Flags: needinfo?(dobry1407)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #6)

Can you add the "Renderer" thread in "Threads" section of the profiler settings and get a new profile?

https://perfht.ml/3aogYR4

Flags: needinfo?(dobry1407)

Definitely a bunch of interesting things going on in that profile.

  1. We seem to still be hitting the slow path in ensureRenderTarget() which should've been fixed by bug 1615694. It's worth looking into what's going on there.
  2. We're spending a bunch of time waiting on a D3D11 lock. This is something I'd expect if there was another thread using D3D11 but I don't know what thread that would be.

Also, FWIW, "Nano adblocker" is using a lot of CPU time. I'd be interested to know if disabling it improved things for you. (I filed https://github.com/NanoAdblocker/NanoCore/issues/311 upstream)

Flags: needinfo?(dobry1407)

Also, what url was this profile recorded on?

(In reply to Jeff Muizelaar [:jrmuizel] from comment #9)

Also, what url was this profile recorded on?

I can't recall exactly but it was just Reddit and Facebook scrolling if I remember correctly. Disabling Nano Adblocker doesn't seem to do much, but I haven't tested this extensively. Can I use uBlock Origin instead or Adblock Plus to ensure it won't hurt performance ?

Flags: needinfo?(dobry1407)

Can you get new profile with from reddit on a public url that you can share?

I would recommend uBlock Origin. It's more commonly used and we monitor it's performance more closely.

Flags: needinfo?(dobry1407)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #8)

Definitely a bunch of interesting things going on in that profile.

  1. We seem to still be hitting the slow path in ensureRenderTarget() which should've been fixed by bug 1615694. It's worth looking into what's going on there.
  2. We're spending a bunch of time waiting on a D3D11 lock. This is something I'd expect if there was another thread using D3D11 but I don't know what thread that would be.

Also, FWIW, "Nano adblocker" is using a lot of CPU time. I'd be interested to know if disabling it improved things for you. (I filed https://github.com/NanoAdblocker/NanoCore/issues/311 upstream)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #11)

Can you get new profile with from reddit on a public url that you can share?

I would recommend uBlock Origin. It's more commonly used and we monitor it's performance more closely.

Nano Adblocker + Nano Defender profile (resource URL included, if that's what you asked for): https://perfht.ml/2PJvuLn
uBlock Origin profile: https://perfht.ml/39jBFxy

Honestly I can't see much, if any performance difference.

Flags: needinfo?(dobry1407)
Blocks: wr-perf
Priority: -- → P3

(In reply to Jeff Muizelaar [:jrmuizel] from comment #11)

Can you get new profile with from reddit on a public url that you can share?

I would recommend uBlock Origin. It's more commonly used and we monitor it's performance more closely.

Do you have some more info about what's going on ? Thank you in advance.

The author of uBlock Origin from which Nano Adblocker comes from responded to the issue I filed. Perhaps you can respond to what he says here: https://github.com/NanoAdblocker/NanoCore/issues/311#issuecomment-595825522

Hey Dzmitry, here is another bug it would be good to have your eyes on at some point.

Flags: needinfo?(dmalyshau)

Could you attach the output of "about:support" to the bug, please?
I'm going to look at the profiles, want to see if this is related to a particular vendor/platform.

Assignee: nobody → dmalyshau
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Flags: needinfo?(dmalyshau) → needinfo?(dobry1407)

Jeff,

The wait in D3D11 runtime is indeed strange. Zooming into the intensive part of a profile made with UBlock Origin, we are spending 31% of the Renderer CPU time by just waiting for a lock (inside d3d11 runtime) when some D3D11ShaderResourceView (associated with one of our textures) is destroyed. The only D3D-related thing on the other threads in GPU process I can see is mozilla::D3D11DXVA2Manager::SupportsConfig(IMFMediaType*, float). Perhaps, we are erroneously calling it too often, but inspecting the relevant code doesn't reveal anything wrong.

One thing we could try as a workaround is deferring gl.delete_texture calls to after the frame is rendered.

We spend awful amount of time (24% of total time in one of the stacks) trying to upload the instance data. Waiting on a Map in particular. We end up hitting RtlpEnterCriticalSectionContended, which implies simultaneous access to the device. I wonder if we are memory starved, and the driver is having a hard time finding new resident memory. The map is done with NO_OVERWRITE flag, so there isn't expected to be any stall for GPU work.

Another small thing is 2.7% spent in gl::TransposeMatrixType(unsigned int). This isn't much, but we could save it by providing the matrix in the proper transposition to begin with.

Another thing is 2.7% spent in rx::BufferD3D::promoteStaticUsage(gl::Context const*, unsigned long long). I'm surprised Angle is trying to promote anything, given that our instance data is re-built every time, and we definitely don't expect Angle to scan it and try to figure out if it changes. Moreover, I don't quite understand why Ange tries to call promoteStaticUsage here. The argument on the caller side is the size of what's bound, the argument on the callee side is what's unchanged - this is a mismatch in semantics.

So the overall diagnosis is that the system/driver is in a very bad state. Every access to internally synchronized objects involves locking a contended critical section. The question is why it's contended. I think having a Gecko Profile with all threads could shed more light.

The scrolling performance on this site is horrible with WebRender enabled:
https://www.epicgames.com

Scrolling either manually or with autoscroll, the CPU and GPU usage is unacceptable. It works smooth enough with Direct3D 11 (Advanced Layers) and with a lot less CPU+GPU usage.

That seems like a separate issue. I've filed bug 1683239 for that.

(In reply to Jeff Muizelaar [:jrmuizel] from comment #19)

That seems like a separate issue. I've filed bug 1683239 for that.

Sorry, I thought it was related because that site is pretty heavy, is full of high-res images and embedded videos.

I don't know if this is going to help, it was one of the ideas to check, based on the profile.

I prototyped one thing that can improve this. Could you download the binary artifact - https://treeherder.mozilla.org/jobs?repo=try&revision=b90499c1a3668008ab6cfbcfae1a50ba4f6add63&selectedTaskRun=Qc3MFiKlQ56ZUXwvQcTlLQ.0 (exact zip link is https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Qc3MFiKlQ56ZUXwvQcTlLQ/runs/0/artifacts/public/build/target.zip), enable WR, and double-check what the perf looks like on that machine on the affected websites? A fresh Firefox profile would be great!

There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:kvark, could you have a look please?
For more information, please visit auto_nag documentation.

Flags: needinfo?(dmalyshau)

dobry1407, would you be able to try out a build linked from https://bugzilla.mozilla.org/show_bug.cgi?id=1619093#c22? That would be very useful for us to know if the fix can be landed.

Flags: needinfo?(dmalyshau)

(In reply to Dzmitry Malyshau [:kvark] from comment #24)

dobry1407, would you be able to try out a build linked from https://bugzilla.mozilla.org/show_bug.cgi?id=1619093#c22? That would be very useful for us to know if the fix can be landed.

Hi Dzmitry.

It is a shame that dobry1407 did not reply as yet. My apologies if I am hijacking here, but I have had the same problem for some time, and have just tested your linked build on a fresh profile, and for me it is significantly better.

I am testing using https://browserbench.org/MotionMark1.2/. I zoom in to 200% and then use auto-scroll to slowly move up and down while the background colour changes. In the latest nightly I get terrible stutter on every background colour change: https://share.firefox.dev/3sY9wa9
With your linked build I get no stutter. Unfortunately it's too old to upload directly to profiler.firefox.com. Could I send to you via another means, or would you prefer to do another build for us to confirm with? (given this issue is a bit dated)

(In reply to pexxie from comment #25)

I am testing using https://browserbench.org/MotionMark1.2/. I zoom in to 200% and then use auto-scroll to slowly move up and down while the background colour changes. In the latest nightly I get terrible stutter on every background colour change: https://share.firefox.dev/3sY9wa9
With your linked build I get no stutter. Unfortunately it's too old to upload directly to profiler.firefox.com. Could I send to you via another means, or would you prefer to do another build for us to confirm with? (given this issue is a bit dated)

FYI
v78 ESR with WebRender is smooth.
https://share.firefox.dev/3mJHePw
https://pastebin.com/6zc4Y3QG

pexxie, are you able to use mozregression to find the change that regressed it?

Flags: needinfo?(pexxie)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #27)

pexxie, are you able to use mozregression to find the change that regressed it?

Hi Jeff. Thanks. That would have been fun, but I didn't need it.

For this specific stutter; I found it in 78 release, but not in 78 ESR. I ended up comparing the exact same release and esr versions:
Stutter: https://ftp.mozilla.org/pub/firefox/releases/78.0.2/win64/en-US/
Smooth: https://ftp.mozilla.org/pub/firefox/releases/78.0.2esr/win64/en-US/
I did clean installs of each after removing all Mozilla folders from disk. Then manually enabled WebRender using gfx.webrender.all = true

I guess Release and ESR are being built/compiled differently. Therein the problem may lie...

Flags: needinfo?(pexxie)

There could be a number of things that cause the release vs ESR difference. It would be more valuable to have a mozregression run because that will ensure the same environment and point to the causing change.

(In reply to Jeff Muizelaar [:jrmuizel] from comment #29)

There could be a number of things that cause the release vs ESR difference. It would be more valuable to have a mozregression run because that will ensure the same environment and point to the causing change.

So I can confirm the initial report by dobry1407, that it was introduced in release version 73. However, there's a problem using mozregression. After about 4/5 good builds starting release 72; I get critical errors and a "Unable to find enough data to bisect" long before reaching release 73. I get it with shippable and opt. I tried dev too, but that said it only keeps builds for a year, and started me off with 82.

https://imgur.com/a/aHCpNoV
https://imgur.com/wScnqij

(In reply to pexxie from comment #30)
Thanks! Please give us the most narrow "pushlog url" you can get, the one from the last step.

"Unable to find enough data to bisect"

If the change is older than a year, mozregression is less precise (old autoland builds get deleted), but that's not a problem.

Got it, I think. This tool is awesome, by the way. Sorry, I got royally confused with the release versions and build dates.

Here's where it first went bad:

app_name: firefox
build_date: 2020-06-10
build_file: C:\Users\Gareth.mozilla\mozregression\persist\2020-06-10--mozilla-central--firefox-79.0a1.en-US.win64.zip
build_type: nightly
build_url: https://archive.mozilla.org/pub/firefox/nightly/2020/06/2020-06-10-21-40-41-mozilla-central/firefox-79.0a1.en-US.win64.zip
changeset: fab7c4f54054ceb06504c7ddac380858e2521fc4
pushlog_url: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=63dc5e9b1b02b0aebd6badfe5eaef7bb9aa8f430&tochange=fab7c4f54054ceb06504c7ddac380858e2521fc4
repo_name: mozilla-central
repo_url: https://hg.mozilla.org/mozilla-central

https://imgur.com/a/N2CMT5e

Sorry, accidental there with the font size. Can't edit or delete.

My guess is that this was caused by bug 1641751

(In reply to Jeff Muizelaar [:jrmuizel] from comment #34)

My guess is that this was caused by bug 1641751

Jeff, you're awesome! Thanks so much, you put me on the right path.

I found the "fix." Comment out the following 2 lines and rebuild:
gfx/wr/webrender/src/texture_cache.rs
916: self.evict_items_from_cache_if_required(profile);
917: self.expire_old_picture_cache_tiles();

With the original build; my Texture cache update, Renderer and Frame CPU Total times max out just over 50ms: https://imgur.com/a/4n3ZiNs
With my "fix"; those times max out just under 4ms: https://imgur.com/a/JCZ2VM9

Renaming the bug to focus on the more rencent discussion.

The problem is that the animated background is rendered via a CPU fallback which is uploaded to GPU textures and the stored pixels occupy a large amount of GPU memory (depending on the screen size) which puts pressure on the texture cache and causes things to be evicted. We end up needing to re-upload these things when the color changes because it invalidates a large portion of the screen.

Bug 1641751 Made it so we would do a better job at cleaning up the texture cache when under pressure. That saves a lot of memory but doesn't interact well with how this page creates regular spikes of texture cache pressure every few seconds. There are tweaks we could make but I'd rather make them based on more common scenarios than 200% zoom on this specific page since it's going to be a tradeoff between memory usage and texture upload time. There's also a longer term project of not needing the CPU fallback.

Blocks: texture-upload-perf
No longer blocks: wr-perf
Summary: Scrolling smoothness is worse on WebRender. → Stutters caused by texture uploads with blob images

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: dmalyshau → nobody
Status: ASSIGNED → NEW

Redirect a needinfo that is pending on an inactive user to the triage owner.
:gw, since the bug has recent activity, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(dobry1407) → needinfo?(gwatson)

Nical, do we want to do anything with this? Should it be closed as incomplete?

Flags: needinfo?(gwatson) → needinfo?(nical.bugzilla)

We have a lot of higher priority items to chew, normally I'd say let's keep it so that we have a test case when we focus on texture upload perf but, I think that we have enough texture upload perf bugs on file to focus on the ones that happen in more common settings.

Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(nical.bugzilla)
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: