Closed Bug 1628913 Opened 4 years ago Closed 3 years ago

WebRender cause 100% CPU usage while monitor is off (DPMS) on Linux with NVIDIA drivers

Categories

(Core :: Graphics: WebRender, defect, P3)

77 Branch
x86_64
Linux
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox78 --- disabled
firefox79 --- disabled

People

(Reporter: bubbleguuum, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: power)

Attachments

(3 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:74.0) Gecko/20100101 Firefox/74.0

Steps to reproduce:

Issue happens on both Firefox 74.0.1 provided by openSUSE and current nightly as of the date of this bug report.

My setup:

  • Thinkpad P72 with NVIDIA Quadro P600
  • Distro: openSUSE Tumbleweed
  • Kernel 5.6.2 (also happened with older kernels)
  • NVIDIA drivers 440.82 (also happened with older versions)
  • Xorg 1.20.8
  • Firefox 74.0.1 and Firefox Nightly as of the date of this bug report (77.0a1 (2020-04-09) (64-bit)
  • WebRender enabled (gfx.webrender.all=true)
  • Desktop: Plasma 5.18.4 with compositor, or i3+picom (it does not matter if compositor is enabled or not)
  • using an external Dell P2415Q 4K monitor over Thunderbolt 3 (DisplayPort)

The problem:

Whenever my monitor go to sleep via DPMS (either automatically or via hotkey triggering xset dpms force off), after a few seconds, Firefox systematically takes 100% CPU usage until I wake up the monitor where CPU usage goes back to normal. I can hear the CPU usage (laptop fans suddenly kick in) and see it via htop in a remote ssh console.

This issue does not happen if either:

  • I disable WebRender. WebRender otherwise works fine and makes scrolling much smoother than without it.
  • I do not use the NVIDIA driver and use my Intel UHD 630 instead with the modesetting driver (with WebRender enabled)

=> this issue is specific to NVIDIA and Firefox when using OpenGL (it also happens with layers.acceleration.force-enabled=true). Desktop environment does not matter, in particular whether there is a compositor or not. It is also not specific to a recent version of the kernel, Xorg, NVIDIA driver, Firefox, as I have this issue since many months.

///////////

Nightly profiler report: https://perfht.ml/2wsfib4

In this report:

  • started Firefox Nigthly
  • loaded a web page
  • started the profiler
  • waited about 10s and turned of the monitor with a hotkey running 'xset dmps force off'
  • CPU usage went immediately to 100% for 2 processes as monitored from a ssh session with htop
  • waited for about 30-40s with 100% CPU usage
  • waked up monitor pressing on a key
  • CPU use went back to normal and waited about 10s before stopping the profiling

Actual results:

100% CPU usage when monitor is off with DPMS

Expected results:

Normal CPU usage when monitor is off with DPMS

Attached file about_support.txt

Attached content of about:support

Blocks: wr-linux
Priority: -- → P3

There's nothing obviously spinning in that profile. Can you perhaps take a profile using 'perf'?

Flags: needinfo?(bubbleguuum)

Here's the output of 'perf record', recorded via ssh for 1 minute with monitor off and 2 Firefox processes taking 100% CPU.
'perf report' shows GLXVsyncThread coming on top.

Flags: needinfo?(bubbleguuum)

Ok, that's sort of what I was expecting. I think we probably need to throw our Linux vsync implementation in the trash.

Depends on: vsync

In case you do, it would be great if the new one would not rely on the root window - it causes several issues, among them bad xwayland support. See bug 1588589 / https://gitlab.freedesktop.org/xorg/xserver/issues/915#note_257271

Thank you for looking into it.

In the meantime, I've made an interesting discovery and found a workaround:

My laptop supports Optimus thus have 2 video cards:

  • an Intel UHD 630
  • a NVIDIA Quadro P600

As mentioned in OP, I mainly use an external 4K/60Hz Monitor connected via DP over Thunderbolt 3. The laptop's panel is also 4K/60Hz.
My Xorg config file is setup for Optimus, thus with both the modesetting and nvidia xorg drivers loaded.
When displaying on the laptop's panel, the nvidia driver renders off screen in a way the UHD 630 can display it and both the nvidia and modesetting drivers cooperate for this.
When displaying on the external monitor, the nvidia driver handles it entirely.

If found out that if I start Firefox on the laptop's panel and turn off the panel with DPMS, there is no issue and CPU usage is normal.
Moreover, if I subsequently switch to the external monitor (with laptop panel off) so Firefox is displayed there , there is no issue when monitor is off !
=> So the workaround is to start Firefox on the laptop's panel prior to switching to external monitor.

To conclude, this issue only happens if Firefox is started on the external monitor. However if I then switch to the laptop's panel, it does not
happen. If I switch back to the external monitor, it happens again.

Attached my Optimus xorg config

Attachment #9139936 - Attachment description: Xorg configuration is an Optimus setup → Xorg Optimus configuration

Resetting severity to default of --.

Getting similar issue with Gentoo linux 4.19.113, nvidia-drivers 440.64, xorg 1.20, and Firefox-75.

I have webrender disabled but HW acceleration otherwise enabled. If I disable HWaccel the problem goes away.
If I enable webrender the problem also goes away but it looks like HWaccel goes away too (at least vsync does).

I'm running just one monitor but the story is probably not that simple.

I'll adapt to disabling DPMS for the time being. Using TV panels as monitors I'm used to closing them via remote anyway.

PS. I absolutely rely on working vsync so I'd hate to lose it just because of this silly bug.

Because this bug's Severity has not been changed from the default since it was filed, and it's Priority is P3 (Backlog,) indicating it has been triaged, the bug's Severity is being updated to S3 (normal.)

Severity: normal → S3
See Also: → 1640779
Blocks: wr-nv-linux
No longer blocks: wr-linux
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
See Also: → 1587040
See Also: → 1592530

FTR, I think a elegant way to go forward here would be implementing bug 1563075 and then go for compositor driven frame source on X11 using _NET_WM_FRAME_DRAWN. That is close to how GTK4s OpenGL backend works and would also solve bug 1640779 (the current vsync only works with GLX).

I'm planning to look into that soon, as it would also benefit the Wayland backend and potentially other platforms.

From the additional reports in bug 1592530, I think we can confirm this. Blocking against wr-linux for now, just so that it isn't forgotten. If we follow up on some of the suggestions like comment 11, this would impact all hardware and might be worth blocking shipping against depending on the level of effort.

Blocks: wr-linux
Status: UNCONFIRMED → NEW
Ever confirmed: true

(In reply to Robert Mader [:rmader] from comment #11)

FTR, I think a elegant way to go forward here would be implementing bug 1563075 and then go for compositor driven frame source on X11 using _NET_WM_FRAME_DRAWN. That is close to how GTK4s OpenGL backend works and would also solve bug 1640779 (the current vsync only works with GLX).

Update on this: apparently this is very hard to realize on top of GTK, as it uses the signals itself. On Wayland that's a slightly different story, as we already do more things on our own. GTK people say we should consider not using upstream GTK - maybe we're getting to a point where it's easier to fork those part of GTK (mostly GDK) that we are actually using and adopt them to our needs. Just as Mutter has forked Clutter.

the last since half a year ago:

status-firefox78: --- → disabled
status-firefox79: --- → disabled

is this now fixed? I have no idea what disabled means

It's still happening for me with Firefox 84.0 on Gentoo Linux, with kernel 5.9.12 and nvidia driver version 455.45.01.

(I've only recently noticed this happening and it took me a while to trace it back to Firefox. I usually have layers.acceleration.force-enabled set to false and I think that bug here never happened then. But this option set to false made Foundry VTT not initialize a WebGL context since Firefox 83.0, when it previously worked. Don't know if that's a Firefox bug, a Foundry VTT bug or not a bug at all, though.)

@drmccoy see bug 1679671 for context on that. VTT is probably using failIfMajorPerformanceCaveat.

Blocks: wr-linux-perf
No longer blocks: wr-linux

NV specific bug, thus not blocking bug 1690581 any more.

No longer blocks: wr-linux-perf

Does this problem still occur?

If yes, can it be fixed by starting Firefox 92 (minimum 91: bug 1646135) with MOZ_X11_EGL=1 environment variable?
$ MOZ_X11_EGL=1 firefox

At least for me, this problem is gone with Firefox 92.0 (and nvidia-drivers 470.63.01).

(In reply to drmccoy from comment #20)

At least for me, this problem is gone with Firefox 92.0 (and nvidia-drivers 470.63.01).

Gone with Firefox 92 even without setting the MOZ_X11_EGL=1 env var - or only if you set it?

Ah, sorry, yeah, gone even without setting that env var.

(In reply to bubbleguuum from comment #0)
Is this bug still reproducible for you with Firefox 92?

If yes, can it be fixed by starting Firefox 92 (minimum 91: bug 1646135) with MOZ_X11_EGL=1 environment variable?
$ MOZ_X11_EGL=1 firefox

Flags: needinfo?(bubbleguuum)

It does not happen anymore since a while (do not remember when), without MOZ_X11_EGL set,
with WebRender force enabled and NVIDIA drivers 470.63.01, Firefox 92, openSUSE Tumbleweed.

Thanks for checking! :)

Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(bubbleguuum)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: