Closed Bug 1678897 Opened 4 years ago Closed 4 years ago

X11/EGL Firefox causes higher power consumption on Intel CPUs in dual-gpu setups with proprietary nvidia driver, even if only integrated intel gpu is used

Categories

(Core :: Widget: Gtk, defect)

Firefox 83
x86_64
Linux
defect

Tracking

()

RESOLVED MOVED
Tracking Status
firefox83 --- disabled
firefox84 --- disabled
firefox85 --- ?

People

(Reporter: valentin.manea, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: power)

Attachments

(2 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0

Steps to reproduce:

  1. Start firefox with MOZ_X11_EGL=1
  2. start powertop and watch the Pkg(HW) power states attained

Actual results:

The CPU/Pkg cannot reach the deepest state available(C10/pc10) -> results in quite higher system baseline state

Expected results:

The CPU/Pkg should reach the deepest state available as it does with MOZ_X11_EGL=0

This problem affects Wayland as well as I suppose Firefox on Wayland always uses EGL

I should note I tried Firefox 83 and Firefox 84 from the beta channel, the results are exactly the same.
I also did try with a empty profile, the results were exactly the same.

As it stands now it seems like I have to make a choice between accelerated video decoding(with EGL) or just system going to deeper idle states(with GLX).

Component: Untriaged → Widget: Gtk
Product: Firefox → Core

Thanks for the report. Please open about:support, click on "Copy text to clipboard" and paste it here. Which Linux distribution, version and desktop environment are you using?

Keywords: power
OS: Unspecified → Linux
Hardware: Unspecified → x86_64

Attached about:support

I'm running this on Kde Neon which based on Ubuntu 20.04
Running a quite recent mesa from ppa: Mesa 20.3.0-devel (git-2b977ad 2020-10-06 bionic-oibaf-ppa)

Target Frame Rate: 60

Display0: 3000x2000 default

Thanks! Can you check if this difference still occurs with Nightly?
It would also be interesting to find out (with Nightly) if Gnome X11 is likewise affected or if it only occurs on KDE X11.

Blocks: linux-egl

it's probably that is not due to the cpu but the gpu that has the intel integrated. If not.. I am almost sure that it occurs too on amd cpu's, I mean, on both brands. Do you understand what I'm saying?

Testing nightly with X11/EGL would be appreciated as bug 1669275 landed recently. Also, could you test the Wayland backend with widget.wayland_vsync.enabled enabled?

Context: currently, both the Wayland and the X11/EGL backend run on software timers instead of "hardware" vsync sources. This could be an issue. If it's not that, I'd suspect a Mesa/Kernel issue actually :/

(In reply to Darkspirit from comment #5)

Target Frame Rate: 60

Display0: 3000x2000 default

Thanks! Can you check if this difference still occurs with Nightly?
It would also be interesting to find out (with Nightly) if Gnome X11 is likewise affected or if it only occurs on KDE X11.

Checked with nightly, exactly the same behaviour. GL works fine, Pkg reackes PC10, power consumption at IDLE=~2W, EGL Pkg only reaches PC3, power consumption at idle ~5W.

Attached file Wayland + Nightly

This is nightly with the Wayland Backend on Wayland.

After I enabled widget.wayland_vsync.enabled it was updating the window only every now and then, I think only when I actually dragged it around otherwise no update. But I think plasma and wayland are not yet usable so that might be one thing.
In any case the power issue was still there, once I started firefox the CPU never reached the lower IDLE states.
I suppose the Wayland back only supports EGL so I couldn't try GL.

(In reply to Vali from comment #10)

After I enabled widget.wayland_vsync.enabled it was updating the window only every now and then, I think only when I actually dragged it around otherwise no update. But I think plasma and wayland are not yet usable so that might be one thing.

Oh right, you most likely ran into https://bugs.kde.org/show_bug.cgi?id=428499

In any case the power issue was still there, once I started firefox the CPU never reached the lower IDLE states.
I suppose the Wayland back only supports EGL so I couldn't try GL.

Yes, Wayland is EGL only, GLX is...let's call it legacy :)

Thanks for testing

One more question for clarification: does neither of your cores ever reach C10 or does just the whole package never reach it (pc10)?

Because here the single cores reach C10 most of the time (>50%), but Pkg(HW) does apparently never reach C10 (pc10) - however that doesn't seem to be dependant on DE or backend (even LXDE with FF GLX does not reach it).

Flags: needinfo?(valentin.manea)

(In reply to Robert Mader [:rmader] from comment #12)

One more question for clarification: does neither of your cores ever reach C10 or does just the whole package never reach it (pc10)?

Because here the single cores reach C10 most of the time (>50%), but Pkg(HW) does apparently never reach C10 (pc10) - however that doesn't seem to be dependant on DE or backend (even LXDE with FF GLX does not reach it).

The cores are in C10, the system is mostly IDLE. If it makes any difference this is a Intel 8th gen CPU.

When Firefox with EGL is started the system will only go as down as PC3
When I close Firefox or start Firefx with GLX the system will be in either PC10 or PC9 according to powertop.
As far as powertop reporting goes, the difference between the 2 is substantial(at least 2.5W more). I don't expect it to stay in PC9/10 all the time, but if the system is IDLE and firefox is in the background(not running anything of note) it should be able to reach the lowest supported IDLE state supported by the system while running with the EGL enabled.

Flags: needinfo?(valentin.manea)

Some more testing from my side: I tried many kernels between 5.4 to 5.9 and the behavior is exactly the same, no changes between kernel versions.

One more thing I kept an eye on was the GPU state, but the integrated GPU always reaches RC6 state, which as far as I understand means it's IDLE and it should permit the Pkg sleep state to go to deeper sleep.

(In reply to Robert Mader [:rmader] from comment #12)

One more question for clarification: does neither of your cores ever reach C10 or does just the whole package never reach it (pc10)?

Because here the single cores reach C10 most of the time (>50%), but Pkg(HW) does apparently never reach C10 (pc10) - however that doesn't seem to be dependant on DE or backend (even LXDE with FF GLX does not reach it).

Just a note, that package C-State C10 (pc10) can probably only be reached with the display off.

You can also use the Linux utility turbostat to list power usage:

sudo turbostat --quiet -i 3 -s 'PkgWatt,CorWatt,GFXWatt,RAMWatt'

(In reply to Paul Menzel from comment #15)

Just a note, that package C-State C10 (pc10) can probably only be reached with the display off.

Yeah, was thinking the same. So I can conclude that at least on my system I can't confirm Valis findings - pc7 and c10 are easily reached here. This is on Fedora 33 (Gnome) with FF Wayland (the default on Fedora) and Webrender enabled.

Vali, could you quickly grab a live-iso of Fedora and check if you see the same there?

I'm more of an ARM person, not so well versed in Intel lingo. Unless powertop is lying the package reaches PC10 when display is on and cpu power profile is set to Energy saving, or it reaches PC8 if cpu power profile is balanced(I suppose it's a latency/power thing). The only modes I was aware that require display off are Soix states.

I'll give Fedora a try.

Fedora works fine, so I started investigating a bit the power setup using this guide from Intel: https://01.org/blogs/qwang59/2020/linux-s0ix-troubleshooting

In my setup which is a switcheroo setup(Intel + Nvidia) I noticed the difference between EGL and GL was the SPB not being power gated, everything else was the same. So on a hunch I just removed /usr/lib/x86_64/libEGL_nvidia.so.0 and behold system is again doing PC10 with EGL.
The weird part is that the NVIDA device is powered off with bbswitch so I suppose libEGL_nvidia.so.0 does some probing to just keep the PCI interface from power gating.

I think this is not a defect so please close it, sorry for wasting your time. Maybe if others have the same problem they can use this as a solution.

Thanks Vali, this is super helpful to know. So IIUC this affects dual-gpu setups with prop. nvidia driver, even if the integrated gpu is used. This would affect our EGL rollout, so keeping the issue open for now.

Do you mind filing an issue at the nvidia bugtracker about this?

Summary: X11/EGL Firefox causes higher power consumption on Intel CPUs → X11/EGL Firefox causes higher power consumption on Intel CPUs in dual-gpu setups with proprietary nvidia driver, even if only integrated intel gpu is used

Oh, and one little thing:

difference between EGL and GL

It's the difference between EGL and GLX. Both use GL, with EGL we can also use GLES as fallback. Sorry for being smart-assy here ;p

See Also: → 1675768

So I've been investigating this a bit and I think the culprit seems to be the Ubuntu Nvidia Driver package not playing nicely with the dual GPU setup.

optimus-manager from https://github.com/Askannz/optimus-manager is the dynamic GPU switcher I use. It tries it's best to unload all the NVIDIA modules on X.org start, but through 71-nvidia.rules udev rule from the nvidia-kernel-common-450 package, udev forces the driver reload on each Xorg restart. Even though the GPU is switched off through bbswitch having libEGL_nvidia.so.0 and the nvidia modules loaded causes the SPB(Intel name - SouthPort B PCI port?) to always be ON. This doesn't allow the CPU Package to go to deeper sleep states.

I was able to restore libEGL_nvidia.so.0 and just by deleting /lib/udev/rules.d/71-nvidia.rules my system works with dynamic GPU switching and saves power.

I suppose it's not really a NVIDiA bug because they don't allow dynamic switching at all, they require system reboot to switch PRIME profiles.

(In reply to Vali from comment #21)

I suppose it's not really a NVIDiA bug because they don't allow dynamic switching at all, they require system reboot to switch PRIME profiles.

Ah alright, thanks. I suppose in that case optimus-manager could try to find some workaround for the issue, but it's not something that should stop us from shipping the EGL backend.

Status: UNCONFIRMED → RESOLVED
Closed: 4 years ago
Resolution: --- → MOVED

Right, if you are using the NVIDIA recommended PRIME method(eg reboot to switch GPU) then everything should be OK. The ubuntu nvidia-prime package doesn't even power off the GPU so no chance to use lower power anyway.

The only issue is the overly aggressive nvidia module loading scripts in udev in the ubuntu package.

PS: I also checked Wayland + KWin and that is fixed as well.

Can you report it to the Ubuntu bug tracker Launchpad, so they can improve the packaging? (No idea, why there should be a udev rule at all for this.)

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: