Closed Bug 1940924 Opened 8 months ago Closed 7 months ago

[Wayland] high compositor CPU use on idle in 135 beta series

Categories

(Core :: Graphics: WebRender, defect, P2)

Firefox 135
defect

Tracking

()

RESOLVED FIXED
136 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox-esr128 --- unaffected
firefox134 --- unaffected
firefox135 + fixed
firefox136 + fixed

People

(Reporter: s.zharkoff, Assigned: stransky)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(6 files, 1 obsolete file)

Attached file about_support.txt

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:135.0) Gecko/20100101 Firefox/135.0

Steps to reproduce:

Open youtube video in FF 134. Start playing it then stop. Compositor CPU usage goes down to about 1%.

Open youtube video in FF 135 Beta (tried 1, 2 , trunk from January 09 2025) , play it, stop it. compositor cpu uage goes up to 10-12% and compositor continues to be in the top of CPU using apps running at 5-12 % range (video is paused).

Compositor Wayfire (development from trunk, based on wlroots 18.2 release)

Got the same effect when applied patches from 1934497 over 134 tree, so most likely it is a regression of WaylandSurface to render to screen approach.

Actual results:

my laptop with the same youtube page opened in FF 134 consumes 5-6 watts video paused, 10-11 video playing

FF135 - 8-9 watts paused, 11-12 playing.

The issue is mostly noticed when FF is idle

Expected results:

Power consumption and CPU use by wayland compositor shold be same (or better) with a new version.

Can you try to use mozregression to find the broken commit?
https://fedoraproject.org/wiki/How_to_debug_Firefox_problems#Use_Mozregression_tool
Thanks.

Flags: needinfo?(s.zharkoff)

Testing on Fedora 41 / Gnome I don't see any significant difference between 134.0 and latest trunk. Also the WaylandSurface rework (Bug 1934497) should not affect the HW based rendering much - we use GL / egl_surfaces for rendering so I'd expect a regression on SW rendering here.
It may be related to broken VSync but it seems also work fine - when window is hidden I see the occlusion state is set to window.

Priority: -- → P3

the issue is that it is consuming more when it does not render anything. When it plays something it looks normal, the difference is within measurement error margin. But with the idle - it shows issues. Maybe it is because of some wayland protocols missing in wlrooots compared to mutter. Trying to compile excluding patches - maybe will be more clear. It will take time - compiling happenng on Honor ultrabook 125H based so it is slow.

Flags: needinfo?(s.zharkoff)

You can use mozregression tool to download and run binaries with and without the patches without compilation.

I've used source of 134 release and started apply patches.
Fisrt goes some requests upgrading Wayland headers and protocols - without them 1934497 does not apply/compile

  • D230536.1733569089.diff.
  • D230708.1733577349.diff
  • D230709.1733577330.diff
  • D230712.1733568100.diff
  • D230713.1733568080.diff
    Those patches applied ok , compiled, no issues - wayfire use around 1% CPU when firefox is not playng video and there are no animations on the tab - to make sure I may pause video playback, and open blank tab. The behavior is normal - when video paused it takes some time to settle down and wayfire CPU use calming from 6% to 1%

Now 1934497.. Unfortunately this is a patchset - so I can not just take pathes one by one - it does not compile at all this way. I've managed to remove some - but not many and not big ones - so finally I got thos set applied, compiled and running

  • D230845.1733566649.diff
  • D230847.1733566620.diff
  • D230848.1733566604.diff
  • D230849.1733566587.diff
  • D230850.1733568634.diff
  • D230851.1733566558.diff
  • D230852.1733566542.diff
  • D230853.1733566528.diff
  • D230854.1733566507.diff
  • D230855.1733566489.diff
  • D230856.1733566469.diff
  • D230858.1733566426.diff
  • D230859.1733566392.diff

With those applied wayfire is starting eating the CPU. When I stop playing and opening an empty tab instead of calming down from 6% to 1% wayfire CPU usage rises from 6 to 11%. Most likely it is not rising but the CPU frequencies go down because firefox itself stop using so much CPU's and the overall CPU capacity goes down - not sure about this. But it seems that new wayland surface handling continue to force the compositor to run something constantly no matter if there is some rendering goes on for the active tab or if it is static.

just to mention again - it is not a firefox processes that eats the CPU/battery - it is the compositor (wayfire in my case) that eats the CPU. Firefox itself, Web Content, RDD - they are all OK. And it may not look signifficant - 6-10% CPU of an ultrabook processor , not 100 - so it does not even fully load 1 E-core. But this thing makes a big difference for battery life as it prevents CPU from entering deeper C-states. It easily can be issue with wlroots/wayfire that can not handle some protocols efficiently - I remember some work being done by Robert Mader for mutter compositor. Unfortunately my system does not have mutter installed so can not test this.

just noticed - there is no need to start playing video. Just start the browser with active tab without anything moving - like about:support - and it starts using compositor CPU. So it is not related to the media player.

Only change I can think of is transition from fixed buffer scale to viewport scale. I see from about:support that you use 100% scale and 4K resolution, is that correct?

You can test by yourself, just change WaylandSurface::CreateViewportLocked():

https://searchfox.org/mozilla-central/rev/36f132063db8a0a0c7a4c434e52b75c8e45c5303/widget/gtk/WaylandSurface.cpp#414

to:

bool WaylandSurface::CreateViewportLocked(
    const WaylandSurfaceLock& aProofOfLock, bool aFollowsSizeChanges) {
  LOGWAYLAND("WaylandSurface::CreateViewportLocked() follow size %d",
             aFollowsSizeChanges);
  MOZ_DIAGNOSTIC_ASSERT(&aProofOfLock == mSurfaceLock);
  MOZ_DIAGNOSTIC_ASSERT(mIsMapped);
  MOZ_DIAGNOSTIC_ASSERT(!mViewport);

  return false;
}

so it's running without surface scale.

Flags: needinfo?(s.zharkoff)
Attached image new_wayland.png

top with 134.0 patched with new wayland processing , scaling removed.

Flags: needinfo?(s.zharkoff)
Attached image old_wayland.png

top with vanilla 134

added screenshots with what's happenng when running ff with several youtube tabs (all paused - not playing anything) + about:support tab.

I just have 2 ff's installed, both using same profile - so I just quit one and start another without rebooting the PC or changing something else.

Both are compiled same way - using gentoo ebuild , but one is vanilla gentoo another one is gentoo with mentioned patches.

In the initial about:support I've used the pre-built 135 beta from mozilla ftp just to make sure it is not a gentoo issue , also I've tried safe profile with the same result.

Patched out the scaling - did not help.

My laptop is chinese Honor MagicBook Art 14 with a weird resolution 3:2 ratio 3K OLED display - 3120x2080, it is capable of 120hz but i forced it to 60 for some battery saving as I do not play games so do not care about more Hz that my eyes can feel. And I do not run the scale - it is 100%. I prefer to make application settings for bigger fonts instead of scaling the final picture using wayland fractional scaling, it works way better for eyes.

Blocks: gfx-triage

Could this be related to bug 1924932? "Excessive Windows DWM usage when screen is locked and display enters standby"

See Also: → 1924932

Can you take a performance trace profile when it's having this trouble? (https://profiler.firefox.com/ with the Graphics preset)

Flags: needinfo?(s.zharkoff)
Flags: needinfo?(s.zharkoff)

Uploaded the performance trace. The most complex thing for me here is that firefox does not use CPU itself, but it forces wayland compositor to do it. And it does not happen with the previous version.

(In reply to s.zharkoff from comment #15)

Uploaded the performance trace. The most complex thing for me here is that firefox does not use CPU itself, but it forces wayland compositor to do it. And it does not happen with the previous version.

Would be worth to report that to wayland compositor tacker and use any perf tool for it.

We suspect this is a bug in Wayfire.

Severity: -- → S3

:s.zharkoff, could you please submit this to the Wayland team (as Martin has requested in comment #16) and see what their take would be?

No longer blocks: gfx-triage
Flags: needinfo?(s.zharkoff)

WIll do it when 135 released , meanwile I want to try it with other wlroots based compositor and see if the issue is wayfire-specific or it is the issue of wlroots library which is used as an engine for many other compositors as well.

Flags: needinfo?(s.zharkoff)

Tested with sway - tile compositor, actually classics of wlroots - actually wlroots was initially designed as a part of sway and for sway. Same story.

I just open firefox 135-beta8 - empty window, nothing running - sway is the first-second line in the top output using 7-8% CPU constantly.

Okay, added to wayland-sway tracker. We primary target Gnome/KDE and we don't have manpower to investigate/fix bugs on other compositors. I'll look at it if it also affects Gnome/KDE.

Blocks: wayland-sway
Summary: Wayland - high compositor CPU use on idle in 135 beta series → [Sway] Wayland - high compositor CPU use on idle in 135 beta series

I really recommend you to file that on Sway bug tracker and try to solve it there - I have no idea why Sway uses such extra resources, they should be able to investigate it and find what's going on. If that investigation reveals a bug on Firefox side I'm happy to fix it.

Starting with bug 1934497, I observe that Firefox is constantly updating its window by using the "showrepaint" plugin of Wayfire, and my desktop computer consumes 10 watts more when Firefox is visible and focused that when it not shown. The power usage is less significant with my laptop though.

(In reply to lilydjwg from comment #23)

Starting with bug 1934497, I observe that Firefox is constantly updating its window by using the "showrepaint" plugin of Wayfire, and my desktop computer consumes 10 watts more when Firefox is visible and focused that when it not shown. The power usage is less significant with my laptop though.

Can you reproduce that with a single window and any static page like about:support for instance?

I tried one more thing. Wayfire has a tool "show repaint". If activated - it continuosly apply some slight coloring over the damaged & repainted areas.

So - I open 134 ffox. It flickers with colors until things go static - and stops. next time it flickers when I hover the mouse over.
135 ffox - with "" show repaint" area of FF window always flickers. It always sends damage events so compositor always repaints and repaints the ff window like the content their changes every frame. No matter that it is empty white page , and no mouse hoovers, and nothing happens in ff itself - it just empty window, and no mater how long I wait.

So the root cause of it is clear - wlroots based compositors treats ff window as always being damaged and requiring repainting no matter what happens in the ff window area itself.

Opened issue for wlroots

https://gitlab.freedesktop.org/wlroots/wlroots/-/issues/3942

(In reply to Martin Stránský [:stransky] (ni? me) from comment #24)

Can you reproduce that with a single window and any static page like about:support for instance?

Yes.

It's possible that Firefox sends surface damage. Please run latest nightly on terminal as:

WAYLAND_DEBUG=1 ./firefox -P 2>&1 | grep "damage"

(use -P to select your nightly testing profile).

Do you see surface damage if the page is still but repainted by wayfire?

On Gnome I open about:support and if mouse is still, I don't see any damage send by Firefox. But Firefox sends surface commit to get frame callback /vsync which is correct.

No, I don't see any new damage logs when I leave the Firefox window still. Instead, I see the follow log lines repeating:

[3087408.904] {Default Queue} wl_callback#60.done(84233859)
[3087408.914] {Default Queue}  -> wl_surface#58.frame(new id wl_callback#60)
[3087408.921] {Default Queue}  -> wl_surface#58.commit()

(The argument in done changes for each repeat.)

(In reply to lilydjwg from comment #28)

No, I don't see any new damage logs when I leave the Firefox window still. Instead, I see the follow log lines repeating:

[3087408.904] {Default Queue} wl_callback#60.done(84233859)
[3087408.914] {Default Queue}  -> wl_surface#58.frame(new id wl_callback#60)
[3087408.921] {Default Queue}  -> wl_surface#58.commit()

(The argument in done changes for each repeat.)

So Firefox doesn't send any damage to compositor - looks like a bug in compositor side then.

https://gitlab.freedesktop.org/wlroots/wlroots/-/issues/3942#note_2750816

Here is wlroots team reply - hopefully that may clarify the issue

It's possible that Firefox runs more operations that before - can you run with WAYLAND_DEBUG=1 env variable and check which wayland events are issued in working and broken version on plain static visible page like about:support?
Thanks.

From https://gitlab.freedesktop.org/wlroots/wlroots/-/issues/3942#note_2752308

For me, 134 is completely idle when displaying a static page while 135 beta sends commits with frame callbacks in a loop. Maybe it tries to query refresh rate that way even when it's not required for some reason?

So let's investigate why Firefox sends the frame refresh now.

Flags: needinfo?(stransky)
Duplicate of this bug: 1943874

I have a wip patch if you wait for 30 mins till I can open my laptop again fwiw :)

Will look at it, Thanks. But I don't think we need so radical change here.

Blocks: wayland
No longer blocks: wayland-sway
Flags: needinfo?(stransky)
Priority: P3 → P2
Assignee: nobody → emilio
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true

ah, sorry, tested it and submitted it before seeing your reply. FWIW I confirmed that does the trick.

It's not really a radical change IMO, it's just making the "persistent callback" mechanism more targeted to its only use (vsync). But feel free to rework as needed / take the bug :)

Yes, I'd like to look at it a bit. But thanks for the patch.

Summary: [Sway] Wayland - high compositor CPU use on idle in 135 beta series → [Wayland] high compositor CPU use on idle in 135 beta series
Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/c2192211712b [Wayland] Disable frame callback emission if Vsync source doesn't need it r=emilio
Status: ASSIGNED → RESOLVED
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: --- → 136 Branch

The patch landed in nightly and beta is affected.
:emilio, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox135 to wontfix.

For more information, please visit BugBot documentation.

Flags: needinfo?(emilio)
Assignee: emilio → stransky
Flags: needinfo?(emilio) → needinfo?(stransky)

Comment on attachment 9461857 [details]
Bug 1940924 [Wayland] Disable frame callback emission if Vsync source doesn't need it r?emilio

Beta/Release Uplift Approval Request

  • User impact if declined/Reason for urgency: Firefox uses necessary resources when it's idle as it keeps to file Wayland frame callbacks if VSync is disabled.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): We disable frame callbacks when VSync is disabled.
  • String changes made/needed:
  • Is Android affected?: Yes
Attachment #9461857 - Flags: approval-mozilla-beta?
Flags: needinfo?(stransky)
Attachment #9461857 - Flags: approval-mozilla-beta? → approval-mozilla-release?

Hi, can you confirm that things are improved in Nightly builds?

Flags: needinfo?(s.zharkoff)

I've checked with todays nightly build and issue is gone. 135 release candidate is still affected.

Flags: needinfo?(s.zharkoff)

Great, thanks for checking & confirming!

Comment on attachment 9461857 [details]
Bug 1940924 [Wayland] Disable frame callback emission if Vsync source doesn't need it r?emilio

Fixes a pretty bad CPU usage regression introduced in the 135 cycle. Approved for 135.0rc2.

Attachment #9461857 - Flags: approval-mozilla-release? → approval-mozilla-release+

Backed out from mozilla-release for causing build bustages @ WaylandSurface.cpp

Failure log

Backout link: https://hg.mozilla.org/releases/mozilla-release/rev/fc110ff8d61e74bd1b33c83c9c1dc01a0ee4f289

Flags: needinfo?(stransky)

(In reply to Sandor Molnar[:smolnar] from comment #50)

Backed out from mozilla-release for causing build bustages @ WaylandSurface.cpp

Failure log

Backout link: https://hg.mozilla.org/releases/mozilla-release/rev/fc110ff8d61e74bd1b33c83c9c1dc01a0ee4f289

Perhaps due to conflict with backed out Bug 1942232. Will look at it.

(In reply to Martin Stránský [:stransky] (ni? me) from comment #51)

(In reply to Sandor Molnar[:smolnar] from comment #50)

Backed out from mozilla-release for causing build bustages @ WaylandSurface.cpp

Failure log

Backout link: https://hg.mozilla.org/releases/mozilla-release/rev/fc110ff8d61e74bd1b33c83c9c1dc01a0ee4f289

Perhaps due to conflict with backed out Bug 1942232. Will look at it.

Ah, I see it's mozilla release so former beta.

Attachment #9462764 - Flags: approval-mozilla-beta?
Flags: needinfo?(stransky)
Attachment #9462764 - Flags: approval-mozilla-release+
Attachment #9461857 - Flags: approval-mozilla-release+
Attachment #9462764 - Flags: approval-mozilla-beta?
Attachment #9462764 - Attachment description: Bug 1940924 [Wayland] Disable frame callback emission if Vsync source doesn't need it for beta/release r?emilio → Bug 1940924 [Wayland] Disable frame callback emission if Vsync source doesn't need it

compiled 139 nightly today (Apr 26) - and looks like the issue is back. high compositor CPU use if
Show only modified preferences gfx.wayland.hdr is enabled, all OK if disabled. widget.wayland.vsync.keep-firing-at-idle = false does not help anymore.

That seems separate then, let's track it independently.

Attachment #9461850 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: