Closed Bug 1700200 Opened 4 years ago Closed 2 years ago

Constant CPU usage from idle Firefox 86.0 on KDE/Linux

Categories

(Core :: Graphics: WebRender, defect, P3)

Firefox 86
defect

Tracking

()

RESOLVED WORKSFORME
Performance Impact low

People

(Reporter: private_lock, Unassigned)

Details

(Keywords: perf:resource-use)

Attachments

(8 files)

Attached image sc1.png

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0

Steps to reproduce:

Running FF for days, ~300-350 tabs in 4 windows, ~100 addons

Version 86.0
Build-ID 20210222142601
Distributions-ID canonical
User-Agent Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Firefox/78.0
Betriebssystem Linux 5.8.0-44-generic #50-Ubuntu SMP Tue Feb 9 06:29:41 UTC 2021

Actual results:

See sc1.png of system monitor (red numbers according to line "Netzwerk-Verlauf" to mark sections between red lines - screenwidth about 2:30 min):
0. last page-load / activity

  1. FF is "misbehaving" - one CPU core constantly jumping between 30 and 80 %.
  2. recording first performance profile (29 sec)
    https://share.firefox.dev/3vSqN5k

Two spikes marked A. and B. reported separately as bug 1699922

This error state in 1. can last for minimum one hour and is not "repairing" itself. The CPU gets hot and the fan takes off into permanently running. After a fresh boot, it does so far not happen (startup-time of less than 1 minute, until system monitor goes low). But once it has begun, closing Firefox (all processes gone!) and restarting the Browser will exhibit low startup like 4-5 minutes. Also switching tabs, scrolling pages or loading new links is perceptibly slower.

Expected results:

See sc2.png
2. same as above
3. suspend to RAM whole Laptop for about 10 sec, and waking up (notice the flatline)
4. recording the second performance profile (again 29 sec, notice the missing high read line - most often CPU core 2)
https://share.firefox.dev/3lGViH6

The known difference between the two profiles are:

  • Process-Switch to take screenshots
  • moving the mouse over the FF window - but no scroll or interaction with any page
  • clicking the profile button in FF, that will open a new tab.
  • one more suspend2RAM

The "repair-workaround" is not totally reliable, but works in about 8 of 10 cases. After restarting the machine into good state 3, there is no guarantee, how long it is stable. I've seen it falling back to bad state 1 only seconds later, but it can also be stable for an hour. I have the System-Monitor visible on screen and the error can start when I read an article and only occasionally move the mouse to prevent the screensaver or to scroll the page - so everything should be stable, still it switches into bad state 1.

Usually after booting the machine from scratch I see long stretches of good state 3. But over time Firefox accretes more and more memory (e.g. 4 GB right now) and pushes out some of the operating system into the SWAP-partition. Then, it tends to become more susceptible to bad state 1. Still, there is no guarantee - right now I am pushing the limits, because I wanted to finish this bug-report and I had interrupted several times for a short suspend2RAM, but then again I'm typing here for more than an hour without interruption, even starting GIMP to edit the screenshots.

Also I tried running FF with all addons disabled (but didn't uninstall them). The profile comes up with its usual ~300 tabs in 4 windows. And after some time, it switched to bad state 1 - it just takes longer and is harder to reproduce. So I speculate, it being some wild JavaScript of one of the tabs - but witch one? On the other hand, my layman view at the profiles sees some difference in the WebExtension thread.

Finally, I observe this behavior already for quite some time, so it is not new in 86 but goes back to ~FF84.

Attached image sc2.png

The Bugbug bot thinks this bug should belong to the 'Core::Performance' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Performance
Product: Firefox → Core

(In reply to Holger from comment #0)

Created attachment 9210842 [details]
sc1.png
2. recording first performance profile (29 sec)
https://share.firefox.dev/3vSqN5k

There is a DOM worker running at 100% due to a setTimeout handler. Can you please re-test in a recent Firefox Nightly build if that is gone now? I assume the fix for bug 1684139 also helped here. You can get the Nightly build from https://www.mozilla.org/en-US/firefox/channel/desktop/#nightly. Alternatively if you don't want to use Nightly with your profile please wait until tomorrow when Firefox 88 will be on beta.

See sc2.png
2. same as above
3. suspend to RAM whole Laptop for about 10 sec, and waking up (notice the flatline)
4. recording the second performance profile (again 29 sec, notice the missing high read line - most often CPU core 2)
https://share.firefox.dev/3lGViH6

Looks like the same problem as above.

Flags: needinfo?(private_lock)
Attached video FF-high-restart.mp4

For now, I only managed a video of the effect of a short little sleep of a few seconds on the outrageous CPU consumption. Will look into the beta-version next ...

I am running FF 88 beta now and the rampant DOMworker is still around: https://share.firefox.dev/3rr3cWb

Allgemeine Informationen

Name: Firefox
Version: 88.0b3
Build-ID: 20210325185929
Distributions-ID:
Update-Kanal: beta
User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Firefox/78.0
Betriebssystem: Linux 5.8.0-48-generic #54-Ubuntu SMP Fri Mar 19 14:25:20 UTC 2021
Fenster mit mehreren Prozessen: 4/4
Fission-Fenster: 0/4 Standardmäßig deaktiviert
Externe Prozesse: 11
Unternehmensrichtlinien: Inaktiv
Google-Location-Service-Schlüssel: Gefunden
Google-Safebrowsing-Schlüssel: Gefunden
Mozilla-Location-Service-Schlüssel: Gefunden
Abgesicherter Modus: false

Flags: needinfo?(private_lock)

I nailed the rampant DOMWorker to an addon "Plasma Integration" see bug 1701388
I'd like to keep this bug around for the actual high CPU load, as the addon does not show up in Systemmonitor.

Flags: needinfo?(private_lock)

Running same FF88beta with remaining 70 addons (excluding Plasma Integration) just loading yahoo-mail and reading a few news articles:

So the issue of actual high CPU-usage definitely persists.

Flags: needinfo?(private_lock)

I managed to reproduce with all addons deactivated after a fresh boot of the computer. In the beginning of the screenshot you see, how Firefox was started (very quick without loading the addons). Then I was watching TV for about 1:30 minutes and the active Tab in the active window was about:addons and then the CPU load started and I took this profile: https://share.firefox.dev/3lUp1fG

So there must be something else broken in my data ...

Now I'm a little confused - I tried to verify, that there is no other process causing the CPU-Load in Systemmonitor and fell back on a console with top.

Well, top points an ugly finger at "firefox-bin" and in addition to the "WebExtensions" subprosess as well as some of the "Web Content". What is really strange: Top insists on them consuming CPU, where Systemmonitor sees the CPU-graph almost a flatline.

On the other hand, the CPU-fan definitely follows Systemmonitor - for the flatline, it is running idle, and when Systemmonitor shows one core active, then also the fan will spin up and real heat is coming out.

Also for starting up Firefox, I distinguish two states:

  • startup time under one minute is "good" the extra CPU-load is missing.
  • startup time over four minutes is "bad" and accompanied by one core at 100% throughout.

The performance profile in comment 8 doesn't really show anything.
Do you have another performance profile captured when FF takes significant amount cpu time?
Hopefully taken with Nightly and without 'Plasma Integration'.

I wonder if the top vs system monitor difference is just that one shows overall cpu usage and one per logical cpu core.
If I run a busy loop in Firefox, a process takes 100% cpu in top, but System monitor may show the load to be split to several cpus.

Severity: -- → S3
Priority: -- → P3

I managed to reproduce in a Kubuntu 21.4 Lifesession (booted from USB-thumbdrive). Inside I updated to FF88 and copied my 1.5 GB profile over. Then I read newsarticles for about 4 hours, when this bug struck again.

I took this profile inside the lifesession (without plasma-integration):
https://share.firefox.dev/3dWR8Zl

Thank you for the updated profile. What was the CPU load at this time? Which process contributed most to it?

Did it happen with a specific website open in a tab, or also when there is a blank tab selected? I can see kinda bit of activity from the Rendering thread. Also the TreeStyleTab extension consumes a bit of CPU on the WebExtension process. But given that it unclear about which CPU load we talk about here, I'm not sure if that has an impact. But maybe try to disable it temporarily when it happens again.

According to "top" it is always the "firefox-bin" process (https://bugzilla.mozilla.org/attachment.cgi?id=9212920).
Top sees it at 15-20%. This machine has 4 physical or 8 hyperthreadding cores - so eating up one or two cores of the total.
According to "Systemmonitor" - one core is eating up between 30-80% (Phase 1 in https://bugzilla.mozilla.org/attachment.cgi?id=9210842)
If anything else is using some CPU power, that one core jumps to 100% (e.g. Phase 2)

At that time in the life-session, it was five windows with around 350 tabs. I ignored most of the windows - didn't even focus them, so they only loaded a few pinned tabs in the background. Only in my main window I had the yahoo webmail-interface loaded and was reading articles from https://www.heise.de by having one tab on the main page and approximately 10 tabs with articles opened in that session, that I read and close one by one.

Examples:
https://www.heise.de/news/Fedora-Linux-34-prescht-bei-Sound-Server-Wayland-und-Gnome-vor-6028555.html
https://www.heise.de/news/Ubuntu-21-04-Neuer-Versuch-mit-Wayland-aber-ohne-Gnome-40-6025255.html

From the Fedora article, I opened some more pages by middle click without reading them yet:
https://fedoraproject.org/wiki/Changes/WaylandByDefaultForPlasma#Benefit_to_Fedora
https://community.kde.org/Goals/Wayland
https://www.heise.de/news/Ubuntu-21-04-Neuer-Versuch-mit-Wayland-aber-ohne-Gnome-40-6025255.html
https://i3wm.org/
https://fedoramagazine.org/getting-started-i3-window-manager/

When the bug strikes, some actions in firefox like loading a page or scrolling a page have a noticeable lag, that goes away by the workaround of shortly suspending the machine to RAM. Only the life-session I did not try to suspend it.

Thank you for the detailed answer. So given that it comes from the parent process it might be related to the Renderer thread as there can be seen activity based on your last profile. Does the CPU load drop when you have a blank tab open and selected? Maybe some ad on those pages causes the high CPU. Did you try to install uBlock origin, or NoScript to check if getting rid of those ads or JS at all helps?

Maybe you could also install my PerfChaser addon, which presents the CPU load and memory usage in a sidebar. That way you can easily observe the current values, and could also give some feedback which specific threads on the main process are actually causing the high CPU load. If it also turns out to be the Renderer thread we should move this bug into the graphics component.

Thanks!

UBlock origin and UMatrix were both present and running with the last profile in the "Profile Info"
https://share.firefox.dev/3dWR8Zl

Switching Tabs does not visually impact the CPU-load (only, switching to an unloaded tab - then, the page has to load of course).

Will try to run PerfChaser next ...

I got you a "video" of the console running first htop and then top while Firefox exhibits the bug:
https://asciinema.org/a/uLkyCazV7XErhdMbpgTvbAZ2V

But this is the regular install of FF88 that came via Ubuntus packages. It was running continuously since my last comment (only suspending to RAM over night). Today is the first time to exhibit the bug.

Here is a log of my "uptime":
2021-04-27 Di 16:47:47 - 2021-04-27 Di 18:02:48 = 000 01:15:01 #2096
2021-04-27 Di 23:59:09 - 2021-04-28 Mi 01:21:10 = 000 01:22:01 #2104
2021-04-28 Mi 22:32:42 - 2021-04-29 Do 00:39:43 = 000 02:07:01 #2104
2021-04-29 Do 21:19:38 - 2021-04-29 Do 21:36:38 = 000 00:17:00 #2104
2021-04-29 Do 21:50:00 - 2021-04-29 Do 23:00:59 = 000 01:10:59 #2104
2021-04-30 Fr 17:09:50 - 2021-04-30 Fr 17:31:50 = 000 00:22:00 #2104
2021-04-30 Fr 19:56:58 - 2021-04-30 Fr 20:22:58 = 000 00:26:00 #2104
2021-04-30 Fr 20:39:05 - 2021-04-30 Fr 20:53:05 = 000 00:14:00 #2104
2021-04-30 Fr 23:49:18 - 2021-05-01 Sa 01:00:18 = 000 01:11:00 #2104
2021-05-01 Sa 16:36:34 - 2021-05-01 Sa 17:05:46 = 000 00:29:12 #2104

It gets a new process-number on boot (change from #2096 to #2104) ... it might not pick up every short suspend - as it probes only once a minute. Times are "MESZ Berlin"

So far no progress on the nightly install ... I'm at it.

Hi Holger, have you been able to reproduce the situation with Nightly yet? If yes, what did PerfChaser show? Thanks.

Flags: needinfo?(private_lock)

Running 89.0b10 (64-Bit) with Perfchaser now ... it keeps updating almost every other day.

Right after the update finishes, Version 89 starts itself again and it gets really slow - opening the menu takes 5 to 10 seconds. Scrolling a page the same. It is, as if all my mouse commands are first send to the moon and back. Even closing and restarting FF does not solve it. Does the previous FF keep some background process around, that is not compatible with the latest version?

So I have to restart the whole Computer, to get it running smoothly again. But this also resets the timer for this bug. So far, it did not occur in beta 8, beta 9 and as of today beta 10 (about 5 days) ... Which I tentatively take as a good sign.

Flags: needinfo?(private_lock)

Yes, we ship beta releases twice a week. But you can safely ignore the update requests, and dismiss the popup.

When you notice the slowdown after the update please create a profile. This seems to be an unrelated issue, and should be worth another bug.

Attached video hi.mp4

Hi Henrik,

I ignored the update from beta10 to beta11 and now the bug manifested.
I was downloading updates for (in this order):
https://mediathekview.de/download/
(half an hour ago, installed + tested)
https://download.eclipse.org/eclipse/downloads/drops4/R-4.19-202103031800/
(download finished, not installed jet, because I watched the news in TV)
https://www.tvbrowser.org/index.php?id=tv-browser
(not yet downloaded, only opened the tab and checked for availability)

Made a little video of PerfChaser before and after suspending to RAM.

Attached video lo.mp4

second video after resume from RAM

Oh and, I had a new Kernel installed earlier today, but did not jet restart the system, as I was waiting for this bug to manifest.

Next, I'll apply the update and see, if it kills the performance again.

When you notice the slowdown after the update please create a profile. This seems to be an unrelated issue, and should be worth another bug.

Didn't happen again. Update beta 10 to 11 was right after rebooting the laptop.
I skipped beta 12 and Update from 11 to 13 also doesn't show any slowdown today.
So far no reason, to create another bug ...

Created Bug 1712396 for the update-issue from beta13 to beta15 ...

I wonder, if that is a coincidence: After installing FF90b1 I was playing with it for an hour and got the high CPU-Load quickly:
https://share.firefox.dev/3caMLIK (Platform-Profile)
https://share.firefox.dev/3cauon4 (Graphics-Profile)

And though I see the high CPU-Load in System-Monitor, I still cannot reproduce bug 1712396 - so it is definitely a separate issue.

What changed in the meantime? It seems, I had the KDE-Plasma effects off for some time and reenabled them within the last hour. Now I was wondering, if my workaround of shortly suspending to disk does somehow reset the graphics adapter.

At least I have to say, that switching off Plasma-Effects did not qualify as a workaround alone for this bug at hand.

Also restarting Firefox (as in closing FF, checking no "firefox" process left and restarting) without suspending to RAM, the bug immediately strikes again. So somehow the system itself is in a "fragile" state, that will lure the freshly started firefox immediately into the trap. The startup time significantly rises and of course, the fan is begging me to finally apply the workaround and silence this firefox-process from eating 40-80% of one core constantly...

One other thing I did: I launched FF88 with an empty profile. it showed some low-level constant CPU usage (5-10%) when supposed to be idle, but not comparable with my productive profile of 350 tabs and many addons ...

I hope you don't mind, if I SPAM this bug with more details?
Good night for now ...

Hi Holger. Sorry for the silence in the last 2 months but it fall under my radar. Based on the profiles you provided in comment 24 I would say that we should move the bug into graphics. But before could you please tell us what the WebRender status is in about:support? If it's turned on could you please check if it makes a difference when you turn it off? Especially the compositor which is taken quite an amount of CPU load. There is the preference gfx.webrender.compositor.force-enabled that you could flip in about:config. Thanks.

Flags: needinfo?(private_lock)

Hi Henrik,

WEBRENDER has two entries in the table:
available by default
disabled by user: User force-disabled WR

The bug was just some minutes ago active ... meanwhile I phased the 90-betas out and use the FF90 install that came via Ubuntu package manager.

gfx.webrender.compositor.force-enabled = false as of now ... switching to true now

Thanks for investing your time in my problem.
Holger

Flags: needinfo?(private_lock)

Oh, BTW:
gfx.webrender.force-disabled = true
is also in effect since bug 1712396 comment 6

Component: Performance → Graphics: WebRender

The Performance Priority Calculator has determined this bug's performance priority to be P3. If you'd like to request re-triage, you can reset the Performance flag to "?" or needinfo the triage sheriff.

[x] Causes severe resource usage

Performance Impact: --- → low

This issues vanished since I don't use the Wifi on this laptop anymore. With the LAN-cable hooked up and Wifi disabled, I haven't seen it happening again. I speculate, it's an issue with the Wifi-hardware getting stuck and resetting only after a short hibernate.

Closing as WORKSFORME ...

Status: UNCONFIRMED → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: