Open Bug 1860279 Opened 8 months ago Updated 20 days ago

[X11/KDE] firefox memory consumption spikes, consuming all available resources

Categories

(Core :: Widget: Gtk, defect)

Firefox 118
defect

Tracking

()

UNCONFIRMED

People

(Reporter: mindboosternoori, Unassigned)

References

(Depends on 1 open bug)

Details

Attachments

(14 files)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0

Steps to reproduce:

I tried to save a document on confluence.

Actual results:

The computer fans started spinning wildly, the mouse lagged to move and then stopped altogether. Pressing CTRL+ALT+F2 to get into a tty took a minute or so to have the desired effect - by that OOM killed had already killed one container, but the machine's load average was still unbearable and not lowering. I killed -11 firefox.

Expected results:

Firefox should not put the machine in an unusable state.

The Bugbug bot thinks this bug should belong to the 'Core::Widget: Gtk' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Widget: Gtk
Product: Firefox → Core

Please attach your about:support page. Do you open any particular page to reproduce it?
Thanks.

Flags: needinfo?(mindboosternoori)
Attached file about:support
This has happened to me twice in this version, while doing different things (once triggered by clicking "Edit" on a confluence page, the other by opening a tab to an youtube video). I don't have reproduction steps (and I can regularly edit confluence pages or watch youtube videos).

Here is about:support:

Thanks. Do you have anything related in journalctl?
https://fedoraproject.org/wiki/How_to_debug_Firefox_problems#Get_system_log_after_system_freeze_/_restart

Looking at the crash stat:
https://crash-stats.mozilla.org/report/index/f141f81a-a6f7-4e2e-a5ee-79aaa0231020#allthreads

It says:
Available Virtual Memory 466,022,400 bytes (466.02 MB)
Available Physical Memory 171,020,288 bytes (171.02 MB)

as TotalPhysicalMemory is 8206991360 (8GB)

Summary: firefox memory consumption spikes, consuming all available resources → [X11/KDE] firefox memory consumption spikes, consuming all available resources
Attached file dmesg
It happened to me again today, again while opening a tab. This time I wasn't able to get to a terminal before OOMkiller saved the day. Here's what showed on dmesg:
Attached file jctl.log

And here's the reported day, from journalctl.

Flags: needinfo?(mindboosternoori)

I see you use Snap:

snapd invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-900

I wonder if you have any strong snap sandbox restriction for used memory. Can you try to install plain Mozilla binaries?
https://fedoraproject.org/wiki/How_to_debug_Firefox_problems#Testing_Mozilla_binaries
Thanks.

Blocks: snap
Flags: needinfo?(mindboosternoori)

Can you share any more recent crash report? The one linked is 118, there were issues with crash reporter being broken on those versions that would end up in: parent process crash with OOM and/or freeze.

The memory info from crash report looks weird:

Available Virtual Memory 466,022,400 bytes (466.02 MB)
Available Physical Memory 171,020,288 bytes (171.02 MB)

especially if total ram is reported as 8GB. Alexandre, does Snap restrict available memory somehow?

Not to the best of my knowledge.

There was a bug with some older version of SystemD's OOM killer implementation that would kill Firefox a bit too much, but it was fixed months ago: https://bugzilla.mozilla.org/show_bug.cgi?id=1768765 https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1972159

There are baloo (file indexer?) processes that seems to use much more RAM than firefox:

out 20 14:46:31 dogfood kernel: [ 2077] 1000 2077 67167002 505 1224704 916 0 baloo_file
out 20 14:46:31 dogfood kernel: [ 2633] 1000 2633 67183949 2397 2293760 4751 0 baloorunner
[...]
out 20 14:46:31 dogfood kernel: [ 416342] 1000 416342 1297038 115480 4698112 28575 0 firefox

We really only are consuming ~450MB

Out of memory: Killed process 1272762 (Isolated Web Co) total-vm:3600744kB, anon-rss:476948kB, file-rss:0kB, shmem-rss:2588kB, UID:1000 pgtables:4492kB oom_score_adj:167

I'm afraid that without STR, without an actionable crash report or about:memory it's going to be tricky to know what is going on ...

Can you share snap info snapd as well as snap info firefox ?

I hit https://bugzilla.mozilla.org/show_bug.cgi?id=1863885#c4 wich contains useful info how to debug memory issues via DMD tool. We can try that too.

Attached file dmesg-119
(In reply to :gerard-majax from comment #9)
> Can you share any more recent crash report? The one linked is 118, there were issues with crash reporter being broken on those versions that would end up in: parent process crash with OOM and/or freeze.

https://crash-stats.mozilla.org/report/index/cb54a8e9-8c63-45c7-8b5a-85f110231109 (FF 119.0)

Note that the actual crash was me issuing a "killall -11 firefox", but while I did type it on the command line, the machine was too slow and the kill only ran *after* OOM killer killed the offending process.

I'm also attaching a dmesg log of the OOM killer in action.

(In reply to :gerard-majax from comment #15)

Can you share snap info snapd as well as snap info firefox ?

$ snap info snapd
name: snapd
summary: Daemon and tooling that enable snap packages
publisher: Canonical✓
store-url: https://snapcraft.io/snapd
contact: https://github.com/snapcore/snapd/issues
license: GPL-3.0+
description: |
Install, configure, refresh and remove snap packages. Snaps are
'universal' packages that work across many different Linux systems,
enabling secure distribution of the latest apps and utilities for
cloud, servers, desktops and the internet of things.

Start with 'snap list' to see installed snaps.
type: snapd
snap-id: PMrrV4ml8uWuEUDBT8dSGnKUYbevVhc4
channels:
latest/stable: 2.60.4 2023-10-10 (20290) 42MB -
latest/candidate: 2.60.4 2023-10-03 (20290) 42MB -
latest/beta: 2.61 2023-10-16 (20515) 42MB -
latest/edge: 2.61+git1481.g7a33013 2023-10-31 (20625) 42MB -

$ snap info firefox
name: firefox
summary: Mozilla Firefox web browser
publisher: Mozilla✓
store-url: https://snapcraft.io/firefox
contact: https://support.mozilla.org/kb/file-bug-report-or-feature-request-mozilla
license: MPL-2.0
description: |
Firefox is a powerful, extensible web browser with support for modern web application
technologies.
snap-id: 3wdHCAVyZEmYsCMFDE9qt92UV8rC8Wdk
channels:
latest/stable: 119.0.1-1 2023-11-07 (3358) 251MB -
latest/candidate: 119.0.1-1 2023-11-06 (3358) 251MB -
latest/beta: 120.0b9-1 2023-11-10 (3376) 257MB -
latest/edge: 121.0a1 2023-11-10 (3375) 288MB -
esr/stable: 115.4.0esr-1 2023-10-24 (3280) 253MB -
esr/candidate: 115.4.0esr-1 2023-10-17 (3280) 253MB -
esr/beta: ↑
esr/edge: ↑

Flags: needinfo?(mindboosternoori)

(In reply to :gerard-majax from comment #13)

There was a bug with some older version of SystemD's OOM killer implementation that would kill Firefox a bit too much, but it was fixed months ago: https://bugzilla.mozilla.org/show_bug.cgi?id=1768765 https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1972159

This does not seem to be at all related, at least in terms of the behavior I see. What I have been having is:

  • system works as normal, firefox behaves correctly, memory consumption is fine, all is great
  • while opening a new tab (or in one case, clicking on something that, I believe, was refreshing that current tab into another view), the system starts hanging, fans start spinning, mouse starts moving slowly until it totally halts, computer becomes irresponsive
  • after a while longer (couple of minutes perhaps?), OOMkiller kills the "offending process", and computer gets back to normal (memory consumption lowers, PC becomes usable, load average lowers, fans stop spinning)

Firefox isn't slowly consuming more memory, it is that "new tab" that does it all.

(In reply to :gerard-majax from comment #14)

I'm afraid that without STR, without an actionable crash report or about:memory it's going to be tricky to know what is going on ...

I am not sure what STR is or how do you want me to use about:memory, but if you want me do something to help collect more info, I will do it if you give me steps for it. Do realize, however, that most probably I will only be able to do it after the process is killed, since when this happens I am not even able to move the mouse cursor from one side of the screen to the other (until the process is killed).

(In reply to Martin Stránský [:stransky] (ni? me) from comment #8)

I wonder if you have any strong snap sandbox restriction for used memory. Can you try to install plain Mozilla binaries?
https://fedoraproject.org/wiki/How_to_debug_Firefox_problems#Testing_Mozilla_binaries

I will.

FWIW, I spent 1 week using Mozilla's binaries instead of the snap package, and haven't seen this. Of course, this isn't conclusive proof that this is a snap issue. I will keep using the binary release for a while longer.

It did happen with the binary distributed version too, after all:
https://crash-stats.mozilla.org/report/index/360dd386-b4d0-4f82-97d0-33d060231122

Computer got very slow, but this time I managed to killall -11 firefox-bin before OOMkiller did anything; at that time, the machine's load average was above 100 (!).

No longer blocks: snap

(In reply to mindboosternoori from comment #21)

It did happen with the binary distributed version too, after all:
https://crash-stats.mozilla.org/report/index/360dd386-b4d0-4f82-97d0-33d060231122

Computer got very slow, but this time I managed to killall -11 firefox-bin before OOMkiller did anything; at that time, the machine's load average was above 100 (!).

Unfortunately it looks like this crash is just the outcome of your kill :(

Does it not give any information on the browser's state at the time of the kill? Is there anything I can do (instead or before killing the browser) in these cases, to collect information of what's going on (for the next time this happens)?

Did it got slow outside of your usage ? Suddenly or slowly ? If it was not a sudden slowness, and you could still interact with Firefox at some point while it was getting slow, can you try to profile ? https://profiler.firefox.com/ and use Nightly settings I guess ?

Capturing about:memory if you get a chance might help as well.

Since you have access to the system to issue a kill maybe you can also:

  • see htop and look which process or even thread might be consuming CPU / memory
  • get a gdb to attach to this process, and bt it so we might get a clue where it is stuck ?

(In reply to :gerard-majax from comment #24)

Did it got slow outside of your usage ? Suddenly or slowly ? If it was not a sudden slowness, and you could still interact with Firefox at some point while it was getting slow, can you try to profile ? https://profiler.firefox.com/ and use Nightly settings I guess ?

It was "sudden", but not instant - fans started spinning, the mouse started to get slow.. and it took about 10 seconds until I couldn't do anything.
I will try to about:memory next time, but I very much doubt I will be able to even open the tab for it. Going to the profiler page will be impossible, I imagine.

Is there by any chance a way to trigger the about:memory capture from the command line, instead of having to open a tab, and click on two buttons?

I dont think so. Can you verify from htop how much you are consuming before and during ?

(In reply to :gerard-majax from comment #26)

Since you have access to the system to issue a kill maybe you can also:

  • see htop and look which process or even thread might be consuming CPU / memory

I usually peek with top, and it is always an "Isolated Web Co...". Will use htop instead next time.

  • get a gdb to attach to this process, and bt it so we might get a clue where it is stuck ?

Might not be easy or possible, machine takes to respond and might often enough OOMkiller acts first, but I will surely try - seems more probable I'll succeed at this than at capturing about:memory if there's no way to do it from the command line...

Attached file htop-before.txt
(In reply to :gerard-majax from comment #28)
> I dont think so. Can you verify from `htop` how much you are consuming before and during ?

I'll try to get a "during", will upload a "before" right now.
Attached file htop-before.txt
(In reply to :gerard-majax from comment #28)
> I dont think so. Can you verify from `htop` how much you are consuming before and during ?

I'll try to get a "during", will upload a "before" right now.

I tried to get a "during": on a terminal, I typed the htop command, but unfortunately it took around a minute and then an OOMkiller message showed up (killing an "Isolated Web" process) and only after that the htop ran.

[6074574.005731] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-3.scope,task=Isolated Web Co,pid=3130753,uid=1000
[6074574.005839] Out of memory: Killed process 3130753 (Isolated Web Co) total-vm:3480180kB, anon-rss:521224kB, file-rss:0kB, shmem-rss:428kB, UID:1000 pgtables:4460kB oom_score_adj:167

The killed process in question (3130753) does not show in the htop result (proving that it only ran right after the process was already killed) but at least it still gives signs of what happened, since it was "right after" - notice, for eg., how the used memory is 6.18G/7.6G and swap 1.99G/2.00G. In comparison, an htop ran 7 minutes after the kill shows "only" 5.36G/7.6G and 1.96G/2.00G in use. Worth noting, this time I didn't kill anything on my own - and after that one process was killed, the computer is in a working/usable state - including the browser, which I'm using to write this comment, before having had closed it or anything of the sort. At this moment, (20 minutes after) the machine (running all the same things but that one process) has 5.41G of memory in use, and a load average of 0.32 (in contrast with the 11.85 from the htop right after the kill). I will attach the "right after" and "minutes later" htop output files, in hope that they'll be useful somehow.

Attached file htop-right-after
Attached file htop-minutes-later

Happened twice today, both times the htop only run after the process being killed. One of the times was probably caused by slack - at least when I tried to go to its tab afterwards, it had a message saying the tab had crashed; the other time the killed tab was on JIRA. htop output about to be uploaded.

Attached file htop-right-after-case2
Attached file htop-right-after-case3

Still happening with Firefox 121.0 (64-bit).

Attachment #9365654 - Attachment mime type: application/octet-stream → text/plain

Still happens in 121.0.1

Still happening on 122.0.

Unfortunately, the text content shared is full of shell escapes that seems to have been slightly modified and I can't it visible at all ...

Any chance you can test under wayland?

Unfortunately I can't.

(In reply to :gerard-majax from comment #41)

Unfortunately, the text content shared is full of shell escapes that seems to have been slightly modified and I can't it visible at all ...

Yikes, sorry about that, I thought I have seen all the attachments but for that one I might have uploaded the wrong file...
You should be able to see htop-right-after-case2 and htop-right-after-case3 without an issue (just cat the files).

Still happens on 122.0.1.

(In reply to mindboosternoori from comment #44)

(In reply to :gerard-majax from comment #41)

Unfortunately, the text content shared is full of shell escapes that seems to have been slightly modified and I can't it visible at all ...

Yikes, sorry about that, I thought I have seen all the attachments but for that one I might have uploaded the wrong file...
You should be able to see htop-right-after-case2 and htop-right-after-case3 without an issue (just cat the files).

they are full of control chars.

Since you seem to reproduce reliably, have you spotted a website in particular that you use ? Do you have anything that you have constantly loaded ? Can you take a regular look at about:processes to see memory usage of tabs?

Also since it's Ubuntu 20.04, are you using provided debian package by ubuntu ? snap ? external PPA ? our tarball ? our debian package (I dont think we have release and I dont think we support 20.04) ?

Flags: needinfo?(mindboosternoori)
Flags: needinfo?(mindboosternoori)

Thanks, in case 3 I'm a bit troubled by PIDs 5189, 5190, 10922 and 11112, they have no CLI parameter so I would guess they are main processes, are you running multiple profiles in parallel ? (each consumes 7.0% of RAM so it's already 4*573MB, 2.3GB. Over 8GB + 2GB of swap)

(In reply to :gerard-majax from comment #47)

Also since it's Ubuntu 20.04, are you using provided debian package by ubuntu ? snap ? external PPA ? our tarball ? our debian package (I dont think we have release and I dont think we support 20.04) ?

/usr/lib/firefox/firefox and distribution_id=canonical so it's canonical's deb

Since you seem to reproduce reliably, have you spotted a website in particular that you use ?

Not really - it usually happens when a new page is loading in some tab, and it is usually on 'heavy pages' (with media, for eg.), but there isn't a pattern I can identify.

Do you have anything that you have constantly loaded ?

I usually have several tabs open with 'resource intensive' websites: slack, gmail, jira, confluence, often one tab playing music (bandcamp, for eg.).

Can you take a regular look at about:processes to see memory usage of tabs?

At this very moment I have 447MB memory and CPU fluctuating a lot (it is between 4% and 8% and then spikes up to to 30% or so). Most of it seems to be caused by one tab where gmail is open.

Also since it's Ubuntu 20.04, are you using provided debian package by ubuntu ? snap ? external PPA ? our tarball ? our debian package (I dont think we have release and I dont think we support 20.04) ?

I am currently using the snap, but I did test using Mozilla's binaries and saw the same behavior (see comment 21).

Thanks, in case 3 I'm a bit troubled by PIDs 5189, 5190, 10922 and 11112, they have no CLI parameter so I would guess they are main processes, are you running multiple profiles in parallel ?

I very rarely use more than one window open: two if I am sharing one of them on a video app, or sometimes I have a private window open. Having the three of them (regular + to share + private) at the same time would be very rare. I do not use multiple profiles, but I don't know if having multiple windows affect in the same way. I cannot recall how many windows I had open during those particular cases, but I'm sure this also happens when I have only one.

No, multiple windows should not translate into those processes. Within about:processes you have CPU, Memory as well as PIDs, can you check on your live system if you still see those /usr/lib/firefox/firefox being there (without any parameter), and do you see them in about:procesess ?

Can you share about:processes screenshot ordered by memory decreasing ? And maybe about:memory and perform a measure then save, so we have a view ?

Attached image about-processes.png
Attached file memory-report.json.gz

After the about:processes, the memory profile, but for this I closed the atlassian tab and the slack tab, as they had information (even just their URLs) I'd rather not share.

The two attachment just added hopefully provide answers to your questions.
The issue still happens on 123.0.1.

Still happens on 124.0.1.

Still happens on 124.0.2.

Random observations: The about:support has WebRender on Intel built-in gfx, but the about:memory shows WebRender + SWGL? Not sure what graphics setup is active here.

about:support complains we get a corrupted file when trying to download H264 decoders: "Error: Downloaded file was 3484 bytes but expected 511815 bytes.". I've never seen that before.

I believe this is another case where bug 1823370 may be helpful to get the profiler started even if the system is under heavy load.

Depends on: 1823370

(In reply to Gian-Carlo Pascutto [:gcp] from comment #59)

Random observations: The about:support has WebRender on Intel built-in gfx, but the about:memory shows WebRender + SWGL? Not sure what graphics setup is active here.

about:support complains we get a corrupted file when trying to download H264 decoders: "Error: Downloaded file was 3484 bytes but expected 511815 bytes.". I've never seen that before.

I believe this is another case where bug 1823370 may be helpful to get the profiler started even if the system is under heavy load.

I can confirm that currently (124.0.2) about:support shows:

Compositing WebRender
WebGL 1 Driver Renderer Intel -- Mesa Intel(R) UHD Graphics 620 (KBL GT2)

And about:profile shows:

Explicit Allocations

372.29 MB (100.0%) -- explicit
├──100.98 MB (27.12%) ── heap-unclassified
├───85.55 MB (22.98%) -- gfx
│ ├──84.91 MB (22.81%) -- webrender
│ │ ├──75.96 MB (20.40%) ── swgl

If this is something you think is worth investigation, I'm available to check out any details you might want to, as long as you navigate me through it.

Still happens in 125.0.2.

Still happens in 125.0.3.

Still happens in 126.0.

Still happens in 126.0.1.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: