Set oom_score_adj in SetProcessPriority() on Linux to make it less likely for the main process to be killed under memory pressure
Categories
(Core :: Hardware Abstraction Layer (HAL), enhancement)
Tracking
()
People
(Reporter: gsvelto, Assigned: gsvelto)
References
(Blocks 2 open bugs)
Details
Attachments
(1 file)
The introduction of PSI-based means our users encountering more often cases where Firefox gets OOM-killed, especially on machines with very small or no swap files (which ironically are often machines with large amounts of physical memory which should never encounter an OOM in the first place).
To mitigate this issue we can implement hal::SetProcessPriority()
on Linux and instruct it to adjust the value of /proc/<pid>/oom_score_adj
so that background processes are more likely to be targeted by the OOM killer than foreground processes, or the main process itself.
Note that if we can do this right then we might disable tab unloading on Linux and rely on the system OOM killer to reap background tasks for us (a little bit like we did on Firefox OS where unloading an app was entirely done by signaling the OOM killer which pages we cared less about).
Assignee | ||
Comment 1•3 years ago
|
||
Some very quick testing on my machine shows that even small adjustments to oom_score_adj
(such as +100 or even +50) are more than enough to make small child processes have a larger score than a much larger main process. This will however make the affected child processes be more likely to be killed compared to other applications too. That's not necessarily a problem though, if we only make sure that background child processes get killed then Firefox will play nicely with other apps under memory pressure without visibly affecting the user. That isn't a bad outcome.
Assignee | ||
Comment 2•3 years ago
|
||
One last thing worth keeping in mind: the process priority manager was written before Fission, we should double-check that it's still detecting background/foreground processes correctly (it might not).
Comment 3•3 years ago
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #2)
One last thing worth keeping in mind: the process priority manager was written before Fission, we should double-check that it's still detecting background/foreground processes correctly (it might not).
I think @mccr8 fixed this recently.
Comment 4•3 years ago
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #0)
Note that if we can do this right then we might disable tab unloading on Linux and rely on the system OOM killer to reap background tasks for us (a little bit like we did on Firefox OS where unloading an app was entirely done by signaling the OOM killer which pages we cared less about).
It'd be helpful if we still get low memory notifications so that the remaining processes can do some compaction. I think I remember someone saying that tab unloading happes "before" low memory notifications so without tab unloading this should actually be better. I don't have a good picture of how tab unloading and low memory notifications interact.
Assignee | ||
Comment 5•3 years ago
|
||
(In reply to Paul Bone [:pbone] from comment #4)
It'd be helpful if we still get low memory notifications so that the remaining processes can do some compaction. I think I remember someone saying that tab unloading happens "before" low memory notifications so without tab unloading this should actually be better. I don't have a good picture of how tab unloading and low memory notifications interact.
Yes, a process being knowingly targeted by the OOM killer could be used in place of the low-memory detection machinery and we could react to it by sending memory pressure events to the other processes. Right now we detect low memory, try to unload tabs, and if that fails also send memory pressure events (thought that's something I wanted to change).
Comment 6•3 years ago
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #2)
One last thing worth keeping in mind: the process priority manager was written before Fission, we should double-check that it's still detecting background/foreground processes correctly (it might not).
Yes, I fixed this in bug 1618547. You just need to flip the pref on Linux, and the tests should pass and everything. (Of course, it isn't actually changing priorities on Linux, just tracking them.)
Assignee | ||
Comment 7•3 years ago
|
||
Thanks! Given the introduction of systemd-oomd in mainstream distros this is a good time as any to start using it.
Assignee | ||
Comment 8•3 years ago
|
||
This patch implements hal::SetProcessPriority() on Linux and leverages
it to make it more likely that the parent process and foreground tab
survive an OOM situation, letting Linux' OOM killer reap preallocated
processes and background tabs first when reclaiming memory.
This is achieved by setting the oom_score_adj
values of said processes
such that they will be killed in this order:
- Preallocated processes will be the first to go, they don't contain
user data and are not visible, so they're a good candidate to free up
memory - Background tabs will be killed next, we don't generate crash reports
for thoes nor do we inform the user, we just reload the tab, so in
most cases one being killed will only be a small annoyance - Background tabs playing video come next, but only if they're not also
playing audio, see below - Finally foreground tabs will be killed as a last resort, background
tabs playing audio are also considered to be in the foreground as the
user will immediately notice if they crash
Note that this patch only implements the low-level plumbing. The process
priority manager has not been enabled on Linux yet so that needs to
happen before this actually works. Additionally I need to check how
tabs having an active WebRTC session behave, they need to be considered
as foreground tabs and I'm not sure if we're accounting for them yet.
Updated•3 years ago
|
Assignee | ||
Comment 9•2 years ago
|
||
I'm running a bunch of tests before landing this because I want to be sure this doesn't interfere with other stuff in automation.
Comment 10•2 years ago
|
||
Comment 11•2 years ago
|
||
bugherder |
Assignee | ||
Comment 12•2 years ago
|
||
Release Note Request (optional, but appreciated)
[Why is this notable]: Firefox on Linux will be less likely to run out of memory and crash. Additionally if a web page is leaking memory it won't cause other applications to run out of memory, even if it's in the background.
[Affects Firefox for Android]: This is already supported on Firefox for Android, it only affects Firefox desktop on Linux
[Suggested wording]: "Firefox is less likely to run out of memory on Linux, and is better behaved towards the rest of the system when memory is running low."
[Links (documentation, blog post, etc)]:
Comment 13•2 years ago
|
||
Added to the 105 nightly release notes.
Updated•2 years ago
|
Comment 14•2 years ago
|
||
This doesn't work as expected when running as a systemd service, which ships OOMScoreAdjust=100
for user@.service and another +100 because of DefaultOOMScoreAdjust
is unset.
Maybe Firefox could read its current oom_score_adj early and adjust on that base.
Assignee | ||
Comment 15•2 years ago
|
||
(In reply to lilydjwg from comment #14)
This doesn't work as expected when running as a systemd service, which ships
OOMScoreAdjust=100
for user@.service and another +100 because ofDefaultOOMScoreAdjust
is unset.Maybe Firefox could read its current oom_score_adj early and adjust on that base.
Yes, we can do that. I'll file a follow-up.
Description
•