Open Bug 1746556 Opened 3 years ago Updated 1 year ago

Firefox in Linux running as SCHED_RR crashes

Categories

(Core :: Widget: Gtk, defect)

Firefox 95
defect

Tracking

()

UNCONFIRMED

People

(Reporter: github, Unassigned)

Details

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0

Steps to reproduce:

wget https://download-installer.cdn.mozilla.net/pub/firefox/releases/95.0.1/linux-x86_64/en-US/firefox-95.0.1.tar.bz2
tar xf firefox-95.0.1.tar.bz2

~]$ chrt -r 11 ./firefox/firefox --profile mktemp -d -no-remote

Actual results:

Use it for a short time, try to do some work, then it will crash as follows:

]$ chrt -r 11 /opt/firefox/firefox --profile mktemp -d -no-remote
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
CPU time limit exceeded

Expected results:

Shouldn't crash ;)

The original post clobbered the backticks. Here it is again:

This crashes:

chrt -r 11 ./firefox/firefox --profile `mktemp -d` -no-remote

Running it in SCHED_OTHER (the Linux default) works fine:

chrt -o 0 ./firefox/firefox --profile `mktemp -d` -no-remote

I tried invoking firefox with a /full/path/to/firefox too, so it wasn't my crazy pathing, that is just an example.

See bug#1538435 where I tried to bisect the problem, but I think that issue was different than this new bug so I'm opening a new report to target this specific issue.

Hey Eric,
Could you please check out if you have any submitted crash reports if you navigate to about:crashes? If you do have, please share them here as well.
Is this crash reproducible using a New Firefox profile or in Safe Mode?
Firefox Safe Mode:
https://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-mode
New Firefox Profile:
https://support.mozilla.org/en-US/kb/profile-manager-create-remove-switch-firefox-profiles

Flags: needinfo?(github)

Can the metadata for the version be changed to 95, please? The OP was using 95.01 and I am using 95.0.2-2.fc34

Are you actually running Firefox with SCHED_RR? There's nothing in your report that indicates so.

Exiting due to channel error.
CPU time limit exceeded

Are we just being killed because the threads exceeded their timeslice?

Component: Untriaged → Widget: Gtk
Product: Firefox → Core
Version: Firefox 91 → Firefox 95

I get into the same problem whenever I try to share my screen in Google Meet on pretty slow laptop. The system gets slow and firefox instance gets killed after some time (usually few seconds).

I was investigating a bit - it seems the cause is that the process responsible for the Google meet tab (Isolated Web Co) has set the rttime limit:

$ prlimit --pid xxx
...
RTTIME     timeout for real-time tasks             50000     200000 microsecs
...

It seems this is the cause of the issue, the process gets killed because this limit is exceeded in such a case.

If I remove the limit by sudo prlimit --pid xxx --rttime=unlimited it is not killed anymore (at least not for 10 mins I was testing it). Is there any way to disable this in Firefox?

Tested on Firefox 107.0.1, Linux 64bit (latest KDE Neon, based on Ubuntu 22.04).

Update: it seems the master process firefox has the same limit.

Redirect a needinfo that is pending on an inactive user to the triage owner.
:stransky, since the bug has recent activity, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(github) → needinfo?(stransky)
Flags: needinfo?(stransky)

Well, my firefox just exited: "cpu limit exceeded", so this bug is still a thing.

Mozilla Firefox 102.14.0esr
Debian 11.7

My machine is a potato, and I think firefox is swapping.

I think firefox is doing sound stuff (the browser that exited had no video things, but made notification sounds), invoking pulseaudio, which may be setting real-time foo for sound, then finding I don't have any sound output (the USB audio isn't plugged in), then leaving the realtime signal handlers uncaught, and then the browser later tries to "Ding!" but swaps in the necessary Ding machinery too slowly, triggering a realtime signal for "ran out of time!", which gets caught by the default signal handler, who wasn't expecting it, which causes firefox to exit cleanly.

The Aristocrats!

I just hit this 3 times in ~10 minutes while attempting to read this reddit thread: https://www.reddit.com/r/PrintedCircuitBoard/comments/19ak4z7/kicad_for_professional_projects/

I have hit this late last year on a slower CPU once or twice, but since I upgraded to a new CPU that is 3x faster with 6x more cores, I've now hit this at least 6 times this January.

All of this is on Debian 11 Firefox, which is currently at 115.6.0esr.

I can test things if needed

Another crash while doing a call on google voice. None of these show up in about:crashes

After another crash about 2 minutes after I reported the last one, I created this workaround script:

#!/bin/bash
cd /proc
for pid in $(grep realt */limits|grep -v 'unlimited.*unlimited'|grep 000 | cut -f1 -d/); do
	tmp=$(ps -p $pid -o args --no-headers |grep firefox)
	if [[ "x$tmp" != "x" ]]; then
		prlimit --pid $pid --rttime=unlimited:unlimited
	fi
done

Put that in a file, and I scheduled it to run via cron as root every minute. Thus once firefox is up, it will have an average of 30 seconds before the crash protection kicks in.

You need to log in before you can comment on or make changes to this bug.