Closed Bug 1718851 Opened 2 years ago Closed 2 years ago

Firefox crashed with no error report

Categories

(Core :: Widget: Gtk, defect, P3)

Firefox 89
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mystiquewolf, Assigned: jjalkanen)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 2 obsolete files)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0

Steps to reproduce:

Firefox just crashed without any error report in about:crashes. I have only 3 old crashes there. I don't know how to reproduce this but nevermind i file a bug so you know that sometimes there can be a crash without an error report according to the current code logic.

Actual results:

Firefox stopped responding and closed unexpectedly with no error report neither in /var/crash, nor in about:crashes.

Expected results:

Some kind of error reporting.

Attached file Troubleshooting info

The Bugbug bot thinks this bug should belong to the 'Core::Widget: Gtk' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Widget: Gtk
Product: Firefox → Core

Please run Firefox on terminal with WAYLAND_DEBUG=1 env variable and attach the log here, like:

WAYLAND_DEBUG=1 firefox > log.txt 2>&1

and attach the log.txt here.
Thanks.

Flags: needinfo?(liubomirwm)
Blocks: wayland
Priority: -- → P3
Attached file log.txt

Attaching the log. This log is without the crash. The crash only happened one time before i reported the bug and i don't really know how to trigger it again or when/if it will happen again.

Flags: needinfo?(liubomirwm)

I ran it just about 2 minutes, should i run it for more time with WAYLAND_DEBUG=1?

I see. I need the log only when the crash happens.

Please reopen if you have any further info.
Thanks.

Status: UNCONFIRMED → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME

FYI: In my case it could've been earlyoom killing Firefox. However there are cases like Bug 1721014 where it seems it isn't.

@Martin I'm not sure what's going on but i'll try to describe it.

So i was using X11 login session on my KDE (because i'm waiting for some Wayland fixes for other apps than Firefox ) and there were freezes of Firefox. I have 8GB RAM and use only Zram with Zstd as a swap, don't use a hard-drive backed swap. Also i'm using EarlyOOM. EarlyOOM kills an app when memory reaches 90% and free RAM is less than 10%.

On X11 there were only freezes for about 5 seconds time length, on Wayland these are more like 10-15 seconds and from time to time (3 times yesterday) Firefox closes some time after the hang starts. Interesting fact is that before that happened i was at 42% RAM, 65% Swap so not reached the trigger for EarlyOOM and also the usual EarlyOOM notice is missing in the logs. I'm trying to catch a performance profile of one of the times when Firefox freezes. I've also captured a WAYLAND_DEBUG log but i can't see anything problematic.

During one of the freezes i pressed Ctrl + Q and Firefox closed, but Firefox processes continued for about 1 minute. Then they crashed with this crash: https://crash-stats.mozilla.org/report/index/23f61a81-337a-4b0e-a348-511e00211007

Here is a performance profile: https://share.firefox.dev/3Bq0fLm

I have to say that the freezes are system-wide, that is, the clock in plasmashell freezes too. If i use Swapview there are 8 firefox processes in the swapped memory. I am using Fission also. Maybe i have to search for zram/Kubuntu issues?

Actually 13 if i use this grep VmSwap /proc/*/status | sort -r -n -k 2 | while read name size kb; do echo "$(readlink /proc/$(echo ${name} | cut -d/ -f 3)/exe) $size"; done | grep firefox

Thanks for the profile data / backtrace. I see that Firefox does not do anything special, just waits in event loop.

Disabling Fission seems to fix this issue - no Firefox processes in swap and no crashes yet for about an hour or so.
Before that i switched from Zstd to LZ4 (to be faster) and it still crashed without Firefox crash report and without EarlyOOM logging a kill. Twice it crashed with RAM 68% SWAP 37% and RAM 65% SWAP 22%.
Yes, with these values it might be freezing but it should not crash IMO. On Wayland it seems to crash. I might try it again with X11 but i believe it was only freezing there.

Do these stalls happen without Zram/etc?
Your configuration is considerably non-standard. Can you get a whole-system performance profile using perf? You can export a perf profile to the gecko profiler UI via "perf record -g -F 999" and after capturing "perf script -F +pid,-cpu >/tmp/foo.perf" (though I haven't tried it for a whole system in this way, normally just for Firefox).
The profile you gave shows no significant CPU use or other issues within FIrefox; my assumption is that something in your system is causing the delays

Flags: needinfo?(liubomirwm)

Yes, these stalls / crashes happen without zram and without any swap enabled.

Memory used by Firefox is not small (due to Fission i guess) and sometimes (probably intensified by lack of swap) early-oom kills (sends SIGTERM or SIGKILL if SIGTERM timeouts) either Isolated web content or GeckoMain and logs it. When this happens (noticed only once but it probably happens everytime and i wasn't watching carefully) at the end of bash terminal there is the text "Terminated" written.

But sometimes Firefox crashes without early-oom logging a SIGTERM/SIGKILL and instead of the text "Terminated" there is the following text written:

###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost

Gdk-Message: 18:07:06.842: Lost connection to Wayland compositor.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.

I don't know when each of these messages is supposed to appear and if you do know then probably you know better if something looks off.
I probably have to take more time and focus on whether what i'm telling is what really happens or i simply wasn't careful enough in my considerations.

I also tried to record perf but it was 2.6GBs and Firefox Profiler showed error when i tried to import. I have perf.data and perf.data.old and i'm not sure what's in these recordings. I might try a new recording.

What i'm not sure is (1) whether Firefox will always print "Terminated" when it receives SIGTERM/SIGKILL and (2) whether this "Lost connection to Wayland compositor" is normal to be printed on SIGKILL/SIGTERM for example.

I've also checked sudo dmesg | grep -i kill and it's empty, so i suppose the kernel OOM killer isn't stepping in. And early-oom should in fact step in before the kernel OOM anyways.

Today Firefox again closed without crash report. In the terminal there was this message:

Gdk-Message: 13:04:17.448: Error 104 (Connection reset by peer) dispatching to Wayland display.
Exiting due to channel error.

Note that i was updating kde to 5.23.1 at this time and the update of kwin could have been the cause. Anyway, this should not cause Firefox to crash. Also note that i don't believe this crash to be related to the previous crashes with either "Lost connection to Wayland compositor" or early-oom. Maybe a new bug is needed for this?

Also, these are crashes and i have no idea why no crash report is sent to Mozilla when it happens...

Here is a link to the last 15% of a profile - there is should be noticeable a freeze towards the end of the recording (that's when i remember it happened): https://share.firefox.dev/3vw6Odh

The full analysed.perf file is 3.8GB and i've used the --time 75%-100% option of perf script to reduce the size and after all there was a freeze towards the end of the recording.
If you request i can send you the full 3.8GB analysed.perf or the 1 GB perf.data, as the profiler tab crashes on import of full analysed.perf.

Flags: needinfo?(liubomirwm)
See Also: → 1743144

Method FileQuotaStream::SetEOF is exposed by the subclass FileOutputStream.

Assignee: nobody → jjalkanen

A patch has been attached on this bug, which was already closed. Filing a separate bug will ensure better tracking. If this was not by mistake and further action is needed, please alert the appropriate party. (Or: if the patch doesn't change behavior -- e.g. landing a test case, or fixing a typo -- then feel free to disregard this message)

Comment on attachment 9302337 [details]
Bug 1718851 - Test if FileQuotaStream::SetEOF supports extension of files. r=#dom-storage

Revision D161435 was moved to bug 1797913. Setting attachment 9302337 [details] to obsolete.

Attachment #9302337 - Attachment is obsolete: true

Comment on attachment 9302338 [details]
Bug 1718851 - Fix FileQuotaStream::SetEOF to support extension of files. r=#dom-storage

Revision D161481 was moved to bug 1797913. Setting attachment 9302338 [details] to obsolete.

Attachment #9302338 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.