Open Bug 1538435 Opened 3 years ago Updated 28 days ago

[Wayland] Firefox can self closed randomly (Exiting due to channel error)

Categories

(Core :: Widget: Gtk, defect, P3)

68 Branch
defect

Tracking

()

Tracking Status
firefox68 --- wontfix
firefox69 --- wontfix
firefox70 --- wontfix
firefox71 --- wontfix
firefox74 --- wontfix
firefox75 --- wontfix
firefox76 --- fix-optional

People

(Reporter: mikhail.v.gavrilov, Unassigned, NeedInfo)

References

(Blocks 1 open bug)

Details

Attachments

(4 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0

Steps to reproduce:

I use latest nightly build:

$ Downloads/firefox/firefox
*** You are running in chaos test mode. See ChaosMode.h. ***
IPDL protocol Error: Received an invalid file descriptor
IPDL protocol Error: Received an invalid file descriptor

###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost

IPDL protocol Error: Received an invalid file descriptor
IPDL protocol Error: Received an invalid file descriptor
IPDL protocol Error: Received an invalid file descriptor
[Child 12869, Chrome_ChildThread] WARNING: pipe error (56): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 357
[Child 12854, Chrome_ChildThread] WARNING: pipe error (57): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 357
[Child 12854, Chrome_ChildThread] WARNING: pipe error (55): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 357
Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=1.89439) [GFX1-]: Receive IPC close with reason=AbnormalShutdown
[Child 12869, Chrome_ChildThread] WARNING: pipe error (53): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 357
Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=2.25399) [GFX1-]: Receive IPC close with reason=AbnormalShutdown
[Child 12869, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 357
Exiting due to channel error.
[Child 12854, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 357
Exiting due to channel error.
Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=5.19836) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=7.18981) [GFX1-]: Receive IPC close with reason=AbnormalShutdown
[Child 12752, Chrome_ChildThread] WARNING: pipe error (52): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 357
[GFX1-]: Receive IPC close with reason=AbnormalShutdown
[Child 12752, Chrome_ChildThread] WARNING: pipe error: Broken pipe: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 726
Exiting due to channel error.
[Child 12800, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 357
Exiting due to channel error.
Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=11.0966) [GFX1-]: Receive IPC close with reason=AbnormalShutdown
[Child 12661, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 357
Exiting due to channel error.

Actual results:

Firefox crashed (self closed)

Has STR: --- → yes
Component: Untriaged → Graphics
Product: Firefox → Core
Priority: -- → P3

How can I provide the needed info ?

I'm running firefox which is crashing all the time (6 ~ 12 minutes after launching it) I try to use it since 2 years ago. The crashes reported on about:crashes are not any more at your bugtracker due 6 month limit policy

Currently at firefox 69 on Debian Sid, i tried to download firefox from upstream too. Both, normal mode and safe-mode crash. Disabling hardware acceleration not help. Creating a new ~/.mozilla not help.

$ firefox
[Child 20457, Chrome_ChildThread] WARNING: pipe error (3): Conexión reinicializada por la máquina remota: file /build/firefox-W4ZBKx/firefox-69.0.1/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
[GFX1-]: Receive IPC close with reason=AbnormalShutdown
Exiting due to channel error.
Exiting due to channel error.
Terminado (killed)

I was a happy firefox user until it started to crash on my computer. Please see that other apps will not crash

Please help me fixing my firefox issue

Thank you!

I'm affected as well. Version 68 worked correctly.

Now, the Firefox 69.0.1 is always crashing when I try open the hamburger menu or type any text (url, text to search) into address bar. FF fails with the same error as i5513 have.

As long I don't open menu or enter text into address bar, FF works.

I'm on Gentoo, KDE plasma 5.16.5., kernel 5.3.1-gentoo, old nvidia graphics (GeForce GT 740M) with 435.21 drivers used together with build-in Intel GC.

Update: returned to FF 68 and all works, no crashes.

So I have other two computers where FF 69 works.

On this one, where it fails, I had some issues with overheating. I have solved and recompiled gcc and FF 69 and it works correctly now. So it was probably issue with corrupted gcc.

Today I tried a firefox snap version, but it didn't work :(. Firefox wont run more than 5- 10 minutes at my laptop.
$ snap list firefox
Name Version Rev Tracking Publisher Notes
firefox 69.0.3-1 274 stable mozilla✓ -

[Child 15392, MediaDecoderStateMachine #1] WARNING: Decoder=7f4ed2df2800 Decode error: NS_ERROR_DOM_MEDIA_METADATA_ERR (0x806e0006) - static MP4Metadata::ResultAndByteBuffer mozilla::MP4Metadata::Metadata(mozilla::ByteStream *): Cannot parse metadata: file /builds/worker/workspace/build/src/dom/media/MediaDecoderStateMachine.cpp, line 3309
Gtk-Message: Failed to load module "canberra-gtk-module"
Gtk-Message: Failed to load module "canberra-gtk-module"
Gtk-Message: Failed to load module "canberra-gtk-module"
Gtk-Message: Failed to load module "canberra-gtk-module"
user-open error: no such file or directory
Gtk-Message: Failed to load module "canberra-gtk-module"
Gtk-Message: Failed to load module "canberra-gtk-module"
user-open error: no such file or directory
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
[GFX1-]: Receive IPC close with reason=AbnormalShutdown
Exiting due to channel error.
Terminado (killed)

Tried beta version at snap , the same result:

$ snap list firefox
Name Version Rev Tracking Publisher Notes
firefox 70.0b14-1 275 beta mozilla✓ -

LibThai: Fail to open dictionary at '/usr/share/libthai/thbrk.tri'.
LibThai: Fail to open dictionary at '/usr/share/libthai/thbrk.tri'.
LibThai: Fail to open dictionary at '/usr/share/libthai/thbrk.tri'.
[Parent 19755, Gecko_IOThread] WARNING: pipe error (143): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
[Parent 19755, Gecko_IOThread] WARNING: pipe error (256): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
[Child 20304, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Terminado (killed)

How can we debug this issue? I would like to stick to firefox ... but for now I have to use chrome (since 2 years :( )

Thank you !

tried with Xorg classic and with wayland, the same result

Seeing this with Firefox 71.0 on Arch Linux.

Dec 07 16:17:27 miner firefox.desktop[3372]: [Parent 3372, Gecko_IOThread] WARNING: pipe error (260): Connection reset by peer: file /build/firefox/src/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Dec 07 16:17:27 miner firefox.desktop[3372]: [Parent 3372, Gecko_IOThread] WARNING: pipe error (192): Connection reset by peer: file /build/firefox/src/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Dec 07 16:17:27 miner firefox.desktop[3372]: [Parent 3372, Gecko_IOThread] WARNING: pipe error (227): Connection reset by peer: file /build/firefox/src/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Dec 07 16:17:27 miner firefox.desktop[3372]: [Parent 3372, Gecko_IOThread] WARNING: pipe error (190): Connection reset by peer: file /build/firefox/src/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Dec 07 16:17:28 miner firefox.desktop[3428]: ###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
Dec 07 16:17:28 miner firefox.desktop[3372]: ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
Dec 07 16:17:28 miner firefox.desktop[3372]: [Parent 3372, Gecko_IOThread] WARNING: pipe error (82): Connection reset by peer: file /build/firefox/src/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Dec 07 16:17:28 miner firefox.desktop[3372]: ###!!! [Child][MessageChannel::SendAndWait] Error: Channel error: cannot send/recv

I have this issue:

  • Firefox 71.0 (64-bit)
  • NixOS
  • Sway

This has only happened to be on reddit.com (new), although it could be pure coincidence. Each time I restart my browser and visit the same reddit pages, it crashes after some time.

(firefox:5267): Gdk-WARNING **: 16:12:05.261: (../gdk/wayland/gdkwindow-wayland.c:810):buffer_release_callback: runtime check failed: (impl->staging_cairo_surface != cairo_surface)
JavaScript error: , line 0: uncaught exception: Object
JavaScript error: , line 0: uncaught exception: undefined
JavaScript error: , line 0: uncaught exception: undefined
JavaScript error: , line 0: uncaught exception: undefined
JavaScript error: , line 0: uncaught exception: undefined
.firefox-wrapped: cairo-surface.c:930: cairo_surface_reference: Assertion `CAIRO_REFERENCE_COUNT_HAS_REFERENCE (&surface->ref_count)' failed.
[Child 5689, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /build/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
[Child 5502, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /build/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Exiting due to channel error.
[Child 5493, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /build/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Exiting due to channel error.
[Child 5530, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /build/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
[Child 5449, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /build/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
[Child 5422, Chrome_ChildThread] WARNING: pipe error (55): Connection reset by peer: file /build/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=69.7012) [Child 5393, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /build/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=69.7892) Exiting due to channel error.
[Child 5422, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /build/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Exiting due to channel error.
[Child 5321, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /build/firefox-71.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 358
Exiting due to channel error.
Aborted (core dumped)

Firefox crashing also seems to be linked with opening links by calling the firefox executable.

If I have Firefox already opened and type:

$ firefox 'youtube.com'

Then when I am going to interact with the new tab that was created by that command, the browser will crash. It seems to be the best way I can find to reproduce it. I have also tried with other websites (change youtube.com in the above example), and have the same problem.

Could anyone confirm that this is the problem? And that it is reproducible?

(In reply to nils from comment #10)

Firefox crashing also seems to be linked with opening links by calling the firefox executable.

If I have Firefox already opened and type:

$ firefox 'youtube.com'

Then when I am going to interact with the new tab that was created by that command, the browser will crash. It seems to be the best way I can find to reproduce it. I have also tried with other websites (change youtube.com in the above example), and have the same problem.

Could anyone confirm that this is the problem? And that it is reproducible?

I can always reproduce it.

This started happening constantly for me starting with today's nightly update using the wayland backend.
GDK_BACKEND=wayland /.../firefox-nightly/firefox

Gdk-Message: 13:04:57.986: Error flushing display: Invalid argument
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.

Firefox Version: 73.0a1 (2020-01-03) (64-bit)

Note that the above steps mentioned by :nils does not consistently reproduce the issue I'm seeing.

I also want to mention that I just switched to an SSD and my computer has a lot less latency now. For example firefox takes a lot less time to startup.

I'm still running firefox 71 and am running the same OS (NixOS) but now it just isn't reproduceable.

This could suggest this crash is caused by a race condition or something of that kind.

I get the issue on an SSD. The one pattern I've noticed is that it tends to crash shortly after waking and logging in (after automatic suspend). Does anyone else get the Firefox window going transparent when it crashes. It's as if the window becomes hollow with only a frame. I can see through to my desktop. Firefox doesn't actual close until I hit the close icon. But I get the same "Error: Channel closing: too late to send/recv, messages will be lost" message.

I started seeing this after yesterday's nightly update as well. I'm on Wayland (swaywm).

$ GDK_BACKEND=wayland ./firefox
Gdk-Message: 15:57:53.832: Error 22 (Invalid argument) dispatching to Wayland display.
Exiting due to channel error.

This is very frequent for me—it's probably already happened like four times today.

I can't find a consistent way to reproduce the crash in comment 12, but it seems to happen at random and fairly often.

Workaround: I only see it when running with GDK_BACKEND=wayland. If I simply run firefox-nightly/firefox & (as I'm doing right now), the issue disappears.

Here's some environment info:

Firefox Version: *73.0a1* (2020-01-04) (64-bit)
Arch Linux
Gnome Version 3.34.2 (Wayland)
graphics card:
                       nVidia GP107M [GeForce GTX 1050 Ti Mobile]
                       Intel UHD Graphics 630 (Mobile)
cpu:
                       Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, 3955 MHz (6 core, 6 virtual)
disk:               SK hynix SSD

Problem on wayland might be dup of Bug 1606751 .

Just to note, I'm no longer seeing this issue on the latest nightly build with Wayland. So most likely my problem (which could be different from this bug) was indeed fixed in Bug 1606751.

I also see Firefox randomly exits.

i@alexei:~$ firefox --private-window; echo $?
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
CPU time limit exceeded (core dumped)
152

It seems to correlate with opened tab of purrli.com -- when the site is opened, I have an exit in about 0.5--2 hours, and otherwise it at least works much longer.
No new items appear in about:crashes, no files in /var/crash/, despite phrase "core dumped" in output and ulimit -c unlimited.

Ubuntu 18.04, Firefox 74 (and earlier versions too). Addons used: Adblock Plus (without it I can't reproduce, with it Firefox exits after several hours of running one tab with purrli.com).

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression
Keywords: regression

For me Firefox-68.6.0esr-x86_64 ALWAYS crashes immediately after starting it with messages:

Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=1.89439) [GFX1-]: Receive IPC close with reason=AbnormalShutdown

I've tried to reinstall system and at the moment I'm using Slackware64-current PLASMA5 live USB.
The notebook on which the crashes always reproduce is:
Manufacturer: Acer
CPU: AMD C-50
GPU: AMD 6250 (integrated with CPU)

I haven't had any success with the recent Firefox versions on this netbook - that is, when
(and if) a version of Firefox (in fact I think those were only Slackware-provided ESRs) runs
without the mentioned crash it anyway is inevitable unuseful because of permanent
"GAH your tab has crashed"'s on every tab but text-only (or maybe absolute static).

The strange (or maybe good) thing is that I've boot from the same USB stick another
notebook(s) (with an older Intel CPU / integrated GPU) - and have had NO problems at all.
So I think that there maybe some hardware/GPU drivers incompatibility involved in these
crashes....

Any progress? I'm experiencing these crashes on firefox-78.0.1 under Wayland (Sway 1.5)

Seems to be fixed at Debian Unstable, Firefox 79

Yesterday firefox run fine about three hours, now I'm browsing 36 minutes without any crash

I hope other affected people can confirm it !

I have the problem since I have upgraded firefox to version 80.
I have followed minor update from Gentoo and now I'm using 80.0.1
When firefox crash, I can get this message in the terminal :

Crash Annotation
GraphicsCriticalError: |
[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=31.4298)
Crash Annotation
GraphicsCriticalError: |
[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=31.2482)
[GFX1-]: Receive IPC close with reason=AbnormalShutdown
[GFX1-]: Receive IPC close with reason=AbnormalShutdown
Crash Annotation
GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=32.4313)
[GFX1-]: Receive IPC close with reason=AbnormalShutdown
Crash Annotation
GraphicsCriticalError: |
[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=31.5242)
[GFX1-]: Receive IPC close with reason=AbnormalShutdown
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Erreur de segmentation (translation: segmentation fault )

In he computer that crash, there is 2 GPU :
04:00.0 VGA compatible controller: NVIDIA Corporation G96C [GeForce 9500 GT] (rev a1)
09:0c.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] ES1000 (rev 02)
that are used with "radeon" and "nouveau" driver.

I can reproduce the bug each time I try to visit this french bank's site

It can be easy to me to build another version of firefox with debug info to get stack, does it needed ?

I was able to get a stable build by disabling link time optimization, (-lto useflag for gentoo user)

Still happens, and only on Wayland.

Happens to me as well on a Gentoo system on Firefox 84 and 85 under certain conditions (e.g., several windows open, try to open more windows). On Wayland, the actual error is:
Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=5115.14) Exiting due to channel error.
Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=3547.66)

I haven't had a chance to try "safe" mode yet, but I'm only running Vimius-C, NoScript and AgentSwitcher, not something I would describe as "heavy extension usage".

I also started wondering why WebGL has some ANDROID in it, but that may just be spurious.

I can confirm that this happens in safe mode, too.

Please run with WAYLAND_DEBUG=1 env variable set. It should reveal if it's a problem in Wayland.
Thanks.

Blocks: wayland
Component: Graphics → Widget: Gtk
Summary: Firefox can self closed randomly (Exiting due to channel error) → [Wayland] Firefox can self closed randomly (Exiting due to channel error)

Martin, happy to test the WAYLAND_DEBUG route, which I'm familiar with and is fully independent from the program, so long as it's a wayland client, that is.
However, are you sure that coredumpctl works in this case? The page you linked says "Application window simply disappear and bug report dialog will show up" and suggests that coredumpctl is one of the things that you can do. However, this is not my case (and I thought not the case for this bug either, but I may be wrong). Firefox simply crashes and there is nothing offered, no crash saved in ~/.mozilla. Let me know if I misunderstood before I start installing more debugging tools.

If you see Mozilla crashreporter dialog please use it, submit crash (from about:crashes) and paste crash ID here. Use coredumpctl only when Mozilla crashreporter dialog does not show up. Also please inspect crashes at about:crashes if they're related.

OK - I can confirm that there is nothing in about:crashes, nor in ~/mozilla (wherever the crashes get stored) - basically Firefox has never crashed with the crashreporter in a very long time. I'll get coredumpctl and give it a try - will report if I find something relevant.

It's also possible that Firefox does not crash - it's just terminated due to Wayland protocol error. That should reveal WAYLAND_DEBUG log.

$ tail -50 firefox.log.1 
[3306822.292] wl_display@1.delete_id(312)
[3306822.335] wl_buffer@301.release()
[3306822.379] wl_buffer@172.release()
[3306822.413] wl_callback@312.done(154124)
[3306822.618] wl_buffer@290.release()
[3306833.496] wl_display@1.delete_id(88)
[3306833.591] wl_display@1.delete_id(321)
[3306833.621] wl_display@1.delete_id(320)
[3306833.647] wl_display@1.delete_id(308)
[3306833.686] wl_callback@88.done(76365597)
[3306833.759] wl_callback@320.done(76365597)
[3306833.864] wl_callback@321.done(76365597)
[3306833.946] wl_callback@308.done(76365597)
[3307167.026]  -> wl_compositor@6.create_surface(new id wl_surface@308)
[3307257.694]  -> wl_surface@30.destroy()
[3307258.855]  -> wl_surface@27.destroy()
[3307259.868]  -> zwp_linux_dmabuf_v1@29.destroy()
[3307361.558]  -> wl_surface@308.frame(new id wl_callback@321)
[3307361.709]  -> xdg_wm_base@25.get_xdg_surface(new id xdg_surface@320, wl_surface@308)
[3307361.742]  -> xdg_surface@320.get_toplevel(new id xdg_toplevel@88)
[3307361.764]  -> xdg_toplevel@88.set_parent(nil)
[3307361.787]  -> xdg_toplevel@88.set_title("Mozilla Firefox")
[3307361.808]  -> xdg_toplevel@88.set_app_id("firefox-wayland")
[3307361.827]  -> wl_surface@308.commit()
[3307361.844]  -> org_kde_kwin_server_decoration_manager@11.create(new id org_kde_kwin_server_decoration@312, wl_surface@308)
[3307361.876]  -> org_kde_kwin_server_decoration@312.request_mode(2)
[3307380.879]  -> wl_shm@45.create_pool(new id wl_shm_pool@189, fd 228, 4)
[3307381.139]  -> wl_shm_pool@189.create_buffer(new id wl_buffer@183, 0, 1, 1, 4, 0)
[3307438.980]  -> wl_buffer@183.destroy()
[3307439.040]  -> wl_shm_pool@189.destroy()
[3307439.120]  -> xdg_toplevel@88.destroy()
[3307439.138]  -> xdg_surface@320.destroy()
[3307439.150]  -> org_kde_kwin_server_decoration@312.release()
[3307439.162]  -> wl_surface@308.destroy()
[3307443.120] wl_display@1.delete_id(183)
[3307443.179] wl_display@1.delete_id(189)
[3307443.202] wl_display@1.delete_id(88)
[3307443.226] wl_display@1.delete_id(320)
[3307443.250] wl_display@1.delete_id(312)
[3307443.277] wl_display@1.delete_id(321)
[3307443.305] wl_display@1.delete_id(308)
[3307443.331] xdg_wm_base@25.ping(Exiting due to channel error.
Exiting due to channel error.
Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=46.1424) Exiting due to channel error.
Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=46.6436) Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.

Not sure this is too useful... maybe the server killed firefox because it took a long time to return a pong? I'll come back with that coredumpctl if/when it crashes again and I get it installed :)

[3307443.331] xdg_wm_base@25.ping(Exiting due to channel error.

Yes, this is important. I suggest to report it to KWim/Wayland for further investigation. Maybe some KWim log can help here but I don't know how to get it, Vlad?

Flags: needinfo?(vlad.zahorodnii)
Blocks: wayland-kde
No longer blocks: wayland

Martin, FWIW, I don't use KDE/Plasma... I'm on hikari (wlroots based compositor). If you tell me what you're looking for exactly, I can poke around a bit - my understanding is that Wayland server uses ping/pong to figure out if something is deadlocked. The indication of "clients have to respond to a ping in a timely manner" doesn't really describe what ought to happen very well, and I'm not an expert here but anything I can do to help, I will do.

Gabriel: what would be interesting if still see the crashes when using nightly (as there have been a bunch of crash fixes that are not yet in release). And also if forcing on Webrender makes any difference (should be automatically enabled on nightly).

Thanks, Robert. I activated webrender and went ahead to refresh the 16 windows (assuming that that's the way to make this kick in), and crashed :) Attaching last 1000 lines of the wayland log.
Martin, sorry, I can see there is something related to kwin in the wayland logs :)

Martin: regarding coredumpctl, I am not on systemd, so can't provide that. There are, however, no core dumps or any logs beyond what I provided from stdin+stderr

Okay, Thanks for the info. I think it would help to get log from the compositor or ask compositor devs.

Flags: needinfo?(vlad.zahorodnii)

Martin, I haven't found anything in the compositor's logs - but I'll pay more attention in the next crash in case there's something not obvious. I'll share this bug in the community, see if anybody has seen the same.

Robert: it turns out that with using webrender didn't yield a crash, albeit I did see a few strange areifacts here and there, but nothing too offensive, and certainly much more pleasant than a sudden crash.
Perhaps if others want to test, it's easy to enable via about:config, and didn't even need to run nightly.
Of course luck may damn me tomorrow, in which case I will report back :)

Hit this bug again, with firefox 88 (debian unstable) :(

(In reply to Martin Stránský [:stransky] (ni? me) from comment #30)

Please run with WAYLAND_DEBUG=1 env variable set. It should reveal if it's a problem in Wayland.
Thanks.

Tried it but not success

javi@doraemon:$ WAYLAND_DEBUG=1 firefox -no-safe
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Terminado (killed)
javi@doraemon:
$ sudo coredumpctl list # do I need to start some service?
No coredumps found.

in about:crash:
No crash reports have been submitted.

but at firefox directory (not *log found, but yes a crash report??)
javi@doraemon:/.mozilla/firefox/Crash Reports$ cat InstallTime20210504152106
1621797825
javi@doraemon:
/.mozilla/firefox/Crash Reports$ ls -l --full-time InstallTime20210504152106
-rw------- 1 javi disk 10 2021-05-23 21:23:45.578083218 +0200 InstallTime20210504152106

Checked with and without hardware accel checked

How can I provide logs?

Thank you!

(In reply to i5513 from comment #47)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #30)

Please run with WAYLAND_DEBUG=1 env variable set. It should reveal if it's a problem in Wayland.
Thanks.

Tried it but not success

javi@doraemon:$ WAYLAND_DEBUG=1 firefox -no-safe
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Terminado (killed)
javi@doraemon:
$ sudo coredumpctl list # do I need to start some service?
No coredumps found.

in about:crash:
No crash reports have been submitted.

but at firefox directory (not *log found, but yes a crash report??)
javi@doraemon:/.mozilla/firefox/Crash Reports$ cat InstallTime20210504152106
1621797825
javi@doraemon:
/.mozilla/firefox/Crash Reports$ ls -l --full-time InstallTime20210504152106
-rw------- 1 javi disk 10 2021-05-23 21:23:45.578083218 +0200 InstallTime20210504152106

Checked with and without hardware accel checked

How can I provide logs?

Thank you!

Sorry for syntax ...

javi@doraemon:~$ WAYLAND_DEBUG=1 firefox  -no-safe
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Terminado (killed)
javi@doraemon:~$ Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.

javi@doraemon:~$ sudo coredumpctl list
No coredumps found.
javi@doraemon:~/.mozilla/firefox/Crash Reports$ cat InstallTime20210504152106
1621797825javi@doraemon:~/.mozilla/firefox/Crash Reports$ ls -lrt InstallTime20210504152106
-rw------- 1 javi disk 10 may 23 21:23 InstallTime20210504152106
javi@doraemon:~/.mozilla/firefox/Crash Reports$ 

Seeing this same problem (Exiting due to channel error) every couple days on Firefox 88.0.1 (64-bit) with Wayland, Gnome 3.38.5, on Debian GNU/Linux 11 (bullseye), with NVMe SSD, no recent crash reports.

Recently Firefox has started to crash very often this way.

$ Downloads/firefox/firefox

###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost


###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost


###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost


###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost


###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost

Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
CPU time limit exceeded (core dumped)

Please open about:support, click on "Copy text to clipboard" and paste it here.

Attached file about:support

MOZ_GMP_PATH: /usr/lib64/mozilla/plugins/gmp-gmpopenh264/system-installed

media.gmp-widevinecdm.abi: x86_64-gcc3-asan

My official Nightly doesn't have an "-asan" widevine plugin, only "x86_64-gcc3".
From Firefox' official asan build I remember that it doesn't open the regular crash reporter, but has its own crash reporting: https://firefox-source-docs.mozilla.org/tools/sanitizer/asan_nightly.html Are you using this ASan Nightly, a self-compiled one, or one from which distribution?

Please try out the official Nightly from https://nightly.mozilla.org and remove the MOZ_GMP_PATH environment variable.

(In reply to Darkspirit from comment #53)

My official Nightly doesn't have an "-asan" widevine plugin, only "x86_64-gcc3".
From Firefox' official asan build I remember that it doesn't open the regular crash reporter, but has its own crash reporting: https://firefox-source-docs.mozilla.org/tools/sanitizer/asan_nightly.html Are you using this ASan Nightly, a self-compiled one, or one from which distribution?

Yes, I am use Firefox' official ASan Nightly binary downloaded from page https://firefox-source-docs.mozilla.org/tools/sanitizer/asan_nightly.html.

Please try out the official Nightly from https://nightly.mozilla.org and remove the MOZ_GMP_PATH environment variable.

It turns out that the MOZ_GMP_PATH variable was set by the script /etc/profile.d/gmpopenh264.sh, which was installed by the mozilla-openh264 package, which is installed by default in Fedora.

# cat /etc/profile.d/gmpopenh264.sh
MOZ_GMP_PATH="/usr/lib64/mozilla/plugins/gmp-gmpopenh264/system-installed"
export MOZ_GMP_PATH

[root@primary-ws /]# dnf provides /etc/profile.d/gmpopenh264.sh
Last metadata expiration check: 0:25:43 ago on Wed 01 Sep 2021 01:46:44 AM +05.
mozilla-openh264-2.1.1-2.fc35.x86_64 : H.264 codec support for Mozilla browsers
Repo        : @System
Matched from:
Filename    : /etc/profile.d/gmpopenh264.sh

mozilla-openh264-2.1.1-2.fc35.x86_64 : H.264 codec support for Mozilla browsers
Repo        : fedora-cisco-openh264
Matched from:
Filename    : /etc/profile.d/gmpopenh264.sh

[root@primary-ws /]# rpm -ql mozilla-openh264
/etc/profile.d/gmpopenh264.sh
/usr/lib/.build-id
/usr/lib/.build-id/3d
/usr/lib/.build-id/3d/fdd6769fec61f5e3a1ddc2a5c9239df8a4bcaf
/usr/lib64/firefox
/usr/lib64/firefox/defaults
/usr/lib64/firefox/defaults/pref
/usr/lib64/firefox/defaults/pref/gmpopenh264.js
/usr/lib64/mozilla/plugins/gmp-gmpopenh264
/usr/lib64/mozilla/plugins/gmp-gmpopenh264/system-installed
/usr/lib64/mozilla/plugins/gmp-gmpopenh264/system-installed/gmpopenh264.info
/usr/lib64/mozilla/plugins/gmp-gmpopenh264/system-installed/libgmpopenh264.so
/usr/lib64/mozilla/plugins/gmp-gmpopenh264/system-installed/libgmpopenh264.so.2.1.1
/usr/lib64/mozilla/plugins/gmp-gmpopenh264/system-installed/libgmpopenh264.so.6

I removed this package.

Me too, I have been experiencing this problem.

[Child 5314, MediaDecoderStateMachine #3] WARNING: Decoder=7f7f82067000 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /builds/worker/checkouts/gecko/dom/media/MediaDecoderStateMachine.cpp, line 3470

###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost

###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost

###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost

[Child 5524, MediaDecoderStateMachine #1] WARNING: Decoder=7fa7a0bb8400 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /builds/worker/checkouts/gecko/dom/media/MediaDecoderStateMachine.cpp, line 3470
[Child 5524, MediaDecoderStateMachine #1] WARNING: Decoder=7fa7a0bb8400 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /builds/worker/checkouts/gecko/dom/media/MediaDecoderStateMachine.cpp, line 3470

###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost

[Child 8024, MediaDecoderStateMachine #3] WARNING: Decoder=7f02360fb400 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /builds/worker/checkouts/gecko/dom/media/MediaDecoderStateMachine.cpp, line 3470
[Child 8024, MediaDecoderStateMachine #3] WARNING: Decoder=7f02360f1400 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /builds/worker/checkouts/gecko/dom/media/MediaDecoderStateMachine.cpp, line 3470
[Child 8024, MediaDecoderStateMachine #3] WARNING: Decoder=7f0226906400 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /builds/worker/checkouts/gecko/dom/media/MediaDecoderStateMachine.cpp, line 3470
[Child 8024, MediaDecoderStateMachine #3] WARNING: Decoder=7f020bcb8c00 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /builds/worker/checkouts/gecko/dom/media/MediaDecoderStateMachine.cpp, line 3470
[Child 8024, MediaDecoderStateMachine #3] WARNING: Decoder=7f021652e000 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /builds/worker/checkouts/gecko/dom/media/MediaDecoderStateMachine.cpp, line 3470
[Child 8024, MediaDecoderStateMachine #3] WARNING: Decoder=7f021652e000 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /builds/worker/checkouts/gecko/dom/media/MediaDecoderStateMachine.cpp, line 3470
[Child 5314, MediaDecoderStateMachine #23] WARNING: Decoder=7f7f8c4cf800 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /builds/worker/checkouts/gecko/dom/media/MediaDecoderStateMachine.cpp, line 3470


###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost

If Firefox has run on HDD or external HDD, web page performance and speed is poor and slowest. It impacts performance of The Firefox extensions!
If Firefox runs on memory, it is speedy, but I will test then reported.

Sorry
I have tested with Firefox v78 v85 v89 on Linux and Windows! It is the same problem

Happening on dev edition 93 (Arch build, on sway).

I think I found one quite reliable way to reproduce this after some experimenting (not the only way but a reliable one). If you open a new Firefox window in sway (tabbed layout), have your focus on any text input such as url bar or input, close the tab with a three finger click on the tab, then switch focus to a terminal window running htop by clicking the sway tab, CPU usage will be 100% until you do something with the keyboard.

I couldn't see what was using all the CPU, but about 15% (with 2 cores) was sway and also some other wayland clients.

Sometimes this will result in a crash, but the probability might be a bit higher when I have a large amount of windows open, though that could also be because having many windows means I'll also be closing them more often.

Probably lots of the information above is irrelevant, but I feel that this might be somehow related to keyboard input in Wayland. I have fcitx5 IME installed, but I think I've seen crashes with and without.

Attached file aboutsupport-miika.txt

[2563171.072] wl_registry@2655.global(37, "wl_seat", 7)
[2563171.088] -> wl_registry@2655.bind(37, "wl_seat", 1, new id [unknown]@2718)

Firefox is repeatedly allocation when registry and id binding new wl_seat only. Can you run Firefox with

MOZ_LOG="Widget:5" WAYLAND_DEBUG=1

variables and attach the log here?
Thanks.

Flags: needinfo?(miika9764)

On gnome + xorg (without wayland), Debian Unstable (92.0-1), this bug happens with all the firefox version I tried (debian package, flatpak, download from mozilla, snap)

Today , It run more time than other days, but finally it crashed again

Thank you

Do you have anything at about:crashes? If so can you please submit the crashreport and paste crash ID here?
Thanks.

The only crash that appear at about:crashes is from May, and firefox 89.0, so I think it would not help here.

Today 3 times crashed (now 4 while I was writting this comment it crashed again) , but not about:crashes id were generate
Maybe I need to set any var env to generate something useful?

Thank you very much!
PD: reply button to your comment does not seem to work, at least while I'm writting this comment

Hello,

Today I could navigate with firefox 1 hour, not crash after upgrading Firefox to 93.0-1 (gnome+xorg).

Maybe tomorrow I can try 91.2.0esr-1 too ... or check with wayland 93.0-1

I hope return to firefox as my default browser !

Thank you!

I had the problem too and bisected the nightlies. I'm running CentOS 7.

2:27.31 INFO: Last good revision: 0cf9eded35d8150796eda1f892666d0a778bd488 (2019-08-26)
2:27.31 INFO: First bad revision: 4c09da80722fcf62db02f19dc67f3f1a6b88f84d (2019-08-27)

I don't have the resources to bisect a full build at the moment, but here is how I pinned it down if others can try too:

mozregression --good 2019-06-02 --bad 2019-10-29 --command '/usr/src/upstream/mozregression/venv/firefox-test {binary}'

Here is the firefox-test command that it runs looking for a SIGXCPU:

==== ./firefox-test ====
#!/bin/bash

set -x

binary=$1

(sleep 15; wmctrl -c firefox) &

(
prof=mktemp -d
$binary --profile $prof
rc=$?
echo rc=$rc;
rm -rf "$prof"
2>&1) | tee firefox.log

if grep "rc=152" firefox.log; then
exit 1
else
exit 0
fi

Trying again with markdown, it clobbered some of my paste in the previous comment:

This bisect shows the first bad nightly on my CentOS 7 system:

2:27.31 INFO: Last good revision: 0cf9eded35d8150796eda1f892666d0a778bd488 (2019-08-26)
2:27.31 INFO: First bad revision: 4c09da80722fcf62db02f19dc67f3f1a6b88f84d (2019-08-27)

This is how the nightlies were bisected:

mozregression --good 2019-06-02 --bad 2019-10-29 --command './firefox-test {binary}'

=== ./firefox-test bash script ===

#!/bin/bash

set -x 

binary=$1

(sleep 15; wmctrl -c firefox) &

(
	prof=`mktemp -d`
	$binary --profile $prof
	rc=$?
	echo rc=$rc;
	rm -rf "$prof"
	2>&1) | tee firefox.log

if grep "rc=152" firefox.log; then 
	exit 1
else
	exit 0
fi

If someone can tell me how to do a bisect by building these versions then I can give it a shot, but I've not built firefox in CentOS 7 before so I'm not sure what I need to do.

Well I'm having a horrible time just trying to bootstrap the build environment from 2019.

I've tried

hg clone https://hg.mozilla.org/mozilla-central/ mozilla-central
cd mozilla-central/
hg update -r 4c09da80722fcf62db02f19dc67f3f1a6b88f84d
./mach bootstrap

but it tries to download an ancient toolchain artifact that is no longer available and bootstrap blows up.

How can I build this tree from 2019 so we can bisect and find the actual cause?

Its building! I was able to ignore the bootstrap errors, do ./mach clobber and ./mach build. Now it is building the "bad" commit so we can bisect.

This is el7 so I also had to install a few build dependencies like so in case someone else needs a reference:

wget https://www.mercurial-scm.org/release/centos7/mercurial.repo
yum install mercurial
yum install centos-release-scl
yum-config-manager --enable rhel-server-rhscl-7-rpms
yum install llvm-toolset-7
yum install rust-toolset-7-cargo-vendor.x86_64
cargo install cbindgen

I also had to symlink the clang root and cbindgen to ~/.mozbuild:

mv .mozbuild/clang .mozbuild/clang-orig
ln -s /opt/rh/llvm-toolset-7/root/ ~/.mozbuild/clang 

mv  .mozbuild/cbindgen/cbindgen{,-old}
ln -s ~/.cargo/bin/cbindgen .mozbuild/cbindgen/cbindgen

From there:

hg update -r 4c09da80722fcf62db02f19dc67f3f1a6b88f84d
./mach bootstrap
./mach clobber
./mach build

I may have missed a step in the explanation above, but this is roughly correct.

Once we get a good build bisecting can begin if the bug will (kindly) continue to present.

I'm getting a huge number of build errors like the one below. I've tried rust 1.37, 1.39, 1.41, and 1.56.

Bug#1595218 has changes backed out that fix this exact error, but they are in December '19 and these commits are in October '19. A nightly build did happen, so somehow it built with some version of rust. Maybe it was even older...

Ideas?

 0:28.72 error[E0119]: conflicting implementations of trait `std::clone::Clone` for type `gecko_bindings::structs::root::mozilla::GeckoEffects`:
 0:28.72      --> /usr/src/upstream/mozilla-central/obj-x86_64-pc-linux-gnu/x86_64-unknown-linux-gnu/release/build/style-3157d1b0b8b47e0a/out/gecko_properties.rs:18362:1
 0:28.72       |
 0:28.72 18362 | impl Clone for GeckoEffects {
 0:28.72       | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation for `gecko_bindings::structs::root::mozilla::GeckoEffects`
 0:28.72       | 
 0:28.72      ::: /usr/src/upstream/mozilla-central/obj-x86_64-pc-linux-gnu/x86_64-unknown-linux-gnu/release/build/style-3157d1b0b8b47e0a/out/gecko/structs.rs:38226:31
 0:28.73       |
 0:28.73 38226 |         #[derive(Debug, Copy, Clone)]
 0:28.73       |                               ----- first implementation here

The "good" version doesn't have the error above, it did compile completely. Hurray, this is my first firefox build!

Ok, more info for anyone following along:

The file taskcluster/ci/toolchain/rust.yml hints at the rsust version being used, so I'm going to try to build the "bad" version with 1.36.0:

rustup default 1.36.0

Also I had weird python unicode issues in the v2.7.5 that comes with CentOS 7 so I installed 2.7.18 and used VirtualEnv to re-bootstrap and rebuild:

wget https://www.python.org/ftp/python/2.7.18/Python-2.7.18.tgz
tar xzf Python-2.7.18.tgz
cd Python-2.7.18
./configure --enable-optimizations --prefix=/usr/local/python-2.7.18
make -j24 && make install
virtualenv --python=/usr/local/python-2.7.18/bin/python venv
. venv/bin/activate
./mach clobber
./mach bootstrap
./mach build
For more information on what to do now, see https://developer.mozilla.org/docs/Developer_Guide/So_You_Just_Built_Firefox

ok, on to bisecting...

BISECTED!

Here it is:

The first bad revision is:
changeset:   490166:1f4a5ebd5c7b
user:        Paul Adenot <paul@paul.cx>
date:        Tue Aug 27 08:00:43 2019 +0000
summary:     Bug 1576168 - mach vendor rust. r=pehrsons

And there is a related commit just before on the same day:

changeset:   490165:6e85d1ee673a
user:        Paul Adenot <paul@paul.cx>
date:        Tue Aug 27 08:00:36 2019 +0000
summary:     Bug 1576168 - Update audio_thread_priority to 0.19.1. r=pehrsons

Here is the description from Paul:

From Paul Adenot (:padenot):

Attached file Bug 1576168 - Update audio_thread_priority to 0.19.1. r?pehrsons — Details

This changes the hard-limit of RLIMIT_RTTIME to be the maximum available
(200ms on my system), and keep the soft limit to the same number.

Having different numbers allow catching SIGXCPU before getting SIGKILL.

So the fix to bug#1576168 caused (this) bug#1538435.

Strange thing is, this is just a version change of audio_thread_priority to 0.19.1, but somehow that broke it.

If it is useful, here is the bisect history:

~]# hg log -r "bisect(good) or bisect(bad)" --template "{node|short} {bisect} {summary}\n"
0cf9eded35d8 good 
7ce6426b7653 good 
c8268335f83d good 
de8a6bdc5d73 good 
2ba2516d4a2c good 
f4da3ae7efa6 good 
6e85d1ee673a good 
1f4a5ebd5c7b bad    <<<
b51268c0040b bad 
4c09da80722f bad 

Build details:

  • CentOS 7
  • Python 2.7.18
  • Rust 1.36.0
  • Clang 5.0.1
  • gcc 4.8.5 (Red Hat 4.8.5-44)

Eric, are you playing any kind of audio at all when it's crashing? What is the return value of:

dbus-send --system --dest=org.freedesktop.RealtimeKit1 --print-reply /org/freedesktop/RealtimeKit1 org.freedesktop.DBus.Properties.Get string:org.freedesktop.RealtimeKit1 string:RTTimeUSecMax

on the command-line? I'm very surprised we're seeing a crash there. In any case, it's probably best to create a different bug for your issue. This one appear Wayland related. The message we're seeing in any case is of a child process being killed.

Flags: needinfo?(github)

It seems my bisect test case wasn't testing what I thought it was: it detected the change from SIGKILL to SIGXCPU, but this isn't the right commit. Hopefully the procedure above is useful in case someone can reproduce and bisect.

(In reply to Paul Adenot (:padenot) from comment #73)

Eric, are you playing any kind of audio at all when it's crashing? What is the return value of:

dbus-send --system --dest=org.freedesktop.RealtimeKit1 --print-reply /org/freedesktop/RealtimeKit1 org.freedesktop.DBus.Properties.Get string:org.freedesktop.RealtimeKit1 string:RTTimeUSecMax

on the command-line? I'm very surprised we're seeing a crash there. In any case, it's probably best to create a different bug for your issue. This one appear Wayland related. The message we're seeing in any case is of a child process being killed.

~]$ dbus-send --system --dest=org.freedesktop.RealtimeKit1 --print-reply /org/freedesktop/RealtimeKit1 org.freedesktop.DBus.Properties.Get string:org.freedesktop.RealtimeKit1 string:RTTimeUSecMax
method return time=1639600826.705050 sender=:1.3 -> destination=:1.15896 serial=2072 reply_serial=2

variant int64 200000

Flags: needinfo?(github)

Ok I found the problem on my system. I run most of my UI apps in SCHED_RR to get above VMs that take up CPU in the background:

~]$ chrt -p $$
pid 24658's current scheduling policy: SCHED_RR
pid 24658's current scheduling priority: 11

If I run firefox in SCHED_OTHER it runs fine:

~]$ chrt -o 0 firefox

You can probably reproduce the failure pretty easily on any system like so:

~]$ chrt -r 11 firefox
Exiting due to channel error.
CPU time limit exceeded

Wow, what a crazy issue! It worked with Firefox 68 ESR but stopped working in 78 ESR. Maybe I'll rework the bisect and see where firefox started failing under SCHED_RR.

I'm no longer certain that this bug is the same issue that others are reporting. It seems unlikely that there would be so many people running in SCHED_RR.

It seems to only be a problem when firefox is loading. If I set SCHED_RR after it is loaded it doesn't seem to crash (or at least, it hasn't yet):

chrt -o 0 firefox &
for i in `pgrep -fw firefox`; do sudo chrt -p -r 11 $i; done

(In reply to Eric Wheeler from comment #77)

It seems to only be a problem when firefox is loading. If I set SCHED_RR after it is loaded it doesn't seem to crash (or at least, it hasn't yet):

chrt -o 0 firefox &
for i in `pgrep -fw firefox`; do sudo chrt -p -r 11 $i; done

It turns out that this is false, firefox does eventually crash even if SCHED_RR is set after it loads: the tab crashes when I select a field that makes an AJAX request.

So, firefox under SCHED_RR just isn't stable.

Ok, so starting over now that I have a good test script using chrt -r 11, mozregression still shows the same nightly good/bad versions:

 2:32.58 INFO: Last good revision: 0cf9eded35d8150796eda1f892666d0a778bd488 (2019-08-26)
 2:32.58 INFO: First bad revision: 4c09da80722fcf62db02f19dc67f3f1a6b88f84d (2019-08-27)

Here is my test script

#!/bin/bash

set -x 

timeout 7 chrt -r 11 "$@"
rc=$?

# Good if it times out, bad if it crashes within 7 seconds:
if [ $rc = 124 ]; then 
	echo GOOD
	echo hg bisect --good
	exit 0
else
	echo BAD
	echo hg bisect --bad
	exit 1
fi

Now lets see if the same behavior manifests while bisecting the build tree with hg...

Well it seems that the bisect of the nightly builds was not representative of the bug I'm trying to pin down. Even the earliest changeset indicated by the

Perhaps it is the difference between the nightly build environment from 2019 and the one I've built in CentOS 7.

For all builds that we built and all nighlty builds after and including 2019-08-27:

This works:

  chrt -o 0 firefox

and this does not:

  chrt -r 11 firefox

Not sure what else to troubleshoot, but I'll just run it in SCHED_OTHER. I'm done troubleshooting, but I'll open a dedicated ticket for this issue.

Here is the bug specific to the chrt issue: Bug#1746556 .

I'm done troubleshooting unless someone asks for more information so hopefully all this helps someone along the way!

Cheers,

-Eric

I'm getting reproducible crashes in wayland when using a wacom stylus (One by Wacom) in Blackboard education software. I'm running KDE Plasma 5.24.2 on Manjaro. A terminal spits out:
ExceptionHandler::GenerateDump cloned child 116268
ExceptionHandler::SendContinueSignalToChild sent continue signal to child
ExceptionHandler::WaitForContinueSignal waiting for continue signal...
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.

(In reply to i5513 from comment #64)

Hello,

Today I could navigate with firefox 1 hour, not crash after upgrading Firefox to 93.0-1 (gnome+xorg).

Maybe tomorrow I can try 91.2.0esr-1 too ... or check with wayland 93.0-1

I hope return to firefox as my default browser !

Thank you!

Firefox (98) and firefox-esr (91) are crashing or failing again. Not clue at about:crashes. GNOME + Xorg environment

I downloaded firefox 93 , seems like it not crash. maybe I test from firefox 93 to 97, I will review the firefox 96 first (the previous firefox debian package that I had installed)

(In reply to i5513 from comment #84)

(In reply to i5513 from comment #64)

Hello,

Today I could navigate with firefox 1 hour, not crash after upgrading Firefox to 93.0-1 (gnome+xorg).

Maybe tomorrow I can try 91.2.0esr-1 too ... or check with wayland 93.0-1

I hope return to firefox as my default browser !

Thank you!

Firefox (98) and firefox-esr (91) are crashing or failing again. Not clue at about:crashes. GNOME + Xorg environment

I downloaded firefox 93 , seems like it not crash. maybe I test from firefox 93 to 97, I will review the firefox 96 first (the previous firefox debian package that I had installed)

Firefox 96.0.3 from Mozilla won't crash at my laptop. I'm using it since I wrote the last comment

I'm using Firefox 96.0.3 (firefox-wayland package) on NixOS and it crashes regularly.
I'm running jackd2 with pulse audio plugin as audio source.
I switched to wayland because X11 is pretty slow on higher workloads. X11 firefox is also quite unstable and now even firefox-wayland crashes every 2-3 hours :(

This might also be because I was running a PREEMTIVE_VOLUNTARY with zfs, which causes periodic hangs. Anyways, closing firefox as a result is definitly not right

The bug has a release status flag that shows some version of Firefox is affected, thus it will be considered confirmed.

Status: UNCONFIRMED → NEW
Ever confirmed: true

Redirect a needinfo that is pending on an inactive user to the triage owner.
:stransky, since the bug has recent activity, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(miika9764) → needinfo?(stransky)
Flags: needinfo?(stransky) → needinfo?(moz)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #90)

If you see any crashes please attach crash ID here:
https://fedoraproject.org/wiki/How_to_debug_Firefox_problems?rd=Bug_info_Firefox#Collect_information_for_a_bug_report
Thanks.

When it crashes for me, I do not get any crash report from FF. In about:crashes, I see:

No crash reports have been submitted.

I also tried running FF in gdb, and when it crashes I see the following. Not sure if this is useful though.

###!!! [Parent][PBackgroundParent] Error: RunMessage(msgname=PBackgroundIDBDatabase::Msg_Close) Channel closing: too late to send/recv, messages will be lost
Thread 5 "IPC I/O Parent" received signal SIGPIPE, Broken pipe.
[Switching to Thread 0x7fffe8c43640 (LWP 772678)]
0x00007ffff7b6946d in sendmsg () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff7b6946d in sendmsg () at /usr/lib/libc.so.6
#1  0x00007fffed373a3a in  () at /usr/lib/firefox/libxul.so
#2  0x00007fffed54d5d9 in  () at /usr/lib/firefox/libxul.so
#3  0x00007fffedfd7681 in  () at /usr/lib/firefox/libxul.so
#4  0x00007fffee581be9 in  () at /usr/lib/firefox/libxul.so
#5  0x00007ffff23124ce in  () at /usr/lib/firefox/libxul.so
#6  0x00005555555ba1c5 in set_alt_signal_stack_and_start(PthreadCreateParams*) ()
#7  0x00007ffff7ae25c2 in start_thread () at /usr/lib/libc.so.6
#8  0x00007ffff7b67584 in clone () at /usr/lib/libc.so.6
(gdb) q
A debugging session is active.

	Inferior 1 [process 772596] will be killed.

Quit anyway? (y or n) y
Exiting due to channel error.
Exiting due to channel error.

###!!! [Child][PBackgroundChild] Error: SendAndWait(msgname=PBackgroundLSDatabase::Msg_PBackgroundLSSnapshotConstructor) Channel error: cannot send/recv

Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.

PBackgroundIDBDatabase() looks like a problem with access to your profile. Please file a new bug against Core / Storage:Indexed DB.

You need to log in before you can comment on or make changes to this bug.