Firefox crashes due to wayland display returning invalid argument 22
Categories
(Core :: Widget: Gtk, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox72 | --- | unaffected |
firefox73 | --- | fixed |
firefox74 | --- | fixed |
People
(Reporter: nagisa, Assigned: stransky)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: regression)
Attachments
(2 files)
879.37 KB,
text/plain
|
Details | |
47 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-beta+
|
Details | Review |
Starting wayland firefox from today (2020-01-02) with sway from git master (sway version 1.2-d510684c (Jan 2 2020, branch 'master')
) starts up fine but fairly quickly – within a couple of seconds – "crashes". Firefox does not consider this a crash in a typical sense of the crash.
The std output has this:
(firefox:12834): Gtk-WARNING **: 00:12:20.450: Loading IM context type 'xim' failed
Gdk-Message: 00:12:27.125: Error 22 (Invalid argument) dispatching to Wayland display.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
[GFX1-]: Receive IPC close with reason=AbnormalShutdown
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
[GFX1-]: Receive IPC close with reason=AbnormalShutdown
Exiting due to channel error.
Exiting due to channel error.
I have upgraded from a fairly old version of nightly, but I also upgraded sway at the same time, so I cannot really rule out sway being the issue. OTOH I do not see the same problem with other gtk applications.
Feel free to ask for additional information.
Will test with a clean profile in a moment.
Reporter | ||
Comment 1•5 years ago
|
||
Looks like launching with --ProfileManager
prevents this issue from surfacing. I’ll investigate more later.
I found the same today after upgrading both sway/wlroots and firefox-nightly.
I am seeing several different log patterns before exit, which is strange.
3 concurrent sessions each exited at startup but with different output:
❯ firefox-nightly
Gdk-Message: 21:15:53.194: Error 22 (Invalid argument) dispatching to Wayland display.
Exiting due to channel error.
[GFX1-]: Receive IPC close with reason=AbnormalShutdown
Exiting due to channel error.
~ 6034s
❯ firefox-nightly
ExceptionHandler::GenerateDump cloned child 24956
ExceptionHandler::SendContinueSignalToChild sent continue signal to child
ExceptionHandler::WaitForContinueSignal waiting for continue signal...
###!!! [Parent][RunMessage] Error: Channel error: cannot send/recv
###!!! [Parent][RunMessage] Error: Channel error: cannot send/recv
###!!! [Parent][MessageChannel] Error: (msgtype=0x37006D,name=PContent::Msg_SuspendInputEventQueue) Channel error: cannot send/recv
###!!! [Parent][MessageChannel] Error: (msgtype=0x37006B,name=PContent::Msg_FlushInputEventQueue) Channel error: cannot send/recv
###!!! [Parent][MessageChannel] Error: (msgtype=0x37006C,name=PContent::Msg_ResumeInputEventQueue) Channel error: cannot send/recv
###!!! [Parent][MessageChannel] Error: (msgtype=0x37004D,name=PContent::Msg_Shutdown) Channel error: cannot send/recv
###!!! [Parent][RunMessage] Error: Channel error: cannot send/recv
###!!! [Parent][RunMessage] Error: Channel error: cannot send/recv
Gdk-Message: 21:16:52.146: Error 71 (Protocol error) dispatching to Wayland display.
[GFX1-]: Receive IPC close with reason=AbnormalShutdown
Exiting due to channel error.
i found that running with MOZ_ENABLE_WAYLAND=0 firefox-nightly
works for now.
Assignee | ||
Comment 4•5 years ago
|
||
Can you please run firefox with WAYLAND_DEBUG=1 env variable set with wayland enabled and attach the log here?
Thanks.
Reporter | ||
Comment 5•5 years ago
|
||
Attempts to reproduce with WAYLAND_DEBUG=1 results in https://crash-stats.mozilla.org/report/index/89ead2cf-c245-4c2d-8b68-1c5e90200103
Reporter | ||
Comment 6•5 years ago
|
||
Downgrading firefox to nightly from 2019-12-09 fixes the issue.
Therefore, the issue started occuring between 2019-12-09 (works fine) and 2020-01-03 (fails).
Here are a couple more of identical crashes when attempting to debug with WAYLAND_DEBUG=1
:
https://crash-stats.mozilla.org/report/index/0d5ac7e2-ea83-4f18-8ce4-a37020200103
https://crash-stats.mozilla.org/report/index/bp-f02eddfb-da52-4616-b3e5-c23c30200103
https://crash-stats.mozilla.org/report/index/bp-3318e8b2-fef3-47d0-bfe0-adb1c0200103
https://crash-stats.mozilla.org/report/index/bp-33b9a2f7-b8cc-41f6-8dad-52c0b0200103
Reporter | ||
Comment 7•5 years ago
|
||
Not sure if it will be helpful at all, here’s the log collected for a session which ultimately resulted in one of the wl_log_set_handler crashes. Not sure if or how it is relevant to the original issue, but here goes.
Assignee | ||
Comment 8•5 years ago
|
||
Thanks, the backtraces are clear, it's a problem with setting an opaque region.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 9•5 years ago
|
||
I suspect this is a sway bug when null opaque region is set. I filed https://github.com/swaywm/sway/issues/4875 for further work.
Assignee | ||
Comment 10•5 years ago
|
||
See also Bug 1606848
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 11•5 years ago
|
||
Can you try to disable webrender, i.e. run Firefox with basic compositor?
Set gfx.webrender.force-disabled to true at about:config and restart Firefox.
Thanks.
Reporter | ||
Comment 12•5 years ago
|
||
Setting gfx.webrender.force-disabled
does not make this issue go away. I think it is already disabled by default on my machine anyway, because, as per about:suppot, WEBRENDER_QUALIFIED blocked-device-too-old by env: Device too old
.
Assignee | ||
Comment 13•5 years ago
|
||
I tried Sway on my Fedora 31 box but I can't reproduce it with latest nightly.
Assignee | ||
Comment 14•5 years ago
|
||
I can reproduce it now. It's because we use already released region.
Assignee | ||
Comment 15•5 years ago
|
||
It can be reproduced reliably when doing drag & drop operations.
Assignee | ||
Comment 17•5 years ago
•
|
||
It's really a multi-thread issue (https://bugzilla.mozilla.org/show_bug.cgi?id=1606848#c2), there's a log from it:
[(null) 69489: Main Thread]: D/WidgetWayland moz_gtk_widget_get_wl_surface [0x7fffdc246a60] wl_surface 0x7fffdc052060 ID 44
[(null) 69489: Main Thread]: D/Widget nsWindow::UpdateTopLevelOpaqueRegionWayland()
[2031351.728] -> wl_compositor@33.create_region(new id wl_region@110)
[2031351.738] -> wl_region@110.add(26, 23, 960, 1020)
[2031351.747] -> wl_surface@44.set_opaque_region(wl_region@110)
[2031351.752] -> wl_region@110.destroy()
[2031351.758] -> wl_compositor@33.create_region(new id wl_region@108)
[2031351.793] -> wl_region@108.add(0, 0, 960, 1020)
[(null) 69489: Compositor]: D/WidgetWayland moz_container_get_wl_surface [0x7fffdc2dc830] surface 0x7fffd92f6510 ready_to_draw 1
wl_surface_set_opaque_region id 107 0x7fffc93d51f0
moz_container_set_opaque_region region id 108 0x7fffcc217f60 BEGIN
[2031351.902] -> wl_region@107.destroy()
moz_container_set_opaque_region region id END, new region is 0x7fffcc217f60
[(null) 69489: Main Thread]: D/Widget END nsWindow::UpdateTopLevelOpaqueRegionWayland() END
Compositor]: D/WidgetWayland moz_container_get_wl_surface
-> We're at compositor thread (moz_container_get_wl_surface) while opaque region is updated from main thread (UpdateTopLevelOpaqueRegionWayland()).
[2031351.943] -> wl_surface@61.set_opaque_region(
Thread 25 "Compositor" received signal SIGSEGV, Segmentation fault.
Updated•5 years ago
|
Updated•5 years ago
|
Comment 18•5 years ago
|
||
As seen in bug 1606848, this is not restricted to Sway but also affects GNOME.
Updated•5 years ago
|
Assignee | ||
Comment 19•5 years ago
|
||
nsWindow::UpdateOpaqueRegion() is used from Main thread and it collides with
moz_container_get_wl_surface() where opaque region is used and which is called from Compositor thread.
As a fix don't set opaque region directly for mozcontainer but rather just use a flag to signalize
there's an update needed and calculare/set the opaque region at moz_container_get_wl_surface() or
moz_container_egl_window_set_size().
Assignee | ||
Updated•5 years ago
|
Comment 20•5 years ago
|
||
Pushed by nerli@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9b54914b2037 [Wayland] Manage opaque region of mozcontainer internally, r=heftig
Comment 21•5 years ago
|
||
bugherder |
Comment 22•5 years ago
|
||
Hi Martin, does this need a Beta uplift request for 73?
Assignee | ||
Comment 23•5 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #22)
Hi Martin, does this need a Beta uplift request for 73?
Yes please. I'll file the uplift request.
Assignee | ||
Comment 24•5 years ago
|
||
Comment on attachment 9118925 [details]
Bug 1606751 [Wayland] Manage opaque region of mozcontainer internally, r?heftig
Beta/Release Uplift Approval Request
- User impact if declined: Crashes on Wayland backend caused by concurrent writes to mozcontainer.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Linux/Wayland only.
- String changes made/needed: none
Comment 25•5 years ago
|
||
Comment on attachment 9118925 [details]
Bug 1606751 [Wayland] Manage opaque region of mozcontainer internally, r?heftig
Wayland crash fix. Approved for 73.0b3.
Comment 26•5 years ago
|
||
bugherder uplift |
Description
•