Crash in XGetWindowAttributes / GtkCompositorWidget::GtkCompositorWidget
Categories
(Core :: Graphics: WebRender, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr78 | --- | unaffected |
firefox90 | --- | affected |
firefox91 | --- | affected |
firefox92 | --- | affected |
People
(Reporter: aosmond, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: crash)
Crash Data
This bug is for crash report bp-016f1069-c4fe-47e3-b937-140a00200722.
Top 10 frames of crashing thread:
0 libX11.so.6 XGetWindowAttributes
1 libxul.so mozilla::widget::GtkCompositorWidget::GtkCompositorWidget widget/gtk/GtkCompositorWidget.cpp:42
2 libxul.so mozilla::widget::CompositorWidgetParent::CompositorWidgetParent widget/gtk/CompositorWidgetParent.cpp:16
3 libxul.so mozilla::layers::CompositorBridgeParent::AllocPCompositorWidgetParent gfx/layers/ipc/CompositorBridgeParent.cpp:2223
4 libxul.so mozilla::layers::PCompositorBridgeParent::OnMessageReceived ipc/ipdl/PCompositorBridgeParent.cpp:1100
5 libxul.so mozilla::layers::PCompositorManagerParent::OnMessageReceived ipc/ipdl/PCompositorManagerParent.cpp:197
6 libxul.so mozilla::ipc::MessageChannel::DispatchMessage ipc/glue/MessageChannel.cpp:2074
7 libxul.so mozilla::ipc::MessageChannel::MessageTask::Run ipc/glue/MessageChannel.cpp:1953
8 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1234
9 libxul.so mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:302
Bug 1603839 (and bug 1633462) was fixed on a related crash, but we still see this with X11 + GPU process.
Reporter | ||
Comment 1•4 years ago
|
||
The first build with the signature is 20200424114754 which bug 1603839 landed in.
Reporter | ||
Comment 2•4 years ago
•
|
||
We don't see this without the GPU process, and things will work slightly different without it. We are passed in a non-nullptr nsWindow, and we get the mXDisplay from there. Given mXWindow is an ID that it probably (?) verifies, then I assume this line gives us a nullptr:
Assuming that is true, disabling the GPU process should work around these crashes. We need to monitor this when bug 1653443 lands.
Reporter | ||
Updated•4 years ago
|
Comment 3•3 years ago
|
||
Alexandre, this crash signature (now [@ libglib-2.0.so.0@0x5f0f8]
instead of [@ XGetWindowAttributes]
) spiked starting in Nightly 91. Could this recent spike be a regression from Linux semi-headless mode (bug 1635451)?
GtkCompositorWidget::GtkCompositorWidget
is crashing deep instead a call to XGetWindowAttributes(DefaultXDisplay(), mXWindow, &windowAttrs)
here:
Crash report: https://crash-stats.mozilla.org/report/index/1ee95652-a060-445c-bbff-89a950210713
Reason: SIGTRAP
Top 10 frames of crashing thread:
0 libglib-2.0.so.0 libglib-2.0.so.0@0x5f0f8
1 libglib-2.0.so.0 libglib-2.0.so.0@0x5adf4
2 libglib-2.0.so.0 libglib-2.0.so.0@0x5aff0
3 libgdk-3.so.0 libgdk-3.so.0@0x919d6
4 libX11.so.6 libX11.so.6@0x43a34
5 libX11.so.6 libX11.so.6@0x406a7
6 libX11.so.6 libX11.so.6@0x40744
7 libX11.so.6 libX11.so.6@0x417a4
8 libX11.so.6 libX11.so.6@0x277b9
9 libX11.so.6 libX11.so.6@0x2792a
Comment 4•3 years ago
|
||
(In reply to Chris Peterson [:cpeterson] from comment #3)
Alexandre, this crash signature (now
[@ libglib-2.0.so.0@0x5f0f8]
instead of[@ XGetWindowAttributes]
) spiked starting in Nightly 91. Could this recent spike be a regression from Linux semi-headless mode (bug 1635451)?
GtkCompositorWidget::GtkCompositorWidget
is crashing deep instead a call toXGetWindowAttributes(DefaultXDisplay(), mXWindow, &windowAttrs)
here:Crash report: https://crash-stats.mozilla.org/report/index/1ee95652-a060-445c-bbff-89a950210713
Reason:
SIGTRAP
Top 10 frames of crashing thread:
0 libglib-2.0.so.0 libglib-2.0.so.0@0x5f0f8 1 libglib-2.0.so.0 libglib-2.0.so.0@0x5adf4 2 libglib-2.0.so.0 libglib-2.0.so.0@0x5aff0 3 libgdk-3.so.0 libgdk-3.so.0@0x919d6 4 libX11.so.6 libX11.so.6@0x43a34 5 libX11.so.6 libX11.so.6@0x406a7 6 libX11.so.6 libX11.so.6@0x40744 7 libX11.so.6 libX11.so.6@0x417a4 8 libX11.so.6 libX11.so.6@0x277b9 9 libX11.so.6 libX11.so.6@0x2792a
I'm not sure, most of the crashes from this signature are reported with buildid 20210705095222
, according to https://crash-stats.mozilla.org/signature/?product=Firefox&signature=libglib-2.0.so.0%400x5f0f8&date=%3E%3D2021-07-07T15%3A23%3A00.000Z&date=%3C2021-07-14T15%3A23%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_columns=startup_crash&_sort=-date&page=1#reports
However,
- https://hg.mozilla.org/mozilla-central/rev/c49de061c1fa
- https://hg.mozilla.org/mozilla-central/rev/c1bd0996764c
- https://hg.mozilla.org/mozilla-central/rev/2974b1ba2beb
All landed on20210706214242
, so that would be before the changes landed.
Comment 5•3 years ago
|
||
This is the changelog for the buildid 20210705095222
and build before it:
@ Martin, might your DMABufSurface changes in bug 1712588 have caused an increase in these GtkCompositorWidget crashes calling XGetWindowAttributes
?
Comment 6•3 years ago
|
||
(In reply to Chris Peterson [:cpeterson] from comment #5)
This is the changelog for the buildid
20210705095222
and build before it:@ Martin, might your DMABufSurface changes in bug 1712588 have caused an increase in these GtkCompositorWidget crashes calling
XGetWindowAttributes
?
No, this is not related.
Comment 7•3 years ago
|
||
It's unfortunate we don't have call stack from local libraries. Is there any way how to reproduce it locally?
Comment 8•3 years ago
|
||
(In reply to Martin Stránský [:stransky] (ni? me) from comment #7)
It's unfortunate we don't have call stack from local libraries.
@ Gabriele, I thought I read that we now have debug symbols for popular Linux distros' libraries. If so, which distros are those? All of this bug's libglib-2.0.so.0 crashes are from the Arch Linux and Manjaro Linux (a fork of Arch Linux).
Is there any way how to reproduce it locally?
Unfortunately, I don't know of STR. None of the crash reports have user comments about what they were doing. The crash reports' URLs are pretty random, just the usual sites with no patterns.
I suspect this is a regression in Arch Linux's glib. All these crashes are on variants of Arch Linux. They all started on 2021-06-16 on all Firefox channels at the same time (Nightly 91, Beta 90, and Release 89), even though there were no crash reports from Nightly 91 or Beta 89.
Comment 9•3 years ago
|
||
(In reply to Chris Peterson [:cpeterson] from comment #8)
@ Gabriele, I thought I read that we now have debug symbols for popular Linux distros' libraries. If so, which distros are those? All of this bug's libglib-2.0.so.0 crashes are from the Arch Linux and Manjaro Linux (a fork of Arch Linux).
Yes, unfortunately Arch does not provide debug packages, you have to build the package locally to get them and I never could find the time to do that. We used to scrape the public symbols from their libraries but since we replaced dump_syms we haven't been able to due to a limitation in the new tool (not that they were particularly useful but at the least they caused crashes to clump together under a signature).
Unfortunately, I don't know of STR. None of the crash reports have user comments about what they were doing. The crash reports' URLs are pretty random, just the usual sites with no patterns.
I found an interesting comment in this crash:
playing a video, opened a new tab, attempted to detach the video tab and move it to other monitor
Which lead me to this signature on Debian's ESR build: @ handle_response | XShapeGetRectangles
. The stack trace is similar (though without having Arch symbols I can't be sure) and the comments point to a similar issue:
it is someething to do with dragging the tab out im on MATE desktop environment on debian with multiple monitors thanks for trying to fix it
i tried to drag a tab out of firefox
Crashes when draggin anything. Loaded defaul google page. Click-dragged google logo.
Anytime I drag/drop anything. A link, a tab, anything. The browser crashes.
And so on... It could be a different issue but I'd start trying to reproduce the first comment about detaching tabs with playing video in order to repro.
Comment 10•3 years ago
|
||
There's another thing that all the crashes Arch seem to have in common: they're on X11, none of them are using Wayland.
Comment 11•3 years ago
•
|
||
Searched a bit for XGetWindowAttributes:
Latest GPU process XGetWindowAttributes crash seems to be bp-449eccc2-2b1d-48eb-92bc-539310210824 with version 82.0b9.
- Flash has been removed afterwards in 89 by bug 1682030.
- There was a Flash player problem with XGetWindowAttributes: bug 921848, https://bugs.webkit.org/show_bug.cgi?id=163159
- XGetWindowAttributes "deadlock", "known thread-safety issues": https://gitlab.freedesktop.org/xorg/lib/libx11/-/issues/26
(Chris Peterson [:cpeterson] from comment #3)
Crash report: https://crash-stats.mozilla.org/report/index/1ee95652-a060-445c-bbff-89a950210713
[@ libglib-2.0.so.0@0x5f0f8 ]
That one has:
GraphicsCriticalError |[G0][GFX1-]: Failed to create EGLSurface!: 0x3009 (t=1.05895) |[G1][GFX1-]: Failed to create EGLSurface (t=1.05898)
GLX is still the default.
If I (non-programmer) understand correctly, MOZ_X11_EGL=1/proprietary Nvidia must have perfectly matching visual between X11 and EGL, otherwise it crashes. Mesa does not seem this strict.
Example of such a crash on X11 Mate desktop environment: bug 1677314
Nvidia driver 470 has many improvements for EGL/Dmabuf/Xwayland/Wayland.
EGL/X11 XFCE desktop/proprietary Nvidia seems(!) to work fine now: https://bug1729900.bmoattachments.org/attachment.cgi?id=9240298
This change was required for WebRender in the GPU process and OOP Webextensions: https://hg.mozilla.org/mozilla-central/rev/bb6817615317
It seems windowAttrs could be used with the struct's default values (=no visual?) if XGetWindowAttributes() fails.
Comment 12•3 years ago
•
|
||
(In reply to Darkspirit from comment #11)
Latest GPU process XGetWindowAttributes crash seems to be bp-449eccc2-2b1d-48eb-92bc-539310210824 with version 82.0b9.
Correction, there are also:
bp-bc4f3b47-71fa-4f73-93ce-6f8080210503 90.0a1 Ubuntu 20.10
Adapter Vendor ID Intel Corporation (0x8086)
App Notes WR! WR+ EGL? EGL- GL Context? GL Context+
KDE Connect
"screenWidth": 1600,
"screenHeight": 900
e.g. Visual problem on non-composited KDE or something like that? Or multitouch tablet with X11 thread safety problem?
(bp-1ee95652-a060-445c-bbff-89a950210713 [@ libglib-2.0.so.0@0x5f0f8 ] from comment 3 also has "GL Context+", but mentions "Failed to create EGLSurface", while this one does not.)
bp-5b9b65bc-e2c6-492b-bf6a-8979d0210429 90.0a1 Ubuntu 20.10
Adapter Vendor ID Intel Corporation (0x8086)
App Notes WR! WR+ EGL? EGL- GL Context? GL Context+
GraphicsCriticalError |[G0][GFX1-]: TOpAddBlobImage failed (t=313362) <--------------------------
KDE Connect
"screenWidth": 1600,
"screenHeight": 900
Comment 13•3 years ago
|
||
Please re-test with latest nightly, we remove outdated XWindow from GtkCompositorWidget now.
Comment 14•3 years ago
|
||
Should be fixed now.
Description
•