Closed Bug 1655282 Opened 4 years ago Closed 3 years ago

[wayland] Crash in [@ wl_proxy_marshal_constructor | moz_container_wayland_request_parent_frame_callback]

Categories

(Core :: Widget: Gtk, defect, P1)

80 Branch
x86_64
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 1648698
Tracking Status
firefox80 --- affected
firefox81 --- affected
firefox82 --- affected
firefox83 --- affected
firefox84 --- affected
firefox85 --- affected

People

(Reporter: matt.fagnani, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: crash, nightly-community)

Crash Data

This bug is for crash report bp-8553f2a5-313e-4023-b2fa-6d1c30200725.

Top 10 frames of crashing thread:

0 libwayland-client.so.0 <name omitted> src/wayland-client.c:830
1 libxul.so moz_container_wayland_request_parent_frame_callback widget/gtk/MozContainerWayland.cpp:219
2 libxul.so moz_container_wayland_get_surface_locked widget/gtk/MozContainerWayland.cpp:378
3 libxul.so moz_container_wayland_surface_lock widget/gtk/MozContainerWayland.cpp:441
4 libxul.so mozilla::widget::WindowSurfaceWayland::CommitWaylandBuffer widget/gtk/WindowSurfaceWayland.cpp:1054
5 libxul.so RunnableFunction<void  ipc/chromium/src/base/task.h:324
6 libxul.so {virtual override thunk} 
7 libxul.so nsTimerImpl::Fire xpcom/threads/nsTimerImpl.cpp:565
8 libxul.so nsTimerEvent::Run xpcom/threads/TimerThread.cpp:251
9 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1234

I was using Firefox Nightly 80.0a1 (2020-7-25) on Wayland in Plasma 5.19.3 in Fedora Rawhide. I opened a second tab. I closed the second tab by pressing the x button. Firefox had a segmentation fault in wl_proxy_marshal_constructor(struct wl_proxy *proxy, uint32_t opcode, const struct wl_interface *interface, ...) at src/wayland-client.c:830 in libwayland-client-1.18.0-1.fc33.x86_64. wayland-client.c:830 was wl_argument_from_va_list(proxy->object.interface->methods[opcode].signature,
args, WL_CLOSURE_MAX_ARGS, ap);

The crash address was 0x0. proxy might've been null resulting in a null pointer dereference. I haven't seen this particular trace before, but similar crashes have happened without crash reports being saved.

Blocks: wayland
Crash Signature: [@ <name omitted> | moz_container_wayland_request_parent_frame_callback] → [@ <name omitted> | moz_container_wayland_request_parent_frame_callback] [@ wl_proxy_marshal_constructor | moz_container_wayland_request_parent_frame_callback ]
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
See Also: → 1641778
Summary: Crash in [@ <name omitted> | moz_container_wayland_request_parent_frame_callback] → [wayland] Crash in [@ <name omitted> | moz_container_wayland_request_parent_frame_callback]

Hm, I don't quite understand how is that possible. Looks like gtk_container_surface is null but we test it for it:
https://hg.mozilla.org/mozilla-central/file/3ad2fc2915b1a66bb7180dee6796144121042dfe/widget/gtk/MozContainerWayland.cpp#l219

I wonder what else can be wrong there.

Priority: -- → P3

Jan, can you please try to reproduce it again with WAYLAND_DEBUG=1 env variable and attach the log here when it crashes?
Thanks.

Flags: needinfo?(matthew.fagnani)

I tried to reproduce this crash many times, but I haven't seen one with this trace again. I've seen other crashes including some where crash reports weren't saved. A race condition involving the Wayland surface being used unchanged at wayland-client.c:830 or rarely freed or otherwise set to null before that might be involved. Thanks.

Flags: needinfo?(matthew.fagnani)

I saw another segmentation fault involving a null pointer dereference in wl_proxy_marshal_constructor at wayland-client.c:830 in libwayland-client-1.18.0-2.fc33.x86_64 with a similar trace while using Firefox Nightly 82.0a1 (2020-8-29) on Wayland with WebRender enabled in Plasma 5.19.4 in Fedora 33. https://crash-stats.mozilla.org/report/index/8dde9039-b6f9-41ad-8a46-0b72e0200829
I clicked on the Bookmarks menu in the menu bar and moved the cursor down over the bookmarks folders when Firefox crashed. The contents of the bookmarks folders weren't shown before the crash. I've seen occasional crashes when doing this before in Nightly on Wayland without crash reports being saved.

I clicked on Bookmarks in the menu bar and moved the mouse cursor down over the bookmarks folders when Firefox Nightly 82.0a1 (2020-9-11) on Wayland with WebRender enabled crashed in Plasma 5.19.5 in Fedora 33. A segmentation fault involving a null pointer dereference in wl_proxy_marshal_constructor at wayland-client.c:830 in libwayland-client-1.18.0-2.fc33.x86_64 https://crash-stats.mozilla.org/report/index/cda7ab8c-f26b-43ed-afaf-50a8d0200911 I've seen at least 4 crashes when using the Bookmarks menu in the last few days, but this is the only one which the crash reporter appeared after. The crashes might've involved a race condition with a use-after-free error and null pointer dereference of the Wayland proxy of the bookmarks menu surface.

Summary: [wayland] Crash in [@ <name omitted> | moz_container_wayland_request_parent_frame_callback] → [wayland] Crash in [@ wl_proxy_marshal_constructor | moz_container_wayland_request_parent_frame_callback]

Change the status for beta to have the same as nightly and release.
For more information, please visit auto_nag documentation.

I clicked on Help in the menu bar when Firefox Nightly 83.0a1 (2020-9-25) on Wayland with WebRender enabled crashed in Plasma 5.19.5 in Fedora 33. A segmentation fault involving a null pointer dereference in wl_proxy_marshal_constructor at wayland-client.c:830 in libwayland-client-1.18.0-2.fc33.x86_64 happened as in the previous crashes I reported here. https://crash-stats.mozilla.org/report/index/b82a8b8b-7e58-4a46-a9e5-a94f70200926

I clicked on Bookmarks in the menu bar and moved the mouse cursor down over the bookmarks folders when Firefox Nightly 83.0a1 on Wayland with WebRender enabled crashed in Plasma 5.19.5 in Fedora 33. A segmentation fault happened involving a null pointer dereference in wl_proxy_marshal_constructor at wayland-client.c:830 in libwayland-client-1.18.0-2.fc33.x86_64. The Wayland proxy of the surfaces of the contents of the bookmark folders might've been freed before being used occasionally. I've seen a few crashes happen in such a way in recent days, but the following two were the only which the crash reporter appeared after.
https://crash-stats.mozilla.org/report/index/8f81075a-51af-4927-8c68-df9e70201017
https://crash-stats.mozilla.org/report/index/b614150b-c26b-4b40-bc70-33f920201014

140 reports from Fedora and 101 from other distributions with this signature have been submitted since the 78 branch.
https://crash-stats.mozilla.org/signature/?signature=%3Cname%20omitted%3E%20%7C%20moz_container_wayland_request_parent_frame_callback&date=%3E%3D2020-04-17T14%3A55%3A00.000Z&date=%3C2020-10-17T14%3A55%3A00.000Z
https://crash-stats.mozilla.org/signature/?signature=wl_proxy_marshal_constructor%20%7C%20moz_container_wayland_request_parent_frame_callback&date=%3E%3D2020-04-17T15%3A01%3A00.000Z&date=%3C2020-10-17T15%3A01%3A00.000Z#summary

Priority: P3 → P1

I clicked on Bookmarks in the menu bar and moved the mouse cursor down over the bookmarks folders when Firefox Nightly 85.0a1 on Wayland with WebRender enabled crashed in Plasma 5.20.3 in Fedora 33. A segmentation fault happened involving a null pointer dereference in wl_proxy_marshal_constructor at wayland-client.c:830 in libwayland-client-1.18.0-2.fc33.x86_64. I've seen a few crashes happen in such a way recently, but the following was the only which the crash reporter appeared after. https://crash-stats.mozilla.org/report/index/71620246-b1fc-4a91-8e59-fbffe0201128

kwin_wayland 5.20.3 has crashed seven times while moving the cursor over the bookmarks folders in Firefox Nightly 84.0a1-85.0a1 on Wayland in the last two weeks or so. kwin_wayland segmentation faulted in QScopedPointer<KWaylandServer::SurfaceInterfacePrivate, QScopedPointerDeleter<KWaylandServer::SurfaceInterfacePrivate> >::operator->()
at /usr/include/qt5/QtCore/qscopedpointer.h:116 in qt5-qtbase-devel-0:5.15.1-7.fc33.x86_64. The pointer this=0x10 in frame 0 was likely invalid, which might be due to this=0x0 in KWaylandServer::SurfaceInterface::subSurface in frame 1. If the Wayland subsurface of the bookmark folders were sometimes freed before being used, this=0x0 in KWaylandServer::SurfaceInterface::subSurface in frame 1 might be how that would show up. I reported those crashes at https://bugs.kde.org/show_bug.cgi?id=429086 and https://bugzilla.redhat.com/show_bug.cgi?id=1897969

It just occurred to me that with Webrender enabled mesa will commit our surface in dri2_wl_swap_buffers_with_damage which we call in GLContextEGL::SwapBuffers(). So even if when using moz_container_wayland_surface_lock there's still a chance that the surface will get commited behind our back, potentially freeing it IIUC.

So maybe we have to extent our surface locking to GLContextEGL::SwapBuffers() as well somehow.

Martin, do you think the above idea makes sense?

Flags: needinfo?(stransky)
See Also: → 1629140
See Also: → 1680505
See Also: → 1680961

As for the crashes in WindowSurfaceWayland::CommitWaylandBuffer (implying basic compositor), they will likely get fixed in bug 1648698. As for the webrender/popup case, it might be fixed in bug 1681107

See Also: → 1648698, 1681107

Let's close it as a dupe of Bug 1648698 which should fix this one. Please reopen if there's any new report in FF86.

Status: UNCONFIRMED → RESOLVED
Closed: 3 years ago
Flags: needinfo?(stransky)
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.