Closed Bug 1716796 Opened 2 years ago Closed 2 years ago

Crash in [@ moz_container_wayland_surface_lock]

Categories

(Core :: Widget: Gtk, defect, P3)

Firefox 91
x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
91 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox89 --- unaffected
firefox90 --- unaffected
firefox91 --- fixed

People

(Reporter: matt.fagnani, Assigned: stransky)

References

(Blocks 2 open bugs, Regression)

Details

(Keywords: regression, Whiteboard: [not-a-fission-bug])

Crash Data

Attachments

(1 file)

I closed Firefox Nightly 91.0a1 (2021-06-16) on Wayland in Plasma 5.21.5 in Fedora 34, and the crash reporter appeared. A segmentation fault happened in moz_container_wayland_surface_lock at widget/gtk/MozContainerWayland.cpp:549. The same type of crash happened the next time I closed Nightly https://crash-stats.mozilla.org/report/index/e1ca2349-28e8-4f21-a1d3-fc3c80210616
I don't remember seeing crashes with this trace before today.

Maybe Fission related. (DOMFissionEnabled=1)

Crash report: https://crash-stats.mozilla.org/report/index/a3e6ca8a-9b84-4bc2-ad50-093d60210616

Reason: SIGSEGV /0x00000080

Top 10 frames of crashing thread:

0 libxul.so moz_container_wayland_surface_lock widget/gtk/MozContainerWayland.cpp:549
1 libxul.so mozilla::widget::WindowSurfaceWayland::FlushPendingCommitsLocked widget/gtk/WindowSurfaceWayland.cpp:667
2 libxul.so mozilla::widget::WindowSurfaceWayland::FlushPendingCommits widget/gtk/WindowSurfaceWayland.cpp:621
3 libxul.so mozilla::widget::WaylandBufferFlushPendingCommits widget/gtk/WindowSurfaceWayland.cpp:632
4 libglib-2.0.so.0 g_timeout_dispatch /usr/src/debug/glib2-2.68.2-1.fc34.x86_64/glib/gmain.c:4889
5 libglib-2.0.so.0 g_main_context_dispatch /usr/src/debug/glib2-2.68.2-1.fc34.x86_64/glib/gmain.c:4055
6 libglib-2.0.so.0 g_main_context_iterate.constprop.0 /usr/src/debug/glib2-2.68.2-1.fc34.x86_64/glib/gmain.c:4131
7 libglib-2.0.so.0 g_main_context_iteration /usr/src/debug/glib2-2.68.2-1.fc34.x86_64/glib/gmain.c:4196
8 libxul.so {virtual override thunk} 
9 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1078

Looks like nsWindow is already destroyed.

Blocks: wayland
Priority: -- → P3

Similar issues are happening with turning on tests for WebRender + devtools + Wayland:

https://treeherder.mozilla.org/logviewer?job_id=342992786&repo=try&lineNumber=4706

On nsWindow::Destroy() we set mContainer to null and that causes this crash.

Is this a regression? The earliest build ID from a crash report is 20210616094608, suggesting maybe a regression landed in mozilla-central on June 15 or 16.

(Adding [not-a-fission-bug] whiteboard tag because this doesn't look like a Fission bug, even though many of the recent crash reports (including comment 0's) have "DOMFissionEnabled=1".)

Whiteboard: [not-a-fission-bug]

(In reply to Chris Peterson [:cpeterson] from comment #4)

Is this a regression? The earliest build ID from a crash report is 20210616094608, suggesting maybe a regression landed in mozilla-central on June 15 or 16.

(Adding [not-a-fission-bug] whiteboard tag because this doesn't look like a Fission bug, even though many of the recent crash reports (including comment 0's) have "DOMFissionEnabled=1".)

I think this problem is a regression since I've seen Nightly crash 10 times when closing from 20210616094608 and later, but I hadn't seen crashes with this trace before that. There don't appear to be any other reports with the same signature before then. The crashes have happened about 10% of the times I've closed NIghtly since 20210616094608, and so they might be due to a race condition. Given what Martin wrote and the crash address being null, the segmentation fault might be due a null pointer dereference where container was null at the crashing line widget/gtk/MozContainerWayland.cpp:549
if (!container->wl_container.surface ||
!container->wl_container.ready_to_draw) {

I saw Thunderbird Nightly 91.0a1 (2021-06-17) on Wayland crash once with the same trace when closing https://crash-stats.mozilla.org/report/index/b69bf195-1eee-4c87-8cfe-57dfd0210617

Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: regression

I can provoke this one in a fresh profile pretty quickly by rapidly clicking alternately on the Pocket button and the hamburger menu button.

Does setting widget.wayland.multi-buffer-software-backend.enabled in latest nightly work around the issue?

Yeah, it does.

It's because we hold extra ref to WindowSurface and WindowSurface is not released when WindowSurfaceProvider is deleted.

Assignee: nobody → stransky
Regressed by: 1716850
Has Regression Range: --- → yes
  • Poll mozcontainer remap state by moz_container_wayland_remapped()
  • Remove WindowSurface::Reset()
  • Remove moz_container_wayland_set_window_surface() as it adds extra ref at WindowSurface() and causes cyclic dependency.
Pushed by stransky@redhat.com:
https://hg.mozilla.org/integration/autoland/rev/2b20b560fe61
[Wayland] Poll mozcontainer remap state, r=rmader
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch
Crash Signature: [@ moz_container_wayland_surface_lock] → [@ moz_container_wayland_surface_lock] [@ GetMozContainer] [@ mozilla::widget::WindowSurfaceWayland::FlushPendingCommitsLocked]
You need to log in before you can comment on or make changes to this bug.