Closed Bug 1761870 Opened 2 years ago Closed 2 years ago

Dragging tab window hangs/crashes browser

Categories

(Core :: Widget: Gtk, defect, P3)

Firefox 100
defect

Tracking

()

RESOLVED FIXED
101 Branch
Tracking Status
firefox101 --- fixed

People

(Reporter: gliu10000, Assigned: stransky)

References

(Blocks 2 open bugs)

Details

Crash Data

Attachments

(5 files)

User Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:98.0) Gecko/20100101 Firefox/98.0

Steps to reproduce:

  1. Open Firefox Nightly
  2. Open a ton of tabs
  3. Pin a couple of tabs (I'm not sure if this is necessary)
  4. Randomly drag tags to the left or right until browser hangs/crashes

Actual results:

Browser hangs/crashes.
Anecdotally, I'm pretty sure when I drag a tab, the mouse cursor is the "move" icon instead of the "drag" icon. I may be wrong about this so don't quote me on that.

Log of crash: https://paste.mozilla.org/v8M4U8ip

Expected results:

No crashes

See Also: → 1751887
See Also: → 1754789

The Bugbug bot thinks this bug should belong to the 'Core::DOM: Copy & Paste and Drag & Drop' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → DOM: Copy & Paste and Drag & Drop
Product: Firefox → Core

I should point out that you have to drag tabs left and right in quick succession. It could be some sort of race condition??????

Component: DOM: Copy & Paste and Drag & Drop → Widget: Gtk
Here's the backtrace:
```

```
Attached file gdb-results
Here's the backtrace

I can reproduce this in stock Nightly with a fresh profile.

My STR:

  1. Start Firefox with fresh profile.
  2. Have 3 pinned tabs (and nothing else): one tab at https://www.mozilla.org/en-US/privacy/firefox/ and 2 at https://www.example.org/ (exact sites/count probably doesn't matter)
  3. Rapidly click and drag the pinned tabs back and forth left to right

ACTUAL RESULTS:
Within a minute or so, the whole browser suddenly crashes (i.e. I get a crash reporter dialog).

EXPECTED RESULTS:
No such crash.

If I visit about:crashes, I can see my crash report IDs:
bp-8ea3d3fd-f042-4c96-a8ed-162ba0220329 [@ wl_log ]
bp-0c6d55b7-4f38-4967-9246-199260220329 [@ wl_proxy_marshal | mozilla::widget::WindowSurfaceWaylandMB::Commit ]

Crash Signature: [@ wl_log ] [@ wl_proxy_marshal | mozilla::widget::WindowSurfaceWaylandMB::Commit ]

When performing the STR in a debug build locally, I see assortments of these warnings (before the crash):

[Parent 592899, Main Thread] WARNING: Quit unfinished Wayland Drag and Drop operation. Buggy Wayland compositor?: file /scratch/work/builds/mozilla-central/mozilla/widget/gtk/nsWindow.cpp:7180

(firefox-default:592899): Gtk-WARNING **: 21:34:32.126: Attempting to add a widget with type GtkWindow to a container of type GtkWindow, but the widget is already inside a container of type GtkWindow, please remove the widget from its existing container first.

(firefox-default:592899): Gtk-WARNING **: 21:34:34.891: Attempting to add a widget with type GtkWindow to a container of type GtkWindow, but the widget is already inside a container of type GtkWindow, please remove the widget from its existing container first.

[Child 593196, Main Thread] WARNING: DispatchEvent called on non-current inner window, dropping. Please check the window in the caller instead.: file /scratch/work/builds/mozilla-central/mozilla/dom/base/nsGlobalWindowInner.cpp:4290

Also: I managed to catch this with rr, and I've uploaded it to pernosco; hopefully we can learn more from that once pernosco has processed the trace.

See Also: → 1622107

Is that KDE or something different?

(In reply to Daniel Holbert [:dholbert] from comment #7)

https://pernos.co/debug/VxeXsyYhGHP0XokMVyjfvA/index.html

I see 'unauthorized' error (I use stransky@anakreon.cz on github).
https://static.pernos.co/server/799dc0669e943cf0066633894aec84dd70af5dc8/unauthorized.html

Flags: needinfo?(gliu10000)
Flags: needinfo?(dholbert)

Also I guess this is not a recent regression, right?

Priority: -- → P3

(In reply to Martin Stránský [:stransky] (ni? me) from comment #8)

Is that KDE or something different?
it's Gnome 41.5 on Wayland

Flags: needinfo?(gliu10000)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #8)

Is that KDE or something different?

I'm using Gnome 40.4.0 with Wayland, on Ubuntu 21.10.

I see 'unauthorized' error (I use stransky@anakreon.cz on github).
https://static.pernos.co/server/799dc0669e943cf0066633894aec84dd70af5dc8/unauthorized.html

Gotcha, it seems our Pernosco uploads are only viewable by moco pernosco accounts by default; I've requested that khuey open this one up so you can take a look if you've got cycles to do so, though (thanks!)

Flags: needinfo?(dholbert)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #9)

Also I guess this is not a recent regression, right?

Possibly not. I just reproduced in Nightly 2021-12-01 96.0a1, launched with MOZ_ENABLE_WAYLAND=1 (since it predated bug 1749174 which enabled wayland by default on Nightly; I'm assuming that wayland-enabled-Firefox is required to trigger this)

Crash Signature: [@ wl_log ] [@ wl_proxy_marshal | mozilla::widget::WindowSurfaceWaylandMB::Commit ]
Status: UNCONFIRMED → NEW
Ever confirmed: true

Ah, I just got a crash in that same Nightly (2021-12-01 96.0a1) using comment 5 STR, without needing to use MOZ_ENABLE_WAYLAND (so about:support reports Window Protocol xwayland). They exit code is slightly different, though -- I think I got exit code -11 in current Nightly and with wayland enabled, vs. in this case I got:

4:14.46 WARNING: Process exited with code -133

(In reply to Daniel Holbert [:dholbert] from comment #11)

I've requested that khuey open this one up so you can take a look if you've got cycles to do so, though (thanks!)

Hopefully it should work now (at https://pernos.co/debug/VxeXsyYhGHP0XokMVyjfvA/index.html ); let me know if not.

Looking at the pernosco trace myself a little: it looks like we're crashing in this function, in wayland-client.c


WL_EXPORT void
wl_proxy_marshal(struct wl_proxy *proxy, uint32_t opcode, ...)
{
	union wl_argument args[WL_CLOSURE_MAX_ARGS];
	va_list ap;

	va_start(ap, opcode);
	wl_argument_from_va_list(proxy->object.interface->methods[opcode].signature,
				 args, WL_CLOSURE_MAX_ARGS, ap);
	va_end(ap);

	wl_proxy_marshal_array_constructor(proxy, opcode, args, NULL);
}

...because proxy->object.interface is a null pointer.

This proxy object is passed down as wl_container->surface from Mozilla code, here:
https://searchfox.org/mozilla-central/source/widget/gtk/MozContainerWayland.cpp#118

// Route input to parent wl_surface owned by Gtk+ so we get input
// events from Gtk+.
static void moz_container_clear_input_region(MozContainer* container) {
[...]
  wl_surface_set_input_region(wl_container->surface, region);

Here, ((struct wl_proxy*)(wl_container->surface))->object.interface is null, and that's what we end up crashing on (two levels deeper, in wl_proxy_marshal in my first code snippet quoted above).

Flags: needinfo?(stransky)

Works now, Thanks!

It's possible that moz_container_wayland_frame_callback_handler() is called after unmap event if we draw to parent surface so don't assert there.

Assignee: nobody → stransky
Status: NEW → ASSIGNED
Flags: needinfo?(stransky)
Pushed by stransky@redhat.com:
https://hg.mozilla.org/integration/autoland/rev/b25395f8ae6f
[Wayland] Don't assert if there's active draw callback after unmap event r=emilio
https://hg.mozilla.org/integration/autoland/rev/d2b258ffc8fb
[Wayland] Clear button press count on D&D workaround if D&D is finished r=emilio
https://hg.mozilla.org/integration/autoland/rev/860d0663ac1b
[Wayland] Mark MozContainer as mapped to make sure moz_container_wayland_unmap() is called on hide/withdraw r=emilio
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 101 Branch

Unfortunately the reporter and I can both still reproduce a crash (with my STR from comment 5, in my case), but let's keep this closed to avoid having too much going on here (since 3 patches have already landed).

[we're both testing with local mozilla-central builds more-recent than comment 20, BTW.]

I'll file a new bug with my new crash and I'll post a new pernosco trace there (I captured it).

Crash Signature: [@ libxul.so@0x3569103 | libxul.so@0x356913a | libxul.so@0xc642d5 | libxul.so@0xc4dfa7 | libxul.so@0xc4d318 | libxul.so@0xc4d50e | libxul.so@0xc65d51 | libxul.so@0xc5988c | libxul.so@0xc5ddd7 | libxul.so@0x12bd4d7 | libxul.so@0x1270f65 | libxul.so@0x34fd3…
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: