Closed Bug 1754789 Opened 2 years ago Closed 2 years ago

[wayland] Deadlock when dragging a pinned tab on Plasma

Categories

(Core :: Widget: Gtk, defect, P3)

defect

Tracking

()

RESOLVED FIXED
99 Branch
Tracking Status
firefox99 --- fixed

People

(Reporter: emilio, Assigned: emilio)

References

(Blocks 2 open bugs)

Details

Attachments

(3 files)

Attached file gdb.txt

Not reproducible, but I managed to catch it on gdb, see attachment.

It seems the main thread is waiting for the compositor, which is waiting for the renderer, which grabs a lock that the main thread is holding.

Relevant threads:

Renderer, stuck in WindowSurfaceWaylandMB::Commit

Thread 58 (Thread 0x7f25ee0ca640 (LWP 5538) "Renderer"):
#0  futex_wait (private=0, expected=2, futex_word=0x7f240b2bc950) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0x7f240b2bc950, private=0) at lowlevellock.c:49
#2  0x00007f262d8a2607 in ___pthread_mutex_lock (mutex=0x7f240b2bc950) at pthread_mutex_lock.c:145
#3  0x00007f262de5d43e in mozilla::detail::MutexImpl::lock() ()
#4  0x00007f2621a3a7c9 in std::_Function_handler<void (), mozilla::widget::WindowSurfaceWaylandMB::Commit(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&, mozilla::gfx::IntRegionTyped<mozilla::LayoutDevicePixel> const&)::$_4>::_M_invoke(std::_Any_data const&) () from /home/emilio/firefox/libxul.so
#5  0x00007f2621a30280 in mozilla::widget::WindowSurfaceWaylandMB::Commit(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&, mozilla::gfx::IntRegionTyped<mozilla::LayoutDevicePixel> const&) () from /home/emilio/firefox/libxul.so
#6  0x00007f2621a300da in mozilla::widget::WindowSurfaceWaylandMB::Commit(mozilla::gfx::IntRegionTyped<mozilla::LayoutDevicePixel> const&) () from /home/emilio/firefox/libxul.so
#7  0x00007f262064a1ef in mozilla::wr::RenderCompositorSWGL::CommitMappedBuffer(bool) () from /home/emilio/firefox/libxul.so
#8  0x00007f262064a26e in mozilla::wr::RenderCompositorSWGL::EndFrame(nsTArray<mozilla::wr::Box2D<int, mozilla::wr::DevicePixel> > const&) () from /home/emilio/firefox/libxul.so
#9  0x00007f26249431d5 in mozilla::wr::RenderThread::UpdateAndRender(mozilla::wr::WrWindowId, mozilla::layers::BaseTransactionId<mozilla::VsyncIdType> const&, mozilla::TimeStamp const&, bool, mozilla::Maybe<mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> > const&, mozilla::Maybe<mozilla::wr::ImageFormat> const&, mozilla::Maybe<mozilla::Range<unsigned char> > const&, bool*) () from /home/emilio/firefox/libxul.so
#10 0x00007f2624942af7 in mozilla::wr::RenderThread::HandleFrameOneDoc(mozilla::wr::WrWindowId, bool) () from /home/emilio/firefox/libxul.so
#11 0x00007f2624945db7 in mozilla::detail::RunnableMethodImpl<mozilla::wr::RenderThread*, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, bool), true, (mozilla::RunnableKind)0, mozilla::wr::WrWindowId, bool>::Run() () from /home/emilio/firefox/libxul.so
#12 0x00007f2623a87247 in nsThread::ProcessNextEvent(bool, bool*) () from /home/emilio/firefox/libxul.so
#13 0x00007f26246c61af in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) () from /home/emilio/firefox/libxul.so
#14 0x00007f262469c7af in MessageLoop::Run() () from /home/emilio/firefox/libxul.so
#15 0x00007f26244e5e61 in nsThread::ThreadFunc(void*) () from /home/emilio/firefox/libxul.so
#16 0x00007f262cd62113 in _pt_root () from /home/emilio/firefox/libnspr4.so
#17 0x00007f262de8b4ff in set_alt_signal_stack_and_start(PthreadCreateParams*) ()
#18 0x00007f262d89f0e7 in start_thread (arg=<optimized out>) at pthread_create.c:442
#19 0x00007f262d924780 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Compositor, waiting on Renderer

Thread 75 (Thread 0x7f25ed4fd640 (LWP 5555) "Compositor"):
#0  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7f2498c1b380) at futex-internal.c:57
#1  __futex_abstimed_wait_common (futex_word=futex_word@entry=0x7f2498c1b380, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0, cancel=cancel@entry=true) at futex-internal.c:87
#2  0x00007f262d89bd0f in __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7f2498c1b380, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0) at futex-internal.c:139
#3  0x00007f262d89e431 in __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7f2498c1b2f8, cond=0x7f2498c1b358) at pthread_cond_wait.c:503
#4  ___pthread_cond_wait (cond=0x7f2498c1b358, mutex=0x7f2498c1b2f8) at pthread_cond_wait.c:618
#5  0x00007f262cd6b1fc in PR_Wait () from /home/emilio/firefox/libnspr4.so
#6  0x00007f26248ebef4 in mozilla::layers::SynchronousTask::Wait() () from /home/emilio/firefox/libxul.so
#7  0x00007f2620651dcb in mozilla::wr::WebRenderAPI::Pause() () from /home/emilio/firefox/libxul.so
#8  0x00007f2620595c69 in mozilla::layers::CompositorBridgeParent::PauseComposition() () from /home/emilio/firefox/libxul.so
#9  0x00007f2620595bf9 in mozilla::layers::CompositorBridgeParent::RecvPause() () from /home/emilio/firefox/libxul.so
#10 0x00007f26246fc34b in mozilla::layers::PCompositorBridgeParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) () from /home/emilio/firefox/libxul.so
#11 0x00007f26246fed9c in mozilla::layers::PCompositorManagerParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) () from /home/emilio/firefox/libxul.so
#12 0x00007f26246c4814 in mozilla::ipc::MessageChannel::DispatchSyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&, IPC::Message*&) () from /home/emilio/firefox/libxul.so
#13 0x00007f2623ace41d in mozilla::ipc::MessageChannel::MessageTask::Run() () from /home/emilio/firefox/libxul.so
#14 0x00007f2623a87247 in nsThread::ProcessNextEvent(bool, bool*) () from /home/emilio/firefox/libxul.so
#15 0x00007f26246c61af in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) () from /home/emilio/firefox/libxul.so
#16 0x00007f262469c7af in MessageLoop::Run() () from /home/emilio/firefox/libxul.so
#17 0x00007f26244e5e61 in nsThread::ThreadFunc(void*) () from /home/emilio/firefox/libxul.so
#18 0x00007f262cd62113 in _pt_root () from /home/emilio/firefox/libnspr4.so
#19 0x00007f262de8b4ff in set_alt_signal_stack_and_start(PthreadCreateParams*) ()
#20 0x00007f262d89f0e7 in start_thread (arg=<optimized out>) at pthread_create.c:442
#21 0x00007f262d924780 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Main, waiting on Compositor

Thread 1 (Thread 0x7f262d809780 (LWP 5435) "firefox-bin"):
#0  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7f25d563725c) at futex-internal.c:57
#1  __futex_abstimed_wait_common (futex_word=futex_word@entry=0x7f25d563725c, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0, cancel=cancel@entry=true) at futex-internal.c:87
#2  0x00007f262d89bd0f in __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7f25d563725c, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0) at futex-internal.c:139
#3  0x00007f262d89e431 in __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7f25d5637200, cond=0x7f25d5637230) at pthread_cond_wait.c:503
#4  ___pthread_cond_wait (cond=0x7f25d5637230, mutex=0x7f25d5637200) at pthread_cond_wait.c:618
#5  0x00007f262de5d1d6 in mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&) ()
#6  0x00007f26246c2699 in mozilla::ipc::MessageChannel::Send(mozilla::UniquePtr<IPC::Message, mozilla::DefaultDelete<IPC::Message> >, IPC::Message*) () from /home/emilio/firefox/libxul.so
#7  0x00007f26246cc365 in mozilla::ipc::IProtocol::ChannelSend(IPC::Message*, IPC::Message*) () from /home/emilio/firefox/libxul.so
#8  0x00007f262020e5e6 in mozilla::layers::PCompositorBridgeChild::SendPause() () from /home/emilio/firefox/libxul.so
#9  0x00007f2624dbfc39 in nsWindow::PauseCompositorHiddenWindow() () from /home/emilio/firefox/libxul.so
#10 0x00007f2624dbdf1a in nsWindow::DisableRenderingToWindow() () from /home/emilio/firefox/libxul.so
#11 0x00007f2624dc6486 in widget_unrealize_cb(_GtkWidget*) () from /home/emilio/firefox/libxul.so
#12 0x00007f262bfc2d30 in g_closure_invoke (closure=0x7f25c6e26710, return_value=0x0, n_param_values=1, param_values=0x7ffe05407aa0, invocation_hint=0x7ffe05407a20) at ../gobject/gclosure.c:830
#13 0x00007f262bfece36 in signal_emit_unlocked_R.isra.0 (node=node@entry=0x7f262d6c0400, detail=detail@entry=0, instance=instance@entry=0x7f2553c68660, emission_return=emission_return@entry=0x0, instance_and_params=instance_and_params@entry=0x7ffe05407aa0) at ../gobject/gsignal.c:3744
#14 0x00007f262bfe009e in g_signal_emit_valist (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7ffe05407c50) at ../gobject/gsignal.c:3497
#15 0x00007f262bfe0323 in g_signal_emit (instance=instance@entry=0x7f2553c68660, signal_id=<optimized out>, detail=detail@entry=0) at ../gobject/gsignal.c:3554
#16 0x00007f262c5c5b9a in gtk_widget_unrealize (widget=0x7f2553c68660) at ../gtk/gtkwidget.c:5582
#17 0x00007f262c5c5da4 in gtk_widget_unparent (widget=0x7f2553c68660) at ../gtk/gtkwidget.c:4674
#18 0x00007f262c31e0fa in gtk_bin_remove (container=0x7f240b35da60, child=0x7f2553c68660) at ../gtk/gtkbin.c:151
#19 0x00007f262bfc17c5 in g_cclosure_marshal_VOID__OBJECTv (closure=0x7f2611347ba0, return_value=<optimized out>, instance=0x7f240b35da60, args=<optimized out>, marshal_data=<optimized out>, n_params=<optimized out>, param_types=0x7f261135a1e8) at ../gobject/gmarshal.c:1910
#20 0x00007f262bfe01e9 in _g_closure_invoke_va (param_types=<optimized out>, n_params=<optimized out>, args=0x7ffe05407fb0, instance=<optimized out>, return_value=<optimized out>, closure=0x7f2611347ba0) at ../gobject/gclosure.c:893
#21 g_signal_emit_valist (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7ffe05407fb0) at ../gobject/gsignal.c:3407
#22 0x00007f262bfe0323 in g_signal_emit (instance=instance@entry=0x7f240b35da60, signal_id=<optimized out>, detail=detail@entry=0) at ../gobject/gsignal.c:3554
#23 0x00007f262c37ba65 in gtk_container_remove (container=0x7f240b35da60, widget=0x7f2553c68660) at ../gtk/gtkcontainer.c:1907
#24 0x00007f262c60887d in gtk_drag_remove_icon (info=0x7f243ba2fc00) at ../gtk/gtkdnd.c:2752
#25 0x00007f262c60a52d in gtk_drag_source_info_free (info=0x7f243ba2fc00) at ../gtk/gtkdnd.c:2761
#26 0x00007f262c041d07 in g_datalist_clear (datalist=<optimized out>) at ../glib/gdataset.c:273
#27 0x00007f262c1c697a in gdk_wayland_drag_context_finalize (object=0x7f24033fa500) at ../gdk/wayland/gdkdnd-wayland.c:96
#28 0x00007f262bfcfd52 in g_object_unref (_object=<optimized out>) at ../gobject/gobject.c:3678
#29 g_object_unref (_object=0x7f24033fa500) at ../gobject/gobject.c:3553
#30 0x00007f262bfdf8a9 in g_signal_emit_valist (instance=instance@entry=0x7f24033fa500, signal_id=signal_id@entry=49, detail=<optimized out>, var_args=var_args@entry=0x7ffe05408330) at ../gobject/gsignal.c:3457
#31 0x00007f262bfe0528 in g_signal_emit_by_name (instance=instance@entry=0x7f24033fa500, detailed_signal=detailed_signal@entry=0x7f262c2001c7 "dnd-finished") at ../gobject/gsignal.c:3596
#32 0x00007f262c1c6f84 in data_source_dnd_finished (data=<optimized out>, source=0x7f24187fec90) at ../gdk/wayland/gdkselection-wayland.c:1145
#33 0x00007f262b59f746 in ffi_call_unix64 () at ../src/x86/unix64.S:105
#34 0x00007f262b59c4d2 in ffi_call_int (cif=<optimized out>, fn=<optimized out>, rvalue=<optimized out>, avalue=<optimized out>, closure=<optimized out>) at ../src/x86/ffi64.c:672
#35 0x00007f262b6c4e03 in wl_closure_invoke (closure=closure@entry=0x7f242ba61660, target=<optimized out>, target@entry=0x7f24187fec90, opcode=opcode@entry=4, data=<optimized out>, flags=1) at ../src/connection.c:1025
#36 0x00007f262b6c5573 in dispatch_event (display=0x7f262d6adac0, queue=<optimized out>) at ../src/wayland-client.c:1583
#37 0x00007f262b6c573c in dispatch_queue (queue=0x7f262d6adb90, display=0x7f262d6adac0) at ../src/wayland-client.c:1729
#38 wl_display_dispatch_queue_pending (display=0x7f262d6adac0, queue=0x7f262d6adb90) at ../src/wayland-client.c:1971
#39 0x00007f262b6c5790 in wl_display_dispatch_pending (display=<optimized out>) at ../src/wayland-client.c:2034
#40 0x00007f262c1bf14b in _gdk_wayland_display_queue_events (display=<optimized out>) at ../gdk/wayland/gdkeventsource.c:201
#41 0x00007f262c1853bb in gdk_display_get_event (display=0x7f262d6a3400) at ../gdk/gdkdisplay.c:442
#42 0x00007f262c1c2da6 in gdk_event_source_dispatch (base=<optimized out>, callback=<optimized out>, data=<optimized out>) at ../gdk/wayland/gdkeventsource.c:120
#43 0x00007f262c0601bf in g_main_dispatch (context=0x7f2611310660) at ../glib/gmain.c:3413
#44 g_main_context_dispatch (context=0x7f2611310660) at ../glib/gmain.c:4131
#45 0x00007f262c0b52d8 in g_main_context_iterate.constprop.0 (context=context@entry=0x7f2611310660, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4207
#46 0x00007f262c05db40 in g_main_context_iteration (context=0x7f2611310660, may_block=1) at ../glib/gmain.c:4272
#47 0x00007f2623a879d8 in nsThread::ProcessNextEvent(bool, bool*) () from /home/emilio/firefox/libxul.so
#48 0x00007f2623ad09d1 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) () from /home/emilio/firefox/libxul.so
#49 0x00007f262469c7af in MessageLoop::Run() () from /home/emilio/firefox/libxul.so
#50 0x00007f2624db6849 in nsBaseAppShell::Run() () from /home/emilio/firefox/libxul.so
#51 0x00007f262278d715 in nsAppStartup::Run() () from /home/emilio/firefox/libxul.so
#52 0x00007f2622804ce1 in XREMain::XRE_mainRun() () from /home/emilio/firefox/libxul.so
#53 0x00007f262280574e in XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) () from /home/emilio/firefox/libxul.so
#54 0x00007f2622805aa6 in XRE_main(int, char**, mozilla::BootstrapConfig const&) () from /home/emilio/firefox/libxul.so
#55 0x00007f262de69029 in main ()
Assignee: emilio → nobody
Blocks: wayland

Ah, so I think the issue is just in WindowSurfaceWaylandMB::Commit. We have mSurfaceLock held there. If the surface is null, we call moz_container_wayland_add_initial_draw_cb, but if the container->ready_to_draw is true, we invoke that synchronously, trying to grab the lock again, dead-locking.

So I think this check should be:

  if (wl_container->ready_to_draw && wl_container->surface) {

Or so, wdyt? Maybe instead we need some more subtle logic elsewhere?

Flags: needinfo?(robert.mader)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #2)

Ah, so I think the issue is just in WindowSurfaceWaylandMB::Commit. We have mSurfaceLock held there. If the surface is null, we call moz_container_wayland_add_initial_draw_cb, but if the container->ready_to_draw is true, we invoke that synchronously, trying to grab the lock again, dead-locking.

So I think this check should be:

  if (wl_container->ready_to_draw && wl_container->surface) {

Or so, wdyt? Maybe instead we need some more subtle logic elsewhere?

Thanks for digging into this! IMO wl_container->ready_to_draw should never be true if there's no surface AFAICS. So that might the actual bug.

Flags: needinfo?(robert.mader)
Blocks: 1756349

This papers over it by dealing with surface==null but
ready_to_draw==true, like a bunch of other code does.

See bug 1756349 for an example where this state can be reached right
now.

Depends on D139238

Assignee: nobody → emilio
Status: NEW → ASSIGNED

We should do more locking here, will create a new bug for it.

Pushed by stransky@redhat.com:
https://hg.mozilla.org/integration/autoland/rev/517d4c0bec83
Paper over a deadlock in Wayland code. r=stransky
https://hg.mozilla.org/integration/autoland/rev/ad61e85bc9a8
More strongly assert MozContainer invariants. r=stransky
Priority: -- → P3
Pushed by emilio@crisal.io:
https://hg.mozilla.org/integration/autoland/rev/347e0ef280bb
Partially back out ad61e85bc9a8 because the assertions are known not to quite hold.
See Also: → 1754555
See Also: → 1751887
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 99 Branch
Duplicate of this bug: 1691738
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: