Closed Bug 1166741 Opened 5 years ago Closed 5 years ago

Crash on GTK3 when closing a native file picker's parent

Categories

(Core :: Widget: Gtk, defect)

Unspecified
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: acomminos, Assigned: acomminos, Mentored)

References

Details

Attachments

(2 files, 4 obsolete files)

The test 'widget/tests/test_picker_no_crash.html' crashes the browser on GTK3 builds of Firefox due to the file picker being destroyed by its parent before setup commands are processed on the glib event loop. When the setup commands (e.g. setting the current folder) are processed, Firefox segfaults.

GDB backtrace:

#0  update_current_folder_get_info_cb (cancellable=0x7fffdbf1f250, info=0x0, 
    error=0x7fffdd6e08c0, user_data=0x7fffc0c604c0)
    at /tmp/buildd/gtk+3.0-3.14.5/./gtk/gtkfilechooserwidget.c:4268
#1  0x00007ffff521552a in query_info_callback (source_object=<optimized out>, 
    result=<optimized out>, user_data=0x7fffc0fb0a30)
    at /tmp/buildd/gtk+3.0-3.14.5/./gtk/gtkfilesystem.c:419
#2  0x00007ffff267c4bb in g_task_return_now (task=0x7fffc7bb1c20)
    at /tmp/buildd/glib2.0-2.42.1/./gio/gtask.c:1077
#3  0x00007ffff267c4d9 in complete_in_idle_cb (task=0x7fffc7bb1c20)
    at /tmp/buildd/glib2.0-2.42.1/./gio/gtask.c:1086
#4  0x00007ffff20e2b6d in g_main_dispatch (context=0x7ffff6b52140)
    at /tmp/buildd/glib2.0-2.42.1/./glib/gmain.c:3111
#5  g_main_context_dispatch (context=context@entry=0x7ffff6b52140)
    at /tmp/buildd/glib2.0-2.42.1/./glib/gmain.c:3710
#6  0x00007ffff20e2f48 in g_main_context_iterate (
    context=context@entry=0x7ffff6b52140, block=block@entry=0, 
    dispatch=dispatch@entry=1, self=<optimized out>)
    at /tmp/buildd/glib2.0-2.42.1/./glib/gmain.c:3781
#7  0x00007ffff20e2ffc in g_main_context_iteration (context=0x7ffff6b52140, 
    may_block=0) at /tmp/buildd/glib2.0-2.42.1/./glib/gmain.c:3842
#8  0x00007fffe984e36d in nsAppShell::ProcessNextNativeEvent (
    this=0x7fffdd618b70, mayWait=false)
    at /home/andrew/src/mozilla-central/widget/gtk/nsAppShell.cpp:158
Attached patch Workaround for GTK bug 725164 (obsolete) — Splinter Review
Here's a workaround to properly increment the refcount of the file chooser delegate (i.e. the GtkFileChooserWidget backing the GtkFileChooserDialog) to fix affected versions of GTK. Unfortunately, the delegate is not exposed, so we need to rely on hackish container queries. I'd rather not have to do this, but it doesn't look like an upstream fix will land until GTK 3.18.

Thanks for taking a look at this :karlt!
Attachment #8608704 - Flags: review?(karlt)
Comment on attachment 8608704 [details] [diff] [review]
Workaround for GTK bug 725164

>+      // Workaround for problematic refcounting in GTK3 before 3.18.

https://bugzilla.gnome.org/show_bug.cgi?id=725164 is actually fixed in 3.16,
but a workaround is still needed, thanks.

https://git.gnome.org/browse/gtk+/tree/gtk/gtkfilechooserwidget.c?id=3.16.0#n4575

>+            *((GtkFileChooserWidget**)result) = GTK_FILE_CHOOSER_WIDGET(widget);

Please use static_cast instead of C-style casts.

>+      MOZ_ASSERT(delegate);
>+      g_object_ref(delegate);

As finding delegate depends on GTK internals, please test for null here
instead of asserting, in order to gracefully handle any future changes in GTK.
The unref will need a null check also.

>+      // Properly deref our acquired reference.
>+      g_idle_add([](gpointer data) -> gboolean {
>+          g_object_unref(data);
>+          return G_SOURCE_REMOVE;
>+      }, delegate);

_gtk_file_system_get_info() uses g_file_query_info_async().
I don't think there is any guarantee that this will run its callback before
the next idle event.

gtk_widget_destroy() in Done() should be enough to trigger
cancel_all_operations() in gtkfilechooserwidget.c, and g_cancellable_cancel()
will queue an event to run the callback, so the unref runnable can be
dispatched after that gtk_widget_destroy() call.

Unfortunately, it seems Done() is too late to find the chooser widget because
gtk_container_destroy destroys children before dispatching the "destroy" event
on itself.
Attachment #8608704 - Flags: review?(karlt) → review-
Thanks, will update the patch with your suggestions momentarily.

(In reply to Karl Tomlinson (ni?:karlt) from comment #3)
> _gtk_file_system_get_info() uses g_file_query_info_async().
> I don't think there is any guarantee that this will run its callback before
> the next idle event.

While it's certainly not a great idea to depend on this sort of behaviour, the async info query is dispatched using G_PRIORITY_DEFAULT, which should occur before any events of priority G_PRIORITY_DEFAULT_IDLE as per the documentation for g_idle_add (https://developer.gnome.org/glib/unstable/glib-The-Main-Event-Loop.html#g-idle-add).
Attachment #8608704 - Attachment is obsolete: true
Attachment #8608893 - Flags: review?(karlt)
Update ensures that cancelled events are dispatched properly by registering the idle event in Done().
Attachment #8608893 - Attachment is obsolete: true
Attachment #8608893 - Flags: review?(karlt)
Attachment #8609032 - Flags: review?(karlt)
Comment on attachment 8609032 [details] [diff] [review]
GTK3 file chooser refcounting workaround v3

I don't see gfile.c calling g_task_set_return_on_cancel() so it seems the
cancel callback won't run until the task completes after being cancelled.

>+        // We call this in Done() to ensure that pending file info queries
>+        // caused by updating the current folder have been cancelled.

Please replace Done() with "after gtk_widget_destroy()" to be more explicit.

However, I see now that cancelling a GTask doesn't actually cancel the task
and queue the callback immediately unless g_task_set_return_on_cancel() has
been called and I don't see gfile.c calling that.  It seems the callback won't
run until the task completes after being cancelled for that implementation of
GFileIface.

So, please add something like "though we don't know when the callback will run
after cancelled".

I'm marking r+ because if this is resolving the crash because I don't really have any better ideas.  Perhaps a timeout of a minute or so, or use the xul file picker with broken GTK versions.
Attachment #8609032 - Flags: review?(karlt) → review+
Attachment #8609032 - Attachment is obsolete: true
GTK3 try push (try builds aren't in a great state now, but widget/tests/test_picker_no_crash.html should pass):

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6cb24f7ebf84
Keywords: checkin-needed
This busts linux clang builds (without gtk3 enabled) with this build warning (treated as an error):
{
widget/gtk/nsFilePicker.h:77:25: error: private field 'mFileChooserDelegate' is not used [-Werror,-Wunused-private-field]
}

Looks like all the usages of this variable are guarded with "#if (MOZ_WIDGET_GTK == 3)", but its declaration/initialization aren't guarded. So it just looks like an unused variable, to the compiler.

We should just add #ifdefs around the declaration/initialization, I think.
(er, previous followup patch doesn't build, since it leaves the init list with a trailing comma in gtk2 builds.  This version puts the comma at the start of each init-list line, to make it more ifdeffable.)
Attachment #8610649 - Attachment is obsolete: true
Attachment #8610649 - Flags: review?(karlt)
Attachment #8610654 - Flags: review?(karlt)
https://hg.mozilla.org/mozilla-central/rev/5e3736bd6c33
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Comment on attachment 8610654 [details] [diff] [review]
followup patch: add ifdef guard around member-var decl/initialization

>+  : mSelectedType(0)
>+    , mRunning(false)
>+    , mAllowURLs(false)
>+#if (MOZ_WIDGET_GTK == 3)
>+    , mFileChooserDelegate(nullptr)
>+#endif

Convention is to place the ',' under the ':' when placing at the beginning of the line, or probably simpler, use a member initializer at the declaration instead.
Attachment #8610654 - Flags: review?(karlt) → review+
Thanks - I just fixed the indentation, and landed ^^.  (Didn't bother switching to a member initializer, since I don't want to move code around too much for no reason, in a minor followup. Also, I haven't used member initializers before, and I don't want to accidentally screw it up somehow. :))
You need to log in before you can comment on or make changes to this bug.