Use-after-free crash in [@ g_type_check_instance_cast]
Categories
(Core :: Widget: Gtk, defect, P1)
Tracking
()
People
(Reporter: mccr8, Assigned: stransky)
References
(Regression, )
Details
(4 keywords, Whiteboard: [adv-main111+r][adv-esr102.9+r])
Crash Data
Attachments
(2 files)
48 bytes,
text/x-phabricator-request
|
dmeehan
:
approval-mozilla-beta+
tjr
:
sec-approval+
|
Details | Review |
886 bytes,
patch
|
dmeehan
:
approval-mozilla-esr102+
|
Details | Diff | Splinter Review |
Crash report: https://crash-stats.mozilla.org/report/index/76fdcfa7-05c8-4b27-8cb9-2cc250230120
Reason: SIGSEGV / SI_KERNEL
Top 10 frames of crashing thread:
0 libgobject-2.0.so.0 g_type_check_instance_cast /usr/src/debug/glib2-2.74.1-2.fc37.x86_64/gobject/gtype.c:4127
1 libgtk-3.so.0 gtk_container_accessible_get_n_children /usr/src/debug/gtk3-3.24.36-1.fc37.x86_64/gtk/a11y/gtkcontaineraccessible.c:51
2 libatk-bridge-2.0.so.0 impl_get_ChildCount /usr/src/debug/at-spi2-atk-2.38.0-5.fc37.x86_64/atk-adaptor/adaptors/accessible-adaptor.c:139
3 libatk-bridge-2.0.so.0 impl_prop_GetSet /usr/src/debug/at-spi2-atk-2.38.0-5.fc37.x86_64/droute/droute.c:364
4 libatk-bridge-2.0.so.0 handle_properties /usr/src/debug/at-spi2-atk-2.38.0-5.fc37.x86_64/droute/droute.c:443
4 libatk-bridge-2.0.so.0 handle_message /usr/src/debug/at-spi2-atk-2.38.0-5.fc37.x86_64/droute/droute.c:597
5 libdbus-1.so.3 _dbus_object_tree_dispatch_and_unlock /usr/src/debug/dbus-1.14.4-1.fc37.x86_64/dbus/dbus-object-tree.c:1021
5 libdbus-1.so.3 <name omitted> /usr/src/debug/dbus-1.14.4-1.fc37.x86_64/dbus/dbus-connection.c:4742
5 libdbus-1.so.3 dbus_connection_dispatch /usr/src/debug/dbus-1.14.4-1.fc37.x86_64/dbus/dbus-connection.c:4574
6 libatspi.so.0 message_queue_dispatch /usr/src/debug/at-spi2-core-2.44.1-2.fc37.x86_64/atspi/atspi-gmain.c:89
This looks like a UAF inside GTK accessibility code. There are 28 crashes like that in the last week, which isn't a huge number in absolute terms, but seems like it must be a lot for Linux accessibility code.
Reporter | ||
Updated•2 years ago
|
Reporter | ||
Comment 1•2 years ago
|
||
I'm not sure if this should go here or in Widget: GTK.
Comment 2•2 years ago
|
||
I wonder if this is related to bug 1760819? It's really hard to debug this without any idea of what the GTK a11y code was even trying to access.
Updated•2 years ago
|
Comment 3•2 years ago
|
||
I checked more than 10 of these reports. All of them were during shutdown and all with Wayland.
The fact that these occur during shutdown makes me think an Accessible is being destroyed somehow before the widget is destroyed. I don't see how that could happen, though. The RootAccessible associated with the widget should have been destroyed when the PresShell was destroyed. That said, I don't have a solid understanding of the order of events when a root widget shuts down. Is it possible for the PresShell to outlive the widget? I wouldn't think so, but if it is, maybe that could cause a problem?
Alternatively, maybe we're destroying things correctly but not sending some notification we should be sending. I'm somewhat unclear as to how our ATK code notifies the OS about created and destroyed Accessibles. There are create and destroy signals, but we don't seem to use them. We do unref ATK objects when we destroy them, but maybe that isn't enough somehow?
I also can't figure out how we associate our custom ATKObject with the widget (if we do at all), nor how the OS learns about our ApplicationAccessible (we fire children changed notifications on that, but how does the system learn about it in the first place?).
Eitan, do you have any thoughts on any of this?
Assignee | ||
Comment 4•2 years ago
|
||
Please run with MOZ_LOG="Widget:5 WidgetPopup:5" and you get info when Gtk widgets are deleted (it's by nsWindow::Destroy()).
Assignee | ||
Comment 5•2 years ago
|
||
From the code:
static dbus_bool_t
impl_get_ChildCount (DBusMessageIter * iter, void *user_data)
{
AtkObject *object = (AtkObject *) user_data;
int childCount;
g_return_val_if_fail (ATK_IS_OBJECT (user_data), FALSE);
childCount = (ATK_IS_SOCKET (object) && atk_socket_is_occupied (ATK_SOCKET (object)))
? 1
: atk_object_get_n_accessible_children (object);
return droute_return_v_int32 (iter, childCount);
}
It looks like ATK has stored already released object (the void *user_data param).
Comment 6•2 years ago
|
||
(In reply to Martin Stránský [:stransky] (ni? me) from comment #4)
Please run with MOZ_LOG="Widget:5 WidgetPopup:5" and you get info when Gtk widgets are deleted (it's by nsWindow::Destroy()).
Thanks. However, it's still unclear to me how this fits in with the PresShell, which is where the RootAccessible is held.
(In reply to Martin Stránský [:stransky] (ni? me) from comment #5)
It looks like ATK has stored already released object (the void *user_data param).
Which leads to the question: is that object something created by Gecko or something created by GTK? I don't see anywhere we tell the widget to use our AtkObject instead of its default implementation, but that seems odd to me, so I must be missing something.
Assignee | ||
Comment 7•2 years ago
|
||
Can you use ASAN build or rr to trace it? You can but breakpoint to impl_get_ChildCount() and trace it back where the object is created.
Comment 8•2 years ago
|
||
I can't reproduce it - I'm not even set up to run Linux builds currently (but even people on my team using a11y on Linux don't appear to see this) - so unfortunately no.
Comment 9•2 years ago
|
||
I'm going to try to reproduce this with a local asan build and record it in rr.
Comment 10•2 years ago
|
||
Martin,
This looks like a bug in GTK3, if a widget is finalized before it is destroyed it de-references and drops the pointer to its accessible. But the accessible still keeps a reference to the widget. So if the accessible is still alive and queried via dbus it uses the freed address space where the widget used to be.
GtkWidget's destroy method does this correctly:
https://gitlab.gnome.org/GNOME/gtk/-/blob/3.24.36/gtk/gtkwidget.c#L12335-12340
But the finalize method unilaterally drops the accessible pointer value without clearing the widget's pointer value in the accessible:
https://gitlab.gnome.org/GNOME/gtk/-/blob/3.24.36/gtk/gtkwidget.c#L12374
Note that this is a problem because the accessible does not hold a strong reference to the widget:
https://gitlab.gnome.org/GNOME/gtk/-/blob/3.24.36/gtk/gtkaccessible.c#L169-172
I believe our MozContainer is finalized before it is destroyed and leaves a dangling reference to itself in its corresponding accessible that outlives it.
To fix the bug on our end we need to override some assured cleanup method in MozContainer (finalize
?) to remove the MozContainer reference from the accessible.
As for an upstream fix, this is a UAF, so I don't know what their protocol is for that kind of thing.
Assignee | ||
Comment 11•2 years ago
|
||
Thanks for the investigation, will look at it.
Assignee | ||
Comment 12•2 years ago
|
||
Gtk4 seems to have it fixed and doesn't delete accessible reference from finalize. From my understanding finalize is called when reference to GObject drops to 0.
Assignee | ||
Comment 13•2 years ago
|
||
Updated•2 years ago
|
Assignee | ||
Comment 14•2 years ago
•
|
||
Comment on attachment 9316555 [details]
Bug 1811637 [Linux] Detach accessible from moz_container before it's deleted r?emilio
Security Approval Request
- How easily could an exploit be constructed based on the patch?: It reliably crashes when any Firefox window is closed due to access to freed memory. The crash happens because ATK toolkit access (via DBus callback) to already deleted GtkWidget which holds Firefox main window.
- Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
- Which older supported branches are affected by this flaw?: all
- If not all supported branches, which bug introduced the flaw?: Seems to be caused by Gtk3 library which doesn't clear GtkWidget reference from ATK object. It happens for custom GtkWidgets which doesn't call stock (and private) gtk_widget_real_destroy() but only gtk_widget_finalize(). gtk_widget_finalize() doesn't clear back reference from ATK object to GtkWidget so when ATK object is used (after GtkWidget free) the freed GtkWidget is accessed.
- Do you have backports for the affected branches?: Yes
- If not, how different, hard to create, and risky will they be?: No risk, we just clear GtkWidget reference from ATK object so the callback is not performed. It's the same for ESR line.
- How likely is this patch to cause regressions; how much testing does it need?: Should not cause a regression, at already crashes whole browser.
- Is Android affected?: No
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 15•2 years ago
|
||
Reported as https://gitlab.gnome.org/GNOME/gtk/-/issues/5578
Assignee | ||
Comment 17•2 years ago
•
|
||
Guys, this is easy and simple fix, can we land it now also for release? Otherwise I'm going to ship it as Fedora/RHEL downstream patch.
Comment 18•2 years ago
|
||
Copying crash signatures from duplicate bugs.
Comment 19•2 years ago
|
||
The following field has been copied from a duplicate bug:
Field | Value | Source |
---|---|---|
Regressed by | bug 1530052 | bug 1760819 |
For more information, please visit auto_nag documentation.
Comment 20•2 years ago
|
||
Set release status flags based on info from the regressing bug 1530052
Comment 21•2 years ago
|
||
Comment on attachment 9316555 [details]
Bug 1811637 [Linux] Detach accessible from moz_container before it's deleted r?emilio
Approved to request uplift and land
Updated•2 years ago
|
Comment 22•2 years ago
|
||
(In reply to Martin Stránský [:stransky] (ni? me) from comment #17)
Guys, this is easy and simple fix, can we land it now also for release? Otherwise I'm going to ship it as Fedora/RHEL downstream patch.
Hey Martin, sorry I didn't see this when I did the sec-approval - sec-approvals get paused once the uplift period ends, which was on February 2nd, and resume after the merge. (Future dates are visible on fx-trains.) If there is something after the last uplift that is needed urgently in the release then the best thing to do is to contact relman - the Release Owner for a given release is on fx-trains - you would check who the owner is for the beta you want to uplift it into.
![]() |
||
Comment 23•2 years ago
|
||
[Linux] Detach accessible from moz_container before it's deleted r=emilio
https://hg.mozilla.org/integration/autoland/rev/1c9296c074ff122ecf67b8ec97f3e987c3f22009
https://hg.mozilla.org/mozilla-central/rev/1c9296c074ff
Comment 24•2 years ago
|
||
The patch landed in nightly and beta is affected.
:stransky, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox111
towontfix
.
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 25•2 years ago
|
||
Comment on attachment 9316555 [details]
Bug 1811637 [Linux] Detach accessible from moz_container before it's deleted r?emilio
Beta/Release Uplift Approval Request
- User impact if declined: Firefox crashes if any opened Firefox window is closed and accessibility is enabled (screen reader for instance).
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Should not cause a regression, it already crashes whole browser when accessibility is enabled.
- String changes made/needed:
- Is Android affected?: No
Assignee | ||
Comment 27•2 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #26)
Do we need this on ESR102 also?
Yes, it's affected too.
Comment 28•2 years ago
|
||
Comment on attachment 9316555 [details]
Bug 1811637 [Linux] Detach accessible from moz_container before it's deleted r?emilio
Approved for 111.0b5
Comment 29•2 years ago
|
||
uplift |
Comment 30•2 years ago
|
||
Please attach a rebased patch for ESR102 and nominate it for uplift when you get a chance.
Assignee | ||
Comment 31•2 years ago
|
||
Sorry, I'm unable to submit as phabricator patch due to python error:
CommandError: command 'status' failed to complete successfully
so attaching as plain diff here.
Comment 32•2 years ago
|
||
Comment on attachment 9320436 [details] [diff] [review]
esr102.patch
Approved for 102.9esr.
Comment 33•2 years ago
|
||
uplift |
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•1 year ago
|
Description
•