Closed Bug 1894100 Opened 1 year ago Closed 10 months ago

Poison crash in [@ mozilla::widget::WaylandBuffer::BufferReleaseCallbackHandler]

Categories

(Core :: Widget: Gtk, defect)

defect

Tracking

()

RESOLVED FIXED
137 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox-esr128 138+ fixed
firefox135 --- wontfix
firefox136 --- wontfix
firefox137 + fixed

People

(Reporter: mccr8, Assigned: stransky)

References

Details

(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [fixed by 1942232] [adv-main137+r][adv-esr128.10+r])

Crash Data

Attachments

(1 file)

Crash report: https://crash-stats.mozilla.org/report/index/bf4a8405-9924-4e8f-b575-274b20240416

Reason: SIGSEGV / SI_KERNEL

Top 10 frames:

0  libxul.so  mozilla::widget::WaylandBuffer::BufferReleaseCallbackHandler(wl_buffer*)  widget/gtk/WaylandBuffer.cpp:114
0  libxul.so  mozilla::widget::WaylandBuffer::BufferReleaseCallbackHandler(void*, wl_buffer*)  widget/gtk/WaylandBuffer.cpp:121
1  libffi.so.8  ffi_call_unix64  /build/libffi-XrD2gB/libffi-3.4.4/src/x86/unix64.S:104
2  libffi.so.8  ffi_call_int  /build/libffi-XrD2gB/libffi-3.4.4/src/x86/ffi64.c:673
3  libffi.so.8  ffi_call  /build/libffi-XrD2gB/libffi-3.4.4/src/x86/ffi64.c:710
4  libwayland-client.so.0  wl_closure_invoke  /usr/src/wayland-1.22.0-2.1/src/connection.c:1025
5  libwayland-client.so.0  dispatch_event  /usr/src/wayland-1.22.0-2.1/src/wayland-client.c:1631
6  libwayland-client.so.0  dispatch_queue  /usr/src/wayland-1.22.0-2.1/src/wayland-client.c:1777
6  libwayland-client.so.0  wl_display_dispatch_queue_pending  /usr/src/wayland-1.22.0-2.1/src/wayland-client.c:2019
7  libgdk-3.so.0  _gdk_wayland_display_queue_events  /usr/src/gtk+3.0-3.24.38-5ubuntu1/gdk/wayland/gdkeventsource.c:201

I'm not sure how exploitable this is, but there's a fairly steady volume of these. At a glance, it looks like a classic case of some kind of runnable thing holding a weak reference to something that goes away before the runnable can fire.

There were only a couple of crashes in all of January and February, and now there are that many a day. Is there an update on the Wayland side involved?

the crash is a sec-high type, but if this affects only a limited population that might mitigate it a little

mBufferReleaseFunc/mBufferReleaseData should be nullptr is AFAIK it not used at all.

Looks like WaylandBuffer was released before callback call so mBufferReleaseFunc was changed from nullptr to 0xe5e5e5e5e5e5e5e5.

I have patches for it, should be covered by Bug 1942232.

See Also: → 1942232
Flags: needinfo?(stransky)

BufferReleaseCallbackHandler was removed at Bug 1942232.

Flags: needinfo?(stransky)

Could we uplift bug 1942232 to other branches? I don't see this on any branch newer than 135.0a1, so maybe something else fixed it before bug 1942232. I do see it on 128 ESR (as recently as a 20250127191809 build) so it would be good if there was something we could do there.

Assignee: nobody → stransky
Status: NEW → RESOLVED
Closed: 10 months ago
Depends on: 1942232
Flags: needinfo?(stransky)
Resolution: --- → FIXED
See Also: 1942232

[Tracking Requested - why for this release]:

(In reply to Martin Stránský [:stransky] (ni? me) from comment #5)

BufferReleaseCallbackHandler was removed at Bug 1942232.

But that didn't land until 2025-02-05, while this signature seemed to go away in nightlies and betas mid-December 2024 (135 nightlies and after 134b1). ESR-128.7 builds are affected and that should correspond with 135 in theory, but since we don't see this signature in 135 release is it likely this was actually fixed by an earlier "non-security" patch that never got backported to ESR-128?

That makes me worry about backporting bug 1942232: it might not actually fix the problem on 128, and in fact it might cause ESR 128 stability issues if there's an unknown dependency on an earlier fix that wasn't backported.

Group: gfx-core-security → core-security-release
Whiteboard: [fixed by 1942232]

(In reply to Andrew McCreight [:mccr8] from comment #6)

Could we uplift bug 1942232 to other branches? I don't see this on any branch newer than 135.0a1, so maybe something else fixed it before bug 1942232. I do see it on 128 ESR (as recently as a 20250127191809 build) so it would be good if there was something we could do there.

Bug 1942232 (make WaylandBuffer robust) depends on Bug 1934497 (WaylandSurface implementation) which we can't backport to ESR.

The core issue here is missing wl_buffer_destroy() implemented via wl_display_sync() :
https://searchfox.org/mozilla-central/rev/f9f9b422f685244dcd3f6826b70d34a496ce5853/widget/gtk/WaylandBuffer.cpp#203

The crashing scenario is:

  1. create WaylandBuffer with wl_buffer, register 'detach' callback which indicates wl_buffer is no longer used by compositor
  2. nsWindow::Unmap() - usually when a popup window is closed
  3. delete wl_buffer on our side, but it's still live in wayland-client library
  4. delete WaylandBuffer class
  5. get wl_buffer::detach callback which points to released WaylandBuffer

It's because wl_buffer delete is queued to Wayland even queue and it doesn't happens instantly so we need to use wl_display_sync() and delete WaylandBuffer after wl_buffer delete confirmation.

Flags: needinfo?(stransky)

The crashes here may be also related to WebGL/canvas where dmabuf based wl_buffer is used.

I'm not sure how to backport it to ESR128. In 135.0 we introduced the WaylandSurface which holds reference to WaylandBuffer until it's safely released but we don't have anything like that on ESR and we just call wl_buffer delete on ~WaylandBuffer.

I think we may just addref WaylandBuffer when its attached to wl_surface and unref it on BufferReleaseCallbackHandler.

It may lead to WaylandBuffer memleak as it's not guaranteed to get BufferReleaseCallbackHandler and buggy compositors may skip that call for come cases (for instance mutter / Ubuntu 22.04 in Mozilla testsuite does it) but it may be better than UAF crash.

We may also delete underlying wl_shm_create_pool() and keep wl_buffer object only which is quite harsh but it releases entire underlying shm memory and keeps WaylandBuffer shell only so it makes the memleak tiny.

Flags: needinfo?(stransky)
QA Whiteboard: [post-critsmash-triage]
Flags: qe-verify-

(In reply to Martin Stránský [:stransky] (ni? me) from comment #10)

We're basically out of time to do anything for ESR128 at this point in this current cycle, but is this still something you're thinking about doing? Trading a security bug for a small memory leak seems like a reasonable compromise.

Target Milestone: --- → 137 Branch
Attachment #9474512 - Flags: approval-mozilla-esr128?
Flags: needinfo?(stransky)

Comment on attachment 9474512 [details]
Bug 1894100 [Linux] Addref WaylandBuffer when it's attached to Wayland compositor r?emilio

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: You need to repeatedly and quickly open/close popup windows.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Yes
  • Which branches (beta, release, and/or ESR) are affected by this flaw, and do the release status flags reflect this affected/unaffected state correctly?: esr
  • If not all supported branches, which bug introduced the flaw?: None
  • Do you have backports for the affected branches?: Yes
  • If not, how different, hard to create, and risky will they be?:
  • How likely is this patch to cause regressions; how much testing does it need?: causes memleak instead of UAF crash.
  • Is the patch ready to land after security approval is given?: Yes
  • Is Android affected?: No
Attachment #9474512 - Flags: sec-approval?
Whiteboard: [fixed by 1942232] → [fixed by 1942232] [adv-main137+r]

Approved to land, but we need to untangle the Advisory situation because we reported this fixed in 137.

AIUI we should remove it from the 137 advisory and include it in 138?

Flags: needinfo?(stransky)

(In reply to Tom Ritter [:tjr] from comment #14)

Approved to land, but we need to untangle the Advisory situation because we reported this fixed in 137.

AIUI we should remove it from the 137 advisory and include it in 138?

Not sure that do you mean, the patch is for 128.

Flags: needinfo?(stransky)

Not sure that do you mean, the patch is for 128.

Following the whole thread it seems there's no patch yet for ESR 128 (neither I can see it in the commits log). Am I missing something? (asking for a friend doing Tor Browser security backports :) )

(In reply to Martin Stránský [:stransky] (ni? me) from comment #15)

(In reply to Tom Ritter [:tjr] from comment #14)

Approved to land, but we need to untangle the Advisory situation because we reported this fixed in 137.

AIUI we should remove it from the 137 advisory and include it in 138?

Not sure that do you mean, the patch is for 128.

Ah okay, I understand now; fixed in 137 and 128.10.0 (which corresponds to 138)

(In reply to Giorgio Maone [:ma1] from comment #16)

Not sure that do you mean, the patch is for 128.

Following the whole thread it seems there's no patch yet for ESR 128 (neither I can see it in the commits log). Am I missing something? (asking for a friend doing Tor Browser security backports :) )

D243015 is the patch for 128 AIUI. We will be landing it in ESR next cycle.

Comment on attachment 9474512 [details]
Bug 1894100 [Linux] Addref WaylandBuffer when it's attached to Wayland compositor r?emilio

Woops, I commented but never set the flag

Attachment #9474512 - Flags: sec-approval? → sec-approval+
Attachment #9474512 - Flags: approval-mozilla-esr128? → approval-mozilla-esr128+
Whiteboard: [fixed by 1942232] [adv-main137+r] → [fixed by 1942232] [adv-main137+r][adv-esr128.23+r]
Whiteboard: [fixed by 1942232] [adv-main137+r][adv-esr128.23+r] → [fixed by 1942232] [adv-main137+r][adv-esr128.10+r]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: