Closed Bug 1893270 (CVE-2024-4770) Opened 9 months ago Closed 9 months ago

AddressSanitizer: attempting free on address which was not malloc()-ed on cairo_cff_font_destroy -> free

Categories

(Core :: Printing: Output, defect)

defect

Tracking

()

VERIFIED FIXED
127 Branch
Tracking Status
firefox-esr115 126+ verified
firefox125 --- wontfix
firefox126 + verified
firefox127 + verified

People

(Reporter: sourc7, Assigned: dholbert)

References

Details

(5 keywords, Whiteboard: [reporter-external] [client-bounty-form] [verif?] [adv-main126+] [adv-ESR115.11+] )

Crash Data

Attachments

(9 files)

Attached file asan.attemptfree1.txt

After Save to PDF from print dialog, the entire browser crashed, gladly I able to get detailed analysis from AddressSanitizer, the crash is attempting free on address which was not malloc()-ed which points to gfx/cairo.

Tested on:

  • Firefox Nightly 127.0a1 (2024-04-24) (64-bit) on Arch Linux (X11)
  • Firefox Beta 126.0b5 (64-bit) on Arch Linux (X11)

Steps to reproduce:

  1. Visit attached testcase.html
  2. When Print dialog is appear
  3. Click Save
  4. Save the PDF file to destination path
  5. After a seconds the entire browser crashed
Flags: sec-bounty?
Group: firefox-core-security → gfx-core-security
Component: Security → Printing: Output
Keywords: csectype-uaf
Product: Firefox → Core

The test case is missing.

Flags: needinfo?(susah.yak)

(In reply to Andrew McCreight [:mccr8] from comment #1)

The test case is missing.

Yes, please wait, I'm attaching in a moment.

Attached file testcase.html
Flags: needinfo?(susah.yak)

(In reply to Andrew McCreight [:mccr8] from comment #1)

The test case is missing.

Ok, I've tested it works on Arch Linux on my 2 PC with CPU Intel and AMD.

I'm still investigating further, this probably a regression since the release channel version is not crashed. I'll post more update.

I can reproduce a crash locally. I started by testing an ASAN build (127.0a1 (2024-04-24) downloaded from this push: https://treeherder.mozilla.org/jobs?repo=autoland&revision=6457c29f67bb798a7ef5c9de1e2de3829814d6a9 ). It seems to reliably crash with a null-deref, for what it's worth, like so:

AddressSanitizer:DEADLYSIGNAL
=================================================================
==160759==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x61365cfabe12 bp 0x000000000000 sp 0x7ffcb1e320d0 T0)
==160759==The signal is caused by a READ memory access.
==160759==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
    #0 0x61365cfabe12 in atomic_compare_exchange_strong<__sanitizer::atomic_uint8_t> /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_atomic_clang.h:81:10
    #1 0x61365cfabe12 in AtomicallySetQuarantineFlagIfAllocated /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_allocator.cpp:668:10
    #2 0x61365cfabe12 in __asan::Allocator::Deallocate(void*, unsigned long, unsigned long, __sanitizer::BufferedStackTrace*, __asan::AllocType) /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_allocator.cpp:724:10
    #3 0x61365d04430b in free /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:53:3
    #4 0x737f78d83611 in cairo_cff_font_destroy /builds/worker/checkouts/gecko/gfx/cairo/cairo/src/cairo-cff-subset.c:2942:2
    #5 0x737f78d85cd4 in _cairo_cff_fallback_init /builds/worker/checkouts/gecko/gfx/cairo/cairo/src/cairo-cff-subset.c:3441:5
    #6 0x737f78df12e3 in _cairo_pdf_surface_emit_cff_fallback_font /builds/worker/checkouts/gecko/gfx/cairo/cairo/src/cairo-pdf-surface.c:5955:14
    #7 0x737f78df12e3 in _cairo_pdf_surface_emit_unscaled_font_subset /builds/worker/checkouts/gecko/gfx/cairo/cairo/src/cairo-pdf-surface.c:6656:14
    #8 0x737f78f196e5 in _cairo_sub_font_collect /builds/worker/checkouts/gecko/gfx/cairo/cairo/src/cairo-scaled-font-subsets.c:741:30
    #9 0x737f78f196e5 in _cairo_scaled_font_subsets_foreach_internal /builds/worker/checkouts/gecko/gfx/cairo/cairo/src/cairo-scaled-font-subsets.c:1062:6
    #10 0x737f78dd3182 in _cairo_pdf_surface_emit_font_subsets /builds/worker/checkouts/gecko/gfx/cairo/cairo/src/cairo-pdf-surface.c:6694:14

Also: side note: before the crash, my debug build spams out a bunch of copies of this assertion:

###!!! ASSERTION: If it's incomplete and has no nif yet, it must flag a nif reflow.: 'nextFrame || kidStatus.NextInFlowNeedsReflow()', file /scratch/work/builds/mozilla-central/mozilla/layout/generic/nsCanvasFrame.cpp:731

...and then the following which might be more closely related (particularly the libfreetype/fillglyphs ones):

[GFX3-]: Surface size too large (exceeds extent limit)!
[GFX2-]: Allowing surface with invalid size (Cairo) Size(48960,63360)
[GFX3-]: Ending FillGlyphs with a bad surface 1
[GFX2-]: DrawTargetCairo context in error state: error occurred in libfreetype(40)
[GFX3-]: FillGlyphs bad surface 1
[GFX3-]: FillGlyphs bad surface 1
[GFX3-]: FillGlyphs bad surface 1
[GFX3-]: FillGlyphs bad surface 1

I captured a crash in a debug build in rr and submitted it to pernosco for further investigation.

(In reply to Daniel Holbert [:dholbert] from comment #6)

It seems to reliably crash with a null-deref, for what it's worth, like so:

That's not to say this is necessarily a null-deref for me (in fact I caught it as a UAF/double-free in pernosco, see below). Looking closer at my backtrace in comment 6, it looks like I'm a few levels deep inside of free, with atomic_compare_exchange_strong being the thing that actually triggers a null deref. So it's possible we're actually calling free with some bogus non-null address, and in this particular build, that triggers a null-deref in that helper function for whatever reason.

Here's my pernosco trace:
https://pernos.co/debug/DwQdO1tRGNBBo3fLUoecdQ/index.html

In the backtrace there, we've got free calling ReplaceMalloc::free (arg1=0xe5e5e5e5e5e5e5e5)

So it looks like this is cairo being a bit clumsy with failing to initialize members, in one of its code paths.

Here's what happens that triggers the crash:

  • we hit _cairo_cff_fallback_init which declares variable font which is initially null:
  • that calls status = _cairo_cff_font_fallback_create (font_subset, &font, subset_name);
  • ...and that function _cairo_cff_font_fallback_create fails to initialize a handful of member variables, including fd_local_sub_bias.
  • So when we go to clean up the font later in cairo_cff_font_destroy, we call free on those never-initialized member variables (font->fd_local_sub_bias in particular in the crash here), so we're calling free on something uninitialized and we crash.

In particular, compare this chunk of (better, I think?) _cairo_cff_font_create:
https://searchfox.org/mozilla-central/rev/6121b33709dd80979a6806ff59096a561e348ae8/gfx/cairo/cairo/src/cairo-cff-subset.c#2861-2875

font->fdselect = NULL;
font->fd_dict = NULL;
font->fd_private_dict = NULL;
font->fd_local_sub_index = NULL;
font->fd_local_sub_bias = NULL;
font->fdselect_subset = NULL;
font->fd_subset_map = NULL;
font->private_dict_offset = NULL;
font->global_subs_used = NULL;
font->local_subs_used = NULL;
font->fd_local_subs_used = NULL;

*font_return = font;

return CAIRO_STATUS_SUCCESS;

...vs. this buggy-I-think analogous code in _cairo_cff_font_fallback_create:
https://searchfox.org/mozilla-central/rev/6121b33709dd80979a6806ff59096a561e348ae8/gfx/cairo/cairo/src/cairo-cff-subset.c#3213-3226

font->global_subs_used = NULL;
font->local_subs_used = NULL;
font->subset_subroutines = FALSE;
font->fdselect = NULL;
font->fd_dict = NULL;
font->fd_private_dict = NULL;
font->fd_local_sub_index = NULL;
font->fdselect_subset = NULL;
font->fd_subset_map = NULL;
font->private_dict_offset = NULL;

*font_return = font;

return CAIRO_STATUS_SUCCESS;

Both of these are working with a freshly-allocated cairo_cff_font_t object, but the latter snippet (in _cairo_cff_font_fallback_create) leaves some fields uninitialized, like e.g. fd_local_subs_used, fd_local_sub_bias, and others.

Keywords: sec-high

It looks like these uninitialized variables might've been introduced in a cairo update at some point (which set them in one initializer function but not the other, I think?)

Bug 739096 (specifically this commit) introduced fd_local_sub_bias, and the code to free it, and the code to null-initialize it in _cairo_cff_font_create, but not any code to initialize it in _cairo_cff_font_fallback_create. So this might be theoretically reproducible as far back as that bug, for cases where content can cause _cairo_cff_font_fallback_create to be invoked...

Also notable: in cairo_cff_font_destroy, the doomed free (font->fd_local_sub_bias); call is guarded by a call to if (font->is_cid) {:
https://searchfox.org/mozilla-central/rev/6121b33709dd80979a6806ff59096a561e348ae8/gfx/cairo/cairo/src/cairo-cff-subset.c#2894-2895,2927,2942

static void
cairo_cff_font_destroy (cairo_cff_font_t *font)
...
    if (font->is_cid) {
...
	free (font->fd_local_sub_bias);

But that is_cid member is still uninitialized when we read it here in the fatal run-through, though. (It's 0xe5e5e5e5 which is truthy, so we enter that clause).

If we were to somehow reach cairo_cff_font_fallback_generate (which seems like a reasonable thing to reach for the product of _cairo_cff_font_fallback_create), then we would be OK because we'd hit this assignment:
https://searchfox.org/mozilla-central/rev/6121b33709dd80979a6806ff59096a561e348ae8/gfx/cairo/cairo/src/cairo-cff-subset.c#3264-3265

/* Create Top Dict */
font->is_cid = FALSE;

Maybe that's what cairo is expecting to happen, and why it doesn't bother to initialize those members like fd_local_sub_bias? Not sure. But anyway, in this case we never make it that far, so is_cid is uninitialized and truthy, so cairo_cff_font_destroy does enter the fatal chunk of code and attempts to free the uninitialized fd_local_sub_bias pointer.

Assignee: nobody → dholbert
Status: NEW → ASSIGNED

It looks like this issue affects upstream, too (upstream cairo hasn't found/fixed this yet). Here's today's cairo for _cairo_cff_font_fallback_create which is identical to the code that I quoted in the bottom half of comment 9:
https://gitlab.freedesktop.org/cairo/cairo/-/blob/b3d578ec2ff97450cccdf9275c503fc0afa7cc80/src/cairo-cff-subset.c#L3219-3228

Severity: -- → S2
Crash Signature: [@ arena_dalloc | BaseAllocator::free | MozJemalloc::free | PageFree]

I see PContentParent::OnMessageReceived() in the stack, so it looks like this could act as a sandbox escape.

Looking on crash stats for crashes with RemotePrintJobParent::RecvFinalizePrint() in the proto signature on poison values, I found this crash which looks similar, from earlier this month on version 124: bp-2e38df06-05ad-4ae6-91ba-b3a870240410 (There was also one from today on Nightly, but I assume that is somebody testing this out.)

(In reply to Andrew McCreight [:mccr8] from comment #15)

(There was also one from today on Nightly, but I assume that is somebody testing this out.)

I did submit one crash report for this, yeah -- bp-60f096f2-1415-4c1f-92ed-e70e90240424 is me. My crash report's backtrace isn't quite the same as in the crash from 124 that mccr8 found, but it still might be related.

Using mozregression with the testcase here, I can reproduce this quite a ways back -- the regression range is:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=7c8f755bb185b66dbaf3d2f61f4702ea28a300fd&tochange=96604e3a217e9a5b59a6f31a4c426ee7f5df03d1

...but that's just when we enabled CSS nesting, which the testcase does use (but not in a particularly important way; nesting is just syntactic sugar). So this presumably goes back further if we adjust the testcase to not use CSS nesting.

(I added my crash signature to this bug, too, but I'm not sure whether that's especially useful, since this crash signature is [@ arena_dalloc | BaseAllocator::free | MozJemalloc::free | PageFree ] which seems to cover all crashes in free, basically. Perhaps we should add a rule to make crash-stats drill up to whatever called free and include that in the signature, so that we'd have a signature that looks like cairo_cff_font_destroy | free or whatever (I forget the exact syntax/ordering.)

With testcase 2, I get this regression range which might be the "true" regression range here:
Last good revision: bd494168b95a83344e778b5b4679b67101482118 (2021-12-23)
First bad revision: 818c851d0aeb08a5596fdf72410a7033e69ab700 (2021-12-24)

Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=bd494168b95a83344e778b5b4679b67101482118&tochange=818c851d0aeb08a5596fdf72410a7033e69ab700

Not sure what in there would have made this go from not-crashing to crashing; maybe Bug 1747114? In any case, that range is probably not directly responsible; per comment 9, I think this is a bug in the upstream cairo code that we pulled in as part of Bug 739096 a few months earlier, and the December regression range is probably something that just enabled us to start tripping over that bug with this particular testcase.

(and that means 115 is "affected", I think, with testcase 2 at least.)

Oops, I meant to mark 115 as affected. I'm not sure how I managed that.

Comment on attachment 9398501 [details]
Bug 1893270: Make cairo font-creation functions consistently use calloc. r?jfkthame

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: I'd call it "moderate" in terms of easiness. (Though an exploit would require completing a print job -- this code is only used when we actually proceed with a print operation and I think in particular a print to the Save to PDF backend [UPDATE: This crash affects printing to "real" printers, too, I guess because we use cairo to generate the rendering that we send to the printer]
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
  • Which branches (beta, release, and/or ESR) are affected by this flaw, and do the release status flags reflect this affected/unaffected state correctly?: all supported branches (beta, release, ESR115) are affected
  • If not all supported branches, which bug introduced the flaw?: None
  • Do you have backports for the affected branches?: No
  • If not, how different, hard to create, and risky will they be?: I'm pretty sure the patch applies cleanly to all affected branches; this is a snapshot of upstream cairo code, which changes very rarely (and this piece in particular seems to not have changed in a while).
  • How likely is this patch to cause regressions; how much testing does it need?: Unlikely to cause regressions; just initializing some fields that were otherwise left uninitialized.
  • Is the patch ready to land after security approval is given?: Yes
  • Is Android affected?: Yes
Attachment #9398501 - Flags: sec-approval?
Attachment #9398501 - Attachment description: Bug 1893270: Make cairo font initialization functions more consistent. r?jfkthame → Bug 1893270: Make cairo font-creation functions consistently use calloc. r?jfkthame

(In reply to Daniel Holbert [:dholbert] from comment #22)

Security Approval Request

Note: I updated the patch and re-requested review, but the patch is essentially the same (just using a blanket approach to zero-initialize the whole struct instead of initializing specific members).

Answers in comment 22 are still accurate, except that "How easily could an exploit be constructed based on the patch?:" is now slightly harder (hooray!) since it's less obvious to an attacker which of the previously-uninitialized-members are the ones that we're actually caring about zero-initializing.

Attached file testcase 3 (reduced)

lowering severity to sec-moderate because this requires a potential victim to actually "print" (or save to pdf); it's not triggered by the "print preview" which can be forced on a victim by calling window.print().

With testcase 2, I get this regression range which might be the "true" regression range here:

are these tests being run with the pref print.always_print_silent enabled? If so that could explain that regression range, but also point that this likely isn't the true source of the regression. Doesn't matter too much when it's that far back.

The signature associated with this crash appears to be a regression in 120. You may have triggered it using this testcase once, but it's not happening in the wild enough to get captured. If I search on "cff_font" in the proto-signature I see a bunch of crashes involving Firefox ESR and Thunderbird that might be related (though some might not -- that's really overbroad).

Crash Signature: [@ arena_dalloc | BaseAllocator::free | MozJemalloc::free | PageFree] → [@ arena_dalloc | BaseAllocator::free | MozJemalloc::free | PageFree] [@ arena_dalloc | cairo_cff_font_destroy ]

Comment on attachment 9398501 [details]
Bug 1893270: Make cairo font-creation functions consistently use calloc. r?jfkthame

sec-approval = dveditz
also approved for beta uplift

Attachment #9398501 - Flags: sec-approval?
Attachment #9398501 - Flags: sec-approval+
Attachment #9398501 - Flags: approval-mozilla-beta+

Although re-rated as sec-moderate, we should still land in the upcoming ESR because this fix will be released as part of the upstream package.

(In reply to Daniel Veditz [:dveditz] from comment #26)

With testcase 2, I get this regression range which might be the "true" regression range here:

are these tests being run with the pref print.always_print_silent enabled?

I think I was using that pref to speed up mozregression there, yeah. You're right, that's likely why I got the regression range I did - there is a commit about silent printing being broken in there. So probably this does go back as far as our cairo update (Bug 739096) that pulled in this code (only a few months earlier in 2021). I don't have my mozregression env available to test that right now, but it doesn't much matter for regressions that far back, as you note.

Attachment #9399004 - Flags: approval-mozilla-esr115?
Attachment #9399005 - Flags: approval-mozilla-esr115?
Pushed by dholbert@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ebaf603aaadb Make cairo font-creation functions consistently use calloc. r=jfkthame https://hg.mozilla.org/integration/autoland/rev/8cce05f88b61 followup: save a copy of the cairo patch in-tree. r=jfkthame
Group: gfx-core-security → core-security-release
Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → 127 Branch
Attachment #9398516 - Attachment filename: testcase.html → testcase2.html

FWIW I've verified the fix in NIghtlies -- i.e. I've confirmed that Nightly before the fix does crash, and that current Nightly 127.0a1 (2024-04-29) (64-bit) does not crash, when I click through to perform a print operation to Firefox's save-to-PDF print backend with the attached testcases.

I haven't yet been able to reproduce the crash in release or beta for some reason, though (testing 125.0.1 and 126.0b7). But I can reproduce the crash in Nightly versions going back quite a while (before the fix that just landed a few days ago).

So: it's possible this bug is somehow mitigated in release versions, but I'm not sure how. I suspect it's not reliably mitigated in any case.

(Unfortunately my main internet connection happens to be down today, so my ability to bisect / test different releases is limited at the moment.)

esr115 Uplift Approval Request

  • User impact if declined: Potentially-exploitable crashes when printing specially crafted documents
  • Code covered by automated testing: no
  • Fix verified in Nightly: yes
  • Needs manual QE test: yes
  • Steps to reproduce for manual QE testing: Load a saved copy of one of the testcases. Proceed with the print operation, using Save-to-PDF print target. See if you crash (hopefully not).
  • Risk associated with taking this patch: Very low risk.
  • Explanation of risk level: This just zero-initializes a freshly allocated struct to zero, making this codepath consistent with another one that initializes the same type of struct. This turns any potential uninitialized-value reads into reads of zero-filled buffers (which means we end up with null pointers that get gracefully handled, rather than bogus uninitialized pointers).
  • String changes made/needed: None
  • Is Android affected?: yes
Flags: qe-verify+
QA Whiteboard: [post-critsmash-triage]
QA Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][qa-triaged]
Attachment #9399004 - Flags: approval-mozilla-esr115? → approval-mozilla-esr115+
Attachment #9399005 - Flags: approval-mozilla-esr115? → approval-mozilla-esr115+
Flags: sec-bounty? → sec-bounty+

I've replicated this issue with Nightly 127.0a1 (2024-04-24) on Ubuntu 22.04 following the STR from Comment 0 using the testcase 3 (reduced).
Verified as fixed in the latest Nightly 127.0a1 (2024-05-01) and Firefox 126.0b8 versions under same configuration, as the issue no longer occurs.

Status: RESOLVED → VERIFIED
QA Whiteboard: [post-critsmash-triage][qa-triaged]
Flags: qe-verify+

The issue can no longer be reproduced in Firefox 115.11.0esr on Ubuntu 22.04. Consequently, I've updated the flag to reflect this.

Whiteboard: [reporter-external] [client-bounty-form] [verif?] → [reporter-external] [client-bounty-form] [verif?] [adv-main126+] [adv-ESR115.11+]
Alias: CVE-2024-4770
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: