Closed Bug 1826791 (CVE-2023-6858) Opened 2 years ago Closed 1 year ago

Access violation on BuildTextRunsScanner::ScanFrame at layout/generic/nsTextFrame.cpp

Categories

(Core :: Layout: Text and Fonts, defect)

defect

Tracking

()

VERIFIED FIXED
121 Branch
Tracking Status
firefox-esr115 121+ fixed
firefox119 --- wontfix
firefox120 --- wontfix
firefox121 + fixed

People

(Reporter: sourc7, Assigned: jfkthame)

Details

(5 keywords, Whiteboard: [reporter-external] [client-bounty-form] [verif?][adv-main121+][adv-esr115.6+])

Attachments

(9 files, 3 obsolete files)

Attached file testcase-setrangetext1.html (obsolete) —

After run setRangeText on textarea with long string, then run window.scrollByPages and new worker repeatedly on queueMicroTask, Firefox tab able to crash on BuildTextRunsScanner::ScanFrame(nsIFrame*) at layout/generic/nsTextFrame.cpp, the crash address able to change e.g. 0xeb4ff006, 0x73982014, 0x73997014, 0x9fbff006, and more.

Tested on:

  • Firefox Nightly 113.0a1 (2023-04-06) (32-bit)
  • Firefox 111.0.1 (32-bit)
  • Firefox ESR 102.9.0esr (32-bit)

Steps to reproduce:

  1. Open Firefox 32-bit
  2. Visit attached testcase-setrangetext1.html
  3. Analyze the crash dump

If Firefox crash at EXCEPTION_BREAKPOINT, try again until it crash at ACCESS_VIOLATION_READ

Flags: sec-bounty?

Reduce pre-allocate memory from 2800 mb to 2700 mb for better reliability.

Attachment #9327382 - Attachment is obsolete: true
Group: firefox-core-security → layout-core-security
Component: Security → Layout
Product: Firefox → Core

I was unable to reproduce the issue on Linux with a 32-bit ASan build, unfortunately only OOMs.

How reliable is this for you? I guess at least somewhat reliable if you reproduced on 3 different versions.

How much memory does your machine or VM have? Tyson spent quite a bit of time trying to reproduce it with no luck

Jonathan: you've looked at previous crashes in the BuildTextRunsScanner... does this stack give you any ideas of things to look at?

Flags: needinfo?(susah.yak)
Flags: needinfo?(jfkthame)
Keywords: crash, testcase
Attached file assert-stack.txt

In a 32-bit debug build on Windows I get:
Assertion failure: aIndex < mState.mLength (bad index), at /builds/worker/checkouts/gecko/dom/base/nsTextFragment.h:219

(In reply to Daniel Veditz [:dveditz] from comment #5)

Tyson spent quite a bit of time trying to reproduce it with no luck

FWIW I was trying with a 32-bit ASan build on Linux to try to get more logs.

I suspect this is fundamentally an out-of-memory failure, but we're not handling it cleanly somewhere and leaving something (an nsTextFragment?) in an inconsistent state.

(If it's the nsTextFragment that's broken, this might be more of a DOM than Layout bug. Hard to be sure at this point, though.)

Flags: needinfo?(jfkthame)

(In reply to Daniel Veditz [:dveditz] from comment #5)

How reliable is this for you? I guess at least somewhat reliable if you reproduced on 3 different versions.

The testcase is pretty reliable, when running the testcase on Firefox Nightly (32-bit) using Ryzen 7 5700G on Windows 11, on best case running this for 20x run on ffpuppet launcher I able to get 12x success with AV_READ on BuildTextRunsScanner::ScanFrame, 5x breakpoint, and 3x no minidump, overall it got 60% success rate. On the next run I get 35% and 30% success rate on the same 20x run.

How much memory does your machine or VM have? Tyson spent quite a bit of time trying to reproduce it with no luck

I'm on Windows 11 with 64GB of RAM, I tried this recently on my i5-1035G1 with 8GB of RAM, it also able to reproduce the AV_READ.

Flags: needinfo?(susah.yak)

The severity field is not set for this bug.
:jfkthame, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jfkthame)
Attached file minidump_64bit.txt

It's also possible to hit the ACCESS_VIOLATION_READ on Firefox Nightly 115.0a1 (2023-05-26) (64-bit).

From the CPU register rdx = 0x00000273bd81eb00 and crash address 0x00000273bd8ff012 it looks like out-of-bounds read.

Ok great! I able to find the root cause of this!

The root cause is when JS execute textarea1.setRangeText then Firefox code execute nsTextFragment::SetTo to allocate hdr buffer using nsStringBuffer::Alloc when OOM the hdr set to NULL, then it directly execute return already_AddRefed(hdr);

To simulate this we can add following code on nsStringBuffer.cpp above nsStringBuffer* hdr = (nsStringBuffer*)malloc(sizeof(nsStringBuffer) + aSize); as follow:

  if(aSize == 1048858) {
    printf("Hit already_AddRefed((nsStringBuffer*)NULL);\n");
    return already_AddRefed((nsStringBuffer*)NULL);
  }

After Firefox was compiled with AddressSanitizer, then visit the testcase-setrangetext1.html, it will show output AddressSanitizer: heap-buffer-overflow with READ of size 2

Observations:

nsTextFragment::SetTo seems to think it's fine to leave m2b as a null pointer (in case of OOM):
https://searchfox.org/mozilla-central/rev/d8c5478bd730644dab4336fdb83271ac1ca1d6b5/dom/base/nsTextFragment.cpp#299-302

m2b = nsStringBuffer::Alloc(m2bSize.value()).take();
if (!m2b) {
  return false;
}

(In another case it opts to handle this by explicitly crashing). But I think it's supposed to be ~fine to leave it null, too? It looks like nsTextFragment::SetTo does call ReleaseText before experiencing this allocation failure, that seems to leave the fragment in a "string of length 0 with null pointer" state, and that appears to be a normal state to be in (when it does legitimately have a length of 0). So when we try and fail to set it to something non-null (without changing any other member data I think), that should be ~fine.

Then, when nsTextFrame calls into CharAt, we immediately index off of our (null) m2b pointer (with no null-check, though we do have a debug-only assert precondition that aIndex is less than the length, which in this case won't hold up, since the ReleaseText call in SetTo should've left us with zero length):
https://searchfox.org/mozilla-central/rev/d8c5478bd730644dab4336fdb83271ac1ca1d6b5/dom/base/nsTextFragment.h#218-221

char16_t CharAt(uint32_t aIndex) const {
  MOZ_ASSERT(aIndex < mState.mLength, "bad index");
  return mState.mIs2b ? Get2b()[aIndex]
                      : static_cast<unsigned char>(m1b[aIndex]);

So somehow nsTextFrame isn't realizing that its nsTextFragment has zero length. The caller does conditionally check the length here:
https://searchfox.org/mozilla-central/rev/d8c5478bd730644dab4336fdb83271ac1ca1d6b5/layout/generic/nsTextFrame.cpp#642-644

int32_t nsTextFrame::GetContentEnd() const {
  nsTextFrame* next = GetNextContinuation();
  return next ? next->GetContentOffset() : TextFragment()->GetLength();

...but not if next is truthy, which is maybe the problem here.

So essentially, based on that above-quoted GetContentEnd impl: if nsTextFrame has a next-continuation, then it seems possible that it may not realize that its TextFragment has zero length (due to allocation failure) and hence that it shouldn't be indexed into via CharAt/etc.

It's a bit silly to be in this situation where we have zero text and yet also have a next-continuation. Maybe the allocation failure should cause us to recreate all of these frames (such that we won't have a next continuation since we don't have any text)? But that adds some complexity for a pretty-rare situation (CharacterData::SetTextInternal, or something in its callstack, would need to detect the allocation failure and ensure that frame reconstruction happens before we touch the frame tree).

It might just be simpler and more robust just to handle this gracefully, by giving the relevant nsTextFrame code a check to test whether the text fragment has zero length? (We may need more cleanup to make sure things proceed nicely after that point; e.g. I'm not sure if some invariants would be broken elsewhere for this frame or its next continuation due to this fragment having a zero length.)

Actually, it's something a bit more subtle than what I described above... The allocation failure is followed by a successful allocation a little bit later, but we're left with an apparently bogus mContentOffset value on the second nsTextFrame continuation (8344), whereas our nsTextFragment only has a length of 8324. So we've got a mContentOffset value that's out of bounds, overshooting by 20.

(Those values correspond to things in the testcase -- there are initially 19 characters in the nsTextArea (which probably rounds up to 20 via a whitespace or null terminator or something), and 8324 is the number of semicolon characters that we keep inserting at the start of the textarea via setRangeText.)

Here's a pernosco recording of the crash that I'm currently poking at: https://pernos.co/debug/WS_u7yvuO_HvEvJEZPqjyA/index.html

Toggling ni=me to push further here.

Flags: needinfo?(dholbert)
Attached patch CharAt.patchSplinter Review

(In reply to Daniel Holbert [:dholbert] from comment #15)

It might just be simpler and more robust just to handle this gracefully, by giving the relevant nsTextFrame code a check to test whether the text fragment has zero length? (We may need more cleanup to make sure things proceed nicely after that point; e.g. I'm not sure if some invariants would be broken elsewhere for this frame or its next continuation due to this fragment having a zero length.)

I found by adding return char16_t{0}; when the aIndex > mState.mLength on CharAt code, I no longer able to reproduce the EXCEPTION_ACCESS_VIOLATION_READ on BuildTextRunsScanner::ScanFrame, tried replaying many times and still did not receive that, which is great.

char16_t CharAt(uint32_t aIndex) const {
  MOZ_ASSERT(aIndex < mState.mLength, "bad index");
  if(aIndex > mState.mLength) {
    return char16_t{0}; // Represents "null" or no value
  }
  return mState.mIs2b ? Get2b()[aIndex]
                      : static_cast<unsigned char>(m1b[aIndex]);
}

However, does it may display the null text value to the user? or other unwanted behavior? I think the more good approach is when CharAt detect aIndex > mState.mLength then it return false so other code able to output console.error to user i.e. uncaught exception: out of memory.

I think it's a good start knowing that by adding return char16_t{0}; when the aIndex > mState.mLength on CharAt code is resolved the access violation on ScanFrame, thanks!

It looks to me like we can protect against this by making nsTextFrame::GetContentEnd() check the fragment length in addition to looking at the next-continuation's offset, so that in the case where OOM during text fragment updates left things in an unexpected state, we don't proceed to look beyond the end of the fragment that actually exists. This should be lower-cost than adding a bounds check to every CharAt call.

In the pernosco recording from comment 16, this would mean that HasTerminalNewline() does not try to look beyond the end of the fragment.

(There's a similar pattern in BuildTextRunsScanner::MappedFlow::GetContentEnd() where we should do the same thing, I guess, though I haven't specifically hit this case in testing.)

Just trying a local ASAN build for some further testing....

Flags: needinfo?(jfkthame)

With this, I can no longer reproduce an ASAN error or crash using the
simulated-OOM patch from comment 12.

(In a debug build we still spew a bunch of [non-fatal] NS_ASSERTIONs about
"negative length" and "overlapping or discontiguous frames" after forcing
the string allocation failure, which is unsurprising given that an update
to the underlying data has failed, but appears to be handled safely.)

Assignee: nobody → jfkthame
Status: NEW → ASSIGNED
Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7b45b836f672 Always check against fragment length in nsTextFrame::GetContentEnd(). r=dholbert
Group: layout-core-security → core-security-release
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 121 Branch
Component: Layout → Layout: Text and Fonts

(In reply to Jonathan Kew [:jfkthame] from comment #21)

Created attachment 9363532 [details]
Bug 1826791 - Always check against fragment length in nsTextFrame::GetContentEnd(). r=dholbert,#layout

With this, I can no longer reproduce an ASAN error or crash using the
simulated-OOM patch from comment 12.

(In reply to Pulsebot from comment #22)

Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7b45b836f672
Always check against fragment length in nsTextFrame::GetContentEnd().
r=dholbert

Thanks Jonathan for the patch! After check out to the git commit, I also confirm I no longer able to reproduce the access violation on BuildTextRunsScanner::ScanFrame

Status: RESOLVED → VERIFIED

Please nominate this for ESR115 approval when you get a chance.

Flags: needinfo?(jfkthame)

Comment on attachment 9363532 [details]
Bug 1826791 - Always check against fragment length in nsTextFrame::GetContentEnd(). r=dholbert,#layout

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Possible out-of-bounds access after soft allocation failure, leading to crash, or perhaps data leakage or other kinds of exploit
  • User impact if declined: Possible vulnerability (although seems tricky to exploit reliably without simply OOM-crashing the process)
  • Fix Landed on Version: 121
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Simple bounds checks against the actually-allocated buffer size
Flags: needinfo?(jfkthame)
Attachment #9363532 - Flags: approval-mozilla-esr115?

Comment on attachment 9363532 [details]
Bug 1826791 - Always check against fragment length in nsTextFrame::GetContentEnd(). r=dholbert,#layout

Approved for 115.6esr.

Attachment #9363532 - Flags: approval-mozilla-esr115? → approval-mozilla-esr115+
Flags: sec-bounty? → sec-bounty+
Whiteboard: [reporter-external] [client-bounty-form] [verif?] → [reporter-external] [client-bounty-form] [verif?][adv-main121+]
Whiteboard: [reporter-external] [client-bounty-form] [verif?][adv-main121+] → [reporter-external] [client-bounty-form] [verif?][adv-main121+][adv-esr115.6+]
Attached file advisory.txt (obsolete) —
Attached file advisory.txt (obsolete) —
Attachment #9367984 - Attachment is obsolete: true
Attached file advisory.txt
Attachment #9368014 - Attachment is obsolete: true
Alias: CVE-2023-6858

Bulk-unhiding security bugs fixed in Firefox 119-121 (Fall 2023). Use "moo-doctrine-subsidy" to filter

Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: