Access violation on BuildTextRunsScanner::ScanFrame at layout/generic/nsTextFrame.cpp
Categories
(Core :: Layout: Text and Fonts, defect)
Tracking
()
People
(Reporter: sourc7, Assigned: jfkthame)
Details
(5 keywords, Whiteboard: [reporter-external] [client-bounty-form] [verif?][adv-main121+][adv-esr115.6+])
Attachments
(9 files, 3 obsolete files)
10.36 KB,
text/plain
|
Details | |
861 bytes,
text/html
|
Details | |
14.86 KB,
text/plain
|
Details | |
4.04 KB,
text/plain
|
Details | |
671 bytes,
patch
|
Details | Diff | Splinter Review | |
12.74 KB,
text/plain
|
Details | |
1.08 KB,
patch
|
Details | Diff | Splinter Review | |
48 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-esr115+
|
Details | Review |
184 bytes,
text/plain
|
Details |
After run setRangeText on textarea with long string, then run window.scrollByPages and new worker repeatedly on queueMicroTask, Firefox tab able to crash on BuildTextRunsScanner::ScanFrame(nsIFrame*) at layout/generic/nsTextFrame.cpp, the crash address able to change e.g. 0xeb4ff006, 0x73982014, 0x73997014, 0x9fbff006, and more.
Tested on:
- Firefox Nightly 113.0a1 (2023-04-06) (32-bit)
- Firefox 111.0.1 (32-bit)
- Firefox ESR 102.9.0esr (32-bit)
Steps to reproduce:
- Open Firefox 32-bit
- Visit attached testcase-setrangetext1.html
- Analyze the crash dump
If Firefox crash at EXCEPTION_BREAKPOINT, try again until it crash at ACCESS_VIOLATION_READ
Reporter | ||
Comment 1•2 years ago
|
||
Comment hidden (obsolete) |
Reporter | ||
Comment 3•2 years ago
|
||
Reduce pre-allocate memory from 2800 mb to 2700 mb for better reliability.
Updated•2 years ago
|
Comment 4•2 years ago
|
||
I was unable to reproduce the issue on Linux with a 32-bit ASan build, unfortunately only OOMs.
Comment 5•2 years ago
|
||
How reliable is this for you? I guess at least somewhat reliable if you reproduced on 3 different versions.
How much memory does your machine or VM have? Tyson spent quite a bit of time trying to reproduce it with no luck
Jonathan: you've looked at previous crashes in the BuildTextRunsScanner... does this stack give you any ideas of things to look at?
Updated•2 years ago
|
Comment 6•2 years ago
|
||
In a 32-bit debug build on Windows I get:
Assertion failure: aIndex < mState.mLength (bad index), at /builds/worker/checkouts/gecko/dom/base/nsTextFragment.h:219
Comment 7•2 years ago
|
||
(In reply to Daniel Veditz [:dveditz] from comment #5)
Tyson spent quite a bit of time trying to reproduce it with no luck
FWIW I was trying with a 32-bit ASan build on Linux to try to get more logs.
Assignee | ||
Comment 8•2 years ago
|
||
I suspect this is fundamentally an out-of-memory failure, but we're not handling it cleanly somewhere and leaving something (an nsTextFragment?) in an inconsistent state.
(If it's the nsTextFragment that's broken, this might be more of a DOM than Layout bug. Hard to be sure at this point, though.)
Reporter | ||
Comment 9•2 years ago
|
||
(In reply to Daniel Veditz [:dveditz] from comment #5)
How reliable is this for you? I guess at least somewhat reliable if you reproduced on 3 different versions.
The testcase is pretty reliable, when running the testcase on Firefox Nightly (32-bit) using Ryzen 7 5700G on Windows 11, on best case running this for 20x run on ffpuppet launcher I able to get 12x success with AV_READ on BuildTextRunsScanner::ScanFrame
, 5x breakpoint, and 3x no minidump, overall it got 60% success rate. On the next run I get 35% and 30% success rate on the same 20x run.
How much memory does your machine or VM have? Tyson spent quite a bit of time trying to reproduce it with no luck
I'm on Windows 11 with 64GB of RAM, I tried this recently on my i5-1035G1 with 8GB of RAM, it also able to reproduce the AV_READ.
Updated•2 years ago
|
Comment 10•2 years ago
|
||
The severity field is not set for this bug.
:jfkthame, could you have a look please?
For more information, please visit auto_nag documentation.
Reporter | ||
Comment 11•2 years ago
|
||
It's also possible to hit the ACCESS_VIOLATION_READ
on Firefox Nightly 115.0a1 (2023-05-26) (64-bit).
From the CPU register rdx = 0x00000273bd81eb00
and crash address 0x00000273bd8ff012
it looks like out-of-bounds read.
Reporter | ||
Comment 12•2 years ago
|
||
Ok great! I able to find the root cause of this!
The root cause is when JS execute textarea1.setRangeText
then Firefox code execute nsTextFragment::SetTo
to allocate hdr
buffer using nsStringBuffer::Alloc
when OOM the hdr
set to NULL
, then it directly execute return already_AddRefed(hdr);
To simulate this we can add following code on nsStringBuffer.cpp above nsStringBuffer* hdr = (nsStringBuffer*)malloc(sizeof(nsStringBuffer) + aSize);
as follow:
if(aSize == 1048858) {
printf("Hit already_AddRefed((nsStringBuffer*)NULL);\n");
return already_AddRefed((nsStringBuffer*)NULL);
}
After Firefox was compiled with AddressSanitizer, then visit the testcase-setrangetext1.html, it will show output AddressSanitizer: heap-buffer-overflow
with READ of size 2
Reporter | ||
Comment 13•2 years ago
|
||
Comment 14•2 years ago
•
|
||
Observations:
nsTextFragment::SetTo
seems to think it's fine to leave m2b as a null pointer (in case of OOM):
https://searchfox.org/mozilla-central/rev/d8c5478bd730644dab4336fdb83271ac1ca1d6b5/dom/base/nsTextFragment.cpp#299-302
m2b = nsStringBuffer::Alloc(m2bSize.value()).take();
if (!m2b) {
return false;
}
(In another case it opts to handle this by explicitly crashing). But I think it's supposed to be ~fine to leave it null, too? It looks like nsTextFragment::SetTo
does call ReleaseText
before experiencing this allocation failure, that seems to leave the fragment in a "string of length 0 with null pointer" state, and that appears to be a normal state to be in (when it does legitimately have a length of 0). So when we try and fail to set it to something non-null (without changing any other member data I think), that should be ~fine.
Then, when nsTextFrame calls into CharAt
, we immediately index off of our (null) m2b pointer (with no null-check, though we do have a debug-only assert precondition that aIndex is less than the length, which in this case won't hold up, since the ReleaseText
call in SetTo
should've left us with zero length):
https://searchfox.org/mozilla-central/rev/d8c5478bd730644dab4336fdb83271ac1ca1d6b5/dom/base/nsTextFragment.h#218-221
char16_t CharAt(uint32_t aIndex) const {
MOZ_ASSERT(aIndex < mState.mLength, "bad index");
return mState.mIs2b ? Get2b()[aIndex]
: static_cast<unsigned char>(m1b[aIndex]);
So somehow nsTextFrame
isn't realizing that its nsTextFragment
has zero length. The caller does conditionally check the length here:
https://searchfox.org/mozilla-central/rev/d8c5478bd730644dab4336fdb83271ac1ca1d6b5/layout/generic/nsTextFrame.cpp#642-644
int32_t nsTextFrame::GetContentEnd() const {
nsTextFrame* next = GetNextContinuation();
return next ? next->GetContentOffset() : TextFragment()->GetLength();
...but not if next
is truthy, which is maybe the problem here.
Comment 15•2 years ago
•
|
||
So essentially, based on that above-quoted GetContentEnd
impl: if nsTextFrame
has a next-continuation, then it seems possible that it may not realize that its TextFragment
has zero length (due to allocation failure) and hence that it shouldn't be indexed into via CharAt/etc.
It's a bit silly to be in this situation where we have zero text and yet also have a next-continuation. Maybe the allocation failure should cause us to recreate all of these frames (such that we won't have a next continuation since we don't have any text)? But that adds some complexity for a pretty-rare situation (CharacterData::SetTextInternal
, or something in its callstack, would need to detect the allocation failure and ensure that frame reconstruction happens before we touch the frame tree).
It might just be simpler and more robust just to handle this gracefully, by giving the relevant nsTextFrame code a check to test whether the text fragment has zero length? (We may need more cleanup to make sure things proceed nicely after that point; e.g. I'm not sure if some invariants would be broken elsewhere for this frame or its next continuation due to this fragment having a zero length.)
Comment 16•2 years ago
•
|
||
Actually, it's something a bit more subtle than what I described above... The allocation failure is followed by a successful allocation a little bit later, but we're left with an apparently bogus mContentOffset
value on the second nsTextFrame continuation (8344), whereas our nsTextFragment only has a length of 8324. So we've got a mContentOffset value that's out of bounds, overshooting by 20.
(Those values correspond to things in the testcase -- there are initially 19 characters in the nsTextArea (which probably rounds up to 20 via a whitespace or null terminator or something), and 8324 is the number of semicolon characters that we keep inserting at the start of the textarea via setRangeText
.)
Here's a pernosco recording of the crash that I'm currently poking at: https://pernos.co/debug/WS_u7yvuO_HvEvJEZPqjyA/index.html
Comment hidden (typo) |
Reporter | ||
Comment 19•1 year ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #15)
It might just be simpler and more robust just to handle this gracefully, by giving the relevant nsTextFrame code a check to test whether the text fragment has zero length? (We may need more cleanup to make sure things proceed nicely after that point; e.g. I'm not sure if some invariants would be broken elsewhere for this frame or its next continuation due to this fragment having a zero length.)
I found by adding return char16_t{0};
when the aIndex > mState.mLength
on CharAt
code, I no longer able to reproduce the EXCEPTION_ACCESS_VIOLATION_READ
on BuildTextRunsScanner::ScanFrame
, tried replaying many times and still did not receive that, which is great.
char16_t CharAt(uint32_t aIndex) const {
MOZ_ASSERT(aIndex < mState.mLength, "bad index");
if(aIndex > mState.mLength) {
return char16_t{0}; // Represents "null" or no value
}
return mState.mIs2b ? Get2b()[aIndex]
: static_cast<unsigned char>(m1b[aIndex]);
}
However, does it may display the null text value to the user? or other unwanted behavior? I think the more good approach is when CharAt
detect aIndex > mState.mLength
then it return false
so other code able to output console.error
to user i.e. uncaught exception: out of memory
.
I think it's a good start knowing that by adding return char16_t{0};
when the aIndex > mState.mLength
on CharAt
code is resolved the access violation on ScanFrame
, thanks!
Assignee | ||
Comment 20•1 year ago
|
||
It looks to me like we can protect against this by making nsTextFrame::GetContentEnd()
check the fragment length in addition to looking at the next-continuation's offset, so that in the case where OOM during text fragment updates left things in an unexpected state, we don't proceed to look beyond the end of the fragment that actually exists. This should be lower-cost than adding a bounds check to every CharAt
call.
In the pernosco recording from comment 16, this would mean that HasTerminalNewline()
does not try to look beyond the end of the fragment.
(There's a similar pattern in BuildTextRunsScanner::MappedFlow::GetContentEnd()
where we should do the same thing, I guess, though I haven't specifically hit this case in testing.)
Just trying a local ASAN build for some further testing....
Assignee | ||
Comment 21•1 year ago
|
||
With this, I can no longer reproduce an ASAN error or crash using the
simulated-OOM patch from comment 12.
(In a debug build we still spew a bunch of [non-fatal] NS_ASSERTIONs about
"negative length" and "overlapping or discontiguous frames" after forcing
the string allocation failure, which is unsurprising given that an update
to the underlying data has failed, but appears to be handled safely.)
Updated•1 year ago
|
Comment 22•1 year ago
|
||
Comment 23•1 year ago
|
||
Updated•1 year ago
|
Updated•1 year ago
|
Reporter | ||
Comment 24•1 year ago
|
||
(In reply to Jonathan Kew [:jfkthame] from comment #21)
Created attachment 9363532 [details]
Bug 1826791 - Always check against fragment length in nsTextFrame::GetContentEnd(). r=dholbert,#layoutWith this, I can no longer reproduce an ASAN error or crash using the
simulated-OOM patch from comment 12.
(In reply to Pulsebot from comment #22)
Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7b45b836f672
Always check against fragment length in nsTextFrame::GetContentEnd().
r=dholbert
Thanks Jonathan for the patch! After check out to the git commit, I also confirm I no longer able to reproduce the access violation on BuildTextRunsScanner::ScanFrame
Updated•1 year ago
|
Comment 25•1 year ago
|
||
Please nominate this for ESR115 approval when you get a chance.
Assignee | ||
Comment 26•1 year ago
|
||
Comment on attachment 9363532 [details]
Bug 1826791 - Always check against fragment length in nsTextFrame::GetContentEnd(). r=dholbert,#layout
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: Possible out-of-bounds access after soft allocation failure, leading to crash, or perhaps data leakage or other kinds of exploit
- User impact if declined: Possible vulnerability (although seems tricky to exploit reliably without simply OOM-crashing the process)
- Fix Landed on Version: 121
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Simple bounds checks against the actually-allocated buffer size
Comment 27•1 year ago
|
||
Comment on attachment 9363532 [details]
Bug 1826791 - Always check against fragment length in nsTextFrame::GetContentEnd(). r=dholbert,#layout
Approved for 115.6esr.
Comment 28•1 year ago
|
||
uplift |
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Comment 29•1 year ago
|
||
Comment 30•1 year ago
|
||
Comment 31•1 year ago
|
||
Updated•1 year ago
|
Comment 32•7 months ago
|
||
Bulk-unhiding security bugs fixed in Firefox 119-121 (Fall 2023). Use "moo-doctrine-subsidy" to filter
Updated•6 months ago
|
Description
•