[32-bit] Access violation on nsExpatDriver::HandleError -> AppendErrorPointer
Categories
(Core :: XML, defect)
Tracking
()
People
(Reporter: sourc7, Assigned: shravanrn)
References
Details
(5 keywords, Whiteboard: [reporter-external] [client-bounty-form] [verif?][adv-main113+][adv-ESR102.11+][qa-triaged])
Attachments
(6 files)
729.79 KB,
text/html
|
Details | |
3.58 KB,
text/plain
|
Details | |
863.40 KB,
text/html
|
Details | |
3.58 KB,
text/plain
|
Details | |
48 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-beta+
RyanVM
:
approval-mozilla-esr102+
tjr
:
sec-approval+
|
Details | Review |
110 bytes,
text/plain
|
Details |
When running new DOMParser().parseFromString(document.documentElement.outerHTML, 'application/xml')
on large HTML continuously in loop, Firefox 32-bit able to crash with AV_READ
on nsExpatDriver::HandleError()
at changeable address that depends on DOM size or the length of HTML.
When running the testcase on Firefox single process ./mach run --disable-e10s
then attach gdb debugger, the crash is shown at following code:
954 // Last character will be '^'.
955 int32_t last = aColNumber - 1;
956 int32_t i;
957 uint32_t minuses = 0;
958 for (i = 0; i < last; ++i) {
→ 959 if (aSourceLine[i] == '\t') {
960 // Since this uses |white-space: pre;| a tab stop equals 8 spaces.
961 uint32_t add = 8 - (minuses % 8);
962 aSourceString.AppendASCII("--------", add);
963 minuses += add;
964 } else {
It looks like Firefox tried to accessing out-of-bounds element on aSourceLine
array. In case the aSourceLine
array address is 0x25ce0008
then the crash address is 0x25cff000
.
Tested on:
- Firefox Nightly 113.0a1 (2023-03-27) (32-bit) on Windows 11
- Firefox 111.0.1 (32-bit) on Windows 11
- Firefox ESR 102.9.0esr (32-bit) on Windows 11
Steps to reproduce:
- Visit attached testcase-0x2.html
- After a few seconds, the Firefox tab crashes
If Firefox tab doesn't crash after a while, try close the tab and try again.
Reporter | ||
Comment 1•2 years ago
|
||
Reporter | ||
Comment 2•2 years ago
|
||
Reporter | ||
Comment 3•2 years ago
|
||
Updated•2 years ago
|
Comment 4•2 years ago
|
||
peterv, you might be familiar with the relevant code.
Updated•2 years ago
|
Comment 5•2 years ago
|
||
I see some OOM crashes here in crash-stats, but that's a bit different from an access violation. Two examples in this area are bug 1674989 and https://crash-stats.mozilla.org/signature/?proto_signature=~nsExpatDriver%3A%3AHandleError&date=%3E%3D2023-03-21T14%3A24%3A00.000Z&date=%3C2023-03-28T14%3A24%3A00.000Z&_sort=-date&signature=OOM%20%7C%20large%20%7C%20NS_ABORT_OOM%20%7C%20nsTSubstring%3CT%3E%3A%3AAllocFailed%20%7C%20nsTSubstring%3CT%3E%3A%3AAppend%20%7C%20AppendErrorPointer
Updated•2 years ago
|
Updated•2 years ago
|
Reporter | ||
Comment 6•2 years ago
|
||
When visiting the testcase on Firefox 64-bit, it also possible to crash on out-of-bounds address.
But the Firefox tab process have to fill OS free memory which take a couple of seconds, it's faster to see the crash on lower system RAM, when the crash doesn't occur after the page has loaded, try reloading the page until it crashes.
Comment 7•2 years ago
|
||
Marking S2, since I expected rlbox to catch this (or maybe this is how rlbox catching it manifests?). peterv, please downgrade if this should be S3 instead.
Comment 9•2 years ago
|
||
Looking at this briefly, the situation appears to be that we accept an integer |colNumber| from the RLBoxed expat without verifying it, then pass it to AppendErrorPointer, where it's used to index an array.
Which is not good, since we're supposed to assume that |colNumber| is potentially attacker-controlled. I'm guessing we missed this because the earlier usage of |colNumber| just passes it into a format string. Luckily it's only a read, and I don't see any obvious way to exploit it beyond triggering a segfault.
(In reply to Andrew McCreight [:mccr8] from comment #8)
This could be similar to bug 1825445.
Yes, looks like Irvan is poking at RLBox and trying to break it, which is awesome — thanks Irvan! By the way — I assume you've seen our Exploit Mitigation Bug Bounty Program, but I wanted to make sure you're aware that it's permissible to explicitly introduce a memory corruption flaw into the sandboxed library.
Assignee | ||
Comment 10•2 years ago
|
||
@bholley This does appear to be removing tainting early. Investigating more now.
Assignee | ||
Comment 11•2 years ago
•
|
||
In my attempt to repro the crash, I am only seeing valid OOMs from Firefox. That said, after examination of the code, it is entirely possible that the code path mentioned in the original post could have occurred. I am working a patch to ensure the codepath mentioned in the original post (and other similar codepaths in this file) does not lead to OOB access
@peterv: I'm marking myself as the Assignee for this bug. Let me know if you have any concerns
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 12•2 years ago
|
||
Assignee | ||
Comment 13•2 years ago
•
|
||
Patch is up for this. Copying over the comment I had in the phab here, so we have a record
When investigating Bug 1824892 I saw some other anti-pattern instances of removing tainting earlier. While, for the most, the issue identified in Bug 1824892 is the only major OOB read (apart from non-null terminated string reads, which we are ok with in this context), I did more generally try to fix any anti-patterns I saw in this file.
Additionally, I did want to discuss a separate question
@bholley: Typically bugs like this are exploitable only if we find a memory safety violation in the sandboxed code. However, here, we seem to be potentially triggering the same code path because, we don't abort on sandbox OOM. This makes sense because of how Wasm sandboxed code is setup. Today, if a Wasm sandbox attempts to grow (to support a malloc call from sandboxed code) and there is no more space, the malloc returns null and the sandboxed program continuous to execute (Most of this code doesn't check that malloc return nonnull). This continued execution corrupts the internal sandbox memory, and the sandbox returns garbage. Assuming correct tainted checks, this shouldn't be an issue. However, I was thinking we may want to employ a defense in depth here.
Should we consider "abort on memorygrowth/malloc fail" behavior to avoid scenarios like this in the future.
We do interpose on memory growth/malloc failures of sandboxed code, but we use this as an only to annotate crash reports.
Updated•2 years ago
|
Comment 14•2 years ago
|
||
(In reply to Shravan Narayan from comment #13)
Should we consider "abort on memorygrowth/malloc fail" behavior to avoid scenarios like this in the future.
We do interpose on memory growth/malloc failures of sandboxed code, but we use this as an only to annotate crash reports.
(discussed elsewhere)
Comment 15•2 years ago
•
|
||
Backed out for crashes on nsExpatDriver.cpp.
Backout link: https://hg.mozilla.org/integration/autoland/rev/14b2b1ada6301a671e004fef1899cce5fb56577f
Log link: https://treeherder.mozilla.org/logviewer?job_id=413091487&repo=autoland&lineNumber=2650"
Updated•2 years ago
|
Assignee | ||
Comment 16•2 years ago
|
||
I have updated the patch now. Waiting for tests to complete. Assuming they pass, we can land this
Comment 17•2 years ago
|
||
I'm not sure it exactly matters given that it already landed, but this should have gotten sec-approval prior to the initial landing.
Comment 18•2 years ago
|
||
We should still do the sec-approval request on this for posterity. Also, we're nearing the end of the cycle before building RCs next week, so we'll need approval requests on this soon.
Assignee | ||
Comment 19•2 years ago
|
||
Comment on attachment 9328469 [details]
Bug 1824892: Fix tainting use in the nsExpatDriver r=glandium,bholley
Security Approval Request
- How easily could an exploit be constructed based on the patch?: This allows a memory safety error in expat to potentially lead to an out of bounds read. However, this process contains no secrets, so this is not the worst cased.
- Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Unknown
- Which older supported branches are affected by this flaw?:
- If not all supported branches, which bug introduced the flaw?: None
- Do you have backports for the affected branches?: No
- If not, how different, hard to create, and risky will they be?:
- How likely is this patch to cause regressions; how much testing does it need?: Unlikely
- Is Android affected?: Yes
Updated•2 years ago
|
Assignee | ||
Comment 20•2 years ago
•
|
||
Comment on attachment 9328469 [details]
Bug 1824892: Fix tainting use in the nsExpatDriver r=glandium,bholley
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: Low risk and at worst will help avoid crashes in low memory env.
- User impact if declined:
- Fix Landed on Version:
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Minimum change that only prevents error paths
Beta/Release Uplift Approval Request
- User impact if declined:
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: Yes
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Minimum change that only prevents error paths
- String changes made/needed:
- Is Android affected?: Yes
Comment 21•2 years ago
|
||
Comment on attachment 9328469 [details]
Bug 1824892: Fix tainting use in the nsExpatDriver r=glandium,bholley
approved to land
![]() |
||
Comment 22•2 years ago
|
||
Fix tainting use in the nsExpatDriver r=glandium
https://hg.mozilla.org/integration/autoland/rev/c7a3230cea33b1d1ea2cf9441c3abe8d678ca458
https://hg.mozilla.org/mozilla-central/rev/c7a3230cea33
Comment 23•2 years ago
|
||
Comment on attachment 9328469 [details]
Bug 1824892: Fix tainting use in the nsExpatDriver r=glandium,bholley
Approved for 113.0b9 and 102.11esr.
Comment 24•2 years ago
|
||
uplift |
Comment 25•2 years ago
|
||
uplift |
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 26•2 years ago
|
||
I managed to reproduce the crash following the steps mentioned in the description of the bug on Windows 11 on both Firefox Nightly 113.0a1 (2023-03-27) (32-bit) and (64-bit).
I still could reproduce the crash on all the latest Firefox versions(Nightly 114.0a1(2023-05-01), Firefox 113-candidate and on Firefox 102.11esr) on both Firefox 32-bit and 64-bit.
Irvan, could you please check if the issue is reproducible also on your end or if there are other steps not mentioned that I should follow.
Thanks.
Updated•2 years ago
|
Reporter | ||
Comment 27•2 years ago
|
||
(In reply to Hani Yacoub, Desktop QA from comment #26)
I managed to reproduce the crash following the steps mentioned in the description of the bug on Windows 11 on both Firefox Nightly 113.0a1 (2023-03-27) (32-bit) and (64-bit).
I still could reproduce the crash on all the latest Firefox versions(Nightly 114.0a1(2023-05-01), Firefox 113-candidate and on Firefox 102.11esr) on both Firefox 32-bit and 64-bit.
Irvan, could you please check if the issue is reproducible also on your end or if there are other steps not mentioned that I should follow.
Thanks.
After testing this repeatedly on Firefox Nightly 114.0a1 (2023-05-01) (32-bit) on Windows 11, I'm no longer able to crash the tab on EXCEPTION_ACCESS_VIOLATION_READ
(as on attached minidump.txt), the testcase still able to crash the tab, but now it crashed at EXCEPTION_BREAKPOINT
on nsExpatDriver.cpp:977 at below assertion code:
MOZ_RELEASE_ASSERT(val <= aSourceLineLength,
"Unexpected value of last column");
The assertion was added recently from current bug patch which intended to fix the AV_READ
OOB issue.
Updated•2 years ago
|
Assignee | ||
Comment 28•2 years ago
|
||
Hmm... this is interesting. In theory this shouldn't happen, but given that description we can at least confirm that RLBox is now operating as expected and stopping the issue from escalating into anything serious. I will try to look into the root cause of why the MOZ_RELEASE_ASSERT is being hit.
Comment 29•2 years ago
|
||
After testing this repeatedly on Firefox Nightly 114.0a1 (2023-05-01) (32-bit) on Windows 11, I'm no longer able to crash the tab on
EXCEPTION_ACCESS_VIOLATION_READ
(as on attached minidump.txt), the testcase still able to crash the tab, but now it crashed atEXCEPTION_BREAKPOINT
on nsExpatDriver.cpp:977 at below assertion code:MOZ_RELEASE_ASSERT(val <= aSourceLineLength, "Unexpected value of last column");
The assertion was added recently from current bug patch which intended to fix the
AV_READ
OOB issue.
The crash isn't reproducible on my end using Firefox Nightly 113.0a1 (2023-03-27) (32-bit) and (64-bit) with Windows 11, accessing both of the 2 test cases attached (testcase-0x2-minidump.txt and testcase-0x4-minidump.txt).
Reporter | ||
Comment 30•2 years ago
|
||
(In reply to Hani Yacoub, Desktop QA from comment #29)
The crash isn't reproducible on my end using Firefox Nightly 113.0a1 (2023-03-27) (32-bit) and (64-bit) with Windows 11, accessing both of the 2 test cases attached (testcase-0x2-minidump.txt and testcase-0x4-minidump.txt).
No, the minidump.txt is crash information output from the Firefox crash, I was getting the output.txt when visiting the testcase-0x2.html or testcase-0x4.html by launching Firefox Nightly 113.0a1 (2023-03-27) (32-bit) through FFPuppet.
(In reply to Hani Yacoub, Desktop QA from comment #26)
I still could reproduce the crash on all the latest Firefox versions(Nightly 114.0a1(2023-05-01), Firefox 113-candidate and on Firefox 102.11esr) on both Firefox 32-bit and 64-bit.
Can you submit the crash reports (from about:crashes) and share the crash report link? I think it will be the same as Comment 27 crash at MOZ_RELEASE_ASSERT(val <= aSourceLineLength, "Unexpected value of last column");
or another EXCEPTION_BREAKPOINT
Comment 31•2 years ago
|
||
The crash report on Firefox 113.0 (32-bit): https://crash-stats.mozilla.org/report/index/c5bafb28-6c61-46d5-8604-b07d80230503
The crash report on Firefox 102.11.0esr (32-bit): https://crash-stats.mozilla.org/report/index/d1c4b84b-a615-4e61-88e5-e30530230503
I didn't manage to catch the crash on Firefox Nightly this time, the page kept loading without crashing.
Comment 32•2 years ago
|
||
(In reply to Hani Yacoub, Desktop QA from comment #31)
The crash report on Firefox 113.0 (32-bit): https://crash-stats.mozilla.org/report/index/c5bafb28-6c61-46d5-8604-b07d80230503
The crash report on Firefox 102.11.0esr (32-bit): https://crash-stats.mozilla.org/report/index/d1c4b84b-a615-4e61-88e5-e30530230503
I didn't manage to catch the crash on Firefox Nightly this time, the page kept loading without crashing.
These are both OOM crashes, different from what Irvan is describing in comment 27.
Regardless, I don't think we need Hani to reproduce that failure just yet. I think the next step is for Shravan to investigate as he offered to do in comment 28.
Assignee | ||
Comment 33•2 years ago
|
||
I did get the MOZ_RELEASE_ASSERT(val <= aSourceLineLength, "Unexpected value of last column");
to hit, although it is inconsistent. I am investigating further.
Comment 34•2 years ago
|
||
Updated•2 years ago
|
Assignee | ||
Comment 35•2 years ago
|
||
Hmm... this is interesting. In theory this shouldn't happen, but given that description we can at least confirm that RLBox is now operating as expected and stopping the issue from escalating into anything serious. I will try to look into the root cause of why the MOZ_RELEASE_ASSERT is being hit.
I did get the MOZ_RELEASE_ASSERT(val <= aSourceLineLength, "Unexpected value of last column"); to hit, although it is inconsistent. I am investigating further.
This issue and fix is now being tracked in Bug 1831654.
Updated•1 year ago
|
Updated•8 months ago
|
Description
•