Bug 702466 (CVE-2012-0449)

Crash with EIP=0x00000000 when parsing SVG+XSLT

VERIFIED FIXED

Status

()

Core
XSLT
--
major
VERIFIED FIXED
6 years ago
3 years ago

People

(Reporter: Nicolas Grégoire, Assigned: johns)

Tracking

({verified-aurora, verified-beta, verified1.9.2})

Trunk
verified-aurora, verified-beta, verified1.9.2
Points:
---
Bug Flags:
sec-bounty +

Firefox Tracking Flags

(firefox9 wontfix, firefox10+ fixed, firefox11+ verified, firefox12+ verified, firefox-esr1010+ verified, blocking1.9.2 .26+, status1.9.2 .26-fixed)

Details

(Whiteboard: [sg:critical][see comment 6 - comment 9 for what's busted][qa!], crash signature)

Attachments

(7 attachments, 5 obsolete attachments)

233 bytes, image/svg+xml
Details
20.16 KB, text/plain
Details
79.54 KB, image/png
Details
1.10 KB, patch
sicking
: review+
Details | Diff | Splinter Review
14.36 KB, patch
Details | Diff | Splinter Review
8.96 KB, patch
dveditz
: review+
Details | Diff | Splinter Review
1.32 KB, patch
dbaron
: review+
Details | Diff | Splinter Review
(Reporter)

Description

6 years ago
Created attachment 574456 [details]
Minimized SVG file (load from local file to test)

When opening a specific SVG file (created by a mutation fuzzer), the browser will crash with EIP=0x00000000. This was tested on Firefox 3.6.24 / Ubuntu 10.04 LTS and Firefox 8.0 / WinXP SP3.

Under Linux : a call to *(%eax) at  _ZN20txStylesheetCompiler15flushCharactersEv+26 will set EIP to 0x000000. An annotated gdb session is attached. A similar behavior is constated on Firefox 8.0 (a screenshot of windbg !exploitable is attached too).
(Reporter)

Comment 1

6 years ago
Created attachment 574457 [details]
GDB session showing the crash on FF 3.6.24-dbg / Ubuntu
(Reporter)

Comment 2

6 years ago
Created attachment 574458 [details]
WinGDB !exploitable screenshot
I'm not seeing a crash on current trunk (on Mac).  Just an XML parser error from the failed XSLT load....
Component: SVG → XSLT
QA Contact: general → xslt
I initially just saw an XML parser error, too, but then I was able to crash.
bp-890b1eac-dd1e-4bd9-b3d8-2660e2111114

With a locally-saved version of the testcase, I was able to crash on the first try w/ a fresh profile, in my nightly as well as in my debug build.
Status: UNCONFIRMED → NEW
Ever confirmed: true
We crash here:
> 494         rv = (mHandlerTable->mTextHandler)(mCharacters, *this);
http://hg.mozilla.org/mozilla-central/annotate/b51eb4007926/content/xslt/src/xslt/txStylesheetCompiler.cpp#l494

...because mHandlerTable->mTextHandler is a null pointer.
Hardware: x86 → All
Version: 8 Branch → Trunk
OK, this looks kinda bad...

We get the faulty mHandlerTable (mentioned in comment 5) from a call to "popHandlerTable", which pops a raw pointer off of "mOtherStack" (a stack of void*'s) and casts this raw pointer to a txHandlerTable*.

However, the raw pointer is *not* really a txHandlerTable* -- it's in fact a txElementHandler*, which was pushed onto the stack here:
http://hg.mozilla.org/mozilla-central/annotate/eb84780783ed/content/xslt/src/xslt/txStylesheetCompiler.cpp#l345

So we're effectively casting a txElementHandler* to a txHandlerTable*, and then trying to work with the txHandlerTable* pointer, and crashing.
Note that we never actually get a call to "pushHandlerTable" on this testcase (though we do get popHandlerTable as noted above.  I don't know our XSLT code very well, but I'd expect that we'd intend for our pushHandlerTable/popHandlerTable calls to be balanced...
So the evil popHandlerTable() call comes from txFnEndEmbed(), which calls txFnEndStylesheet(), which calls popHandlerTable().

Balancing this, we have an earlier call to txFnStartEmbed(), which in an ideal world would call txFnStartStylesheet(), which would call pushHandlerTable().

(this is all nicely balanced, with just "start" vs "end" and "push" vs "pop")

However, with this bug's testcase, txFnStartEmbed takes an early-return, because 'aState.handleEmbeddedSheet()' fails.  So basically, we're in a sort of error state, and we bail out early & never get to the pushHandlerTable() call.

So, it looks like we need to detect this same error state when we're cleaning up, so we can skip the unbalanced popHandlerTable call there.
Ah, so we actually do have the same "handleEmbeddedSheet()" check in txFnEndEmbed() -- but we pass that check there, because the state has changed between txFnStartEmbed and txFnEndEmbed.

In particular, txStylesheetCompiler::startElementInternal() sets mEmbedStatus = eInEmbed...
http://hg.mozilla.org/mozilla-central/annotate/eb84780783ed/content/xslt/src/xslt/txStylesheetCompiler.cpp#l317
...which will make "handleEmbeddedSheet()" pass, when we get to txFnEndEmbed().
Beyond the above, I don't really understand what's going on here or what the right fix is, since this XSLT code is pretty complex and not-really-touched-for-a-long-time.

I'm hoping that perhaps sicking knows what to do (or can suggest what to do)... Hopefully the info I've posted above will help him or someone else familiar with this code to figure out what needs fixing. :)
Whiteboard: [see comment 6 - comment 9 for what's busted]
Is it possible to get a non-null txElementHandler* ? This is a very small testcase so it's hard to tell if it's null because that's the only bogus value possible or just due to the testcase.
Duplicate of this bug: 701806
Crash Signature: [@ @0x0 | txStylesheetCompiler::flushCharacters ]
John will be looking into this bug, re-assigning.
Assignee: nobody → jschoenick
status1.9.2: --- → wanted
status-firefox10: --- → affected
status-firefox11: --- → affected
status-firefox12: --- → affected
status-firefox9: --- → affected
Sicking agrees this is probably exploitable -> sg:critical
Whiteboard: [see comment 6 - comment 9 for what's busted] → [sg:critical][see comment 6 - comment 9 for what's busted]
Unrealistic to fix for fx10 with the release next week (builds already in QA hands I think) but we need to fix by Fx11. Discovered independently by two researchers on the same weekend, we should assume it's fairly easy to fine should anyone else also be poking in this area.
status-firefox9: affected → wontfix
tracking-firefox10: --- → -
tracking-firefox11: --- → +
tracking-firefox12: --- → +
I want to understand something here:  Are we planning to ship with this bug unfixed because of the train schedule?
This affects 1.9.2.26 as well. We should respin to take this if we're going to take it in the other simultaneous releases.
(Assignee)

Comment 21

5 years ago
Created attachment 592312 [details] [diff] [review]
WIP - Add type checking to pushPtr/popPtr in XSLT code

I'm not sure I like this, but it works. Right now it just calls NS_ABORT_IF_FALSE, but it should ideally gracefully fail on the xslt compile. The string typenames should probably be swapped with an enum or somesuch.

I had a nicer version that used macros to avoid having any additional arguments to pushPtr, but I gave up on finding a simple way to tokenize the typenames in a macro.
Comment on attachment 592312 [details] [diff] [review]
WIP - Add type checking to pushPtr/popPtr in XSLT code

Yeah, something like an enum is needed here. These string operations will be too slow.
(Assignee)

Comment 23

5 years ago
Created attachment 592333 [details] [diff] [review]
Fix state corruption when embedded stylesheet is of wrong node type

The issue as I understand it is this:
1) mEmbedStatus is eNeedEmbed, as the stylesheet declaration referenced an embedded stylesheet
2) txFnStartEmbed handler is matched for a parent element, but because eInEmbed was not set it leaves without touching it.
3) An element with the ID matching said embedded stylesheet is found, and mEmbedStatus is set to eInEmbed
4) The element does not otherwise match an embedded stylesheet, so txFnStartEmbed is not one of the handlers that gets called. mEmbedStatus is still eInEmbed
5) The stack begins unwinding, calling endElement on all elements
6) Because mEmbedStatus is still eInEmbed, the exit handler for the embed handler on the stack from part 2 believes it is exiting from *it's* embed, and calls txFnEndStylesheet.
7) Mismatched stack.

This patch should fix this by adding a new state, eTryEmbed, which is set before calling the element handlers. If a startembed handler is reached, it calls startEmbedding() to move state into inEmbed, otherwise the startElement call moves state back to eNeedEmbed to attempt to match other elements (the XML document is invalid at this point, so we could also move to eNoEmbed).

... That is, assuming I fully understand what is going on here.
Attachment #592333 - Flags: feedback?(jonas)
(Assignee)

Comment 24

5 years ago
Created attachment 592334 [details] [diff] [review]
Fix state corruption when embedded stylesheet is of wrong node type

(Attached wrong version of patch)
Attachment #592333 - Attachment is obsolete: true
Attachment #592333 - Flags: feedback?(jonas)
Attachment #592334 - Flags: feedback?(jonas)
(Assignee)

Comment 25

5 years ago
Created attachment 592350 [details] [diff] [review]
Fix state corruption when embedded stylesheet is of wrong node type
Attachment #592334 - Attachment is obsolete: true
Attachment #592334 - Flags: feedback?(jonas)
(Assignee)

Comment 26

5 years ago
Created attachment 592357 [details] [diff] [review]
Fix state corruption by directing LRE nodes through the handler

After talking to Jonas, he believes the only way the handler is skipped is due to LREs. Because the handler checks namespace, directing LREs to the embed handler should also fix this issue. This patch updates the HandlerTable to point at Stop/EndEmbed regardless.
(Assignee)

Comment 27

5 years ago
Created attachment 592358 [details] [diff] [review]
Validate stack usage throughout the compiler

This validates the stack pushes/pops throughout the parser - which would catch other similar unforseen edge cases.
Attachment #592312 - Attachment is obsolete: true
Attachment #592350 - Attachment is obsolete: true
(Assignee)

Updated

5 years ago
Attachment #592357 - Flags: review?(jonas)
(Assignee)

Updated

5 years ago
Attachment #592358 - Flags: review?(jonas)
Comment on attachment 592357 [details] [diff] [review]
Fix state corruption by directing LRE nodes through the handler

Review of attachment 592357 [details] [diff] [review]:
-----------------------------------------------------------------

Yeah, this should work fine. I'm pretty sure this was how I intended it to be written originally.
Attachment #592357 - Flags: review?(jonas) → review+
Comment on attachment 592358 [details] [diff] [review]
Validate stack usage throughout the compiler

Review of attachment 592358 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #592358 - Flags: review?(jonas) → review+
Either of these patches seem very safe. Safe enough that I think we should take them in FF10.

The variable-stack patch is probably slightly safer in that it won't affect any behavior other than when things have gone really really bad and we'd likely crash (exploitably) anyway.

The LRE patch will only adjust behavior in very rare conditions, even for XSLT usage. It only affects embedded XSLT stylesheets which I don't think I've never seen in the wild, and it only does so with erroneous such stylesheets. In fact, we've likely always crashed for such stylesheets so it's unlikely going to affect anyone in a negative way.
(In reply to Jonas Sicking (:sicking) from comment #30)
> Either of these patches seem very safe. Safe enough that I think we should
> take them in FF10.

Are both of these patches together necessary to plug the exploit hole? It's not clear if they're both targeting the same problem or serving different purposes. If it's the same problem, I'd prefer to take less code change.

We should also check to make sure that these patches apply to the 1.9.2 branch (marked as affected).
(Assignee)

Comment 32

5 years ago
Either patch by itself will fix the security hole, but both should eventually be landed.

The "Validate stack usage" patch adds more safety to the XSLT compiler by type checking all elements pushed/pop'd from it's stack. Thus, stack desync in the compiler (such as caused by this bug) just cause an un-exploitable browser abort:

> ###!!! ABORT: Expected type does not match top element type on stack: 'type == aType', file /home/johns/moz/mozilla-central/content/xslt/src/xslt/txStylesheetCompiler.cpp, line 727

The other patch fixes this specific bug, but doesn't address the lack of safety for the stack, meaning any undiscovered bugs in the compiler could easily become exploitable.

I'll ensure these apply to 1.9.2 and attach an alternate version if necessary
(Assignee)

Comment 33

5 years ago
Created attachment 592363 [details] [diff] [review]
Validate stack usage throughout the compiler. r=sicking

Fix commit message
Attachment #592358 - Attachment is obsolete: true
(Assignee)

Comment 34

5 years ago
Created attachment 592367 [details] [diff] [review]
[1.9.2] Validate stack usage throughout the compiler

Rebased the validate patch on 1.9.2 -- I don't have a working 3.6 build environment right now, and it's 11pm, so I can't test it other than the-involved-files-still-compile.

The other patch applies cleanly to 1.9.2
Pushed the "Validate stack usage throughout the compiler" patch as https://hg.mozilla.org/mozilla-central/rev/dba0b31edfbf
Attachment #574456 - Attachment description: Minimized SVG file → Minimized SVG file (load from local file to test)
I applied the 1.9.2 patch and it does convert the crash from this exploitable one to the ABORT mentioned in comment 32 -- in a debug build. but NS_ABORT_IF_FALSE is debug-only so I'm not sure this fixed anything. About to test the LRE patch.
Created attachment 592444 [details] [diff] [review]
Abort in release builds too
Attachment #592444 - Flags: review?(dbaron)
Comment on attachment 592444 [details] [diff] [review]
Abort in release builds too

r=dbaron (trusting Jonas that the *intent* was to abort in release builds)
Attachment #592444 - Flags: review?(dbaron) → review+
Checked in the "Abort in release builds" patch.

https://hg.mozilla.org/mozilla-central/rev/79d0b6168a53
I don't think the LRE patch made any difference. Either I crash on the abort added in the first patch, or if I comment that out I get basically the same stack as the one Nicolas got on Ubuntu with 3.6.24. The only difference is mine ends at txStylesheetCompiler::endElement (txStylesheetCompiler.cpp:379), his goes one more frame to txStylesheetCompiler::flushCharacters.

If you convert the NS_ABORT_IF_FALSE to NS_RUNTIMEABORT as in Jonas's patch then that's where we die, only under control.

This is throwing ourselves on the grenade to save our users. Should we leave this bug open until we get a fix for whatever is causing the stack to get unbalanced, clone this into a new bug for the now non-exploitable DoS (my preference), or just forget it as too rare to worry about?
Comment on attachment 592444 [details] [diff] [review]
Abort in release builds too

[Triage Comment]
Approved for 1.9.2.26 (applies with fuzz), a=dveditz
Approved for Aurora and Beta.
Attachment #592444 - Flags: approval1.9.2.26+
Attachment #592444 - Flags: approval-mozilla-beta+
Attachment #592444 - Flags: approval-mozilla-aurora+
Comment on attachment 592367 [details] [diff] [review]
[1.9.2] Validate stack usage throughout the compiler

r=dveditz for the backport
approved for 1.9.2.26, a=dveditz
Attachment #592367 - Flags: review+
Attachment #592367 - Flags: approval1.9.2.26+
(Assignee)

Comment 43

5 years ago
Daniel - the crash in this bug should be fixed entirely by the LRE patch, whereas any other bugs that break the stack trigger the fall-on-the-grenade behaviour (that is, the abort should never be 'proper' behaviour).

If the LRE patch on 3.6 is still reaching an abort case, it's possible there's a separate issue there, which we should spin off into a new bug. The abort patch should mean it is a DoS at worst.
Landed on 1.9.2
  http://hg.mozilla.org/releases/mozilla-1.9.2/rev/4e2214da4a82
  http://hg.mozilla.org/releases/mozilla-1.9.2/rev/45b10d6e1176

and GECKO19226_2012012406_RELBRANCH
  http://hg.mozilla.org/releases/mozilla-1.9.2/rev/2202afecc4e7
  http://hg.mozilla.org/releases/mozilla-1.9.2/rev/308e9289894d

Marking status1.9.2 and status-firefox12 "fixed" to represent the security bug fixes that landed.
status1.9.2: wanted → .26-fixed
status-firefox12: affected → fixed
Thanks for fixing the patches and verifying that the security hole is now plugged up, Dan!

John - would you mind putting together a list of areas (and strategies) that QA should focus on when testing for functional regressions?
(In reply to John Schoenick [:johns] from comment #43)
> If the LRE patch on 3.6 is still reaching an abort case,

Turns out I hand-applied it wrong ("just one line," I thought, "how hard can it be?"). Actually using 'patch' did work to stop the crash on 3.6.
Comment on attachment 592357 [details] [diff] [review]
Fix state corruption by directing LRE nodes through the handler

[Triage Comment]
Talked this over with akeybl and we're going to approve the LRE fix for all the branches, too.
Attachment #592357 - Flags: approval1.9.2.26+
Attachment #592357 - Flags: approval-mozilla-beta+
Attachment #592357 - Flags: approval-mozilla-aurora+
Comment on attachment 592363 [details] [diff] [review]
Validate stack usage throughout the compiler. r=sicking

[Triage Comment]
Approved for Aurora and Beta.
Attachment #592363 - Flags: approval-mozilla-beta+
Attachment #592363 - Flags: approval-mozilla-aurora+
Checked in the LRE fix [attachment 592357 [details] [diff] [review]] on 1.9.2 and the 1.9.2.26 relbranch

http://hg.mozilla.org/releases/mozilla-1.9.2/rev/177e42a7014e
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/18ac94cc350c
And the LRE fix checked into m-c
https://hg.mozilla.org/mozilla-central/rev/aee879a3190a

both m-c and 1.9.2 now have three patches applied and should be done.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Alias: CVE-2012-0449
http://hg.mozilla.org/releases/mozilla-aurora/rev/4a961cfc4408
http://hg.mozilla.org/releases/mozilla-aurora/rev/1c2291b526cb
http://hg.mozilla.org/releases/mozilla-aurora/rev/95dd0a69c672

http://hg.mozilla.org/releases/mozilla-beta/rev/97e6aa5f0523
http://hg.mozilla.org/releases/mozilla-beta/rev/26ec9f4367b2
http://hg.mozilla.org/releases/mozilla-beta/rev/c5898e7a8d36
blocking1.9.2: --- → .26+
status-firefox10: affected → fixed
status-firefox11: affected → fixed
tracking-firefox10: - → +
(Assignee)

Comment 52

5 years ago
(In reply to Alex Keybl [:akeybl] from comment #45)
> Thanks for fixing the patches and verifying that the security hole is now
> plugged up, Dan!
> 
> John - would you mind putting together a list of areas (and strategies) that
> QA should focus on when testing for functional regressions?

The crash does not occur in 100% of cases, but I've been able to reproduce it nearly 100% via
1) Save the crash .svg file locally
2) Visit the local file in firefox on a fresh profile

Make sure you can reproduce the crash reliably on a known-affected build - the builds with all three patches should merely print a parse error message.

For regressions - This code is 100% in XSLT parsing, so any regressions would be in XSLT documents. It specifically affects embedded stylesheets inside of XSLT documents - something which is extremely rare. I can't even find a page using them as an example, and a regression there would probably not be particularly severe.

XSLT in general is actually pretty rare - you'll mostly find it on tutorial sites explaining how to use it, e.g. http://www.w3schools.com/xsl/xsl_examples.asp

The most likely regressions would be:
1 - Valid XSLT documents with embedded stylesheets fail to render, instead giving a parse error.
2 - XSLT documents of any kind cause the browser to immediately close with an ABORT message on console. Such a crash, if found, would not be a security risk -- It would indicate our patches had exposed another bug in the XSLT compiler.
(In reply to John Schoenick [:johns] from comment #52)
> an ABORT message on console. Such a crash, if found, would not be a security
> risk -- It would indicate our patches had exposed another bug in the XSLT
> compiler.

Such a crash would signal the presence of a previously unknown lurking vulnerability mitigated by this patch, in other words also a success.
blocking1.9.2: .26+ → ---
status1.9.2: .26-fixed → wanted
status-firefox10: fixed → affected
status-firefox11: fixed → affected
status-firefox12: fixed → affected
tracking-firefox10: + → -
Sorry if the accidental flag unsetting alarmed folks. I was trying to agree with John that regressions like #2 would not be a security problem but in fact would be the opposite: symptoms of a security problem that has been de-fanged.
blocking1.9.2: --- → .26+
status1.9.2: wanted → .26-fixed
status-firefox10: affected → fixed
status-firefox11: affected → fixed
status-firefox12: affected → fixed
tracking-firefox10: - → +
Verified on latest nightly, aurora, and 10RC on Win7, Mac OS X,  I tried the test case on previous builds and the application crashed, then I tried on the latest, fixed, builds, and I got an XSLT parsing error message. 

I tried several of the examples in comment #52 aftewards, and I saw no parsing problems, and all of the XSLT edits applied as expected on the sample XML files.
Keywords: verified-aurora, verified-beta
And linux, ubuntu.
Verified fixed for Firefox 3.6.26. As with Juan's testing in comment 55, I used the attached testcase and some of the examples in comment 52. I confirmed the behaviour in Firefox 3.6.25 and verified that it is indeed fixed in Firefox 3.6.26. Tested Windows 7, Mac OS X 10.6.8, and Ubuntu 11.10.
Keywords: verified1.9.2
status-firefox11: fixed → verified
status-firefox12: fixed → verified
Whiteboard: [sg:critical][see comment 6 - comment 9 for what's busted] → [sg:critical][see comment 6 - comment 9 for what's busted][qa!]
Sorry for the bug spam. I think this can be called VERIFIED FIXED based on recent comments.
Status: RESOLVED → VERIFIED

Updated

5 years ago
status-firefox-esr10: --- → fixed
tracking-firefox-esr10: --- → 10+
Group: core-security
Verified fixed with Firefox 10.0.6esrpre 2012-06-21.
status-firefox-esr10: fixed → verified
rforbes-bugspam-for-setting-that-bounty-flag-20130719
Flags: sec-bounty+
You need to log in before you can comment on or make changes to this bug.