Open Bug 485941 (CVE-2009-1232) Opened 11 years ago Updated 2 days ago
Stack overflow using overly-deep XML tree (Do
230.00 KB, application/x-tar
57.02 KB, text/plain
223.60 KB, application/xml
3.15 KB, patch
|Details | Diff | Splinter Review|
41.69 KB, image/png
30.60 KB, image/png
655 bytes, text/html
63.68 KB, application/zip
5.28 KB, patch
|Details | Diff | Splinter Review|
Testcase that can be used to determine the max allowed depth for an XML parser in XMLHttpRequest, as long as the limit is < 1e6
755 bytes, text/html
47 bytes, text/x-phabricator-request
|Details | Review|
I tested this both via HTTP and via file:// on trunk (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2a1pre) Gecko/20090323 Minefield/3.6a1pre), 3.5b4pre (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1b4pre) Gecko/20090330 Shiretoko/3.5b4pre), and 3.0.9pre (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:188.8.131.52pre) Gecko/2009033004 GranParadiso/3.0.9pre). In all three (x2) cases, all I got was "XML Parsing Error: no element found", "<file path>", and "Line Number 1, Column 228891:". No crash or anything...
I get this stack on 64 bit linux/debug/1.9.0 when reloading the testcase. 1.9.1 and trunk work fine. Stack 'loops' here: 0x00002aaab35eb36b in nsCSSFrameConstructor::ConstructFrameInternal (this=0x1f0e390, aState=@0x7fffea9e8830, aContent=0x2aaab86866b0, aParentFrame=0x2aaab9666ca8, aTag=0x2aaab8686650, aNameSpaceID=0, aStyleContext=0x2aaab9666d08, aFrameItems=@0x7fffe9dee510, aXBLBaseTag=0) at /home/smaug/mozilla/mozilla_cvs/mozilla/layout/base/nsCSSFrameConstructor.cpp:7758 0x00002aaab35eb72c in nsCSSFrameConstructor::ConstructFrame (this=0x1f0e390, aState=@0x7fffea9e8830, aContent=0x2aaab86866b0, aParentFrame=0x2aaab9666ca8, aFrameItems=@0x7fffe9dee510) at /home/smaug/mozilla/mozilla_cvs/mozilla/layout/base/nsCSSFrameConstructor.cpp:7580 0x00002aaab35eb948 in nsCSSFrameConstructor::ProcessInlineChildren (this=0x1f0e390, aState=@0x7fffea9e8830, aContent=0x2aaab86864f0, aFrame=0x2aaab9666ca8, aCanHaveGeneratedContent=1, aFrameItems=@0x7fffe9dee510, aKidsAllInline=0x7fffe9dee52c) at /home/smaug/mozilla/mozilla_cvs/mozilla/layout/base/nsCSSFrameConstructor.cpp:12851 0x00002aaab35e9f6d in nsCSSFrameConstructor::ConstructInline (this=0x1f0e390, aState=@0x7fffea9e8830, aDisplay=<value optimized out>, aContent=0x2aaab86864f0, aParentFrame=0x2aaab9666bb0, aStyleContext=0x2aaab9666c10, aIsPositioned=0, aNewFrame=0x2aaab9666ca8) at /home/smaug/mozilla/mozilla_cvs/mozilla/layout/base/nsCSSFrameConstructor.cpp:12622 #20 0x00002aaab35ea989 in nsCSSFrameConstructor::ConstructFrameByDisplayType (this=0x1f0e390, aState=@0x7fffea9e8830, aDisplay=0x2aaab94ac270, aContent=0x2aaab86864f0, aNameSpaceID=<value optimized out>, aTag=<value optimized out>, aParentFrame=0x2aaab9666bb0, aStyleContext=0x2aaab9666c10, aFrameItems=@0x7fffe9deebf0, aHasPseudoParent=0) at /home/smaug/mozilla/mozilla_cvs/mozilla/layout/base/nsCSSFrameConstructor.cpp:6636
We should probably get this fixed sometime soon since it was posted publicly. Dan, what say you? Also: This looks like CSS stuff. bz or dbaron: Can you take a look at this and see what fixed it on trunk/1.9.1 so we can backport it?
OK, some really quick comments without having managed to reproduce this yet: 1) There is no XUL involved anywhere here. The fact that there is a claim that it's involved makes me suspect whoever's making said claims. 2) nsCSSFrameConstructor has to do with "CSS" only insofar as it's the thing that builds the frame tree. 3) The poc.xml file is a malformed XML file which is chiefly notable for having absolutely no end tags. It does have 30000 start tags, though. If we should happen to actually try to lay the document out before detecting that it's malformed, it's quite likely that we will get a stack overflow during the recursive frame construction process. Is that what's being reported? Or something else? The url in the url field is short on detail. ;)
For what it's worth, on my testcase Opera seems to hang for a few minutes, then crash, while Safari gives an XML parsing error that's completely unrelated to the actual way the file is malformed. In any case, I'd love to know whether anyone's claiming that there's anything here other than us being unable to make a function call due to C stack exhaustion, and if so what the basis for claiming it is...
Oh, and to be clear, directly loading attachment 370116 [details] is sufficient to crash any Gecko that does incremental XML parsing (3.0 and later). No need for the silly HTML iframe thing.
Note that the Opera thing is also http://www.milw0rm.com/exploits/8320 and explicitly points to this testcase. So my general take on this is that yes, this is a crash, no it's not exploitable, unless I'm seriously missing something here.
So the problem might be that we don't cap the depth of an XML tree like we do for HTML? Should probably add that to the Expat driver.
Assignee: nobody → peterv
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: --- → mozilla1.9.2a1
Do we want to just ignore things above a certain level of nesting or show the error page? This throws a generic error, if we do that we can probably add a better error message on trunk. Note that pretty printing doubles or triples the nesting level, but we don't have to account for that because the XSLT processor already cuts things off.
Not for 184.108.40.206 since it's just a DoS and opening up because it was reported publicly.
Whiteboard: [sg:investigate] → [sg:dos]
Comment on attachment 370209 [details] [diff] [review] v1 Not sure who comment 10 was directed at, trying parser guys who weren't CC'd on the bug and probably didn't see the question.
(no evidence of memory corruption here).
Summary: Possible XML XUL parser memory corruption (DoS) → Stack overflow using overly-deep XUL XML (DoS)
Or XUL for that matter.
Summary: Stack overflow using overly-deep XUL XML (DoS) → Stack overflow using overly-deep XML tree (DoS)
Comment on attachment 370209 [details] [diff] [review] v1 I *don't* think we should do this. It's not stopping any DoS attacks anyway since you can create arbitrarily deep trees using the DOM. Additionally the current patch will affect pure data documents, such as DOMParser and XMLHttpRequest documents. Where it not only makes a lot more sense to create very deeply nested DOMs, it's also much safer. The limit we have in the HTML parser makes a lot more sense IMHO since it's easy there to create tag soup that produces very deeply nested trees by accident. Feel free to re-request reviews if you disagree.
So I guess you're arguing for wontfixing this bug on branch (and on trunk?). I personally don't think it makes much sense to allow overly-deep trees. There are a number of places within our code that will just cause a stack overflow with them, so their usefulness will be limited anyway.
Hi, I've found that bug and posted it to milw0rm - when I attached with OllyDBG to Firefox I'v found out that I'm within xul module after that crash. Also ESP register can be modified, however I couldn't find any interesting attack vector other then DoS (and yes I know that ESP is volatile by the nature ;). I did some tests and Opera crashed, Mozilla crashed, however Internet Explorer parsed that malformed XML and... it recovered after nearly 10 minutes. Interesting...
> I'm within xul module If you mean libxul, that's just the big "statically link everything together" module. It contains all the DOM, layout, networking, etc, etc code. Most of it is not XUL related. > Also ESP register can be modified You mean the stack frame on which the overflow happens varies depending on precisely what's on the stack? Sure thing!
So maybe change name from libxul to libeverything :) Also as I said - ESP is volatile by default and unstable, however it gets really strange address after that crash. I tried to play with it for a while but I couldn't find any idea how to use it for code exection purposes.
Whiteboard: [sg:dos] stack-overflow → [sg:dos] stack-overflow [needs patch]
Sicking: see comment 17, are you arguing for wontfixing this? If not, what do you think we should do?
Jonas: An update here would be nice. I really don't think we should WONTFIX a bug that's clearly a crash. Stability issues are important. Anyway, this isn't exploitable and is, thus, no longer blocking our 1.9.0 or 1.9.1 releases. We do, however, *want* it if a patch has landed on trunk and baked. Please request approval after it lands on m-c.
Arg, wrote a comment here but apparently it got lost. I don't think fixing this in the XML parser is the right solution. That doesn't solve the sg:dos problem here since you can just build a deep DOM using appendChild etc. It would solve the problem of people accidentally feeding deeply nested XML documents and crashing themselves/their users, but I haven't heard of this being a real-world problem. (This is where HTML parser differs since it's easy to accidentally create a HTML document that produces deeply nested DOM trees). If we want to reduce the number of accidental crashes, then IMHO we should look at making the layout code stop creating frames beyond a certain depth since layout is extra sensitive to deeply nested trees due to placing large amounts of data on the stack. We could also look into creating a tree-iterator that lets you iterate non-recursively over a subtree. Obviously this wouldn't work everywhere, but could possibly reduce the number of accidental crashes.
This needs someone from layout to do comment 24 then. Should this move to a different component?
Assignee: peterv → nobody
Status: ASSIGNED → NEW
Sure. One thing I forgot to add is that we could also add code to the DOM that checks that we never produce trees deeper than X levels. That would be the only way to truly fix the sg:dos problem here. However I'm worried about perf issues with that, though maybe it's not a big deal. However I still think that we should make layout stop creating frames beyond a certain depth in order to not for X above to be limited by layout.
Component: XML → Layout
QA Contact: xml → layout
How is solving that problem in layout any easier than limiting DOM tree depth? I agree it might be more appropriate, but it's not any easier. It's the same problem, in fact.
And aren't there plenty of things in content that are recursive, e.g., BindToTree?
The main reason isn't that it's easier. It's that I'm pretty sure that layout is the most sensitive to deep DOM trees. Say that we fixed this bug by only changing the DOM to prevent trees deeper than X. This means that we'd have to make X whatever limit that the layout code has (seemingly 200 based on the current MAX_REFLOW_DEPTH define). This even for things like XMLHttpRequest loaded documents and other types of non-displayed data documents, which I think would be unfortunate. If we do want to fix people unexpectedly running into crashes due to too deep DOMs, then I think layout is where we should start fixing that. On ther other hand if we are worried about the sg:dos problem then I think we should fix both layout and the DOM.
Maybe the thing to do is to limit DOM depth to some fixed limit for non-data documents, and ensure that all the algorithms we use to walk data documents are non-recursive.
We could try doing that... That said, the depth check on each DOM mutation could get a little expensive. Another option is to keep track of "current depth" in all DOM nodes (parent depth + 1), which should be doable via Bind/Unbind without too much trouble. Then recursive algorithms could check depth and stop as needed... Having depth in tree stored on nodes would actually be possibly-helpful for other things too (e.g. parallelizing selector matching).
blocking2.0: ? → -
Is this url related to this bug? http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2008.xml
Target Milestone: mozilla1.9.2a1 → ---
Version: unspecified → Trunk
People keep testing this. Hooking up Socorro... From Sept 1 - October 4 http://niebezpiecznik.pl/PoC/Firefox_3.6 356 nsGenericElement::UnbindFromTree(int, int) 53 nsBindingManager::RemovedFromDocument(nsIContent*, nsIDocument*) 19 xul.dll@0x120a7e 17 xul.dll@0x91ed4 13 xul.dll@0x13e23e 12 nsGenericElement::UnbindFromTree(bool, bool)/poc.html 1 nsGenericElement::GetChildAt(unsigned int) 1 SelectorMatches 1 txXPathNodeUtils::localNameEquals(txXPathNode const&, nsIAtom*) 1 xul.dll@0x196731
Crash Signature: int) ] [@ nsXMLElement::UnbindFromTree(int, int) ] [@ xXPathNodeUtils::localNameEquals(txXPathNode const&, nsIAtom*) ] [@ SelectorMatches ] [@ nsBindingManager::RemovedFromDocument(nsIContent*, nsIDocument*) ] [@ nsGenericElement::GetChildAt(unsigned …
When do you think it will be fixed? It still works on 10.0.2.
Not very fast 13.0a1 have the same crash .
Hi all. I have a crash case that I found as part of my project. Basically when creating a grid generator algorithm I stumbled in a recursion case where JS is creating DIVs ( float and clear ) types. This reminded me of 1997, februray when I attempted to write pixels using 1x1 divs in an attempt to make a canvas app. Here we are: The use case is: http://www.telasocial.com/p/d/grid-layout/ The crash case is: https://github.com/taboca/TelaSocial-Grid-Type/issues/1 The motivation for me to write in this bug is: 1) bClary pointed me of 3 possible bugs; 2) Comment 33 comment #c33 is interesting as I feel this goes all the way to user experience — understood it does not happen a lot online 3) I am attaching experience from chrome browser simply as a case of the UI they place when the detect the condition.
still causes crash on current release version (12.0) with win7.
Attachment #627097 - Attachment mime type: text/plain → text/html
Attachment #627097 - Attachment description: some js that triggers this crash reliably → some js that triggers this crash reliably (WARNING: will crash your browser if using ff)
Whiteboard: [sg:dos] stack-overflow [needs patch] → [sg:dos][stack-overflow][needs patch][tbird crash]
Whiteboard: [sg:dos][stack-overflow][needs patch][tbird crash] → [sg:dos][stack-exhaustion][needs patch][tbird crash]
Please remember may this bug duplicate but still works and crash Mozilla even 26 version . 1.http://niebezpiecznik.pl/PoC/Firefox_3.6/poc.html 2.http://niebezpiecznik.pl/PoC/Opera_10.10/poc.html 3.http://taboca.github.com/TelaSocial-Grid-Type/test.html 4.https://bugzilla.mozilla.org/attachment.cgi?id=539481&action=edit
Crash Signature: , int) ] [@ nsXMLElement::UnbindFromTree(int, int) ] [@ xXPathNodeUtils::localNameEquals(txXPathNode const&, nsIAtom*) ] → , int) ] [@ nsXMLElement::UnbindFromTree(int, int) ] [@ xXPathNodeUtils::localNameEquals(txXPathNode const&, nsIAtom*) ] [@ nsBindingManager::RemovedFromDocument ] [@ nsGenericElement::GetChildAt ] [@ nsGenericElement::UnbindFromTree ] [@ nsXMLEleme…
Investigating a crash a friend encountered when visiting this URL http://forum.palemoon.org/ appears to be very similar in nature to this current bug, Added test case attachment. Crash report in latest nightly https://crash-stats.mozilla.com/report/index/09dcb3fc-8108-49fb-9e23-848e82160602 but affects right down to 38.0 maybe lower. This issue also affects google chrome.
More crash reports that are related to this issue: https://crash-stats.mozilla.com/report/index/077b961b-178c-4e8a-9e0b-9ed172160601 https://crash-stats.mozilla.com/report/index/12c1fac2-a5ce-4c84-98ea-025fa2160601 https://crash-stats.mozilla.com/report/index/c81db66e-6e13-4f64-962b-683192160601 These crashes also occurred on http://forum.palemoon.org/. The site owner has confirmed that this bug was the cause of the crashes, and has modified the web site to work around the problem, so that particular site will no longer cause crashes. The same problem was experienced with version FF 46.0.1 and 38.8.0 ESR.
I think this is showing up in Nightly as [@ mozilla::dom::Element::UnbindFromTree ], which is a stack overflow, with Pale Moon URLs.
Crash Signature: , nsIAtom*) ] [@ nsBindingManager::RemovedFromDocument ] [@ nsGenericElement::GetChildAt ] [@ nsGenericElement::UnbindFromTree ] [@ nsXMLElement::UnbindFromTree ] [@ xXPathNodeUtils::localNameEquals ] → , nsIAtom*) ] [@ nsBindingManager::RemovedFromDocument ] [@ nsGenericElement::GetChildAt ] [@ nsGenericElement::UnbindFromTree ] [@ nsXMLElement::UnbindFromTree ] [@ xXPathNodeUtils::localNameEquals ] [@ mozilla::dom::Element::UnbindFromTree ]
The attached test case https://bugzilla.mozilla.org/show_bug.cgi?id=485941#c58 was extracted from the sites html, It contains the iframe and the xml file that was embedded in there forums html code. <iframe width="970" height="2" style="opacity:0;" src="/BingSiteData.xml"></iframe> They have removed this iframe form there forum html resolving the crash on there site. So if you need to test for the crash, The attached case can reproduce it for you.
I wonder if we're going to see this age to 10 years before it's mitigated or solved. I mean... really? You just want to leave a stack overflow issue persist? What would it take to get this solved -- an actual exploit?
The progress is only not all Firefox crash .Only exact Tab.
(In reply to pawel from comment #70) > The progress is only not all Firefox crash .Only exact Tab. Nonsense. That has nothing to do with the cause of this issue and is only a side-effect of e10s. (and if a content process crashes, it will still take all other content that is in that process with it, not just that one tab, IIUC). You can set an arbitrary depth limit (I believe other browsers also do this) to prevent too deep nesting from crashing the parser. That will solve this issue; that is what the original patch on this bug also suggests. An arbitrary depth limit of, say, 200 would satisfy even the most complex real-world examples of XML files, IMHO.
Here's an example patch that will stop this crash. It's just a git diff, since I don't use hg myself, as-adopted in my own tree. Feel free to re-format it to whatever format is required to land it on m-i/m-c
> That would it take to get this solved -- an actual exploit? This isn't exploitable stack overflow, so an actual exploit would be quite impressive. > An arbitrary depth limit of, say, 200 would satisfy even the most complex real-world examples of XML files, IMHO. I'd want to see data for this, honestly. For example, you say other browsers cap the depth; what caps do they use?
> I'd want to see data for this, honestly. For example, you say other browsers cap the depth; what caps do they use? Hmm... I know for a fact that I tested Chrome way back when and it capped the depth. It currently crashes (I reported this to them as a regression). I *distinctly* remember when I first adopted a solution for this in Pale Moon that I tested most mainstream browsers and the only one crashing was Firefox; the others would simply not display the file or display an error based on capped depth, but not crash. IE didn't like it and would hang up a long while trying to parse all of it, but that's no real surprise. It wouldn't crash, though. Edge limits it to 4096 levels (and complains about not responding, so it's likely too high for comfort). An old, old version (Chrome 40 based) of Vivaldi I still had laying around limits it to 5000 levels after a long delay, with the error: "Excessive node nesting.". I'm assuming that is what Chrome used pre-crashing, as well. If you want to be more lenient (although a depth of 200 is already really quite excessive, and you shouldn't even get close to it with any real-world XML -- I'd be interested in knowing of non-theoretical XML files that get to that depth or beyond -- then you can of course pick something closer to what others use, but I'd advise against it since you'll be pushing the stack rapidly towards its limit.
MSIE also uses 4096 levels as a cap, it just takes a hell of a lot longer to get there (several minutes).
Update on Chrome. Chrome Version 61.0.3163.79 (Official Build) (32-bit) fixes their crash regression again. The deeply-nested XML once again aborts at 5000 nesting levels with the following error: > This page contains the following errors: > error on line 5002 at column 7: Excessive node nesting. > Below is a rendering of the page up to the first error.
[Tracking Requested - why for this release]: Crash when opening deeply-nested XML; affects all versions.
Old bug, low volume crash, not exploitable, so AFAICT no need to track for 57.
Too late to fix for 57. Reevaluate for 58, please.
(In reply to Julien Cristau [:jcristau] from comment #78) > Old bug, low volume crash, not exploitable, so AFAICT no need to track for > 57. So how do you know it isn't exploitable? Low volume crash? How was that determined? Just because most sites aren't out to crash Firefox users it isn't important because it is low volume? What does the bug being old have to do with the price of eggs? There are lots of old bugs.. Are you simply waiting until you finish obliterating your own past to close this out? Why not fix it. Just adopt what other browsers are doing and work out a nesting limit and be done with it. You all love parity here right? Or is it parody of what Mozilla once was.. Not too sure anymore. Get with it boys and girls, it's been nine years.
> So how do you know it isn't exploitable? https://blogs.technet.microsoft.com/srd/2009/01/28/stack-overflow-stack-exhaustion-not-the-same-as-stack-buffer-overflow/ is the first search result on Google for "is stack exhaustion exploitable?", for what it's worth. > Low volume crash? How was that determined? By looking at our crash report statistics. > it isn't important because it is low volume? It's _less_ important than things that are _high_ volume, everything else being equal. > What does the bug being old have to do with the price of eggs? All else being equal, a newly introduced bug, for which we don't have a good estimate of impact in the wild, is a higher priority than a longstanding bug. > Just adopt what other browsers are doing and work out a nesting limit and be done with it. Note that that may not even help with all the testcases attached to this bug, some of which don't use an XML parser at all. But even if we just stick to the XML testcases... well, I don't know. I just tried and attachment 370016 [details] and attachment 8759052 [details] both don't crash on 64-bit Linux. Attachment 370116 [details] does cause a crash there, and Mark's proposed patch fixes that, but uses a much lower limit than the other browsers do. Using a limit as high as the 5000 that other browsers have again doesn't crash on 64-bit Linux; the question is what it does on 32-bit Windows...
Peter, how would you feel about just taking that limit patch, but with a 5000 tag depth?
Component: Layout → XML
Happy to volunteer testing Win32 nightly when the limit is in to let you know if the 5000 limit fixes it or if it needs to be lower.
If that patch would be the end of these kind of bug reports I'd just take it. My patch was a quick fix, but if we really care about this we should probably fix more than just the parsing case, otherwise we'll just end up with another dormant bug report with JS testcases and full of useless comments. I kind of liked the depth check idea from comment 34, though obviously a bit worried about memory impact of storing the depth.
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #85) > Created attachment 8925042 [details] Not sure what your testcase is supposed to be doing, because all it does is lock up Firefox and make it non-responding.
> because all it does is lock up Firefox and make it non-responding Well, in Firefox there is no cap, so the testcase takes a very long time trying to find one and not succeeding... Try running it in Chrome or Safari or Edge.
Right, sorry. MSIE: 4096 Pale Moon: 199 Chrome: 5000 Edge: 4096 Safari on Windows doesn't provide any output. So, this is exactly what I reported before.
> So, this is exactly what I reported before. Right, and I appreciate you doing the testing! Safari on Mac reports 5000 as well.
> otherwise we'll just end up with another dormant bug report with JS testcases and full of useless comments I suspect it's a lot harder to hit this by accident with JS than it is with something dumping out XML. Especially if the something dumping out XML just forgets to close all its tags. I would be OK with a followup to consider the depth-in-nodes approach, but we'd need to figure out what to do on "too deep", exactly; it's pretty non-obvious.
You need to log in before you can comment on or make changes to this bug.