Bug 323394 (ffoxdie)

Crash due to too much recursion in frame construction (ffoxdie)

NEW
Unassigned

Status

()

defect
--
critical
14 years ago
9 months ago

People

(Reporter: jruderman, Unassigned)

Tracking

({crash, testcase})

Trunk
Points:
---
Dependency tree / graph
Bug Flags:
wanted1.9 -
wanted1.8.1.x +
wanted1.8.0.x +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [sg:dos], )

Attachments

(2 attachments)

Reporter

Description

14 years ago
 
Reporter

Comment 2

14 years ago
Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_INVALID_ADDRESS (0x0001) at 0xbf7ffff0

Thread 0 Crashed:
0   non-virtual thunk [nv:-480] to nsHTMLDocument::QueryCommandValue(nsAString_internal const&, nsAString_internal&) + 360
1   PL_DHashTableOperate + 120
2   RuleHash::EnumerateTagRules(nsIAtom*, void (*)(nsICSSStyleRule*, nsCSSSelector*, void*), void*) + 36
3   nsCSSRuleProcessor::RulesMatching(PseudoRuleProcessorData*) + 56
4   nsStyleSet::ResolveStyleForNonElement(nsStyleContext*) + 212
5   nsStyleSet::FileRules(int (*)(nsIStyleRuleProcessor*, void*), RuleProcessorData*) + 56
6   nsStyleSet::ProbePseudoStyleFor(nsIContent*, nsIAtom*, nsStyleContext*) + 224
7   nsCSSFrameConstructor::CreateGeneratedContentFrame(nsFrameConstructorState&, nsIFrame*, nsIContent*, nsStyleContext*, nsIAtom*, nsIFrame**) + 132
8   nsCSSFrameConstructor::ProcessInlineChildren(nsFrameConstructorState&, nsIContent*, nsIFrame*, int, nsFrameItems&, int*) + 132

9   nsCSSFrameConstructor::ConstructInline(nsFrameConstructorState&, nsStyleDisplay const*, nsIContent*, nsIFrame*, nsStyleContext*, int, nsIFrame*) + 168
10  nsCSSFrameConstructor::ConstructFrameByDisplayType(nsFrameConstructorState&, nsStyleDisplay const*, nsIContent*, int, nsIAtom*, nsIFrame*, nsStyleContext*, nsFrameItems&, int) + 1516
11  nsCSSFrameConstructor::ConstructFrameInternal(nsFrameConstructorState&, nsIContent*, nsIFrame*, nsIAtom*, int, nsStyleContext*, nsFrameItems&, int) + 1568
12  nsCSSFrameConstructor::ConstructFrame(nsFrameConstructorState&, nsIContent*, nsIFrame*, nsFrameItems&) + 308
13  nsCSSFrameConstructor::ProcessInlineChildren(nsFrameConstructorState&, nsIContent*, nsIFrame*, int, nsFrameItems&, int*) + 412

14  nsCSSFrameConstructor::ConstructInline(nsFrameConstructorState&, nsStyleDisplay const*, nsIContent*, nsIFrame*, nsStyleContext*, int, nsIFrame*) + 168
15  nsCSSFrameConstructor::ConstructFrameByDisplayType(nsFrameConstructorState&, nsStyleDisplay const*, nsIContent*, int, nsIAtom*, nsIFrame*, nsStyleContext*, nsFrameItems&, int) + 1516
16  nsCSSFrameConstructor::ConstructFrameInternal(nsFrameConstructorState&, nsIContent*, nsIFrame*, nsIAtom*, int, nsStyleContext*, nsFrameItems&, int) + 1568
17  nsCSSFrameConstructor::ConstructFrame(nsFrameConstructorState&, nsIContent*, nsIFrame*, nsFrameItems&) + 308
18  nsCSSFrameConstructor::ProcessInlineChildren(nsFrameConstructorState&, nsIContent*, nsIFrame*, int, nsFrameItems&, int*) + 412

...
I get a different crash, stack overflow recursing in nsHTMLDocument::RegisterNamedItems, TB13941953

http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/content/html/document/src/nsHTMLDocument.cpp&mark=3158&rev=MOZILLA_1_8_0_BRANCH#3158

Probably doesn't like a 10,000-deep call stack.

Comment 4

14 years ago
I get a stack overflow in TB13958161Q Win98

nsPseudoFrameData::nsPseudoFrameData  [c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/base/nsCSSFrameConstructor.cpp, line 890]

code below repeated for a total of 12 times:

nsCSSFrameConstructor::ConstructInline  [c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/base/nsCSSFrameConstructor.cpp, line 12892]
nsCSSFrameConstructor::ConstructFrameByDisplayType  [c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/base/nsCSSFrameConstructor.cpp, line 7010]
nsCSSFrameConstructor::ConstructFrameInternal  [c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/base/nsCSSFrameConstructor.cpp, line 8131]
nsCSSFrameConstructor::ConstructFrame  [c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/base/nsCSSFrameConstructor.cpp, line 7956]
nsCSSFrameConstructor::ProcessInlineChildren  [c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/base/nsCSSFrameConstructor.cpp, line 13080]

... stack ending here:

nsCSSFrameConstructor::ConstructInline  [c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/base/nsCSSFrameConstructor.cpp, line 12892]
nsCSSFrameConstructor::ConstructFrameByDisplayType  [c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/base/nsCSSFrameConstructor.cpp, line 7010]
nsCSSFrameConstructor::ConstructFrameInternal  [c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/base/nsCSSFrameConstructor.cpp, line 8131]
Reporter

Updated

14 years ago
Attachment #208471 - Attachment description: testcase → testcase (10000 <x>s followed by 10000 </x>s, in XHTML namespace)
Note that in HTML we somewhat limit content model depth, exactly to prevent such things from happening....
Reporter

Comment 6

13 years ago
Why doesn't MAX_FRAME_DEPTH prevent this too-much-recursion crash?
MAX_FRAME_DEPTH/MAX_REFLOW_DEPTH is only checked for reflow, not frame construction.

It would be possible to do something similar for frame construction, probably by keeping a counter on the frame constructor state.  The question is what limit to set....

And we'd probably want similar checks in ReResolveStyleContext, by the way.  And various DOM methods that recurse.  And anything else in general that recurses down the content tree.  :(
Flags: blocking1.9?
Reporter

Comment 8

13 years ago
Fixing this would make it easier to figure out a claimed security hole, bug 348514.

Comment 9

13 years ago
(In reply to comment #7)
> 
> It would be possible to do something similar for frame construction, probably
> by keeping a counter on the frame constructor state.  The question is what
> limit to set....

What would be the right place(s) to stop too deep frame construction?

> 
> And we'd probably want similar checks in ReResolveStyleContext, by the way. 
> And various DOM methods that recurse.  And anything else in general that
> recurses down the content tree.  :(

Why do we need to add a check in /various/ DOM methods and not just DOM construction methods? Wouldn't limiting at construction time, limit all other recursions too?
If you're trying to limit content tree depth, you could do the DOM construction methods....  Generally speaking, that will break pages.  But yes, that would be a little simpler.  You'd have to change the XML parser, various nsINode methods, XBL, XFT and a few other things, probably...

Comment 11

13 years ago
(In reply to comment #10)
> methods....  Generally speaking, that will break pages.  But yes, that would be
> a little simpler.  You'd have to change the XML parser, various nsINode

Simpler than what? I don't see the alternate approach outlined here. What's the idea ... how can we tackle this without breaking pages that nest that deep?
> Simpler than what?

Simpler than putting checks in various recursive methods.

There is no way to fix this without breaking very deeply nested pages ... except by rewriting Gecko to avoid the use of recursion. That would take a while.
I think limiting the depth allowed by the XML parser, just like the HTML parser does, would be a good start.

Comment 15

13 years ago
This bug crashes Fx 1.5.x to 2.0 RC1 at: http://lcamtuf.coredump.cx/ffoxdie.html

Comment 16

13 years ago
*** Bug 356404 has been marked as a duplicate of this bug. ***
*** Bug 357002 has been marked as a duplicate of this bug. ***

Comment 18

13 years ago
Test cases ffoxdie.html and ffoxdie2.html (from http://lcamtuf.coredump.cx/) both affect 1.5.0.7 (tried on FC5 and RHEL4), however ffoxdie3.html doesn't.

The backtraces of both contain tens of thousands of calls to
NSGetModule () from /usr/lib64/firefox-1.5.0.7/components/libgklayout.so

That is quite different from what both Jesse and Hermann reported.

Comment 19

13 years ago
This is more than 32000 lines long, that's why I gzipped it.
Reporter

Updated

13 years ago
OS: Mac OS X 10.2 → All
Hardware: Macintosh → All
> The backtraces of both contain tens of thousands of calls to
> NSGetModule () from /usr/lib64/firefox-1.5.0.7/components/libgklayout.so

That indicates that you're using a stripped optimized build, so the stack is pretty much useless.
Summary: Crash due to too much recursion in frame construction → Crash due to too much recursion in frame construction (ffoxdie)

Comment 21

13 years ago
are we thinking about roc's suggestion in https://bugzilla.mozilla.org/show_bug.cgi?id=323394#c14 for 1.9 or the branches?

any other ideas on fixes?
Limiting in the parser is only going to get us partly there since you can still use the DOM to create any depth you want. And even if we plug that there's XSLT, and most likely other things (templates? overlays?)

What we could do is to limit it in nsINode::InsertChildAt. That could be done efficiently (time-wise) by letting each node keep track of its own nesting depth. We will have quite a few bits left over in mFlagsOrSlots, so if we grab, say, 15 bits there we could allow trees up to 32000 nesting levels (or lower depending on where we want to limit it).

(Fwiw, such information could be used to create much faster implementations of nsContentUtils::ComparePosition)
That should say: We *still* have quite a few bits...

Comment 24

13 years ago
(In reply to comment #21)
> are we thinking about roc's suggestion in
> https://bugzilla.mozilla.org/show_bug.cgi?id=323394#c14 for 1.9 or the
> branches?
> 
> any other ideas on fixes?

JS engine uses a generic way to check for too-much recursion and treats the native stack overflow roughly like out-of-memory, see bug 192414. The same mechanics can be used in the rest of the browser where recursion is possible.
Flags: blocking1.8.1.1?
Would be great to get a fix here. Even if we don't think it's exploitable it's causing a great deal of grief.
Flags: blocking1.8.1.1?
Flags: blocking1.8.1.1+
Flags: blocking1.8.0.9+

Comment 26

13 years ago
Could someone add [@nsIFrame::SetStyleContext] in the summary. See crash #19 on http://talkback-public.mozilla.org/reports/firefox/FF20/index.html .

A good bunch of crash are coming from http://lcamtuf.coredump.cx/ffoxdie_orig.html,  	http://lcamtuf.coredump.cx/ffoxdie.html .
Reporter

Comment 27

13 years ago
I don't think that would be right.  Any given too-much-recursion crash bug can have many things at the "top" of the stack.
This is just a crash, and not exploitable.  Still looking for a fix.
Flags: wanted1.8.1.x+
Flags: wanted1.8.0.x+
Flags: blocking1.8.1.1+
Flags: blocking1.8.0.9+

Comment 29

13 years ago
(In reply to comment #28)
> This is just a crash, and not exploitable.

However. This bug was reported on many IT-sites (e.g. the german it magazine www.heise.de). If the next version of FF will arrive and there is the information, that this bug is still not fixed it sounds to the people like MS.
It would be a better publicity for FF/Mozilla if this bug is fixed in the next version.
All true, but given the choice between fixing security bugs and fixing issues that are unfortunate, not security bugs, but blown out of proportion by the press, we would rather do the former.  Security is more important than the appearance of security...
Alias: ffoxdie
Flags: blocking1.8.1.2+
Flags: blocking1.8.0.10+

Updated

13 years ago
Flags: blocking1.8.1.2+
Flags: blocking1.8.0.10+
One thing we would definitely like for 1.9 is to limit the depth of frame trees in the frame constructor. This would prevent a lot of problems.

Limiting the content tree depth would also be good but it sounds a bit harder, especially in terms of what you do when content tree depth is exceeded by someone manipulating the DOM.
Whiteboard: [wanted-1.9]
The JS engine has a stack address limit set by DOM code based on a low stack, and assumption that the stack grows down.  Many recursive JS engine functions monitor the address of a stack dummy variable and throw when this limit is reached or exceeded.  It might be better to have more Gecko code use this mechanism, or one like it, rather than counting recursion levels and limiting them with brute-force numbers, which are hard to relate to stack memory use.

/be
Reporter

Comment 33

13 years ago
I imagine the two strategies would have different implications for how hard it is to maintain Gecko's key invariants.  They might even require changes to the invariants themselves.

Comment 34

13 years ago
(In reply to comment #31)
> Limiting the content tree depth would also be good but it sounds a bit harder,
> especially in terms of what you do when content tree depth is exceeded by
> someone manipulating the DOM.

The large content tree depth is a problem only for recursive code, right? Then just checking for the stack depth during the recursion should be enough to cover that.

Comment 35

13 years ago
(In reply to comment #32)
> The JS engine has a stack address limit set by DOM code based on a low stack,
> and assumption that the stack grows down.

It is xpconnect code, not DOM, that these days sets the limit.
(In reply to comment #33)
> I imagine the two strategies would have different implications for how hard it
> is to maintain Gecko's key invariants.  They might even require changes to the
> invariants themselves.

I don't think this is the case.  There are no "key invariants" that depend on magic recursion depth limits, as far as I can see.  The issue is simply DOS attack or unintended stack overflow crash prevention, which means a hard quota of some kind on stack growth.  Ad-hoc and varying recursion depth limits are a poor way to implement such a quota.  Address sampling is better.

(In reply to comment #35)
> (In reply to comment #32)
> > The JS engine has a stack address limit set by DOM code based on a low stack,
> > and assumption that the stack grows down.
> 
> It is xpconnect code, not DOM, that these days sets the limit.

Right, thanks (XPConnect, DOM; Tomato, Tomahtoe ;-).

/be
> There are no "key invariants" that depend on magic recursion depth limits, as
> far as I can see.

Well.  There sort of are.  Frame destruction is recursive, and we absolutely depend on the invariant that Destroy() is called on all frames in a presshell before ~nsPresShell finishes.

That's off the top of my head, of course.  There may be other similar things.
(In reply to comment #37)
> > There are no "key invariants" that depend on magic recursion depth limits, as
> > far as I can see.
> 
> Well.  There sort of are.  Frame destruction is recursive, and we absolutely
> depend on the invariant that Destroy() is called on all frames in a presshell
> before ~nsPresShell finishes.

Ah, I see (and stand corrected).  But really, if we must limit stack consumption to avoid DOS and unintended overflow crashes, then we should revise destruction to use an iterative algorithm if possible.

One question: is the stack consumption from destruction bounded by prior recursive construction or traversal, so that if we have a frame tree of height H, we know we can destroy it without stack overflow?  This depends on depth of stack at destruction time vs. at creation or traversal, so may be hard to bound.

> That's off the top of my head, of course.  There may be other similar things.

It would be good to know of others.

/be
I think it'd be hard to limit the recursion itself while maintaining all invariants that we have. For example the BindToTree implementation does some stuff that is really important that we do and we simply can't stop the recursion there safely. Same thing when for example reparenting wrappers when nodes are moved around in the DOM (especially when moved between documents using .adoptNode)
> One question: is the stack consumption from destruction bounded by prior
> recursive construction or traversal,

Thanks to the wonders of DOM events being reentrant, any operation that can be triggered from script (and frame destruction is amongst those) may begin when the stack is arbitrarily deep.

Same thing for UnbindFromTree and so forth.

Of course if we take this approach we should stack-limit DOM event dispatch such that it can't get us too far into trouble.

But we have a _lot_ of recursive algorithms in our code, thanks to the whole "the data structure is a tree" business.  :(
How about using the JS-maintained thread stack limit, but instead of failing, switch to an iterative algorithm such as Deutsch-Schorr-Waite?  It would be best to have a single tree-walking higher-order function, e.g.

traverse(node, doit) {
    doit(node);
    for (each kid in node)
        traverse(kid, doit);
}

that we could extend via D-S-W, but I suspect (someone pleas confirm) that we have not used a HOF to consolidate tree-walking recursion.

/be
Yeah, we simply inline all the tree walking. The problem with the above approach is that for many of our recursive algorithms we keep state that is either passed on to child calls, or return state from child calls. Though maybe that can be solved using closure parameters.

And this still wouldn't solve the problem when we go in and out of javascript. That part would still need to throw an exception.

In any case, it will be *a lot* of work doing this. I think the time spent on trying to convert all our recursive algorithms into iterative ones would be better spent on other things. These crashes are after all not exploitable.

I suspect a much easier approach would be to limit the depth of the DOM trees and put limits on places where we can recurse over the tree multiple times, such as for mutation events. 
What sicking said. Unwinding all our recursive treewalkers into code with explicit closures would be a ton of work.
Flags: wanted1.9+
Whiteboard: [wanted-1.9]
Flags: wanted1.9+ → wanted1.9-
Whiteboard: [sg:dos]
note: this http://lcamtuf.coredump.cx/ffoxdie_orig.html is still crashing opt-builds.
Duplicate of this bug: 505350
Recent stack of this crash

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_PROTECTION_FAILURE at 0x00000000bf7ffffc
Crashed Thread:  0

Thread 0 Crashed:
0   libxpcom_core.dylib               0x0051f935 NS_LogCtor_P + 5
(nsTraceRefcntImpl.cpp:1032)
1   libgklayout.dylib                 0x12338b35 nsCSSRect::nsCSSRect() + 69
2   libgklayout.dylib                 0x12338b9c nsCSSDisplay::nsCSSDisplay() +
76
3   libgklayout.dylib                 0x123a4f0c
nsRuleNode::GetVisibilityData(nsStyleContext*) + 28 (nsRuleNode.cpp:1361)
4   libgklayout.dylib                 0x123a50d8
nsRuleNode::GetStyleVisibility(nsStyleContext*, int) + 264
(nsStyleStructList.h:103)
5   libgklayout.dylib                 0x123a6947
nsStyleContext::GetStyleVisibility() + 55
6   libgklayout.dylib                 0x12209c44
nsFrame::DidSetStyleContext(nsStyleContext*) + 340 (nsFrame.cpp:574)
7   libgklayout.dylib                 0x12212ce5 nsFrame::Init(nsIContent*,
nsIFrame*, nsIFrame*) + 293 (nsIFrame.h:2159)
8   libgklayout.dylib                 0x12291298
nsSplittableFrame::Init(nsIContent*, nsIFrame*, nsIFrame*) + 40
(nsSplittableFrame.cpp:55)
9   libgklayout.dylib                 0x121fafac
nsContainerFrame::Init(nsIContent*, nsIFrame*, nsIFrame*) + 44
(nsContainerFrame.cpp:93)
10  libgklayout.dylib                 0x1211ebed
nsCSSFrameConstructor::InitAndRestoreFrame(nsFrameConstructorState const&,
nsIContent*, nsIFrame*, nsIFrame*, nsIFrame*, int) + 221 (nsIFrame.h:1095)
11  libgklayout.dylib                 0x12136f8e
nsCSSFrameConstructor::ConstructInline(nsFrameConstructorState&,
nsCSSFrameConstructor::FrameConstructionItem&, nsIFrame*, nsStyleDisplay
const*, nsFrameItems&, nsIFrame**) + 142 (nsCSSFrameConstructor.cpp:10740)
...

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre)
Gecko/20090720 Minefield/3.6a1pre
If we crash due to excessive recursion, the interesting part of the stack is the repeating part, not the part at the very top.
Reporter

Updated

10 years ago
Duplicate of this bug: 456954
Reporter

Updated

10 years ago
Duplicate of this bug: 160228

Updated

9 years ago
Blocks: 468240
Duplicate of this bug: 557180

Comment 51

9 years ago
There are several more recent duplicates of this, some claim to be somewhat "fixed" (like https://bugzilla.mozilla.org/show_bug.cgi?id=557180 which had only one testcase, but I get the crash in current Minefield with another input file). Would another example input help here?
> Would another example input help here?

It depends.

It's pretty easy to trigger this crash if you're _trying_ to.  If your example input is just a real-life page that accidentally triggers it, then it would be helpful to see it.
Duplicate of this bug: 582535
Reporter

Updated

9 years ago
Duplicate of this bug: 592532
Duplicate of this bug: 621850
Blocks: 354161
Blocks: 485024
Reporter

Comment 56

8 years ago
In bug 644526, we're considering capping the DOM tree depth for DOMs generated by the XML parser, in order to work around these layout recursion issues.

Updated

9 months ago
Product: Core → Core Graveyard
Component: Layout: Misc Code → Layout
Product: Core Graveyard → Core
You need to log in before you can comment on or make changes to this bug.