Mozilla M6 (talkback) on a 450 mhz PII with 128 megs APPRUNNER caused an invalid page fault in module KERNEL32.DLL at 015f:bff9d709. Registers: EAX=c002fb08 CS=015f EIP=bff9d709 EFLGS=00010212 EBX=00000000 SS=0167 ESP=0068fdf4 EBP=00690090 ECX=00000000 DS=0167 ESI=008b0f80 FS=18cf EDX=780373c0 ES=0167 EDI=00403d90 GS=0000 Bytes at CS:EIP: 53 8b 15 dc 9c fc bf 56 89 4d e4 57 89 4d dc 89 Stack dump:
http://www.newdream.net/crash/ describes what's going on
Chris, this appears to be an old bug that has plagued even previous versions of ur browser ...
Eric, the only way to fix this bug is to implement a MAX_NUM_NESTED_FRAMES concept. Currently, framesets don't even check for recursion and prevent it. The max num idea would solve that problem as well. I'm not sure what the max should be, possibly 100. This can be fairly easily implemented by going up the nsWebShell parentage until the root is found. We may still want to check for recursion.
My 2 cents: since previous browsers crashed on this too and it's a stupid kind of page with no legitimate application, I would say that enhancing the browser to catch this stop-me-before-I-shoot-myself edge case should be a lower priority than fixing all other outstanding standards compliance bugs. Unless it's trivially easy for you to fix this, I recommend we mark it LATER and put our energy elsewhere. It *is* a crash, but if people deliberately set out to crash the browser, they'll always find a way. Let's focus on shipping a working 5.0 ASAP and postpone silly edge cases to 5.1. Your final call based on time required to fix.
Moving [FEATURE] to Summary field for queries.
LATER. Stick this on the 5.1 stack. If people want to create HTML pages that serve no function whatsoever than to deliberately crash the browser, let 'em. All performance and standards support bugs are a higher priority than this. We gotta start making tough decisions to ship the product, and this one isn't even close to tough.
Marking verified later as per last comments.
*** Bug 30013 has been marked as a duplicate of this bug. ***
*** Bug 31326 has been marked as a duplicate of this bug. ***
*** Bug 34957 has been marked as a duplicate of this bug. ***
*** Bug 32389 has been marked as a duplicate of this bug. ***
By the way Eric, earlier nav code flat out refuses to load a frame that points to itself (i.e., parent.location == SRC value of FRAME). The twist for this URL (above) is that it works around that rule by generating "unique" URLs for each descent. (That's why Chris notes above that the only complete solution is to test for some maximum number). I agree that this is not a high priority fix, but you might consider doing the 'easy' fix for mozilla '1.0' -- just refuse to recur on an _identical_ URL. Defer the MAX_NUM_NESTED_FRAMES fix for mozilla '1.1'.
By the way, I'm not arguing that this is a high-priority (any time before the last beta is good), but I do feel that closing off the simplest case (a direct self-reference) is the right thing to do. [By direct self-reference I mean a frame that points to a URL already visited by this frameset]. While there could be legitimate uses of this type of self-referencing / recursion, I think that "in the wild" it is too easy for someone to shoot themselves (or their audience) in the foot. [Aware programmers can always use dummy query strings to work around the restriction on direct self-reference, subject to an upper limit on depth as a safeguard against bad logic]. This is especially true when, when in the absence of cookies (which is only a 'pref' away), HTTP is stateless. Combine that with any type of recursion, and that's not a good thing. Yes, it's not in the standard, but it is documented legacy behaviour, at least in prior implementations of Navigator: From http://home.netscape.com/assist/net_sites/frame_implement.html "Any frame that attempts to assign its SRC URL to be the same as the URL of any of its ancestors will be treated as if it has no SRC URL at all (basically a blank frame). While this doesn't stop all malicious documents, it eliminates a whole troublesome class of them." Just a thought.
Adding crash keyword.
<tough_decision>Even if Eric has the time to mess with this issue, those cycles would be better spent fixing a bug from someone else's list which actually affected real, valid content that was trying to do something legitimate (as opposed to these bogus look-I-can-crash-your-browser pages that have to real purpose). In the interest of shipping a high-quality FCS, it's vital that we focus on real problems with real content. (e.g. Pulling a bug off of Troy's list would be a much better use of time from the standpoint of helping real customers.) Marking FUTURE and helpwanted. Eric, if we've fixed all the real crashers, memory leaks, and semantic correctness bugs prior to FCS, let's revisit this bug then. ;-> </tough_decision>
adding testcase kw
I just thought I would point out that this works for iframes as well. http://www.students.bucknell.edu/deneen/iframe/test.html
I have a fix for this. It's small (~20 lines of code) does only a depth check, not a "circular reference" check. This is a simple solution that does not limit functionality on most legitimate uses of recursion I can think of. This fix is highly desirable because there are malicious people out there, and I'm imagining how ticked off I would be if I had just finished writing an epic email and clicked on a link that crashed the browser. I chose a limit of 25 frames on the depth because Window NT starts failing trying to create widgets beyond a depth of 30 or so anyway. The limit and the check only apply to content frames and not chrome, so recursive chrome can still bring the browser down. Chrome should be coming from a trusted source and not do this silly kind of thing, right? :)
Updating URL, platforms to All. The testcase is internal, but is simply a series of pages of the form: <html> <body> <iframe src=page2.html> </body> </html> Then page2 refers to page3, and so on. I wrote a perl script to generate 100 pages. Would not recommend looking at these in IE, because it freezes. ;)
is it the depth or the high number of webshells that's making mozilla crash?
I am surprised, that securing the browser against Denial-Of-Service attacks has such a low priority on the managers' list. Eric, do you have a review and the module owner ok?
Jesse - not sure, but I think it is the depth that causes problems. I've seen pages that have 100 web shells (10x10 frameset) with depth 1 load with no problems. Ben - true - this is an easy fix, but not completely without risk. There are some minor performance implications, and this could make sites that have depth greater than MAXDEPTH (25) stop working (though I've never seen sites with depth more than 4 or so). I am the closest thing to module owner for framesets, so that' okay - haven't gotten a review or super-review because this wasn't nsbeta3+ approved. If you would like to open this bug for reconsideration by the PDT team please remove the nsbeta3- from the status whiteboard. I'll attach my fix for review if anyone wants to look at it.
*** Bug 62541 has been marked as a duplicate of this bug. ***
*** Bug 63046 has been marked as a duplicate of this bug. ***
Set milestone to mozilla0.9. Eric if the patch doesn't work, mark this bug future.
*** Bug 64020 has been marked as a duplicate of this bug. ***
Thanks for the test cases! I just tested again on Windows. The patch works fine with both of the above testcases. The reason it was not checked in earlier was that it didn't recieve nsbeta3+. Test case 1: NS6 release: memory usage grows unbounded. (I gave up at 151MB) After 30 levels deep, WinNT just stops showing further nested windows. Debug Moz with fix: memory grows but stops at 50.6MB The nesting is limited to 25 levels deep, rendering is very similar to what the unbounded test case looks like, but it stops just before shooting itself in the foot. :) Test case 2: NS6 release: memory usage grows unbounded. (I gave up at 151MB) After 30 levels deep, WinNT just stops showing further nested windows. Debug Moz with fix: memory grows but stops at 42.8MB The nesting is limited to 25 levels deep, rendering is identical to what the unbounded test case looks like, but it stops just before shooting itself in the foot. :) 40-50MB of memory usage seems somewhat reasonable to me, as this is an edge case - at any rate, it's certainly more reasonable than growing unbounded, infinite looping, and eventually crashing the browser. Will get r= and sr= for this bug.
Fix checked in. To verify, view either of the to attached test cases. The browser should not crash.
Er, Eric ... you excluded XUL frames from this check? Is that correct? Why would that be the case.
> Er, Eric ... you excluded XUL frames from this check? Is that correct? > Why would that be the case. I excluded chrome from this check, which is a little different than XUL per se - but the reason was pretty simple: I think the 99% case for this bug is a user viewing a random page, and the author of that page has done something accidental or malicious (recursive frames) that could cause the browser to crash. We don't want that. However, that must be weighed against allowing the browser to be a flexible, powerful, and consistent as possible. Basically, I wanted to allow for complex chrome around the browser window - and not have that affect the depth to which content you are viewing is nested arbitrarily. And chrome is, by definition, trusted content - capable far more malicious things than just making the browser crash!
> I excluded chrome from this check, which is a little different than > XUL per se - but the reason was pretty simple: Ah. So, XUL over HTTP is subject to this check. Yes? (i.e. a random, malicious, web page can't serve crashing content by using XUL instead of HTML).
Eric: Better than before. Congragulations. Unfortunately, I still see a problem. When running testcase 1, Mozilla bloated to 64 Megs before I had to end-task it. It brought my system to a stand-still. I agree that the nested frames end finally, but not soon enough in this case. I have an "average" system. A PII Laptop with Win2k and 128 Meg of Ram (but only 400 megs free hd). I finally got a message that Virtual Mem is too low (around 400 megs vm). I think that there should be a memory limit as opposed to a nesting limit. IE: If a page is, say, 80K and takes up 200K of mem when parsed and calls itself recursively, and the user has 100MB of virtual mem left, then maybe Mozilla can take up 2% total mem for that page. Therefore, 2MB can be used. That equals 10 times before stopping. I prefer a more ambitious theme such as this than to just putting a number limit. Also, another thing that bothered me was that it gave no message such as "Infinitely recursive frames will be rendered no further".
Sorry for the spam. Eric: Do you think you can reopen this and change the summary to "Memory bloat on infinitely recursive frames"? Or should a seperate bug be filed?
> Ah. So, XUL over HTTP is subject to this check. Yes? (i.e. a random, > malicious, web page can't serve crashing content by using XUL instead of > HTML). I don't think I know enough about XUL to answer your question definitively, but my guess is that this fix does NOT prevent an author from creating a recursive XUL page that might crash the browser - though I'm not sure of all the specifics on how one might do that. The check I put in was specific to the frame that is used to display HTML framesets and iframes. Can you create an example? Brian, you mention two different, but very important points: 1) This fix was merely to address the crash, it doesn't try to calculate memory usage. (To be fair, I think you might be underestimating the complexity of such a change. First, we would need to calculate free memory (in a platform independent way so that it would work on Windows 98, Mac, Linux, Win NT, Solaris, ... Second, we would need to somehow magically know that our user was okay with 2% of memory use per document, and that we should only limit it to this if the reason for excess was due to nesting of a frameset.) I believe there may be ongoing work to address this at an architectural level with a "memory pressure" scheme, where a global notification will be made when a low memory condition exists, and anyone listening to this notification will then purge excess data (flush caches, ...). However, suggestions are more than welcome - if you think this might be better to handle as a special case for framesets, please feel free to file a bug (probably better as a separate issue) 2) No error message is reported to the user if we truncate frameset nesting. This should probably be filed as an enhancement request - with suggestions for the UI (do you think a dialog box would work? too invasive? Should the innermost document display a message?)
Actually, since a xul <iframe> derives its implementation directly from the HTML iframe, I think that this might be fixed. Simple example is : ----save as c:\temp\file.xul (or other OS path) ---- <?xml version="1.0"?> <?xml-stylesheet href="chrome://global/skin" type="text/css"?> <window id="main-window" xmlns:html="http://www.w3.org/TR/REC-html40" xmlns="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul"> <text value="hello, goodbye, hello"/> <iframe style="border:2px solid red;" src="file:///c:/temp/file.xul" flex="100%"/> <text value="hello, goodbye, hello"/> </window>
Pollmann, thanks for creating the fix and checking it in. IMO, the fix of limited to a certain number of nesting levels is just fine. If an attacker wants to, he can create large pages in other ways as well - memory shortage and related DoSes are a separate problem. > Mozilla bloated to 64 Megs That's just fine, isn't it? Mailnews can grow larger in normal operation... > Virtual Mem is too low (around 400 megs vm) What now? 400MB or 64MB(+system)?
Eric. I didn't mean to attack your code. You did solve the bug. I just wanted to mention something again that I had mentioned earlier. I don't know if people saw it the first time.
Mozilla does really need some memory control, but I guess that is a seperate issue.
Fixed in the Feb 28th build.
See also bug 70821 about adding the ability to limit memory usage per browser window.
*** Bug 110682 has been marked as a duplicate of this bug. ***