Last Comment Bug 676934 - Large stack traces can hang browser for a long time
: Large stack traces can hang browser for a long time
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: XPConnect (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla9
Assigned To: Luke Wagner [:luke]
:
Mentors:
Depends on:
Blocks: 682758
  Show dependency treegraph
 
Reported: 2011-08-05 13:55 PDT by Bill McCloskey (:billm)
Modified: 2011-10-18 05:31 PDT (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
first stack trace (1.26 KB, text/plain)
2011-08-05 13:59 PDT, Bill McCloskey (:billm)
no flags Details
second trace (4.08 KB, text/plain)
2011-08-05 14:00 PDT, Bill McCloskey (:billm)
no flags Details
quick fix (1.38 KB, patch)
2011-08-22 13:51 PDT, Luke Wagner [:luke]
no flags Details | Diff | Splinter Review
limit stack depth when building XPC stack (1.52 KB, patch)
2011-08-25 14:57 PDT, Luke Wagner [:luke]
mrbkap: review+
Details | Diff | Splinter Review
testcase (146 bytes, text/html)
2011-10-18 05:31 PDT, Bob Clary [:bc:]
no flags Details

Description Bill McCloskey (:billm) 2011-08-05 13:55:30 PDT
Dvander gave me some stack traces from someone who gets Firefox hangs pretty often. Both of them seem related to exceptions and big stack traces. I'll attach them below.
Comment 1 Bill McCloskey (:billm) 2011-08-05 13:59:29 PDT
Created attachment 551148 [details]
first stack trace

This one seems to be caused by inefficient stack frame iteration. The relevant code seems to be XPCJSStackFrame::CreateStack.
Comment 2 Bill McCloskey (:billm) 2011-08-05 14:00:39 PDT
Created attachment 551152 [details]
second trace

This one seems to happen when we try to free the massive exception stack trace recursively.
Comment 3 Bill McCloskey (:billm) 2011-08-05 14:04:24 PDT
The obvious solution here is to add some sort of cutoff to the number of frames we capture. Luke said that this problem may have gotten worse recently because the JS engine increased the max number of frames.

Blake, how much freedom do we have here? You mentioned that there are some web interfaces that we have to support. I'm sure that we could make the algorithms and data structures more efficient. But if we have to build up a giant stack trace object every time someone feels like writing a recursive infinite loop, it's not going to do much good.
Comment 4 Luke Wagner [:luke] 2011-08-05 14:22:10 PDT
To be clear: for the case of stack iteration, we don't need a cutoff, we just need to not use a quadratic iteration algorithm (i.e., StackFrame::pcQuadratic) when n = 50,000.
Comment 5 Bill McCloskey (:billm) 2011-08-05 14:25:03 PDT
(In reply to Luke Wagner [:luke] from comment #4)
> To be clear: for the case of stack iteration, we don't need a cutoff, we
> just need to not use a quadratic iteration algorithm (i.e.,
> StackFrame::pcQuadratic) when n = 50,000.

That would make the problem a lot better. But it seems like we still don't want to be creating giant stack trace objects that the user is unlikely to want. And if we have a cutoff, it would be during creation.
Comment 6 Luke Wagner [:luke] 2011-08-05 14:31:37 PDT
Ah, I forgot that iteration was building the exception object; I thought it was iterating to... I don't know what I was thinking.
Comment 7 Luke Wagner [:luke] 2011-08-22 13:51:33 PDT
Created attachment 554952 [details] [diff] [review]
quick fix

Stacks can get quite large now (50K frames).  If the only issue was pcQuadratic from JS_GetFramePC, we could just switch to FrameRegsIter.  But it seems like the time to create all these XPCJSStackFrames is also a big deal.  This patch just caps the number of stack frames we create.  This seems fine as long as code isn't depending on the contents of the stack for some hard security judgment or defined by a spec.  Blake, do you know anything about this?

Pushing to try...
Comment 8 Luke Wagner [:luke] 2011-08-22 15:52:05 PDT
Very green.  Now all we need to know is is it totally wrong ;-)
Comment 9 Blake Kaplan (:mrbkap) 2011-08-22 16:26:32 PDT
Comment on attachment 554952 [details] [diff] [review]
quick fix

I'm fine with this. I don't see where you increment numFrames, though (also: no space after if in XPConnect).

No security code that I know of uses this code. Even with this fix, though, I think we should file a followup bug on trying to clean this up to be less of a memory hog. I don't see a reason why we need an XPCOM object per stack frame. That seems wasteful, especially in C++. So, we should file a followup bug on cleaning that up, either by changing the API to be more of an iterator API or making the current implementation defer work until it's needed or other similar optimizations.
Comment 10 Luke Wagner [:luke] 2011-08-23 11:57:46 PDT
This time with the increment ;-).  Still green on try.

dvander, can you or the other fellow verify the fix:

wget --progress=dot:mega -N http://stage.mozilla.org/pub/mozilla.org/firefox/try-builds/lwagner@mozilla.com-381a04c7117b/try-win32/firefox-9.0a1.en-US.win32.zip
Comment 11 Luke Wagner [:luke] 2011-08-24 09:05:12 PDT
Still green.  Waiting for verification of the fix before requesting review.
Comment 12 Luke Wagner [:luke] 2011-08-25 14:57:49 PDT
Created attachment 555862 [details] [diff] [review]
limit stack depth when building XPC stack

David said that the reporter used the build for a day and the hangs went away.
Comment 13 Blake Kaplan (:mrbkap) 2011-08-25 15:05:28 PDT
Comment on attachment 555862 [details] [diff] [review]
limit stack depth when building XPC stack

Review of attachment 555862 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/xpconnect/src/xpcstack.cpp
@@ +189,5 @@
>                  self->mLanguage = nsIProgrammingLanguage::CPLUSPLUS;
>              }
>          }
>  
> +        if (++numFrames > MAX_FRAMES)

if( please: no space after if!
Comment 15 Ed Morley [:emorley] 2011-08-30 04:40:38 PDT
http://hg.mozilla.org/mozilla-central/rev/363c40e06667
Comment 16 Bob Clary [:bc:] 2011-10-18 05:31:50 PDT
Created attachment 567715 [details]
testcase

http://www.in-town.nl/web/ appears to exercise this on Beta/8. Reduction crashes Beta/8 debug but not opt on Mac OS X 10.5

Note You need to log in before you can comment on or make changes to this bug.