Closed Bug 484693 Opened 15 years ago Closed 15 years ago

TM: Crash [@ _chkstk]

Categories

(Core :: JavaScript Engine, defect, P1)

x86
Windows XP
defect

Tracking

()

VERIFIED FIXED
mozilla1.9.2a1

People

(Reporter: gkw, Assigned: graydon)

References

Details

(Keywords: crash, testcase, verified1.9.1, Whiteboard: fixed-in-tracemonkey)

Crash Data

Attachments

(7 files, 4 obsolete files)

Here's the first jsfunfuzz bug I found from Windows XP, and I'm sure there's a lot more too. It took me a long time to reduce this, but it's something jsfunfuzz on Win keeps hitting.

for (var x = 0; x < 100000; ++x) {
  for (y = 0; y < 3; ++y) {}
  var a = eval;
  delete eval;
  eval = a;
  var b = toString;
  delete toString;
  toString = b;
}

crashes TM both opt and debug WinXP builds with -j.

Being unable to yet use Windbg to get a stack, I resorted to putting the script into a HTML file, and a latest-tracemonkey WinXP build does indeed crash with the signature in http://crash-stats.mozilla.com/report/index/545904af-938a-46c2-840c-35d262090322?p=1

0  	js3250.dll  	_chkstk  	chkstk.asm:99
1 	js3250.dll 	js_MonitorLoopEdge(JSContext*,unsigned int&) 	js/src/jstracer.cpp:4255
2 	js3250.dll 	js_Interpret 	js/src/jsinterp.cpp:3129 

I'm turning this security-sensitive just in case an exploit is possible.
Flags: blocking1.9.1?
2 bugs here. 

1) we generate a crazy amount of traces (for every outer loop iterations). Need to find out why we are not blacklisting.

2) after a while something bad happens in the heap.

We should fix 2) first, and then eliminate 1).
CCing nick. I think this is a job for valgrind.
This works fine on macosx debug (way too many traces, but no crash).
Attachment #368801 - Attachment description: testcase (Warning: Crashes latest-tracemonkey Minefield) → testcase (Warning: Crashes Firefox 3.1 Beta 3 and latest-tracemonkey Minefield on WinXP)
Attached image backtrace image from cygwin (obsolete) —
I installed gdb on cygwin to try and get a stack trace, but I don't know why I had no symbols even though I had compiled debug and didn't change the paths. Nonetheless, the stack shows:

0 - js!JS_CompareValues
1 - js_GetSrcNoteOffset
2 - js_Invoke
3 - js_Invoke
4 - js!JS_ExecuteScript

I'm not sure if this is entirely accurate though.
Flags: blocking1.9.1? → blocking1.9.1+
Priority: -- → P2
Comment #4 is wrong. Here's the call stack from VS 2008 Express:

=====

 	js3250.dll!_chkstk()  Line 99	Asm
>	js3250.dll!js_MonitorLoopEdge(JSContext * cx=0x052bd9b0, unsigned int & inlineCallCount=0)  Line 4255 + 0x15 bytes	C++
 	js3250.dll!js_Interpret(JSContext * cx=0x052bd9b0)  Line 3740 + 0x56d bytes	C++
 	js3250.dll!js_Execute(JSContext * cx=0x052bd9b0, JSObject * chain=0x06791b40, JSScript * script=0x078127c8, JSStackFrame * down=0x00000000, unsigned int flags=0, int * result=0x00000000)  Line 1595 + 0x9 bytes	C++
 	js3250.dll!JS_EvaluateUCScriptForPrincipals(JSContext * cx=0x052bd9b0, JSObject * obj=0x06791b40, JSPrincipals * principals=0x077ef8cc, const unsigned short * chars=0x07810fe8, unsigned int length=165, const char * filename=0x077ec000, unsigned int lineno=7, int * rval=0x00000000)  Line 5112 + 0x19 bytes	C++
 	gklayout.dll!nsJSContext::EvaluateString(const nsAString_internal & aScript={...}, void * aScopeObject=0x06791b40, nsIPrincipal * aPrincipal=0x077ef8c8, const char * aURL=0x077ec000, unsigned int aLineNo=7, unsigned int aVersion=0, nsAString_internal * aRetValue=0x00000000, int * aIsUndefined=0x0012edc4)  Line 1603 + 0x42 bytes	C++
 	gklayout.dll!nsScriptLoader::EvaluateScript(nsScriptLoadRequest * aRequest=0x07810f70, const nsString & aScript={...})  Line 671 + 0x7a bytes	C++
 	gklayout.dll!nsScriptLoader::ProcessRequest(nsScriptLoadRequest * aRequest=0x07810f70)  Line 585 + 0x13 bytes	C++
 	gklayout.dll!nsScriptLoader::ProcessScriptElement(nsIScriptElement * aElement=0x07810c54)  Line 539 + 0x17 bytes	C++
 	gklayout.dll!nsScriptElement::MaybeProcessScript()  Line 193 + 0x13 bytes	C++
 	gklayout.dll!nsHTMLScriptElement::MaybeProcessScript()  Line 547 + 0xb bytes	C++
 	gklayout.dll!nsHTMLScriptElement::DoneAddingChildren(int aHaveNotified=1)  Line 485	C++
 	gklayout.dll!HTMLContentSink::ProcessSCRIPTEndTag(nsGenericHTMLElement * content=0x07810c30, int aMalformed=0)  Line 3134 + 0x12 bytes	C++
 	gklayout.dll!SinkContext::CloseContainer(nsHTMLTag aTag=eHTMLTag_script, int aMalformed=0)  Line 1023 + 0x12 bytes	C++
 	gklayout.dll!HTMLContentSink::CloseContainer(nsHTMLTag aTag=eHTMLTag_script)  Line 2389 + 0x11 bytes	C++
 	gkparser.dll!CNavDTD::CloseContainer(nsHTMLTag aTag=eHTMLTag_script, int aMalformed=0)  Line 2798 + 0x26 bytes	C++
 	gkparser.dll!CNavDTD::HandleEndToken(CToken * aToken=0x077ef2a8)  Line 1677 + 0x13 bytes	C++
 	gkparser.dll!CNavDTD::HandleToken(CToken * aToken=0x077ef2a8, nsIParser * aParser=0x077efe90)  Line 761 + 0xc bytes	C++
 	gkparser.dll!CNavDTD::BuildModel(nsIParser * aParser=0x077efe90, nsITokenizer * aTokenizer=0x077ff1b8, nsITokenObserver * anObserver=0x00000000, nsIContentSink * aSink=0x077f0014)  Line 333 + 0x16 bytes	C++
 	gkparser.dll!nsParser::BuildModel()  Line 2384 + 0x24 bytes	C++
 	gkparser.dll!nsParser::ResumeParse(int allowIteration=1, int aIsFinalChunk=0, int aCanInterrupt=1)  Line 2257 + 0xe bytes	C++
 	gkparser.dll!nsParser::OnDataAvailable(nsIRequest * request=0x077ec0d8, nsISupports * aContext=0x00000000, nsIInputStream * pIStream=0x077ec8d0, unsigned int sourceOffset=0, unsigned int aLength=253)  Line 2910 + 0x17 bytes	C++
 	docshell.dll!nsDocumentOpenInfo::OnDataAvailable(nsIRequest * request=0x077ec0d8, nsISupports * aCtxt=0x00000000, nsIInputStream * inStr=0x077ec8d0, unsigned int sourceOffset=0, unsigned int count=253)  Line 306 + 0x30 bytes	C++
 	necko.dll!nsBaseChannel::OnDataAvailable(nsIRequest * request=0x077ec568, nsISupports * ctxt=0x00000000, nsIInputStream * stream=0x077ec8d0, unsigned int offset=0, unsigned int count=253)  Line 708 + 0x52 bytes	C++
 	necko.dll!nsInputStreamPump::OnStateTransfer()  Line 508 + 0x40 bytes	C++
 	necko.dll!nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream * stream=0x077ec8d0)  Line 398 + 0xb bytes	C++
 	xpcom_core.dll!nsInputStreamReadyEvent::Run()  Line 112	C++
 	xpcom_core.dll!nsThread::ProcessNextEvent(int mayWait=1, int * result=0x0012f848)  Line 511	C++
 	xpcom_core.dll!NS_ProcessNextEvent_P(nsIThread * thread=0x00d7e530, int mayWait=1)  Line 230 + 0x16 bytes	C++
 	gkwidget.dll!nsBaseAppShell::Run()  Line 170 + 0xc bytes	C++
 	tkitcmps.dll!nsAppStartup::Run()  Line 192 + 0x1c bytes	C++
 	xul.dll!XRE_main(int argc=1, char * * argv=0x00d7b650, const nsXREAppData * aAppData=0x00d7bbe8)  Line 3220 + 0x25 bytes	C++
 	firefox.exe!NS_internal_main(int argc=1, char * * argv=0x00d7b650)  Line 156 + 0x12 bytes	C++
 	firefox.exe!wmain(int argc=1, wchar_t * * argv=0x00d79560)  Line 107 + 0xd bytes	C++
 	firefox.exe!__tmainCRTStartup()  Line 583 + 0x19 bytes	C
 	firefox.exe!wmainCRTStartup()  Line 403	C
 	kernel32.dll!7c817067() 	
 	[Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]	
 	nspr4.dll!_PR_NT_InitSids()  Line 119 + 0x4 bytes	C
 	0d020206()	



=====


Here's !exploitable output along with the stack.


0:000> !load winext\msec.dll
0:000> !exploitable -v
HostMachine\HostUser
Executing Processor Architecture is x86
Debuggee is in User Mode
Debuggee is a live user mode debugging session on the local machine
Event Type: Exception
*** WARNING: Unable to verify checksum for C:\objdir-tm\dist\bin\components\docshell.dll
*** WARNING: Unable to verify checksum for C:\objdir-tm\dist\bin\components\necko.dll
*** WARNING: Unable to verify checksum for C:\objdir-tm\dist\bin\xpcom_core.dll
*** WARNING: Unable to verify checksum for C:\objdir-tm\dist\bin\components\gkwidget.dll
*** WARNING: Unable to verify checksum for C:\objdir-tm\dist\bin\components\tkitcmps.dll
*** WARNING: Unable to verify checksum for C:\objdir-tm\dist\bin\xul.dll
Exception Faulting Address: 0x6ff0d7
First Chance Exception Type: STATUS_STACK_OVERFLOW (0xC00000FD)

Faulting Instruction:006ff0d7 test dword ptr [eax],eax

Basic Block:
    006ff0d7 test dword ptr [eax],eax
       Tainted Input Operands: eax
    006ff0d9 jmp js3250!_chkstk+0x14 (006ff0c4)

Exception Hash (Major/Minor): 0x7005322b.0x1f0e780d

Stack Trace:
js3250!_chkstk+0x27
js3250!js_MonitorLoopEdge+0x2d7
js3250!js_Interpret+0x5cb9
js3250!js_Execute+0x2dc
js3250!JS_EvaluateUCScriptForPrincipals+0xe5
gklayout!nsJSContext::EvaluateString+0x2ba
gklayout!nsScriptLoader::EvaluateScript+0x37a
gklayout!nsScriptLoader::ProcessRequest+0xfd
gklayout!nsScriptLoader::ProcessScriptElement+0x1031
gklayout!nsScriptElement::MaybeProcessScript+0x149
gklayout!nsHTMLScriptElement::MaybeProcessScript+0x24
gklayout!nsHTMLScriptElement::DoneAddingChildren+0x1f
gklayout!HTMLContentSink::ProcessSCRIPTEndTag+0xcf
gklayout!SinkContext::CloseContainer+0x31d
gklayout!HTMLContentSink::CloseContainer+0xa0
gkparser!CNavDTD::CloseContainer+0x18a
gkparser!CNavDTD::HandleEndToken+0x1dd
gkparser!CNavDTD::HandleToken+0x4ae
gkparser!CNavDTD::BuildModel+0x295
gkparser!nsParser::BuildModel+0xe2
gkparser!nsParser::ResumeParse+0x1bc
gkparser!nsParser::OnDataAvailable+0x228
docshell!nsDocumentOpenInfo::OnDataAvailable+0x4c
necko!nsBaseChannel::OnDataAvailable+0x6d
necko!nsInputStreamPump::OnStateTransfer+0x23d
necko!nsInputStreamPump::OnInputStreamReady+0x80
xpcom_core!nsInputStreamReadyEvent::Run+0x4a
xpcom_core!nsThread::ProcessNextEvent+0x1fa
xpcom_core!NS_ProcessNextEvent_P+0x53
gkwidget!nsBaseAppShell::Run+0x5d
tkitcmps!nsAppStartup::Run+0x6b
xul!XRE_main+0x3388
firefox!NS_internal_main+0x2b2
firefox!wmain+0x119
firefox!__tmainCRTStartup+0x1a8
firefox!wmainCRTStartup+0xf
kernel32!BaseProcessStart+0x23
Instruction Address: 0x6ff0d7

Description: Stack Overflow
Short Description: StackOverflow
Exploitability Classification: UNKNOWN
Recommended Bug Title: Stack Overflow starting at js3250!_chkstk+0x27 (Hash=0x7005322b.0x1f0e780d)
Attachment #368861 - Attachment is obsolete: true
Priority: P2 → P1
Assignee: general → gal
Reproduced under windows.

This might be an alloca overflow.
Stack overflows should not be exploitable -- anyone know otherwise?

/be
I would like to know whats up here. Then we can lift the security classification.
Breakpoint 2, js_ExecuteTree (cx=0x30bc90, f=0x30e650, inlineCallCount=@0xbffff210, innermostNestedGuardp=0xbfffedb4) at ../jstracer.cpp:3910
3910	    TreeInfo* ti = (TreeInfo*)f->vmprivate;
1: globalObj.dslots[-1] = 260
(gdb) c
Continuing.

Breakpoint 2, js_ExecuteTree (cx=0x30bc90, f=0x30e1c0, inlineCallCount=@0xbffff210, innermostNestedGuardp=0xbfffedb4) at ../jstracer.cpp:3910
3910	    TreeInfo* ti = (TreeInfo*)f->vmprivate;
1: globalObj.dslots[-1] = 260
(gdb) c
Continuing.

Breakpoint 2, js_ExecuteTree (cx=0x30bc90, f=0x30de60, inlineCallCount=@0xbffff210, innermostNestedGuardp=0xbfffedb4) at ../jstracer.cpp:3910
3910	    TreeInfo* ti = (TreeInfo*)f->vmprivate;
1: globalObj.dslots[-1] = 516
(gdb) c
Continuing.

The global object keeps growing new slots, eventually overflowing alloca.

This is 2 bugs in one. The global object shouldn't grow like this. And we shouldn't die. I fix the latter. The former sounds like a graydon bug, since he was poking around in that part of the attic.
Attached patch patchSplinter Review
Don't allow the global object to grow so many slots that alloca fails.
Attachment #370338 - Flags: review?(graydon)
Comment on attachment 370338 [details] [diff] [review]
patch

Not comfortable with this. We should not be using alloca() period. I know
it's fast, but it's also a way to crash relatively easily. This is C++, we
can do scoped smart pointers with destructors, there's no need to worry
about how hard it is to ensure all exit paths from a function call free().

Also don't know why the global object is growing endlessly, nor why we've got
an explosion of traces. Possibly global shape changes. 

Anyway, I'll post a patch shortly that adds a small smart-pointer class,
and axes our alloca() calls.
Attachment #370338 - Flags: review?(graydon) → review-
(In reply to comment #11)
> We should not be using alloca() period.

I second this. alloca never checks for stack overflow. So either we prefix it with custom checks for that (like the rest of code does when checking for too much native stack recursion) or eliminate its usage.
Isn't alloca for fixed size allocations kosher? We can crash on an overdeep stack without alloca, and stack overflow should not be exploitable or there's an OS bug to get fixed.

/be
I am not a big fan of alloca either. It has bitten us before. There is no way we can tolerate allocation cost within ExecuteTree though. Its already way too fat. We can re-allocate and grow a per-thread area for this if we want to. But not in the hot path.
(a patch for #14 is somewhere in bugzilla)
(In reply to comment #13)
> and stack overflow should not be exploitable

This is simply not true with alloca. An OS may try to leave unreadable memory between stacks from different stacks, but if alloca is script-controllable, an evil script can try to over-allocate so alloca spans over the protected pages and reaches the stack for another thread.
(In reply to comment #13)
> Isn't alloca for fixed size allocations kosher? We can crash on an overdeep
> stack without alloca, and stack overflow should not be exploitable or there's
> an OS bug to get fixed.

Depends what you mean by kosher. Stack overflows, in general, are ways of crashing C++ programs. There's nothing to do but "be as conservative as you can", keep the stack shallow. Alloca's no different than stack-local buffers or what have you. They all push you closer to "crash".

(In reference to Igor's point, I *believe* chkstk.asm goes page-by-page, making sure to probe all the pages it's advancing over. So if you're on a toolchain that's safety-conscious enough to do so, you're possibly safe. It's still a way of crashing.)
Yes, of course we should not crash on overdeep stacks! SpiderMonkey has lots of JS_CHECK_STACK_SIZE calls to avoid this.

But my point was that a stack overflow, like a null pointer deref, on a *sane* OS, is not exploitable. So why am I looking at a light-orange bug?

/be
#17 just gave us a great reason to stop using alloca here. It has to touch all the pages in between. A pre-malloced buffer doesn't have to do that. Perf win. Security win. I think we are beating a dead horse here. Lets do some patching.
Attached patch First cut (obsolete) — Splinter Review
This removes all the alloca() calls aside from the one gal's concerned about on the hot path.

I'll post a followup that includes the hot path persistent-buffer approach from bug 475998 shortly. Meanwhile, any thoughts? Is this ok?
Attachment #370501 - Flags: review?(gal)
The name is really ugly (jsTmpArray). Otherwise looks good. Waiting for the final piece for r+.
Comment on attachment 370501 [details] [diff] [review]
First cut

>+class jsTmpArray

JS typenames are all JSFooType spelling, but with C++ we can (should, for internal types) lose the poor-man's JS namespace. Old Unix hackers like me love tmp or /tmp, but Tmp not so much, so TempArray seems best. We have TempValueRooter (with JS prefix but no matter).

/be
(In reply to comment #22)
> (From update of attachment 370501 [details] [diff] [review])
> >+class jsTmpArray
> 
> JS typenames are all JSFooType spelling,

The readme (now MDC docs?) talks about the obvious jsscalartype alternative style.

/be
Assignee: gal → graydon
Attached patch An improved variant. (obsolete) — Splinter Review
As predicted, it's ... challenging to compete with alloca on speed! But I
think I've got it as close to noise as matters, and I think this is an
important adjustment: the ability to be alloca-crashed given overlarge user
input is not good.

This variant switches to using a JS arena pool for the temp arrays, and
incorporates gal's previous work on making the global and interp-state area
persist between executions. 

There was some initial real cost due to misfeatures in the way we're using
arena pools; it's easy to make them thrash on malloc, losing all their
advantage. I've attempted to fix that a bit in this patch, and it seems
that I can make up most-to-all of the speed by such corrections: the patch
is faster on linux and slower on mac, but either way within less-than-1%
difference. About 0.8% on mac.

(In investigating this, I've also noticed that the "reserve then release"
idiom we are using to make stack reconstruction "infallible" on exit is
... probably wrong? I don't see any reason why the stack arena pool will
*necessarily* succeed in re-supplying you with a chunk of memory you've
already released. But I guess that's an issue for a later patch.)
Attachment #370501 - Attachment is obsolete: true
Attachment #370562 - Flags: review?(gal)
Attachment #370501 - Flags: review?(gal)
The first argument is not entirely correct. The previous 2-liner already stopped any issues related to overlarge user input. This patch is about avoiding alloca in general.

I am not sure a 10-15ms perf loss is really the way to go. This patch still has ceilings for the maximum number of slots supported. If we already go with fixed ceilings, we can as well just use fixed size stack allocation, which is safe and unbeatable in performance. Also, why do the maximums change? 1024->512 and 64->32?
(In reply to comment #25)
> The first argument is not entirely correct. The previous 2-liner already
> stopped any issues related to overlarge user input. This patch is about
> avoiding alloca in general.

Yes, because alloca in general is a risky call to make. It's open-coded as a single unchecked sp decrement on many toolchains. Do you believe there are no ways to get any of the other allocas to hit large values?

> I am not sure a 10-15ms perf loss is really the way to go. This patch still has
> ceilings for the maximum number of slots supported. If we already go with fixed
> ceilings, we can as well just use fixed size stack allocation, which is safe
> and unbeatable in performance. Also, why do the maximums change? 1024->512 and
> 64->32?

You're mixing issues, though I understand your hesitation to not want to regress performance-wise. Hence why I've been poking around trying to make up the difference elsewhere. I'm really only focused on removing alloca here; it's a bad thing to get in the habit of using. This has nothing much to do with whether there's a fixed cap on the number we pass to a particular alloca. There are lots of allocas floating around jstracer. One was even in a spec-illegal position (inside a function argument list).

Currently, for example, all the perf loss is focused in leavetree, in its use of the FOREACH macro, which allocas an array I'm turning into a TempArray. Nobody's even proposing to make this allocation carry a fixed-cap. I don't think it'd even be correct anymore if it did.

The maximums should change because the current maximums add up to a MAX_INTERP_STACK_BYTES that's over 11,000 bytes. This is a performance problem: it means every time you currently JS_ARENA_ALLOCATE in js_ExecuteTree you're overflowing the stackPool's arena size (8192) and hitting the malloc slow-path. This is a major cost center in the profiles I'm looking at. I only found it while squeezing cost out of the TempArray's uses of arena pools as well. This is the meaning of the comment above wrt arena pools: "it's easy to make them thrash on malloc".
My point is if we have ceilings, why use pool allocation? Why are we not using fixed[maxsize] stack allocation in ExecuteTree?
(In reply to comment #27)
> My point is if we have ceilings, why use pool allocation? Why are we not using
> fixed[maxsize] stack allocation in ExecuteTree?

Because all such things chew up stack. Just like alloca. We keep talking past one another.
I don't think this bug should be security-sensitive. Rob, can you make a final ruling? Thanks.

If we do not nest js_ExecuteTree, why not just use fixed-size stack buffers? No risk of perf loss, and no risk of stack overflow beyond any unrelated unchecked recursion leading to overdeep stack.

If we need to JS_CHECK_STACK_SIZE somewhere let's do that. I just do not see why we'd malloc (and retain) space that could be LIFO allocated more quickly off the stack.

/be
We do nest js_ExecuteTree in case of LeaveTree from inside an interpreter recursion. Its rare but possible.
I mean to write, if we do not nest js_ExecuteTree, then it should not use heap instead of stack just in case the stack is overdeep -- the bug in that case, or in any case, is lack of JS_CHECK_STACK_SIZE somewhere.

If js_ExecuteTree does nest (tree-call, duh) then the same argument applies but the missing JS_CHECK_STACK_SIZE would seem to belong in js_ExecuteTree or its caller.

/be
Loops don't nest so deeply that we should move small-ish stack arrays to the heap. Again, the problem with unchecked stack growth leading to a fatal (and on any OS we should support unexploitable) crash is a lack of JS_CHECK_STACK_SIZE, *not* an abuse of stack allocation over heap allocation. If the arrays were truly huge, then sure. They aren't.

/be
Simple nested loop don't cause recursion here. They all use the same stack areas. Only deep-bailing and then triggering another tree while js_ExecuteTree is still on the stack causes it.
Oh right -- so anyway, the deal with stack overflow checking in SpiderMonkey is that it's incumbent on recursive functions to JS_CHECK_STACK_SIZE. So pick the "lowest" recursive function implicated here and use that macro, and we don't need to heap-allocated anything.

/be
(In reply to comment #34)
> Oh right -- so anyway, the deal with stack overflow checking in SpiderMonkey is
> that it's incumbent on recursive functions to JS_CHECK_STACK_SIZE. So pick the
> "lowest" recursive function implicated here and use that macro, and we don't
> need to heap-allocated anything.

No, the deal with stack overflow checking in spidermonkey is to rewrite things so they don't chew up stack in a way that lets users DoS you (or worse: you've conveniently ignored the notion that it *can* be made to land in other thread's stacks, or the heap). You've argued exactly this point to me in the past, recall the acrobatics the parser does to limit its stack.

The arrays are multiple pages per nesting. This makes them smell like fine attack targets to me. The whole point of the bug is that someone *did* manage to hit input that already overshoots the stack segment. The arguments so far about it being rare or unlikely or easily contained are pure hand-waving. CHECK_STACK_SIZE'ing all over the code might work; but it might be required in places we don't have safe non-stack-chewing bail-out paths for either. Nobody's tried it yet; and either way it's more fragile than just staying shallow(er) by design.

Anyway, I'm tired of fighting this and there are other pressing blockers. If you want to overrule and commit something bandaid-y that leaves alloca or fixed stack-buffers in place, ask another reviewer. I'm not going to r+ it, because I think it's a wrong approach. I've said as much and proposed a number of alternatives. We're just wasting time rehashing this disagreement.

(You should still pick up the change to lower the maximums; that's a pure performance win in your existing use of arena pools. You are *already hitting* malloc each time through executetree.)
Assignee: graydon → gal
(In reply to comment #35)
> No, the deal with stack overflow checking in spidermonkey is to rewrite things
> so they don't chew up stack in a way that lets users DoS you (or worse: you've
> conveniently ignored the notion that it *can* be made to land in other thread's
> stacks, or the heap).

I have done no such thing. The OS grows the stack without exploitable crash or we do not support it. Do you have evidence that any OS we support does otherwise for stack allocations (*not* for alloca)?

> You've argued exactly this point to me in the past,
> recall the acrobatics the parser does to limit its stack.

What do you mean?

The parser is highly recursive and can overflow common OS stack ulimits, which is why we added JS_CHECK_STACK_SIZE opportunistically to its recursive functions.

> The arrays are multiple pages per nesting. This makes them smell like fine
> attack targets to me.

Again, what OS will fail to redzone and crash safely?

The way to avoid that is JS_CHECK_STACK_SIZE.

> The whole point of the bug is that someone *did* manage
> to hit input that already overshoots the stack segment. The arguments so far
> about it being rare or unlikely or easily contained are pure hand-waving.

No, you're not reading what I wrote. The bug is lack of JS_CHECK_STACK_SIZE, not abuse of heap instead of stack for LIFO store.

/be
(In reply to comment #36)
> (In reply to comment #35)
> Again, what OS will fail to redzone and crash safely?
> 
> The way to avoid that is JS_CHECK_STACK_SIZE.

I meant to write *unsafely* -- the question was rhetorical.

The next statement was about the way to avoid safe but to-be-avoided stack overflow crashes.

Less heat, more light.

/be
I will not apply any patch for this issue until there is consensus.
Ok, I lied. This is actually not recursive. In case of deep bail we move all the data from the native areas to the interpreter stack/global object, so we can just paint over it. So one per-thread (per trace monitor) heap allocated area is fine.
Ok then!

Graydon, does the separate stack depth checking argument make sense? You are right that if a recursive function has a big enough stack buffer, we may not check stack depth enough to avoid crashing. We set the stack limit to 512K, which is small enough that any slop before the next check is not enough to crash.

This would be adjustable if js_ExecuteTree were recursive. But it's not so sorry for the digression. To make it up all I can say is the patch looks (and looked) good, apart from any stack vs. heap confusion. Sorry for that,

/be
Attached file proof of concept
With respect to:

> Again, what OS will fail to redzone and crash safely?

For future reference, here's what it looks like when I run the attached code:

$ ./stk
write var is at 0xb7d373b4
read var is at 0xb75363c4
read i=0
writing to buf[13]
read i=255

IOW, both normal C++ dynamic-sized stack allocation and, if you switch the ifdef, the use of alloca generate code that (on linux-gcc) happily walks off your thread's stack and into your neighbour's, and doesn't crash. No matter if you're "inside" the legal range of the supposed buffer. No gradual probe-every-page redzone triggering or anything. Hence quite exploitable.

Naturally, also lets you scribble on the heap.

As far as I know OSX does this too, though with slightly different spatial arrangements. The loader randomizes only a little; it's only got so many megs of possible-places to stick the thread stacks. Some day we'll have 64-bit processes, for real :)
Get a real OS! :-P (typed from a Mac -- you sure it fails too? NeXT infusion notwithstanding?

I bet $1 (US rubles) that Windows does better.

Our existing stack depth checks and 512K limit seem enough to save us given default settings, but a small enough ulimit could cause a crash too, at least on toy OSes. Bleah.

/be
(In reply to comment #40)

> This would be adjustable if js_ExecuteTree were recursive. But it's not so
> sorry for the digression. To make it up all I can say is the patch looks (and
> looked) good, apart from any stack vs. heap confusion. Sorry for that,

Keep in mind that the 2nd version of the patch has no mallocs, only arena pools, and it'll only hit slow paths if someone's messing with us. So it's pretty "LIFO" as-is.

Anyway, I'm sorry I snitted. Inappropriate of me. 

It sounds like the whole thing can be done in a set of fixed per-thread buffers. I'll try to write up a variant that does that.

(The thing that sort of boggles me is how it's slower on OSX; the arena-pool pointer bump should be very nearly identical cost to an esp bump -- I JS_ALWAYS_INLINE it in the version I'm testing -- and I'm carving out a bunch of very expensive mallocs the existing code does. It *should* be faster.)
Assignee: gal → graydon
Attached patch faster variant (obsolete) — Splinter Review
Bleh. I think I must have got confused about which flavour I was testing. The affect I was seeing earlier was failure-to-inline in a few cases. Removing those and not getting my testing environment messed up, then we're hitting fast-paths on the arena pool in all cases, and it's inlined so it's really just doing a compare-and-bump on arena pointers.

That *might* have still cost fractionally more than alloca (which doesn't even do the compare) but as I said, the patch is paying its way performance-wise by removing the existing mallocs: the *combined* patch runs sunspider faster than unpatched now, on mac as well as linux.

(I'll grant, more fixed-size buffers might be fractionally faster too, but then you're stuck needing to insert/rearrange a bunch of code to bail out when we overflow limits on those fixed sizes. This is comparatively uninvasive.)

Sufficient penance? I can chuck it at a tryserver to be sure.
Attachment #370562 - Attachment is obsolete: true
Attachment #370762 - Flags: review?(gal)
Attachment #370562 - Flags: review?(gal)
Attachment #370762 - Flags: review?(gal) → review+
Comment on attachment 370762 [details] [diff] [review]
faster variant

The smaller sizes may hurt recursion, increasing them will hit the pool slow path. Maybe an assert to guard that we don't make them so slow that we hit the bad path. And I think I would prefer if we do the non-pool approach eventually. But we can certainly go with this for now.
Confirmed on windows VM, this patch fixes the bug also.
http://hg.mozilla.org/tracemonkey/rev/b512be855093
Whiteboard: fixed-in-tracemonkey
Backed out due to massive failures in jsfunfuzz http://hg.mozilla.org/tracemonkey/rev/a55a5e3c4276
Whiteboard: fixed-in-tracemonkey
see 486812 for test case
(In reply to comment #47)
> http://hg.mozilla.org/tracemonkey/rev/b512be855093

caused js1_8/regress/regress-477234.js

Assertion failure: *(uint64*)&global[STOBJ_NSLOTS(JS_GetGlobalForObject(cx, cx->fp->scopeChain))] == 0xdeadbeefdeadbeefLL, at ../jstracer.cpp:4241;
Thanks bob. That actually explains the crashes we were seeing quite well. Maybe we should land my minimal patch until this is sorted out.
also, this changed js1_2/regexp/alphanumeric.js and js1_2/regexp/digit.js opt/shell from a crash to TypeError: (void 0) is undefined.
Sounds like the chunk of code copied from bug 475998. Maybe I misunderstood or mis-copied it.
Attached patch updated patchSplinter Review
Previous variant left odd (stale) debugging machinery in place. Caused clobbers to un-owned memory in debug builds, quite random and quite platform-specific. Working on win32 presently. This variant is (I think) free from such flaws; it fixes as many of the cases as I've been able to reproduce here, anyways.

Will post changes-to-patch as separate diff momentarily to clarify what changed.

Not sure why bc saw other changes; would appreciate a double-check on this patch from bc and jruderman as it is behaving spooky, and I'm not able to reproduce all their specific failures. Also not sure why this variant causes change in trace stats for one test. 

(Also not sure why global object is growing quickly in the first place. But that's a separate bug.)
Attachment #370762 - Attachment is obsolete: true
Attachment #371368 - Flags: review?(gal)
This is just a sub-patch showing the differences between the last 2 variants I uploaded, for review (minus one hunk that simply drifted context, but was otherwise identical).
(In reply to comment #24)
> (In investigating this, I've also noticed that the "reserve then release"
> idiom we are using to make stack reconstruction "infallible" on exit is
> ... probably wrong? I don't see any reason why the stack arena pool will
> *necessarily* succeed in re-supplying you with a chunk of memory you've
> already released. But I guess that's an issue for a later patch.)

When I introduced that, I read the arena code to make sure it does, even if you free a big chunk then allocate in several smaller chunks.  mrbkap agreed at the time that it was safe.  If this is no longer true, or I misread, it's definitely a bug.
The arena code has not changed. We need to agree on how it works and make that requirement stick, with assertions more than docs, but both of course.

/be
It looks to me like JS_ArenaRelease calls FreeArenaList on the tail of the pool (after the mark), and that subsequently calls free() on all the tail arenas. Perhaps I'm misreading.
FYI I've set up the fuzzer here and while the former patch certainly provokes a crash within moments, the new one appears to run to completion (whatever its iteration count is).
Attachment #371369 - Flags: review+
Landed again in http://hg.mozilla.org/tracemonkey/rev/143e997c858e, bounced off a mac tinderbox and was backed out again in http://hg.mozilla.org/tracemonkey/rev/6eca563de031. Also the ARM tinderboxen were (naturally!) disagreeing over the drift in trace-stats. Which as of yesterday, appears on my workstation to be triggerable by leaking 8 bytes of random malloc, unrelated to anything else. Sigh.

I'm beginning to despair on this bug. I think I'll land gal's initial patch for now. It fixes the original case minimally, it's true, and I cannot off the top of my head construct an attack that actually abuses the current state of affairs to the point of exploitation. I'm just worried longer-term about stack growth.

The remainder of this bug can then be taken off blocking status, have its critical/P1 priority dropped to P2/enhancement or such, and left assigned to me for mopping up after this release cycle.
Attachment #370338 - Flags: review- → review+
Probably should clone this bug into a followup bug.

/be
As you like. Minimal version landed as http://hg.mozilla.org/tracemonkey/rev/6b6158fa172d. Residual work transferred to new bug 487324.
I believe that stuck. Can close when merged to trunk. Sorry for all the fuss.
Whiteboard: fixed-in-tracemonkey
http://hg.mozilla.org/mozilla-central/rev/6b6158fa172d
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
oops, wrong changeset (the one above was later backed out)
Keywords: fixed1.9.1
can't repro on mac. i'll try win later.
Flags: in-testsuite+
I did a testrun with the latest nightly builds on trunk and 1.9.1 on WinXP and cannot see the crash anymore. But while running the testcase I got the unresponsive script warning, clicked on stop and quit Firefox. Now a process hangs in memory for about 2mins and consumes 100% cpu load. Don't we really stop on processing the script?

Shall I file a new bug on that issue?
Target Milestone: --- → mozilla1.9.2a1
Henrik: can you use shark if you are on a Mac (you'll need XCode and the CHUD tools) to see where the 100% CPU load is being spent? Or similar for other OSes, just need to find the best profiler and attach it to the process.

/be
I followed whimboo's STR and got the same "hang for awhile on shutdown" - did a Shark analysis off a Shark latest-tracemonkey build, and here's what I got.

It seems to be spending lots of time in js_SearchScope.
graydon, your favorite bug is back, in a different flavor. See #72
(In reply to comment #72)
> Created an attachment (id=374175) [details]
> Shark profile screenshot
> 
> I followed whimboo's STR and got the same "hang for awhile on shutdown" - did a
> Shark analysis off a Shark latest-tracemonkey build, and here's what I got.
> 
> It seems to be spending lots of time in js_SearchScope.

David or Graydon, is this an already known bug or shall I file a new one?
(In reply to comment #74)

> David or Graydon, is this an already known bug or shall I file a new one?

Hmm, I'm already hunting a couple in this space this afternoon but feel free to file a fresh one and assign to me. I'll mark as dupe if it happens to be so.
Henrik, Gary: this looks like bug 489636 (see bug 488967 comment 12). Could file new and see about dup'ing, or just optimize for that (likely, IMHO) outcome by commenting there. But no need to comment more in this bug about 100% CPU in GC.

/be
(In reply to comment #72)
> Created an attachment (id=374175) [details]
> Shark profile screenshot
> 
> I followed whimboo's STR and got the same "hang for awhile on shutdown" - did a
> Shark analysis off a Shark latest-tracemonkey build, and here's what I got.
> 
> It seems to be spending lots of time in js_SearchScope.

Shark lies. Need full symbols, should see static functions in the same file whose address "rounds down" to js_SearchScope, the nearest extern function.

/be
Verified fixed with testcase given in comment 0 on trunk and 1.9.1 with the
following debug builds:

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre)
Gecko/20090422 Minefield/3.6a1pre ID:20090422224452

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b4pre)
Gecko/20090422 Shiretoko/3.5b4pre ID:20090422122043

(In reply to comment #76)
> Henrik, Gary: this looks like bug 489636 (see bug 488967 comment 12). Could

Thanks Brendan. Let's watch this bug for now.
Status: RESOLVED → VERIFIED
Attachment #371368 - Flags: review?(gal)
Flags: wanted1.9.0.x-
cvsroot/mozilla/js/tests/js1_8_1/trace/trace-test.js,v  <--  trace-test.js
new revision: 1.14; previous revision: 1.13

/cvsroot/mozilla/js/tests/shell.js,v  <--  shell.js
when this bug is opened, the test should be checked in.
Flags: in-testsuite+ → in-testsuite?
Crash Signature: [@ _chkstk]
Filter on qa-project-auto-change:

Bug in removed tracer code, setting in-testsuite- flag.
Group: core-security
Flags: in-testsuite? → in-testsuite-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: