Closed
Bug 942984
Opened 11 years ago
Closed 11 years ago
Emscripten web page hard-crashes the browser.
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
VERIFIED
FIXED
mozilla28
People
(Reporter: jujjyl, Assigned: bhackett1024)
References
Details
(Keywords: regression)
Attachments
(1 file)
7.88 KB,
patch
|
billm
:
review+
lsblakk
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
This one occurs on nightly build 2013-11-25 on OSX 10.8.4. 1. Visit https://dl.dropboxusercontent.com/u/40949268/emcc/bugs/nehe-emscripten/lesson7.html (to download a locally executable version, get https://dl.dropboxusercontent.com/u/40949268/emcc/bugs/nehe-emscripten/lesson7.zip ) The browser will crash hard to desktop with a dialog "Firefox had a problem and crashed. We'll try to restore ..." message.
Reporter | ||
Comment 1•11 years ago
|
||
Reports: https://crash-stats.mozilla.com/report/index/4ff43ba3-0381-4436-9c6f-a223b2131125 https://crash-stats.mozilla.com/report/index/b4f0a53d-a53c-494c-9580-f0fff2131125 https://crash-stats.mozilla.com/report/index/c639209b-df8d-41c3-aaa2-def512131125 https://crash-stats.mozilla.com/report/index/532dcc9c-8287-4b0a-904a-115b92131125
Reporter | ||
Comment 2•11 years ago
|
||
Current Aurora as of today did not crash.
Comment 3•11 years ago
|
||
Hmm. All of those crash reports are empty. :( That said, this is quite reproducible: Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_PROTECTION_FAILURE at address: 0x0000000114a00ff8 [Switching to process 46321 thread 0x7d13] js::gc::BarrieredCell<JSString>::zoneFromAnyThread (this=0x125bb18a0) at Barrier.h:155 155 JS_ALWAYS_INLINE JS::Zone *zoneFromAnyThread() const { return tenuredZoneFromAnyThread(); } (gdb) At this point, my stack has 3234 frames on it, most of them like so: #3076 0x00000001016e607a in js::frontend::Parser<js::frontend::SyntaxParseHandler>::statement (this=0x114a805b0, canHaveDirectives=false) at Parser.cpp:5205 #3077 0x00000001016e6fc1 in js::frontend::Parser<js::frontend::SyntaxParseHandler>::ifStatement (this=0x114a805b0) at Parser.cpp:3958 #3078 0x00000001016e607a in js::frontend::Parser<js::frontend::SyntaxParseHandler>::statement (this=0x114a805b0, canHaveDirectives=false) at Parser.cpp:5205 #3079 0x00000001016e6fc1 in js::frontend::Parser<js::frontend::SyntaxParseHandler>::ifStatement (this=0x114a805b0) at Parser.cpp:3958 (in fact, all the frames from frame 25 to frame 3220 kinda look like that). I bet we're running out of stack space. Looking at the lesson7.js script, the function __ZN5Regal5Token14GLenumToStringEj has a long if/else cascade with 3313 else clauses. So it's entirely possible that this used to fit in the available stack space by accident and now just doesn't....
tracking-firefox28:
--- → ?
Component: General → JavaScript Engine
Keywords: regression,
regressionwindow-wanted
Comment 4•11 years ago
|
||
The regression range is http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a4e9c9c9dbf9&tochange=b9029b1de410 and the crash is for an async script, while parsing on a worker thread. So presumably a regression from bug 906371. But then why is it not an issue on Aurora? Are we sure the crash doesn't happen there? It looks like the JS compile helper threads are started via PR_CreateThread with a default thread stack size (over in jsworkers.cpp). If https://developer.apple.com/library/mac/qa/qa1419/_index.html is not out of date, that default stack size is 512KB for pthreads on Mac, whereas the main thread uses an 8MB stack. So I can definitely see how we might have fit in the latter but not in the former. It looks like the worker GC threads use a 1MB stack. But all that would do is buy us a few more thousand if nesting levels... The real right fix probably involves doing this parse non-recursively or something, if emscripten outputs long if/else chains like this. Brian, does it make sense to just bump up the thread stack size or these worker threads for now? On Linux, looks like pthreads default to a 2MB stack. On Windows, I would assume we end up with the default 1MB thread stacks. So the Mac's 512KB is the lowest of the lot.
Blocks: 906371
tracking-firefox26:
--- → ?
tracking-firefox27:
--- → ?
Flags: needinfo?(bhackett1024)
Reporter | ||
Comment 5•11 years ago
|
||
I would recommend not increasing the thread stack size, since that would potentially hide the bug, waiting for someone else to write a new application that consumes the bigger stack space and crash again. It's ok (for now) if that page doesn't run but throws a "stack overflow" exception or similar, to start with.
Reporter | ||
Comment 6•11 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #4) > It looks like the worker GC threads use a 1MB stack. But all that would do > is buy us a few more thousand if nesting levels... The real right fix > probably involves doing this parse non-recursively or something, if > emscripten outputs long if/else chains like this. The compiled code for the symbol __ZN5Regal5Token14GLenumToStringEj is from here: https://github.com/p3/regal/blob/53fc58e83f68a6f10e164b631f9e628bc411ed3b/src/regal/RegalToken.cpp#L712 It might be possible to improve emscripten output, azakai knows more, but we should not change emscripten codegen behavior before this crash is fixed, to not hide the bug.
Assignee | ||
Comment 7•11 years ago
|
||
I agree with Jukka, we have a mechanism to detect stack overflows during parsing and that should be doing the right thing here.
Comment 8•11 years ago
|
||
So the parser does have JS_CHECK_RECURSION bits. In my case, these seem to macro-expand to: do { int stackDummy_; if (!((uintptr_t)(&stackDummy_) > (js::GetNativeStackLimit(cx))-(0))) { js_ReportOverRecursed(cx); return false; } } while (0); js::GetNativeStackLimit is over in jscntxtinlines in this case and ends up returning cx->perThreadData->nativeStackLimit[kind]. On this thread, cx->perThreadData->nativeStackLimit is an array of three 0 entries. So the RHS of that compare is always 0, the &stackDummy is always greater than it, and we never stop the recursion.
Comment 9•11 years ago
|
||
We seem to only set the nativeStackLimit stuff to anything nonzero in js::RecomputeStackLimit, which only recomputes it for rt->mainThread.
Comment 10•11 years ago
|
||
It's not clear to me whether NSPR has a way to ask what the thread stack size is of a thread you created. In any case, we'd presumably want the same stack limit across architectures....
Assignee | ||
Comment 11•11 years ago
|
||
Yeah, we're not setting the native stack limit anywhere. This patch just uses 512K as trying to figure out what the 'default' setting should be is super baroque, see the XPCJSRuntime constructor. This patch also propagates the overrecursed bit to the parse task (the worker thread can't report exceptions directly) so that a message appears in the console after the parse fails.
Attachment #8338626 -
Flags: review?(wmccloskey)
Flags: needinfo?(bhackett1024)
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → bhackett1024
Attachment #8338626 -
Flags: review?(wmccloskey) → review+
Assignee | ||
Comment 12•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/c5714ab2828b
Comment 13•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/c5714ab2828b
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
Comment 14•11 years ago
|
||
Unless this is something we expect a lot of users to hit, the example looks like an edge case specific to the link and not a widely impacting site so we should only take this up to FF27 given that we're about to build 26 RC on Monday. Brian or bz - ni? me if you disagree strongly with this assessment, if I'm missing something here.
Comment 15•11 years ago
|
||
One concern I have about the different stack size on main thread and workers: a script will compile fine until it's made async, and then it suddenly won't... I'm not sure I have a proposed solution, unfortunately.
Assignee | ||
Comment 16•11 years ago
|
||
Comment on attachment 8338626 [details] [diff] [review] patch [Approval Request Comment] Bug caused by (feature/regressing bug #): bug 906371 User impact if declined: stack overflow crashes on unusual web scripts Testing completed (on m-c, etc.): on m-c Risk to taking this patch (and alternatives if risky): none, except potential but nonsensical windows tp5 regressions (Bug 943924)
Attachment #8338626 -
Flags: approval-mozilla-aurora?
Updated•11 years ago
|
Updated•11 years ago
|
Attachment #8338626 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 17•11 years ago
|
||
This has non-trivial conflicts on Aurora.
Assignee | ||
Comment 18•11 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/d1a206d2c6b3
Flags: needinfo?(bhackett1024)
Updated•11 years ago
|
Updated•11 years ago
|
Keywords: branch-patch-needed,
regressionwindow-wanted
Comment 19•11 years ago
|
||
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:28.0) Gecko/20100101 Firefox/28.0 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:27.0) Gecko/20100101 Firefox/27.0 Verified as fixed on latest Aurora 28.0a2 (buildID: 20131218004001) and 27 Beta 2 (buildID: 20131216183647).
Status: RESOLVED → VERIFIED
Keywords: verifyme
Comment 20•10 years ago
|
||
Reflagging this for verification as per https://bugzilla.mozilla.org/show_bug.cgi?id=943924#c14
Keywords: verifyme
Comment 21•10 years ago
|
||
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:28.0) Gecko/20100101 Firefox/28.0 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:28.0) Gecko/20100101 Firefox/28.0 Verified as fixed on Firefox 28 beta 2.
Keywords: verifyme
You need to log in
before you can comment on or make changes to this bug.
Description
•