Closed Bug 887459 Opened 11 years ago Closed 10 years ago

Crash [@ js::ion::DoGetNameFallback] with recursive overflow

Categories

(Core :: JavaScript Engine, defect)

All
Linux
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 909094
Tracking Status
firefox22 --- unaffected
firefox23 --- affected
firefox24 --- affected
firefox25 --- affected
firefox-esr17 --- unaffected
b2g18-v1.0.0 --- unaffected
b2g18-v1.0.1 --- unaffected
b2g-v1.1hd --- unaffected

People

(Reporter: gkw, Unassigned)

Details

(Keywords: crash, regression, testcase, Whiteboard: [jsbugmon:update])

Crash Data

Attachments

(2 files, 1 obsolete file)

Attached file stack
try {
    a2 = [];;
    (function() {
        Object.defineProperty(this, "p0", {
            get: function() {
                return p0.map(2, f2);
            }
        })
    })()
    a2[0] = Math.atan2(/x/g, -29)
} catch (e) {}
try {
    (function() {
        Object.defineProperty(this, "t2", {
            get: function() {
                return new Int32Array(a2);
            }
        });
    })()
    for (var v of this.a2) {
        Array.prototype.push.call(a2, t2);
    }
} catch (e) {}
try {
    v1 = p0;
} catch (e) {}


crashes 32-bit js debug shell on m-c changeset e681476216e6 without any CLI arguments at js::ion::DoGetNameFallback.

It also seems to crash 64-bit, but takes a longer time to due so.

autoBisect shows this is probably related to the following changeset:

The first bad revision is:
changeset:   http://hg.mozilla.org/mozilla-central/rev/15850ed77719
user:        Luke Wagner
date:        Wed May 01 11:04:06 2013 -0700
summary:     Bug 865960 - JS OOM should throw instead of silently stopping execution (r=billm)
Flags: needinfo?(luke)
This is pretty interesting: we're crashing after too much recursion ($esp is not valid memory), however $esp - rt->mainThread.nativeStackBase is within the limit (so JS_CHECK_RECURSION is not failing).

I think what's happening is that we first exhaust all available memory (top shows 3.9G resident), which throws out of memory, which is now catchable (in the blamed bug), so now it gets caught and we now start an infinite recursion which I think fails earlier than the stack limit because there is no more address space available.

One fix would be to shrink the stack limit.  Another would be to change OOM reporting to be more like what bug 865960 comment 20 says, which makes sense.
Flags: needinfo?(luke)
Cutting gStackSize in the shell by half indeed causes a safe exit.

Even if catchable-oom makes this easy to hit, there are other ways to exhaust memory and hit these problems, so it seems like we just need to tweak the stack limits more to get a more conservative bound.  I see there already seems to be a lot of tweaking for various OS and build configurations...
Assigning to Luke since he has analysis in comment 1 and comment 2 to keep this on the radar. Please feel free to forward this on if this is not the case, thanks!
Assignee: general → luke
Status: NEW → ASSIGNED
Flags: needinfo?(luke)
(setting needinfo to hopefully move this forward)
I'm no longer able to reproduce this crash (on debug 32-bit linux); are you?  If you can, the fix, I think, is just to make gMacStackSize slightly smaller in shell/js.cpp.
Flags: needinfo?(luke)
Flags: needinfo?(gary)
I've retested that this still reproduces with m-c tip a6a046acc881.

CC="gcc -m32" CXX="g++ -m32" AR=ar PKG_CONFIG_LIBDIR=/usr/lib/pkgconfig sh ./configure --target=i686-pc-linux --enable-optimize --enable-debug --enable-profiling --enable-gczeal --enable-debug-symbols --enable-methodjit --enable-type-inference --disable-tests --enable-more-deterministic --with-ccache --enable-threadsafe <other NSPR flags>


However, neither gStackSize nor gMacStackSize exist now.

http://mxr.mozilla.org/mozilla-central/search?string=gMacStackSize&find=js%2Fsrc&findi=&filter=^[^\0]*%24&hitlimit=&tree=mozilla-central

http://mxr.mozilla.org/mozilla-central/search?string=gStackSize&find=js%2Fsrc&findi=&filter=^[^\0]*%24&hitlimit=&tree=mozilla-central
Flags: needinfo?(gary) → needinfo?(luke)
Oops, I mistyped: gMaxStackSize
Flags: needinfo?(luke)
What would you suggest gMaxStackSize to be changed to?

Code in question is:

#if defined(MOZ_ASAN) || (defined(DEBUG) && !defined(XP_WIN))
static size_t gMaxStackSize = 2 * 128 * sizeof(size_t) * 1024;
#else
static size_t gMaxStackSize = 128 * sizeof(size_t) * 1024;
#endif
Flags: needinfo?(luke)
I dunno, 127?
Flags: needinfo?(luke)
(That is, s/128/127/)
Attached patch patch v1 (obsolete) — Splinter Review
Assignee: luke → gary
Attachment #8339013 - Flags: review?(luke)
(Does this fix the crash?)
Comment on attachment 8339013 [details] [diff] [review]
patch v1

Unfortunately, no, it does not seem to fix the crash.
Attachment #8339013 - Flags: review?(luke)
No idea what's next here, back to the pool.
Assignee: gary → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(luke)
I reproduced this intermittently using --enable-posix-nspr-emulation:

CC="gcc -m32" AR=ar PKG_CONFIG_LIBDIR=/usr/lib/pkgconfig CXX="g++ -m32" sh ./configure --target=i686-pc-linux --enable-optimize --enable-debug --enable-profiling --enable-gczeal --enable-debug-symbols --enable-methodjit --enable-type-inference --disable-tests --enable-more-deterministic --with-ccache --enable-threadsafe --enable-posix-nspr-emulation

on m-c a6a046acc881, on Ubuntu 13.10.
I'm not sure what action to take here.  I looked again and it definitely does seem like this is a fundamental issue: if we allow the host JS program to make an unbounded amount of heap allocation, it can consume all available virtual address space (only feasible on 32-bit) so that the lazily-mapped stack OOMs (even though we're still within the conservative stack bounds).  Bug 865960 isn't really the culprit here, it just makes it easier to hit.  A "real" fix would I think be some form of heap allocation quota which is not a simple task.
No longer blocks: 865960
Flags: needinfo?(luke)
> A "real" fix would I think be some form of heap allocation quota which is not
> a simple task.

So, I guess this is still worth fixing..
QA Contact: general
Will using the same approach in bug 942547 resulting in bug 942547 comment 17, such that we can ignore all of such too-much-recursion crashes, work out well?

e.g. show a "Assertion failure: [too much recursion]" message?
Flags: needinfo?(luke)
The problem is that we don't necessarily hit "too much recursion" before we crash.  I forget, does this testcase report out of memory before crashing?  Because, roughly, that's what's happening: we use up all the available address space so the OS can't even lazily commit more stack.  I wonder if there is an OS flag that says "don't give away my stack's address space".
Flags: needinfo?(luke)
> I forget, does this testcase report out of memory before crashing? 

Yes, it does. FWIW, there's a similar bug 937550 about OOM crashing.

$ ./js-dbg-32-dm-ts-linux-599100c4ebfe 887459.js
js_ReportOutOfMemory called
js_ReportOutOfMemory called
Bus error (core dumped)
Flags: needinfo?(luke)
Attachment #8339013 - Attachment is obsolete: true
Right, well, then that could be useful for ignoring these crashes.
Flags: needinfo?(luke)
Well, I guess if you were able to detect that it will crash, then you couldn't need to force the crash right?

Reporting the crash as a controlled crash would only help if we could reliably detect that we will exceed stack size, but we cannot handle the situation gracefully. I don't think that's the case, is it?
Flags: needinfo?(luke)
(In reply to Luke Wagner [:luke] from comment #19)
> The problem is that we don't necessarily hit "too much recursion" before we
> crash.  I forget, does this testcase report out of memory before crashing? 
> Because, roughly, that's what's happening: we use up all the available
> address space so the OS can't even lazily commit more stack.  I wonder if
> there is an OS flag that says "don't give away my stack's address space".

Luke, Is this the same problem as in bug 909094?
(In reply to Jesse Ruderman from comment #24)
Yeah, totally.
Flags: needinfo?(luke)
> Luke, Is this the same problem as in bug 909094?

> Yeah, totally.

Duping to bug 909094 then.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: