Closed
Bug 1269705
Opened 8 years ago
Closed 7 years ago
Crash [@ js::ToPrimitiveSlow] with OOM and SIGBUS
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
DUPLICATE
of bug 909094
Tracking | Status | |
---|---|---|
firefox47 | --- | fix-optional |
firefox48 | --- | fix-optional |
firefox49 | + | wontfix |
firefox50 | + | wontfix |
People
(Reporter: decoder, Assigned: sfink)
References
(Blocks 1 open bug)
Details
(Keywords: crash, regression, testcase, Whiteboard: [jsbugmon:])
Crash Data
The following testcase crashes on mozilla-central revision 77cead2cd203 (build with --enable-optimize --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --target=i686-pc-linux-gnu --disable-tests --enable-simulator=arm --disable-debug, run with --fuzzing-safe --thread-count=2 --disable-oom-functions --ion-eager --baseline-eager --ion-offthread-compile=off --ion-extra-checks --arm-asm-nop-fill=1 --arm-sim-icache-checks --arm-hwcap=vfp): function testSetTypedFloat32Array(k) { var ar = new Float32Array(8); var t = k + 555; var L = ar[k+7] = t & 5; if (t !== -1) { testSetTypedFloat32Array('#2: y = 1; y /= -1; y === -1. Actual: ' + (t)); } } try { for (var i = 0; i <= 10; i++) { testSetTypedFloat32Array(0); } } catch(exc1) {} function assertEqArr(a1, a2) {} var order = ""; var o1 = { toString: function() { order += "b"; return print + (this ) - 98 >= a1 ; }}; var o2 = { valueOf: function() { order += "a"; return 1; }}; var res = "xyz-xyz".split(o1, o2); Backtrace: Program terminated with signal SIGBUS, Bus error. #0 js::ToPrimitiveSlow (cx=cx@entry=0xf7275040, preferredType=preferredType@entry=JSTYPE_VOID, vp=vp@entry=...) at js/src/jsobj.cpp:3087 (gdb) bt 16 #0 js::ToPrimitiveSlow (cx=cx@entry=0xf7275040, preferredType=preferredType@entry=JSTYPE_VOID, vp=vp@entry=...) at js/src/jsobj.cpp:3087 #1 0x084e3c8e in ToPrimitive (vp=..., cx=0xf7275040) at js/src/jsobj.h:1034 #2 AddOperation (res=..., rhs=..., lhs=..., cx=0xf7275040) at js/src/vm/Interpreter.cpp:1334 #3 js::AddValues (cx=cx@entry=0xf7275040, lhs=lhs@entry=..., rhs=rhs@entry=..., res=res@entry=...) at js/src/vm/Interpreter.cpp:4410 #4 0x082c5eeb in js::jit::DoBinaryArithFallback (cx=0xf7275040, payload=payload@entry=0x0, stub_=stub_@entry=0x78780010, lhs=lhs@entry=..., rhs=rhs@entry=..., ret=ret@entry=...) at js/src/jit/SharedIC.cpp:928 #5 0x08370f89 in js::jit::Simulator::softwareInterrupt (this=0xf721c000, instr=0xf7202d34) at js/src/jit/arm/Simulator-arm.cpp:2380 #6 0x0837130d in js::jit::Simulator::decodeType7 (this=this@entry=0xf721c000, instr=instr@entry=0xf7202d34) at js/src/jit/arm/Simulator-arm.cpp:3502 #7 0x083727fc in js::jit::Simulator::instructionDecode (this=this@entry=0xf721c000, instr=instr@entry=0xf7202d34) at js/src/jit/arm/Simulator-arm.cpp:4424 [...] (gdb) info reg eax 0xffa01158 -6287016 ecx 0xf51f4298 -182500712 edx 0xf51f4298 -182500712 ebx 0x94e3418 156120088 esp 0xffa00fa0 0xffa00fa0 ebp 0xffa01158 0xffa01158 esi 0xf7275040 -148418496 edi 0x25 37 eip 0x83faa78 0x83faa78 <js::ToPrimitiveSlow(JSContext*, JSType, JS::MutableHandle<JS::Value>)+24> eflags 0x10286 [ PF SF IF RF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 (gdb) x /i $pc => 0x83faa78 <js::ToPrimitiveSlow(JSContext*, JSType, JS::MutableHandle<JS::Value>)+24>: call 0x807eb60 <__x86.get_pc_thunk.bx> This could be a hit on the stack space boundary (call instruction + esp looks like it could be near end of address space), but I'm not entirely sure. Marking s-s until investigated by a developer.
Updated•8 years ago
|
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:bisect]
Comment 1•8 years ago
|
||
JSBugMon: Cannot process bug: Unable to automatically reproduce, please track manually.
Updated•8 years ago
|
Whiteboard: [jsbugmon:bisect] → [jsbugmon:]
Assignee | ||
Comment 4•8 years ago
|
||
Definitely looking like stack exhaustion so far. Which is strange, since there's a JS_CHECK_RECURSION 3 frames up. Looking...
Comment 5•8 years ago
|
||
There are some crashes on 47 and 48 showing on crash-stats, and this issue is still unrated, so if we come up with a fix then please ask for sec-approval and uplift. Thanks
status-firefox47:
--- → fix-optional
status-firefox48:
--- → affected
status-firefox50:
--- → affected
tracking-firefox50:
--- → +
Updated•8 years ago
|
Assignee: nobody → sphink
Assignee | ||
Comment 6•8 years ago
|
||
This particular test will only reproduce on a 32-bit arm simulator build for me. It does not reproduce on either 64-bit or 32-bit linux. https://crash-stats.mozilla.com/signature/?_sort=-date&signature=js%3A%3AToPrimitiveSlow#aggregations is 82% on FennecAndroid (which makes sense, and indicates that it's not just a simulator issue), but also 17% on desktop firefox. But none of those are SIGBUS. (And note that those are marked with bug 1248343, which is a dupe of bug 1232685, which was fixed in 47. And yet, these crashes are in 47 and later.) Anyway, I can reproduce on arm-sim, so I'll look into it there.
Assignee | ||
Comment 7•8 years ago
|
||
Bug 1286943 is related -- the script in comment 0 produces a different error message on an arm-sim build than a 32-bit linux build. So far, it seems like the crash in this bug occurs after you've OOMed once, and then you try to report the error. (The arm-sim error appears to be correct, btw; it matches a linux64 build. But I think this bug requires 32 bit.) Also note that I can trigger the bug much, much faster by doing: ulimit -S -s 128 # limit stack size ulimit -S -v 500000 # limit memory size (the -S isn't actually needed, it's just there to prevent you from painting yourself into a corner.)
Assignee | ||
Comment 8•8 years ago
|
||
Ok, documenting my current state of confusion: What is happening is that the test case has an infinite loop that eats up the C stack. It ends up seg faulting inside of jemalloc, when it tries to write to an address just below the end of the stack. I do not understand why it doesn't just extend the stack. ulimit -s says the stack is max 8MB. /proc/<pid>/maps says the stack is currently fff7e000-ffffe000 (512KB) JS_CHECK_RECURSION would cut things off at around 0xffefd001, which corresponds to a 1027KB stack. The stack pointer just before going into the crashing code is 0xfff7e0b0, very near the bottom of the mapped range. The stack pointer at the point of the crash is 0xfff7dfd0, which is indeed in the unmapped range. gdb won't let me read or write that address either. I don't know why the kernel isn't growing the stack automatically when that address is touched. There must be something else that is limiting the stack to 512KB? Needinfo? glandium because he's the only person I can think of who might have a clue about this type of craziness. glandium: do you know of additional limits?
Flags: needinfo?(sphink) → needinfo?(mh+mozilla)
Assignee | ||
Comment 9•8 years ago
|
||
Oh. Interesting. /proc/<pid>/limits lists a hard and soft stack limit of 512KB. Never mind, it looks like I set the limit myself and was running ulimit -s from the wrong shell. :(
Flags: needinfo?(mh+mozilla)
Assignee | ||
Comment 10•8 years ago
|
||
So the easy reproductions I have were from artificially limiting the stack space, which unfortunately makes them invalid. Because I can get a very similar-looking crash when ulimit -s says 8MB. Unfortunately, that generates a truncated 4GB core file that gdb can't make much sense of.
Comment 11•8 years ago
|
||
What does /proc/$pid/maps look like when the crash happens?
Assignee | ||
Comment 12•8 years ago
|
||
Ok, so the thing that I can reproduce is another instance of bug 909094: I have ulimit -s set to the default of 8MB. It grows the stack to about 1MB (I'm printing out stack sizes in CallJSNative every time it grows) and the seg faults. Note that this bug refers to SIGBUS, not SIGSEGV, and I'm not sure when you would see a SIGBUS instead, but it seems plausible. If I wait until the stack grows to 800KB or so and then dump out /proc/self/maps, I see a piece of the heap getting in the way of the stack: f7ffd000-f7ffe000 rw-p 00022000 fd:00 3984528 /usr/lib/ld-2.22.so f8000000-fff00000 rw-p 00000000 00:00 0 fff37000-ffffe000 rw-p 00000000 00:00 0 [stack] This matches the behavior of the test case. It first OOMs on the heap, thus creating enough pressure for the kernel to decide grab a piece of the stack memory. Then it recurses, growing the stack until it hits the mmapped region. Given that it appears we are seeing this in the field, it seems like we may need to more seriously consider the mitigations discussed in that bug. Note that this is probably primarily an Android issue, since it has a 32-bit address space as well as low physical memory. So it's annoying if prefaulting the stack is going to commit all of its memory. Perhaps we could completely defeat the kernel's normal stack mechanism, for the main JS running thread at least, by mmapping the whole stack (up to the rlimit, or up to our own JS_CHECK_RECURSION limits plus a big chunk of slop space) and madvise(MADV_DONTNEED) it?
Comment 13•8 years ago
|
||
Marking this fix-optional for now but if this gets a rating and needs uplift, please let relman know or reset the flag to affected.
Updated•8 years ago
|
Group: javascript-core-security
Hi Steve, is this something that will be fixed in time for 50 or 51? It's a pretty low volume crash so I have tagged it as fix-optional.
Flags: needinfo?(sphink)
This is a pretty low volume crash on current release 49.x and also on beta 50. We are a week away from RC, too late, this is now a wontfix for 50.
Updated•7 years ago
|
Assignee | ||
Updated•7 years ago
|
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(sphink)
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•