Closed Bug 855536 Opened 12 years ago Closed 12 years ago

Crash [@ js::EncapsulatedValue::writeBarrierPre] with [@ js::CloneFunctionAtCallsite] and [@ js::gc::MarkString] on the stack

Categories

(Core :: JavaScript Engine, defect)

x86
All
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla23
Tracking Status
firefox19 --- unaffected
firefox20 --- unaffected
firefox21 --- unaffected
firefox22 + fixed
firefox23 + fixed
firefox-esr17 --- unaffected
b2g18 --- unaffected

People

(Reporter: gkw, Assigned: billm)

References

Details

(4 keywords, Whiteboard: [fuzzblocker] [jsbugmon:testComment=22][adv-main22-])

Crash Data

Attachments

(7 files, 1 obsolete file)

Attached file almost-fully-reduced testcase (obsolete) —
The attached testcase crashes js opt shell on m-c changeset 279078670022 without any CLI arguments at autoBisect shows this is probably related to the following changeset: The first bad revision is: changeset: 126214:4b4f4b9192d9 user: Nicholas Nethercote date: Tue Mar 26 21:40:00 2013 +1100 summary: Bug 854807 - Fix thinko in CloneFunctionObjectIfNotSingleton(). r=smaug. s-s because bug 854807 is s-s.
Attached file opt stack
This doesn't seem to affect debug shells.
(tested with a 32-bit opt deterministic non-threadsafe shell on Win7, though I'm not sure if determinism is needed)
Attachment #730421 - Attachment is obsolete: true
Crash Signature: [@ js::EncapsulatedValue::writeBarrierPre] [@ js::CloneFunctionAtCallsite] → [@ js::EncapsulatedValue::writeBarrierPre] [@ js::CloneFunctionAtCallsite]
Whiteboard: [jsbugmon:] → [jsbugmon:update]
Crash Signature: [@ js::EncapsulatedValue::writeBarrierPre] [@ js::CloneFunctionAtCallsite] → [@ js::EncapsulatedValue::writeBarrierPre] [@ js::CloneFunctionAtCallsite]
Whiteboard: [jsbugmon:update] → [jsbugmon:]
JSBugMon: Cannot process bug: Unable to automatically reproduce, please track manually.
Crash Signature: [@ js::EncapsulatedValue::writeBarrierPre] [@ js::CloneFunctionAtCallsite] → [@ js::EncapsulatedValue::writeBarrierPre] [@ js::CloneFunctionAtCallsite]
Whiteboard: [jsbugmon:update] → [jsbugmon:]
JSBugMon: Cannot process bug: Unable to automatically reproduce, please track manually.
This variant crashes a 32-bit opt deterministic threadsafe shell on Mac on rev cee46001a526 (was on m-i): ./js --no-ion -a -d testcase.js
Weird address 0xb0000000 seems to be accessed, and subsequent runs morph addresses, I think 0x50000000 was also accessed.
Setting sec-critical based on comment 6.
Crash Signature: [@ js::EncapsulatedValue::writeBarrierPre] [@ js::CloneFunctionAtCallsite] → [@ js::EncapsulatedValue::writeBarrierPre] [@ js::CloneFunctionAtCallsite] [@ js::gc::MarkString]
Keywords: sec-critical
OS: Windows 7 → All
Summary: Crash [@ js::EncapsulatedValue::writeBarrierPre] with [@ js::CloneFunctionAtCallsite] on the stack → Crash [@ js::EncapsulatedValue::writeBarrierPre] with [@ js::CloneFunctionAtCallsite] and [@ js::gc::MarkString] on the stack
This is showing up somewhat frequently with assorted crash signatures.
Whiteboard: [jsbugmon:] → [fuzzblocker][jsbugmon:]
njn, is bug 854807 a possible regressor?
Flags: needinfo?(n.nethercote)
I can't reproduce this. What is a "deterministic threadsafe shell"?
Flags: needinfo?(n.nethercote)
--enable-more-deterministic --enable-threadsafe
Fyi, I'm also seeing various crashes around these functions, even without the special build options. I'll try to isolate a testcase but these are usually hard to reduce. Resolving this is very important for JS fuzzing as it generates lots of signatures.
Flags: needinfo?(n.nethercote)
Here's a test that reproduces for me on m-c revision aae004a3c5d9, 32 bit opt build (--disable-debug --enable-optimize --enable-valgrind): var lfcode = new Array(); lfcode.push(""); lfcode.push(""); lfcode.push("1"); lfcode.push("assertEq(arguments.length, 0)"); lfcode.push(""); lfcode.push("gczeal(10, 3);"); lfcode.push("\ var p = new ParallelArray([2, 3,, 4, 5, 6]);\ var r = p.scatter([0,1,0,3,4], (1048575), function (a,b) {});\ "); while (true) { var file = lfcode.shift(); if (file == undefined) { break; } loadFile(file) } function loadFile(lfVarx) { try { if (lfVarx.substr(-3) != ".js" && lfVarx.length != 1) { switch (lfRunTypeId) { case 1: eval(lfVarx); break; } } else if (!isNaN(lfVarx)) { lfRunTypeId = parseInt(lfVarx); } } catch (lfVare) {} } The crash looks like this: Program received signal SIGSEGV, Segmentation fault. markIfUnmarked (cell=<optimized out>, this=<optimized out>, color=0) at ../gc/Heap.h:683 683 if (*word & mask) (gdb) bt #0 markIfUnmarked (cell=<optimized out>, this=<optimized out>, color=0) at ../gc/Heap.h:683 #1 markIfUnmarked (color=0, this=<optimized out>) at ../gc/Heap.h:971 #2 PushMarkStack (gcmarker=0x85cbe20, thing=0x0) at js/src/gc/Marking.cpp:797 #3 0x082642a7 in ScanShape (gcmarker=0x85cbe20, shape=0xf7530df0) at js/src/gc/Marking.cpp:805 #4 0x08267786 in PushMarkStack (thing=0xf7530df0, gcmarker=0x85cbe20) at js/src/gc/Marking.cpp:776 #5 processMarkStackTop (budget=..., this=<optimized out>) at js/src/gc/Marking.cpp:1371 #6 js::GCMarker::drainMarkStack (this=0x85cbe20, budget=...) at js/src/gc/Marking.cpp:1432 #7 0x080ceb3c in DrainMarkStack (phase=js::gcstats::PHASE_MARK, sliceBudget=..., rt=0x85cbd00) at js/src/jsgc.cpp:3783 #8 IncrementalCollectSlice (rt=0x85cbd00, budget=<optimized out>, reason=JS::gcreason::DEBUG_GC, gckind=js::GC_NORMAL) at js/src/jsgc.cpp:4259 #9 0x080d0c2d in GCCycle (rt=0x85cbd00, incremental=<optimized out>, budget=-1025, gckind=js::GC_NORMAL, reason=JS::gcreason::DEBUG_GC) at js/src/jsgc.cpp:4417 #10 0x080d1001 in Collect (rt=0x85cbd00, incremental=true, budget=-1025, gckind=js::GC_NORMAL, reason=JS::gcreason::DEBUG_GC) at js/src/jsgc.cpp:4545 #11 0x080d1d59 in Collect (reason=JS::gcreason::DEBUG_GC, gckind=js::GC_NORMAL, budget=-1025, incremental=true, rt=0x85cbd00) at js/src/jsgc.cpp:4463 #12 js::gc::RunDebugGC (cx=0x85ddea0) at js/src/jsgc.cpp:4737 #13 0x08160b8a in NewGCThing<JSScript, (js::AllowGC)1> (cx=0x85ddea0, kind=<optimized out>, thingSize=<optimized out>, heap=<optimized out>) at ../jsgcinlines.h:490 #14 js_NewGCScript (cx=0x85ddea0) at ../jsgcinlines.h:583 #15 JSScript::Create (cx=0x85ddea0, enclosingScope=(JSObject * const) 0xf7530d30 [object Function "scatter"], savedCallerFun=false, options=..., staticLevel=2, ss=0x85fa210, bufStart=79185, bufEnd=79271) at js/src/jsscript.cpp:1638 #16 0x08166aa7 in js::CloneScript (cx=0x85ddea0, enclosingScope=(JSObject * const) 0xf7530d30 [object Function "scatter"], fun=(JSFunction * const) 0xf7533ac0 [object Function <unnamed>], src=0xf7529268) at js/src/jsscript.cpp:2324 #17 0x080c1693 in js::CloneInterpretedFunction (cx=0x85ddea0, enclosingScope=(JSObject * const) 0xf7530d30 [object Function "scatter"], srcFun= (JSFunction * const) 0xf75380a0 [object Function "forceDivideScatterVector"]) at js/src/jsfun.cpp:467 #18 0x08166f01 in js::CloneScript (cx=0x85ddea0, enclosingScope=0x0, fun=(JSFunction * const) 0xf7530d30 [object Function "scatter"], src=0xf7529808) at js/src/jsscript.cpp:2285 #19 0x081671c5 in js::CloneFunctionScript (cx=0x85ddea0, original=(JSFunction * const) 0xf7530be0 [object Function "scatter"], clone=(JSFunction * const) 0xf7530d30 [object Function "scatter"]) at js/src/jsscript.cpp:2413 #20 0x080c276d in js::CloneFunctionObject (cx=0x85ddea0, fun=(JSFunction * const) 0xf7530be0 [object Function "scatter"], parent=(JSObject * const) 0xf7527040 [object ParallelArray] delegate, allocKind= js::gc::FINALIZE_OBJECT4_BACKGROUND) at js/src/jsfun.cpp:1549 #21 0x0809349d in js::CloneFunctionAtCallsite (cx=0x85ddea0, fun=(JSFunction * const) 0xf7530be0 [object Function "scatter"], script=0xf75292e0, pc=0x85dd093 ":") at js/src/jscntxt.cpp:295 #22 0x080f9d61 in js::Interpret (cx=0x85ddea0, entryFrame=0xf77160e0, interpMode=js::JSINTERP_NORMAL) at js/src/jsinterp.cpp:2344 #23 0x080fdf43 in js::RunScript (cx=0x85ddea0, fp=0xf77160e0) at js/src/jsinterp.cpp:341 #24 0x08103c74 in js::ExecuteKernel (cx=0x85ddea0, script=0xf75292e0, scopeChainArg=(JSObject &) @0xf7530d90 [object Call] delegate, thisv=..., type=js::EXECUTE_DIRECT_EVAL, evalInFrame=..., result=0xf77160b8) at js/src/jsinterp.cpp:529 #25 0x082477d8 in EvalKernel (cx=0x85ddea0, args=..., evalType=DIRECT_EVAL, caller=..., scopeobj=(JSObject * const) 0xf7530d90 [object Call] delegate, pc=0x85f2c73 "{") at js/src/builtin/Eval.cpp:303 #26 0x08248f96 in js::DirectEval (cx=0x85ddea0, args=...) at js/src/builtin/Eval.cpp:422 #27 0x080f8c1e in js::Interpret (cx=0x85ddea0, entryFrame=0xf7716028, interpMode=js::JSINTERP_NORMAL) at js/src/jsinterp.cpp:2304 #28 0x080fdf43 in js::RunScript (cx=0x85ddea0, fp=0xf7716028) at js/src/jsinterp.cpp:341 #29 0x08103fda in ExecuteKernel (result=0x0, evalInFrame=..., thisv=..., scopeChainArg=..., script=0xf7529088, cx=0x85ddea0, type=<optimized out>) at js/src/jsinterp.cpp:529 #30 js::Execute (cx=0x85ddea0, script=0xf7529088, scopeChainArg=(JSObject &) @0xf7525040 [object global] delegate, rval=0x0) at js/src/jsinterp.cpp:569 #31 0x0806a688 in JS_ExecuteScript (cx=0x85ddea0, objArg=(JSObject *) 0xf7525040 [object global] delegate, scriptArg=0xf7529088, rval=0x0) at js/src/jsapi.cpp:5578 #32 0x0805523e in Process (cx=0x85ddea0, obj_=<optimized out>, filename=0xffffd098 "min.js", forceTTY=false) at js/src/shell/js.cpp:473 #33 0x08058886 in ProcessArgs (op=0xffffcdd0, obj_=(JSObject *) 0xf7525040 [object global] delegate, cx=0x85ddea0) at js/src/shell/js.cpp:5082 #34 Shell (cx=0x85ddea0, op=0xffffcdd0, envp=0xffffcef0) at js/src/shell/js.cpp:5119 #35 0x0804b6e0 in main (argc=2, argv=0xffffcee4, envp=0xffffcef0) at js/src/shell/js.cpp:5343 (gdb) x /i $pc => 0x8264062 <PushMarkStack(js::GCMarker*, js::RawBaseShape)+66>: mov (%edx),%ebp (gdb) info reg edx edx 0xfc0b0 1032368 I assume it's the same issue or related. If not, please let me know so I can open a new bug.
Whiteboard: [fuzzblocker][jsbugmon:] → [fuzzblocker][jsbugmon:update,testComment=13,bisect]
This should be backported, so still setting tracking-firefox22.
(In reply to Christian Holler (:decoder) from comment #13) > Here's a test that reproduces for me on m-c revision aae004a3c5d9, 32 bit > opt build (--disable-debug --enable-optimize --enable-valgrind): > [...] > The crash looks like this: > > Program received signal SIGSEGV, Segmentation fault. > markIfUnmarked (cell=<optimized out>, this=<optimized out>, color=0) at > ../gc/Heap.h:683 > 683 if (*word & mask) > (gdb) bt > #0 markIfUnmarked (cell=<optimized out>, this=<optimized out>, color=0) at > ../gc/Heap.h:683 > #1 markIfUnmarked (color=0, this=<optimized out>) at ../gc/Heap.h:971 > #2 PushMarkStack (gcmarker=0x85cbe20, thing=0x0) at > js/src/gc/Marking.cpp:797 > #3 0x082642a7 in ScanShape (gcmarker=0x85cbe20, shape=0xf7530df0) at > js/src/gc/Marking.cpp:805 That stack doesn't contain js::EncapsulatedValue::writeBarrierPre or js::gc::MarkString, so I suspect it's different.
Flags: needinfo?(n.nethercote)
decoder, it would also be interesting to know what revision introduced the problem you see in comment 13.
> That stack doesn't contain js::EncapsulatedValue::writeBarrierPre or > js::gc::MarkString, so I suspect it's different. This bug does involve js::gc::MarkString, see the stack in comment 6. I've also seen assorted signatures involving PushMarkStack and ScanShape related to the testcase in this bug, so I think I agree with decoder here.
> > That stack doesn't contain js::EncapsulatedValue::writeBarrierPre or > > js::gc::MarkString, so I suspect it's different. > > This bug does involve js::gc::MarkString, see the stack in comment 6 Your stack involves MarkString. decoder's doesn't. That's why I'm suggesting they might not be related. gkw, what are your exact build steps for a 32-bit deterministic threadsafe shell on Mac?
> gkw, what are your exact build steps for a 32-bit deterministic threadsafe > shell on Mac? I can reproduce with decoder's testcase in comment 13 on m-c changeset 0b7c27024048. ./js-opt-32-darwin-0b7c27024048 --no-ion -a -d testcase.js Bus error: 10 with the following build arguments: LD=ld CROSS_COMPILE=1 MOZILLA_CENTRAL_PATH=<your m-c repo path> CXX="clang++ -Qunused-arguments -arch i386" RANLIB=ranlib CC="clang -Qunused-arguments -arch i386" AS=$CC AR=ar STRIP="strip -x -S" HOST_CC="clang -Qunused-arguments" HOST_CXX="clang++ -Qunused-arguments" sh ./configure --target=i386-apple-darwin9.2.0 --enable-macos-target=10.5 --enable-optimize --disable-debug --enable-profiling --enable-gczeal --enable-debug-symbols --enable-methodjit --enable-type-inference --disable-tests --with-ccache
(In reply to Nicholas Nethercote [:njn] from comment #16) > decoder, it would also be interesting to know what revision introduced the > problem you see in comment 13. Btw, decoder mentioned changeset aae004a3c5d9 at the top of comment 13. I just retested against m-c tip just to be sure, however, using the testcase in comment 13, autoBisect seems to point out that bug 568953 may have fixed this: autoBisect shows this is probably related to the following changeset: The first good revision is: changeset: 126958:172651edb28e user: Eddy Bruel date: Tue Apr 02 18:00:49 2013 +0200 summary: Bug 568953 - Added reflection support for module declarations; r=jorendorff Is this a possible fix?
Whiteboard: [fuzzblocker][jsbugmon:update,testComment=13,bisect] → [fuzzblocker][jsbugmon:update,testComment=13,bisectfix]
Seems very unlikely to me and the test(s) we have in here are very fragile.
Attached file Another test
Here's another test, that still reproduces on tip, and it has [@ js::CloneFunctionAtCallsite] on the stack. It reproduces only in opt-builds (--disable-debug --enable-optimize --enable-valgrind). See the README file in the archive for details how to run it. I was not able to combine it further in any way. If you cannot reproduce it yourself, I can also give you access to the machine that reproduced this.
Flags: needinfo?(n.nethercote)
I suspect the crucial option is --enable-gczeal, since that's normally disabled in opt builds. Also, it's pretty suspicious that all these test cases involve parallel arrays.
Yeah there's a pretty huge number of parallel array failures recently...
Attached patch patchSplinter Review
Never mind, this is a simple fix. Bug 851421 was the real regressor. The call to setExtendedSlot was triggering a write barrier because the extended slots are HeapValues. However, the original value was uninitialized. This patch switches to us to use .init() so no write barrier is invoked. This sort of thing is kind of a footgun, but I'm not sure how to avoid it.
Assignee: general → wmccloskey
Status: NEW → ASSIGNED
Attachment #732957 - Flags: review?(n.nethercote)
Blocks: 851421
No longer blocks: 854807
Whiteboard: [fuzzblocker][jsbugmon:update,testComment=13,bisectfix] → [fuzzblocker][jsbugmon:update,testComment=22]
Crash Signature: [@ js::EncapsulatedValue::writeBarrierPre] [@ js::CloneFunctionAtCallsite] [@ js::gc::MarkString] → [@ js::EncapsulatedValue::writeBarrierPre] [@ js::CloneFunctionAtCallsite] [@ js::gc::MarkString]
Whiteboard: [fuzzblocker][jsbugmon:update,testComment=22] → [fuzzblocker] [jsbugmon:testComment=22]
JSBugMon: Cannot process bug: Error: Failed to isolate original revision for test
Comment on attachment 732957 [details] [diff] [review] patch Review of attachment 732957 [details] [diff] [review]: ----------------------------------------------------------------- > Bug 851421 was the real regressor. Makes sense. Bug 851421 landed, then I did a follow-up fix for some bustage, then I did a thinko fix for the bustage fix. The code in the first and third of these patches was functionally equivalent, and I guess bisection unluckily fingered the third patch. Thanks for fixing this, billm! I never would have worked this out by myself. ::: js/src/jsfun.h @@ +296,1 @@ > inline void setExtendedSlot(size_t which, const js::Value &val); Wow, that *is* a footgun :( A comment here explaining when to use |init| rather than |set| (and why) might be a good idea.
Attachment #732957 - Flags: review?(n.nethercote) → review+
Flags: needinfo?(n.nethercote)
This'll definitely need backporting to Aurora, BTW.
Crash Signature: [@ js::EncapsulatedValue::writeBarrierPre] [@ js::CloneFunctionAtCallsite] [@ js::gc::MarkString] → [@ js::EncapsulatedValue::writeBarrierPre] [@ js::CloneFunctionAtCallsite] [@ js::gc::MarkString]
Comment on attachment 732957 [details] [diff] [review] patch [Approval Request Comment] Bug caused by (feature/regressing bug #): bug 851421 User impact if declined: Crashes Testing completed (on m-c, etc.): On m-c (in a bit) Risk to taking this patch (and alternatives if risky): Very low. String or IDL/UUID changes made by this patch: None.
Attachment #732957 - Flags: approval-mozilla-aurora?
As a security critical bug affecting more than one branch, this should have gotten sec-approval before going onto inbound. At this point, it might as well continue but please note that for the future.
> As a security critical bug affecting more than one branch, this should have > gotten sec-approval before going onto inbound. I didn't know that. What's the rationale for that?
This bug was indeed a recent regression on m-c when I first filed it (which doesn't require sec-approval if it's only on m-c), but when it landed after an r+ we just branched for aurora, so it affected both aurora and m-c as well, so it needs sec-approval. See https://wiki.mozilla.org/Security/Bug_Approval_Process
(basically this was an unfortunate instance where bug filing, patch review & landing straddled a merge window)
(In reply to Nicholas Nethercote [:njn] from comment #32) > > As a security critical bug affecting more than one branch, this should have > > gotten sec-approval before going onto inbound. > > I didn't know that. What's the rationale for that? Because it has been policy for about five months now? https://wiki.mozilla.org/Security/Bug_Approval_Process The rationale is that when we checkin security fixes, it is a red flag for any third party (read: security "researchers") that "Here is a security problem!" They then begin to look at the changed code to try to construct the original problem. Because of that, we make checking in fixes for security issues a very explicit decision during the six week cycle. Changes that are central only don't need approval, nor do changes that are sec-moderate or lower. If it has gone out on a branch (and thus may get released), the sec-approval? flag needs to be set and sec-approval+ given before a checkin anywhere.
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #34) > (basically this was an unfortunate instance where bug filing, patch review & > landing straddled a merge window) Right... AIUI if it hadn't gone to aurora sec-approval wouldn't have been required?
(In reply to Nicholas Nethercote [:njn] from comment #36) > Right... AIUI if it hadn't gone to aurora sec-approval wouldn't have been > required? Yes, sec-approval is not needed for trunk-only problems.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Attachment #732957 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Whiteboard: [fuzzblocker] [jsbugmon:testComment=22] → [fuzzblocker] [jsbugmon:testComment=22][adv-main22-]
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: