Closed Bug 491620 Opened 15 years ago Closed 15 years ago

TM: debug build crash with side exit from inner trace with extra args on stack


(Core :: JavaScript Engine, defect, P1)






(Reporter: dmandelin, Assigned: dmandelin)



(Keywords: fixed1.9.1, Whiteboard: fixed-in-tracemonkey)


(3 files, 2 obsolete files)

Attached file Test Case
I discovered this because a related program that uses upvars to cause the side exit crashes my patch for bug 458838. 

Stack trace below. I don't know what the problem is yet, but as you can see from the stack trace, TM is trying to flush the value 0x63 (99, the 'extra' argument) from the native stack to the interpreter stack using a type of 7 (function). So I'm guessing the type map is not accounting for the extra arguments or something like that.

0  0x000f03e1 in STOBJ_GET_CLASS (obj=0x63) at jsobj.h:254
#1  0x0012d868 in NativeToValue (cx=0x30bc80, v=@0x812b38, type=7 '\a', slot=0xbfffcb78) at ../jstracer.cpp:1710
#2  0x0012e596 in FlushNativeStackFrame (cx=0x30bc80, callDepth=1, mp=0x2b40bf "\a\a\005", np=0xbfffcb78, stopFrame=0x812b58) at ../jstracer.cpp:1785
#3  0x00136ef0 in LeaveTree (state=@0xbfffc6a0, lr=0x2b42a0) at ../jstracer.cpp:4322
#4  0x00138181 in js_ExecuteTree (cx=0x30bc80, f=0x30dd90, inlineCallCount=@0xbffff0b8, innermostNestedGuardp=0xbfffecd4) at ../jstracer.cpp:4219
#5  0x001515af in js_MonitorLoopEdge (cx=0x30bc80, inlineCallCount=@0xbffff0b8) at ../jstracer.cpp:4534
#6  0x000768b8 in js_Interpret (cx=0x30bc80) at ../jsinterp.cpp:3857
#7  0x0009678a in js_Execute (cx=0x30bc80, chain=0x2ab000, script=0x30dc70, down=0x0, flags=0, result=0x0) at jsinterp.cpp:1612
#8  0x0001e65a in JS_ExecuteScript (cx=0x30bc80, obj=0x2ab000, script=0x30dc70, rval=0x0) at ../jsapi.cpp:5040
#9  0x000088cf in Process (cx=0x30bc80, obj=0x2ab000, filename=0xbffff7fd "../uvc3a.js", forceTTY=0) at ../../shell/js.cpp:412
#10 0x0000a102 in ProcessArgs (cx=0x30bc80, obj=0x2ab000, argv=0xbffff70c, argc=2) at ../../shell/js.cpp:806
#11 0x0000b5ac in main (argc=2, argv=0xbffff70c, envp=0xbffff718) at ../../shell/js.cpp:4728
Flags: blocking1.9.1+
Priority: -- → P1
The day of pointer forging. IF you chose a different value than 99 you can make yourself a function object.
Group: core-security
Probably also a 1-line fix.
Jesse, any idea why the fuzzers never hit this case?
This testcase does a few things my JIT-specific fuzzers don't do much of: multiplying, swapping arrays, and calling externally-defined functions from loops.  I'll try to add each of those things to at least one of my fuzzers.

It's also a fairly complicated testcase ;)
Definitely a complicated test case. I exposed the bug by accident while testing my new patch, but it took me quite a while to figure out how to replicate on tip. The trickiest part was making the inner tree side exit: the array element types have to change after the outer tree has been compiled to use the inner tree, but in a way that won't invalidate either tree.
OK, I think I have the cause and a candidate fix. The problem is that FORALL_SLOTS_IN_FRAME uses fp->fun->nargs to get the number of values pushed onto the the frame as the function's actual arguments. But this is wrong: it's actually JS_MAX(fp->fun->nargs, fp->argc). If fp->argc is greater, that means the function was called with more actuals than formals. The extra actuals do go on the stack so they have to be counted. 

The typemap is set up using FORALL_SLOTS_IN_FRAME, so the typemap won't have any extra arguments. But the arguments are on the interpreter stack and also the native stack, so values end up being copied or converted using the wrong type.

Besides the main flaw, there are some assertions that also use the wrong computation and need to be fixed. I'll search for fp->nargs in a bit and see which ones need to be replaced (I want an inline function js_ActualArgCount or some such). In the meantime, if someone wants to steal and do that, feel free. Otherwise I should have it at the end of the day.
We had a ridiculous number of bugs in that macro and it is extremely fragile. This will need a good amount of testing.
Bad memories of late nights. The original intention was to number stack slots and exclude extra args, because they could not be accessed except via the arguments object, which wasn't traced. We didn't think about how synthesizing stack frames when leaving a trace might be broken by this intentional skipping.

It seems to me we could make it work either way, but numbering the extra args is better in the long run, as there should be fast arguments[i] support soon.

Attached patch Patch (obsolete) — Splinter Review
The main fix is to make FORALL_SLOTS_IN_FRAME aware that when excess args are passed, there are extra slots in the frame. Thus, this patch changes the typemap to match the stack format (the extra args were on the stack in both interpreter and native stacks before this patch).

There is a tricky bit in js_CheckEntryTypes that should get special reviewer attention. Consider a function 'f' with 1 arg that contains a traceable loop. First, call 'f(1)'. The stack typemap will be created with N elements (for some N). This is the same as the value returned by js_NativeStackSlots(). Now, call 'f(1, 2)'. This time, js_NativeStackSlots() will return N+1, because there is in fact one more element on the native stack (and the interpreter stack). But the stack typemap saved in the TreeInfo still has N elements. So the old assert will fail.

I believe the fundamental issue in the previous paragraph is that the number of native stack slots (return value of js_NativeStackSlots) is not in fact invariant with respect to PC if there can be extra arguments. So the assert just can't be right. I fixed that and so far the patch seems to be passing tests, but I can't be sure that the invariant stack size assumption is not implicit in parts of the code other than assertions.
Assignee: general → dmandelin
Attachment #376153 - Flags: review?(gal)
Attachment #376153 - Flags: review?(gal) → review+
Comment on attachment 376153 [details] [diff] [review]

Discussing a bit more with david.
Attachment #376153 - Flags: review+
This is not safe when an inner tree is called from two activations of the same function that itself was called with different argcs.
Dave, the test looks pretty good. Please call jit(false) after the test code and before the reportCompare. Don't forget to end statements with a ; please. 

Also, since this bug is security sensitive please separate out the test and attach it to this bug rather than checking it in with your patch.
Attached patch Patch v2 (obsolete) — Splinter Review
This patch is about what I want to commit but I'm not sure about it yet because it seems to expose an assert as discussed with Andreas. Also, I want to check if this can replace the fix to bug 480244, but I can't test that until the adjustCallerTypes bug is fixed.
Attachment #376153 - Attachment is obsolete: true
Attachment #376362 - Flags: review?(gal)
Blocks: 458838
Depends on: 489836
Blocks: 453730
Attached patch Patch v3Splinter Review
Mostly just updated for bitrot. I also removed the band-aid introduced for bug 480244, as it seems not to be necessary after this patch.

Most tests check out OK, but I've been having trouble running Mochitest. It locks up for me on dom/tests/mochitest/dom-level2-html/test_HTMLBodyElement07.html, with or without this patch. Maybe we can just land on TM and back out if needed?
Attachment #376362 - Attachment is obsolete: true
Attachment #376795 - Flags: review?(gal)
Attachment #376362 - Flags: review?(gal)
Comment on attachment 376795 [details] [diff] [review]
Patch v3

>@@ -1215,7 +1238,7 @@
>         if ((type == JSVAL_INT) && oracle.isStackSlotUndemotable(cx, unsigned(m - map)))
>             type = JSVAL_DOUBLE;
>         JS_ASSERT(type != JSVAL_BOXED);
>-        debug_only_v(printf("capture stack type %s%d: %d=%c\n", vpname, vpnum, type, typeChar[type]);)
>+                                   debug_only_v(printf("capture stack type %s%d: %d=%c vp=%p v=%x\n", vpname, vpnum, type, typeChar[type], vp, (uint32) *vp);)

Nit: Indentation issue.

>@@ -2960,7 +2984,7 @@
>     }
>     debug_only_v(js_DumpPeerStability(traceMonitor, peer_root->ip,
>-                                      peer_root->globalObj, peer_root->globalShape);)
>+        peer_root->globalObj, peer_root->globalShape, peer_root->argc);)

Looks weird too. Not sure how to improve. Maybe 3-line split?

>+#if 0
>     /* Cannot handle treecalls with callDepth > 0 and argc > nargs, see bug 480244. */
>     if (r->getCallDepth() > 0 && cx->fp->argc > cx->fp->fun->nargs) {
>         js_AbortRecording(cx, "Can't call inner tree with extra args in pending frame");
>         return false;
>     }

Remove the code instead of #if 0?
Attachment #376795 - Flags: review?(gal) → review+
Pushed to TM as 85d280b05450.

Summary of the solution, for posterity: Say that outer trace T1 calls inner trace T2 with JS stack frame F on top at the time of the call. This was dangerous because T2 was specialized for a given value of argc (all the args are on the interpreter and native stack), but it can be called by T1 with varying values of argc. Conceptually, that's clearly wrong. In detail, it shows up in a couple of different ways, particularly by doing unsafe stuff or crashing when T2 exits and tries to repopulate F using an incorrect typemap (one specialized for a different value of argc).

The fix is to (a) correct the typemap-building functions so they know about argc (argSlots() and its uses), and (b) make argc part of the key used to look up and call traces.
Nit: JS_MAX on unsigned args returns unsigned, so argSlots should have return type unsigned too. Shouldn't matter on any target platform we care about, though.

Closed: 15 years ago
Resolution: --- → FIXED
I could not reproduce a crash with this test.
Flags: in-testsuite+
when this bug is opened, the test should be checked in.
Flags: in-testsuite+ → in-testsuite?
Group: core-security
You need to log in before you can comment on or make changes to this bug.