Closed Bug 494269 Opened 13 years ago Closed 13 years ago

Consider tracing lambda_fc

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9.1
Tracking Status
status1.9.2 --- beta1-fixed

People

(Reporter: bzbarsky, Assigned: dmandelin)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 2 obsolete files)

Blocks: 460050
I think we should try to fix this for 1.9.1. Why? Because when upvar2 landed, while LAMBDA_FC formed off-trace was a win that helped avoid Call objects out of reach (below the entry frame), since then dmandelin got js_GetUpvarOnTrace going.

So now we are strictly worse off when the LAMBDA_FC happens on trace! We abort where before, a LAMBDA would be recorded and its upvars would be reachable if on trace, or gettable via js_GetUpvarOnTrace since that landed and stabilized.

Again, now that the js_GetUpvarOnTrace work is solid, LAMBDA_FC is trivial.

/be
Flags: wanted1.9.1?
OS: Mac OS X → All
Hardware: x86 → All
Target Milestone: --- → mozilla1.9.1
Assignee: general → gal
(In reply to comment #1)
> So now we are strictly worse off when the LAMBDA_FC happens on trace! We abort
> where before, a LAMBDA would be recorded and its upvars would be reachable if
> on trace, or gettable via js_GetUpvarOnTrace since that landed and stabilized.

Correction: js_GetUpvarOnTrace is used only by non-escaping closures, which are selected over flat closures always (saves memory and closure-forming runtime at slighly higher JSOP_{CALL,GET}UPVAR cost in interpreter or from the trace-helper builtin).

The "or gettable via" above should refer to the builtin dmandlin is working on, for getting call object properties when the call's frame is still active. The go-game wants this for cases where the upvar is mutated by the closure, or other assignments to it that don't dominate the closure expression defeat the flat closure optimization.

Closure optimization decision logic:

    if (non-escaping)
        use JSOP_{GET,CALL}UPVAR;
    else if (upvar inits dominate closure)
        make a flat closure and use JSOP_{GET,CALL}DSLOT;
    else
        HEAVYWEIGHT, use Call objects on scope chain;

HTH,

/be
(In reply to comment #2)
> The "or gettable via" above should refer to the builtin dmandlin is working on,
> for getting call object properties when the call's frame is still active.

... but that frame is above the trace entry frame.

I won't attempt the logic for tracing, through which the closure optimization decision logic refracts! The issues are:

(a) making a flat closure on trace (this bug);

(b) getting slots from frames still active that have call objects, where the frames are above the entry frame (dmandelin has this one).

/be
Duplicate of this bug: 495323
Blocks: 494268
Assignee: gal → dmandelin
Attached patch Patch (obsolete) — Splinter Review
This gives a 5x speedup on a lambda_fc microbenchmark. I want to test more thoroughly, although it's not easy to write code that traces a lambda_fc. I also want to check on the assumption that flat closures don't refer to their scope chains. If so, maybe we should set the scope chain to NULL all the time.
In the interpreter, a closure needs a scope chain pointing to its global object, to find free variables. It's true that on trace, there's only one globalObj that could be used, but this means you can't get branch exits due to flat closures wanting different JSSLOT_PARENT values -- they must all equal the one globalObj.

Since the traced code has to side-exit to the interpreter to stop looping, you'll need that non-null parent slot. So setting to null is not possible, at least not without fixup that could span a larger number of flat closures. Again if the one true globalObj parent isn't causing branch exits, it's best to use the same parent as the interpreter wants.

/be
(In reply to comment #6)
> In the interpreter, a closure needs a scope chain pointing to its global
> object, to find free variables. 

Let me see if I understand. Are you saying that a flat closure can still have free variables (other than the imported closure variables, which are no longer free), but that they have to refer to the global object? So a closure's environment (i.e. parent) should point to the global object active when the closure was created?

> It's true that on trace, there's only one
> globalObj that could be used, but this means you can't get branch exits due to
> flat closures wanting different JSSLOT_PARENT values -- they must all equal the
> one globalObj.

Are you talking about the guard generated for function calls? So we could potentially optimize calls to flat closures a bit by ripping out the guard on the callee's parent?

> Since the traced code has to side-exit to the interpreter to stop looping,
> you'll need that non-null parent slot. So setting to null is not possible, at
> least not without fixup that could span a larger number of flat closures. Again
> if the one true globalObj parent isn't causing branch exits, it's best to use
> the same parent as the interpreter wants.

I see why NULL is no go. But does this mean the code I have in the patch is "correct"? I'm referring to the fact that on trace I let the closure's parent be initialized to cx->fp->scopeChain. cx->fp is not necessarily right, but if the important thing is that the scope chain from the closure reaches the global object, it should be OK, right? Or does this mean it's actually even better to simply initialize a flat closure's parent to the active global object?
(In reply to comment #7)
> (In reply to comment #6)
> > In the interpreter, a closure needs a scope chain pointing to its global
> > object, to find free variables. 
> 
> Let me see if I understand. Are you saying that a flat closure can still have
> free variables (other than the imported closure variables, which are no longer
> free),

Right, free means we didn't find a lexical (well, statically scoped, let's say) var/let/const/function "up" the scope chain. The top level in global code is a mutable mess, except for any let bindings in active blocks enclosing the tree of functions being analyzed.

Free variables may resolve to properties of the global object, or they could be found in active with statements (but those turn off flat closure optimizations), or they might end up throwing reference errors if evaluated and no such property is found on the scope chain.

> but that they have to refer to the global object?

Yes.

> So a closure's
> environment (i.e. parent) should point to the global object active when the
> closure was created?

That's right.
 
> > It's true that on trace, there's only one
> > globalObj that could be used, but this means you can't get branch exits due to
> > flat closures wanting different JSSLOT_PARENT values -- they must all equal the
> > one globalObj.
> 
> Are you talking about the guard generated for function calls? So we could
> potentially optimize calls to flat closures a bit by ripping out the guard on
> the callee's parent?

Yes, if we can abort the recording when we see a cross-global call, and prove that recording time and runtime must agree on same-global vs. cross-global.

> I see why NULL is no go. But does this mean the code I have in the patch is
> "correct"? I'm referring to the fact that on trace I let the closure's parent
> be initialized to cx->fp->scopeChain. cx->fp is not necessarily right, but if
> the important thing is that the scope chain from the closure reaches the global
> object, it should be OK, right?

We should prove that the one true globalObj for the trace is and will always be the parent of flat closures created on trace, and then use that TraceRecorder member and not cx->fp->scopeChain.

> Or does this mean it's actually even better to
> simply initialize a flat closure's parent to the active global object?

With the right assertions and proofs about recording vs. runtime invariance, yes.

/be
Attached patch Patch v2 (obsolete) — Splinter Review
I'm told that we don't currently trace cross-global calls, so I think we're safe. But I don't precisely understand the hazards involved, so I might be wrong.
Attachment #380330 - Attachment is obsolete: true
Attachment #380523 - Flags: review?(brendan)
Attachment #380523 - Flags: review?(gal)
Comment on attachment 380523 [details] [diff] [review]
Patch v2

>+
>+JSObject *
>+js_NewFlatClosure(JSContext *cx, JSFunction *fun)
>+{
>+    JSObject *global = JS_GetGlobalForObject(cx, cx->fp->scopeChain);

How about we hand this in and burn the identity of the global object onto trace? (faster)

>+    JSObject *closure = js_AllocFlatClosure(cx, fun, global);
>+
>     JSUpvarArray *uva = JS_SCRIPT_UPVARS(fun->u.i.script);
>     JS_ASSERT(uva->length <= size_t(closure->dslots[-1]));
> 
>     uintN level = fun->u.i.script->staticLevel;

>-            uintN nativeStackFramePos = state->callstackBase[0]->caller_argc;
>+            uintN nativeStackFramePos = state->callstackBase[0]->spadj;

Maybe spoffset? spadj is already used as sp_adj in side exits.
Attachment #380523 - Flags: review?(gal) → review+
Comment on attachment 380523 [details] [diff] [review]
Patch v2

>-js_NewFlatClosure(JSContext *cx, JSFunction *fun)
>+js_AllocFlatClosure(JSContext *cx, JSFunction *fun, JSObject* global)
> {
>     JS_ASSERT(FUN_FLAT_CLOSURE(fun));
> 
>-    JSObject *closure = js_CloneFunctionObject(cx, fun, cx->fp->scopeChain);
>+    JSObject *closure = js_CloneFunctionObject(cx, fun, global);

Thinking about this more, it's wrong to use the global object. There could be Call objects on the scope chain that the analysis couldn't see, in which free variables will resolve. Testcase:

js> function heavy(s, t, u) {return eval(s)}
js> flat = heavy("(function () {var x = t * t; return function(){return x + u}})()", 2, 3);
function () {
    return x + u;
}
js> dis(flat);
flags: LAMBDA FLAT_CLOSURE
main:
00000:  getdslot 0
00003:  name "u"
00006:  add
00007:  return
00008:  stop

Source notes:
js> flat();
7

>     if (!closure || fun->u.i.script->upvarsOffset == 0)

Oops, I left the || and its right operand there unintentionally -- they were from an earlier point in development when flat closures with 0 upvars would be compiled, but that should "never happen" now. So assert first thing, right after the JS_ASSERT(FUN_FLAT_CLOSURE(fun)), that fun->u.i.script->upvarsOffset != 0.

>+JSObject *
>+js_NewFlatClosure(JSContext *cx, JSFunction *fun)
>+{
>+    JSObject *global = JS_GetGlobalForObject(cx, cx->fp->scopeChain);
>+    JSObject *closure = js_AllocFlatClosure(cx, fun, global);

See above.

>@@ -293,17 +293,26 @@ struct FrameInfo {
>     jsbytecode*     imacpc;     // caller fp->imacpc
>     union {
>         struct {
>             uint16  spdist;     // distance from fp->slots to fp->regs->sp at JSOP_CALL
>             uint16  argc;       // actual argument count, may be < fun->nargs
>         } s;
>         uint32      word;       // for spdist/argc LIR store in record_JSOP_CALL
>     };

Please lose the union -- it dates from when we emitted code to store frameinfo fields!

Blank line here before the major comment, per house style.

>-    uint32          caller_argc;   // fp->argv - stackBase
>+    /*
>+     * Stack pointer adjustment needed for navigation of native stack in
>+     * js_GetUpvarOnTrace. spadj is the number of slots in the native
>+     * stack frame for the caller *before* the slots covered by spdist.
>+     * This may be negative if the caller is the top level script.
>+     * The key fact is that if we let 'cpos' be the start of the caller's
>+     * native stack frame, then (cpos + spdist) points to the first 
>+     * non-argument slot in the callee's native stack frame.
>+     */
>+    int32          spadj;

Still a plausible name, but sp_adj precedent kind of messes it up. Maybe that's sp_adj's problem, though ;-) -- I mean, we might do better to rename sp_adj (Nanojit's ip_adj is gone).

/be
Attachment #380523 - Flags: review?(brendan) → review-
The union and spadj naming are/will be taken care of in other patches.

On the subject of the scope chain, can the scope chain look things up other than in the global object on trace, or can it only do that in the interpreter? I know that eval doesn't currently trace, but maybe there is a way to create a trace that requires the real scope chain? I got some try server failures, and they could be due to using the wrong version of the scope chain in the interpreter version, but I'm not sure yet.
Attached patch Patch v3Splinter Review
Attachment #380523 - Attachment is obsolete: true
Attachment #381121 - Flags: review?(brendan)
I forgot to mention in comment 13 that the key changes in v3 of the patch are:

- initialize the closure using the scope chain of the callee object from the native stack
- convert if condition to assertion per comment 11.
Comment on attachment 381121 [details] [diff] [review]
Patch v3

>-JSObject *
>-js_NewFlatClosure(JSContext *cx, JSFunction *fun)
>+/*
>+ * Create a new flat closure, but don't initialize the imported upvar
>+ * values. The tracer calls this function and then initializes the upvar
>+ * slots on trace.
>+ */
>+JSObject* JS_FASTCALL
>+js_AllocFlatClosure(JSContext *cx, JSFunction *fun, JSObject* scopeChain)

Nit: consistency favors C-style declarator mode cuddling of declarator name, not C++ type-cuddling: JSObject *scopeChain, and JSObject * JS_FASTCALL for the line before.

>+/*
>+ * Record LIR to get the given upvar. Return the LIR instruction for
>+ * the upvar value. NULL is returned only on a can't-happen condition
>+ * with an invalid typemap. The value of the upvar is returned as v.
>+ */
>+JS_REQUIRES_STACK LIns*
>+TraceRecorder::get_upvar(JSScript* script, JSUpvarArray* uva, uintN index,
>+                         jsval& v)

Does it make sense to call this upvar, by analogy to arg and var?

>+    LIns* sc_ins = get(&cx->fp->argv[-2]);
>+    JS_ASSERT(sc_ins);
>+
>+    LIns* args[] = {
>+        sc_ins,

Nit: scopeChain_ins? Matches style elsewhere, and avoids cryptic (and novel) sc shorthand (cx has lots of precedent and is used so much it merits its two chars -- not so scopeChain-derived variable names). Plenty of room for those extra chars here :-P.

>+        INS_CONSTPTR(fun),
>+        cx_ins
>+    };
>+    LIns* call_ins = lir->insCall(&js_AllocFlatClosure_ci, args);
>+    guard(false,
>+          addName(lir->ins2(LIR_eq, call_ins, INS_CONSTPTR(0)),
>+                  "guard(js_NewFlatClosureBase)"),

Change the parenthesized string to match the name of the builtin?

>+          STATUS_EXIT);

OOM_EXIT, right?

>     JS_REQUIRES_STACK void box_jsval(jsval v, nanojit::LIns*& v_ins);
>     JS_REQUIRES_STACK void unbox_jsval(jsval v, nanojit::LIns*& v_ins, VMSideExit* exit);
>+    JS_REQUIRES_STACK nanojit::LIns* get_upvar(JSScript* script, 
>+                                               JSUpvarArray* uva, uintN index,
>+                                               jsval& v);

Put this one near the arg and var pairs?

r=me with nits picked.

/be
Attachment #381121 - Flags: review?(brendan) → review+
Pushed to TM with nits fixed as a16ed38ff63a.
http://hg.mozilla.org/mozilla-central/rev/a16ed38ff63a
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
I highly suspect this is the cause of me now getting the following crash:

http://crash-stats.mozilla.com/report/index/9661ba8f-c912-40c7-9372-2ff7b2090604

http://crash-stats.mozilla.com/query/query?do_query=1&date=&range_value=1&range_unit=weeks&query_search=signature&query_type=contains&query=GetUpvarOnTraceTail

0  	js3250.dll  	GetUpvarOnTraceTail  	 js/src/jstracer.cpp:1857
1 	js3250.dll 	js_GetUpvarOnTrace 	js/src/jstracer.cpp:1895
2 		@0x4752c2b 	
3 		@0x12e2ef 	
4 	js3250.dll 	js_Interpret 	js/src/jsinterp.cpp:3875
5 	js3250.dll 	js_Execute 	js/src/jsinterp.cpp:1622
6 	js3250.dll 	obj_eval 	js/src/jsobj.cpp:1502
7 	js3250.dll 	js_Invoke 	js/src/jsinterp.cpp:1386
8 	js3250.dll 	js_Interpret 	js/src/jsinterp.cpp:5171
9 	js3250.dll 	js_Execute 	js/src/jsinterp.cpp:1622
10 	js3250.dll 	JS_EvaluateUCScriptForPrincipals 	js/src/jsapi.cpp:5155
11 	xul.dll 	nsJSContext::EvaluateString 	dom/base/nsJSEnvironment.cpp:1631
12 	xul.dll 	nsScriptLoader::EvaluateScript 	content/base/src/nsScriptLoader.cpp:686
13 	xul.dll 	nsScriptLoader::ProcessRequest 	content/base/src/nsScriptLoader.cpp:600
14 	xul.dll 	nsCOMArray_base::RemoveObject 	obj-firefox/xpcom/build/nsCOMArray.cpp:129
15 	xul.dll 	xul.dll@0x397de4


I can't figure out a reliable way to reproduce with a new profile, so I haven't opened a new bug.  It is however, a TM content JIT bug.
The crash happens on browser restart, during session restore.  I can't identify any particular web page.
yep. going to have to deny this on branch, given the regressions. This patch isn't currently on 1.9.1, where we don't see bug 496319, the regression that IU is reporting.

Invalid read of size 8
  at 0x3ADCA1: GetUpvarOnTraceTail(InterpState*, unsigned int, unsigned int, unsigned char*, double*)
  by 0x3ADE2E: js_GetUpvarOnTrace(JSContext*, unsigned int, unsigned int, unsigned int, double*) 
  by 0x1F13DC04: ???
  by 0xBFFFC7F7: ???
  by 0x3D448C: js_MonitorLoopEdge(JSContext*, unsigned int&) 
  by 0x2E490F: js_Interpret
  by 0x309946: js_Execute 
  by 0x32A64C: obj_eval(JSContext*, JSObject*, unsigned int, long*, long*) 
  by 0x30AF99: js_Invoke 
  by 0x2F4FFF: js_Interpret 
  by 0x309946: js_Execute 
  by 0x287B40: JS_EvaluateUCScriptForPrincipals (
Address 0xbffba6b0 is not stack'd, malloc'd or (recently) free'd
Depends on: 496319
Flags: wanted1.9.1? → wanted1.9.1-
Flags: blocking1.9.2+
Mass change: adding fixed1.9.2 keyword

(This bug was identified as a mozilla1.9.2 blocker which was fixed before the mozilla-1.9.2 repository was branched (August 13th, 2009) as per this query: http://is.gd/2ydcb - if this bug is not actually fixed on mozilla1.9.2, please remove the keyword. Apologies for the bugspam)
Keywords: fixed1.9.2
You need to log in before you can comment on or make changes to this bug.