Closed Bug 577708 Opened 14 years ago Closed 14 years ago

JM: remove Algol-like display optimization

Categories

(Core :: JavaScript Engine, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: cdleary, Assigned: cdleary)

References

Details

Attachments

(2 files, 2 obsolete files)

Attached patch Without tracing support. (obsolete) — Splinter Review
Shows ~3% win on v8, in the noise on sunspider.

I'm guessing the patch needs to add tracing support before it can go in.
Attached file Perf comparison. (obsolete) —
Attachment #456602 - Attachment is patch: true
Attachment #456602 - Attachment mime type: application/octet-stream → text/plain
To describe the patch a little bit: "imported" closure creation and JSOP_{GET,CALL}UPVAR still call into js_GetUpvar, but that routine now traverses the frame links looking for a frame with the appropriate static level, as determined by the parser.

From the research in bug 543149 we understand that to be very few links in the common case - reducing our stack frame size and avoiding state save/restore overhead appears to pay off for JM. A port to TM is now on my todo list.
Okay, should be ready to go on the tracing side as well. One thing I haven't fully investigated is that the max recursion limit before exception went from 3001 to 3000 under debug, demonstrated by the debug bug 522136 failure. It never actually hits GetUpvar, so it's a bit puzzling... other trace tests and reftests pass, though.
Attachment #456602 - Attachment is obsolete: true
Attachment #456616 - Flags: review?
Where's the profile data (maybe another bug) that shows display maintenance accounts for the difference in v8 perf?

I'm all in favor of losing the display, just looking for the evidence. It seems high and there might be something else going on.

In the same vein, comment 3 is a mystery. We hate those -- need those pesky kids and that big brown dog to solve it.

/be
Is the recursion limit failing on TM or JM? JM has known "bugs" here - TM shouldn't, if it's there I can take a look.
(In reply to comment #5)
> Is the recursion limit failing on TM or JM? JM has known "bugs" here - TM
> shouldn't, if it's there I can take a look.

Patch is against JM-moo tip. Glad to hear you say that -- that's why I mentioned it on the bug instead of trying to hunt it down immediately. :-)
Attached file MacBook perf stats.
3.8% on V8, 1.6% on sunspider, doing some comparative sharking now.
Attachment #456605 - Attachment is obsolete: true
This is great data.
Shark shows that 1.2% of the time saved on V8 can be attributed to time spent
in jitted functions under stubs::Call and less than .5% in Invoke and friends.
The rest looks like it may be secondary effects that I can't really pick out
with this profiler.

The Invoke reduction tells us that the effect of eliminating a stack frame field is mixed in with the display value save/restore overhead, so it'll be enlightening to see what happens when we remove a more simple redundant stack frame field. Projecting ~1% for each field we remove might actually be reasonable. *crosses fingers*
This is definitely a win on my machine - 2.8% on v8, SS not so much.

I can't reproduce the recursion limit thing... which test is failing?

Who can review this? Let the stack frame evisceration begin!
Comment on attachment 456616 [details] [diff] [review]
Remove Algol-like display.

># HG changeset patch
># Parent 9ba6da86ba7bae921de12dd18aa5c3df10a9fff2
>
>diff --git a/js/src/jsbuiltins.cpp b/js/src/jsbuiltins.cpp
>--- a/js/src/jsbuiltins.cpp
>+++ b/js/src/jsbuiltins.cpp
>@@ -349,20 +349,16 @@ js_PopInterpFrame(JSContext* cx, TracerS
>         return JS_FALSE;
>     if (fp->imacpc)
>         return JS_FALSE;
>     if (fp->blockChain)
>         return JS_FALSE;
> 
>     fp->putActivationObjects(cx);
>     
>-    /* Update display table. */
>-    if (fp->script->staticLevel < JS_DISPLAY_SIZE)
>-        cx->display[fp->script->staticLevel] = fp->displaySave;
>-
>     /* Pop the frame and its memory. */
>     cx->stack().popInlineFrame(cx, fp, fp->down);
> 
>     /* Update the inline call count. */
>     *state->inlineCallCountp = *state->inlineCallCountp - 1;
>     return JS_TRUE;
> }
> JS_DEFINE_CALLINFO_2(extern, BOOL, js_PopInterpFrame, CONTEXT, TRACERSTATE, 0, ACC_STORE_ANY)
>diff --git a/js/src/jscntxt.h b/js/src/jscntxt.h
>--- a/js/src/jscntxt.h
>+++ b/js/src/jscntxt.h
>@@ -1719,23 +1719,16 @@ struct JSContext
>      */
>     volatile jsword     interruptFlags;
> 
>     static const jsword INTERRUPT_OPERATION_CALLBACK = 0x1;
> 
>     /* JSRuntime contextList linkage. */
>     JSCList             link;
> 
>-    /*
>-     * Classic Algol "display" static link optimization.
>-     */
>-#define JS_DISPLAY_SIZE 16U
>-
>-    JSStackFrame        *display[JS_DISPLAY_SIZE];
>-
>     /* Runtime version control identifier. */
>     uint16              version;
> 
>     /* Per-context options. */
>     uint32              options;            /* see jsapi.h for JSOPTION_* */
> 
>     /* Locale specific callbacks for string conversion. */
>     JSLocaleCallbacks   *localeCallbacks;
>@@ -3046,16 +3039,32 @@ CanLeaveTrace(JSContext *cx)
>     JS_ASSERT(JS_ON_TRACE(cx));
> #ifdef JS_TRACER
>     return cx->bailExit != NULL;
> #else
>     return JS_FALSE;
> #endif
> }
> 
>+/*
>+ * Search the call stack for the nearest frame with static level targetLevel.
>+ * @param baseFrame If not provided, the context's current frame is used.
>+ */
>+static JS_INLINE JSStackFrame *
>+FindFrameAtLevel(JSContext *cx, uint16 targetLevel, JSStackFrame * const baseFrame = 0)
>+{
>+    JSStackFrame *it = baseFrame ? baseFrame : cx->fp;
>+    JS_ASSERT(it && it->script);
>+    while (it->script->staticLevel != targetLevel) {
>+        it = it->down;
>+        JS_ASSERT(it && it->script);
>+    }
>+    return it;
>+}
>+
> extern void
> SetPendingException(JSContext *cx, const Value &v);
> 
> } /* namespace js */
> 
> /*
>  * Get the current cx->fp, first lazily instantiating stack frames if needed.
>  * (Do not access cx->fp directly except in JS_REQUIRES_STACK code.)
>diff --git a/js/src/jsdbgapi.cpp b/js/src/jsdbgapi.cpp
>--- a/js/src/jsdbgapi.cpp
>+++ b/js/src/jsdbgapi.cpp
>@@ -1338,79 +1338,36 @@ JS_SetDestroyScriptHook(JSRuntime *rt, J
> JS_PUBLIC_API(JSBool)
> JS_EvaluateUCInStackFrame(JSContext *cx, JSStackFrame *fp,
>                           const jschar *chars, uintN length,
>                           const char *filename, uintN lineno,
>                           jsval *rval)
> {
>     JS_ASSERT_NOT_ON_TRACE(cx);
> 
>-    JSObject *scobj;
>-    JSScript *script;
>-    JSBool ok;
>-
>-    scobj = JS_GetFrameScopeChain(cx, fp);
>+    JSObject *scobj = JS_GetFrameScopeChain(cx, fp);
>     if (!scobj)
>         return JS_FALSE;
> 
>     /*
>      * NB: This function breaks the assumption that the compiler can see all
>      * calls and properly compute a static level. In order to get around this,
>      * we use a static level that will cause us not to attempt to optimize
>      * variable references made by this frame.
>      */
>-    script = Compiler::compileScript(cx, scobj, fp, JS_StackFramePrincipals(cx, fp),
>-                                     TCF_COMPILE_N_GO, chars, length, NULL,
>-                                     filename, lineno, NULL, JS_DISPLAY_SIZE);
>+    JSScript *script = Compiler::compileScript(cx, scobj, fp, JS_StackFramePrincipals(cx, fp),
>+                                               TCF_COMPILE_N_GO, chars, length, NULL, filename,
>+                                               lineno, NULL, UpvarCookie::MAX_LEVEL);
> 
>     if (!script)
>         return JS_FALSE;
> 
>-    JSStackFrame *displayCopy[JS_DISPLAY_SIZE];
>-    if (cx->fp != fp) {
>-        memcpy(displayCopy, cx->display, sizeof displayCopy);
>+    bool ok = !!Execute(cx, scobj, script, fp, JSFRAME_DEBUGGER | JSFRAME_EVAL,
>+                        Valueify(rval));
> 
>-        /*
>-         * Set up cx->display as it would have been when fp was active.
>-         *
>-         * NB: To reconstruct cx->display for fp, we have to follow the frame
>-         * chain from oldest to youngest, in the opposite direction to its
>-         * single linkge. To avoid the obvious recursive reversal algorithm,
>-         * which might use too much stack, we reverse in place and reverse
>-         * again as we reconstruct the display. This is safe because cx is
>-         * thread-local and we can't cause GC until the call to js_Execute
>-         * below.
>-         */
>-        JSStackFrame *fp2 = fp, *last = NULL;
>-        while (fp2) {
>-            JSStackFrame *next = fp2->down;
>-            fp2->down = last;
>-            last = fp2;
>-            fp2 = next;
>-        }
>-
>-        fp2 = last;
>-        last = NULL;
>-        while (fp2) {
>-            JSStackFrame *next = fp2->down;
>-            fp2->down = last;
>-            last = fp2;
>-
>-            JSScript *script = fp2->script;
>-            if (script && script->staticLevel < JS_DISPLAY_SIZE)
>-                cx->display[script->staticLevel] = fp2;
>-            fp2 = next;
>-        }
>-    }
>-
>-    ok = Execute(cx, scobj, script, fp, JSFRAME_DEBUGGER | JSFRAME_EVAL,
>-                 Valueify(rval));
>-
>-    if (cx->fp != fp)
>-        memcpy(cx->display, displayCopy, sizeof cx->display);
>     js_DestroyScript(cx, script);
>     return ok;
> }
> 
> JS_PUBLIC_API(JSBool)
> JS_EvaluateInStackFrame(JSContext *cx, JSStackFrame *fp,
>                         const char *bytes, uintN length,
>                         const char *filename, uintN lineno,
>diff --git a/js/src/jsemit.cpp b/js/src/jsemit.cpp
>--- a/js/src/jsemit.cpp
>+++ b/js/src/jsemit.cpp
>@@ -1910,17 +1910,17 @@ MakeUpvarForEval(JSParseNode *pn, JSCode
>     JSAtom *atom = pn->pn_atom;
> 
>     uintN index;
>     JSLocalKind localKind = js_LookupLocal(cx, fun, atom, &index);
>     if (localKind == JSLOCAL_NONE)
>         return true;
> 
>     JS_ASSERT(cg->staticLevel > upvarLevel);
>-    if (cg->staticLevel >= JS_DISPLAY_SIZE || upvarLevel >= JS_DISPLAY_SIZE)
>+    if (cg->staticLevel >= UpvarCookie::MAX_LEVEL || upvarLevel >= UpvarCookie::MAX_LEVEL)
>         return true;
> 
>     JSAtomListElement *ale = cg->upvarList.lookup(atom);
>     if (!ale) {
>         if (cg->inFunction() &&
>             !js_AddLocal(cx, cg->fun, atom, JSLOCAL_UPVAR)) {
>             return false;
>         }
>@@ -2159,17 +2159,17 @@ BindNameToSlot(JSContext *cx, JSCodeGene
>                 return JS_TRUE;
>         }
>         pn->pn_op = op;
>         pn->pn_cookie.set(cookie);
>         pn->pn_dflags |= PND_BOUND;
>         return JS_TRUE;
>     }
> 
>-    uintN level = cookie.level();
>+    uint16 level = cookie.level();
>     JS_ASSERT(cg->staticLevel >= level);
> 
>     /*
>      * A JSDefinition witnessed as a declaration by the parser cannot be an
>      * upvar, unless it is the degenerate kind of upvar selected above (in the
>      * code before the PND_GVAR test) for the special case of compile-and-go
>      * code generated from eval called from a function, where the eval code
>      * uses local vars defined in the function. We detect this upvar-for-eval
>@@ -2207,31 +2207,31 @@ BindNameToSlot(JSContext *cx, JSCodeGene
>             pn->pn_cookie = cookie;
>             pn->pn_dflags |= PND_BOUND;
>             return JS_TRUE;
>         }
> 
>         return MakeUpvarForEval(pn, cg);
>     }
> 
>-    uintN skip = cg->staticLevel - level;
>+    uint16 skip = cg->staticLevel - level;
>     if (skip != 0) {
>         JS_ASSERT(cg->inFunction());
>         JS_ASSERT_IF(cookie.slot() != UpvarCookie::CALLEE_SLOT, cg->lexdeps.lookup(atom));
>         JS_ASSERT(JOF_OPTYPE(op) == JOF_ATOM);
>         JS_ASSERT(cg->fun->u.i.skipmin <= skip);
> 
>         /*
>          * If op is a mutating opcode, this upvar's static level is too big to
>          * index into the display, or the function is heavyweight, we fall back
>          * on JSOP_*NAME*.
>          */
>         if (op != JSOP_NAME)
>             return JS_TRUE;
>-        if (level >= JS_DISPLAY_SIZE)
>+        if (level >= UpvarCookie::MAX_LEVEL)
>             return JS_TRUE;
>         if (cg->flags & TCF_FUN_HEAVYWEIGHT)
>             return JS_TRUE;
> 
>         if (FUN_FLAT_CLOSURE(cg->fun)) {
>             op = JSOP_GETDSLOT;
>         } else {
>             /*
>diff --git a/js/src/jsemit.h b/js/src/jsemit.h
>--- a/js/src/jsemit.h
>+++ b/js/src/jsemit.h
>@@ -286,17 +286,17 @@ struct JSTreeContext {              /* t
>     union {
>         JSFunction  *fun;           /* function to store argument and variable
>                                        names when flags & TCF_IN_FUNCTION */
>         JSObject    *scopeChain;    /* scope chain object for the script */
>     };
> 
>     JSAtomList      lexdeps;        /* unresolved lexical name dependencies */
>     JSTreeContext   *parent;        /* enclosing function or global context */
>-    uintN           staticLevel;    /* static compilation unit nesting level */
>+    uint16          staticLevel;    /* static compilation unit nesting level */
> 
>     JSFunctionBox   *funbox;        /* null or box for function we're compiling
>                                        if (flags & TCF_IN_FUNCTION) and not in
>                                        Compiler::compileFunctionBody */
>     JSFunctionBox   *functionList;
> 
>     JSParseNode     *innermostWith; /* innermost WITH parse node */
> 
>diff --git a/js/src/jsfun.cpp b/js/src/jsfun.cpp
>--- a/js/src/jsfun.cpp
>+++ b/js/src/jsfun.cpp
>@@ -2497,19 +2497,19 @@ js_NewFlatClosure(JSContext *cx, JSFunct
> 
>     JSObject *closure = js_AllocFlatClosure(cx, fun, scopeChain);
>     if (!closure || fun->u.i.nupvars == 0)
>         return closure;
> 
>     JSUpvarArray *uva = fun->u.i.script->upvars();
>     JS_ASSERT(uva->length <= closure->dslots[-1].toPrivateUint32());
> 
>-    uintN level = fun->u.i.script->staticLevel;
>+    uint16 level = fun->u.i.script->staticLevel;
>     for (uint32 i = 0, n = uva->length; i < n; i++)
>-        closure->dslots[i] = js_GetUpvar(cx, level, uva->vector[i]);
>+        closure->dslots[i] = GetUpvar(cx, level, uva->vector[i]);
> 
>     return closure;
> }
> 
> JSObject *
> js_NewDebuggableFlatClosure(JSContext *cx, JSFunction *fun)
> {
>     JS_ASSERT(cx->fp->fun->flags & JSFUN_HEAVYWEIGHT);
>diff --git a/js/src/jsinterp.cpp b/js/src/jsinterp.cpp
>--- a/js/src/jsinterp.cpp
>+++ b/js/src/jsinterp.cpp
>@@ -425,28 +425,21 @@ class AutoPreserveEnumerators {
> 
> struct AutoInterpPreparer  {
>     JSContext *cx;
>     JSScript *script;
> 
>     AutoInterpPreparer(JSContext *cx, JSScript *script)
>       : cx(cx), script(script)
>     {
>-        if (script->staticLevel < JS_DISPLAY_SIZE) {
>-            JSStackFrame **disp = &cx->display[script->staticLevel];
>-            cx->fp->displaySave = *disp;
>-            *disp = cx->fp;
>-        }
>         cx->interpLevel++;
>     }
> 
>     ~AutoInterpPreparer()
>     {
>-        if (script->staticLevel < JS_DISPLAY_SIZE)
>-            cx->display[script->staticLevel] = cx->fp->displaySave;
>         --cx->interpLevel;
>     }
> };
> 
> JS_REQUIRES_STACK bool
> RunScript(JSContext *cx, JSScript *script, JSFunction *fun, JSObject *scopeChain)
> {
>     JS_ASSERT(script);
>@@ -546,17 +539,16 @@ InvokeCommon(JSContext *cx, JSFunction *
>     fp->argc = argc;
>     fp->argv = vp + 2;
>     fp->rval = (flags & JSINVOKE_CONSTRUCT) ? fp->thisv : UndefinedValue();
>     fp->annotation = NULL;
>     fp->scopeChain = NULL;
>     fp->blockChain = NULL;
>     fp->imacpc = NULL;
>     fp->flags = flags;
>-    fp->displaySave = NULL;
> 
>     /* Initialize regs. */
>     if (script) {
>         regs.pc = script->code;
>         regs.sp = fp->slots() + script->nfixed;
>     } else {
>         regs.pc = NULL;
>         regs.sp = fp->slots();
>@@ -1378,27 +1370,24 @@ js_DoIncDec(JSContext *cx, const JSCodeS
>         return JS_FALSE;
>     (cs->format & JOF_INC) ? ++d : --d;
>     vp->setNumber(d);
>     *vp2 = *vp;
>     return JS_TRUE;
> }
> 
> const Value &
>-js_GetUpvar(JSContext *cx, uintN level, UpvarCookie cookie)
>+js::GetUpvar(JSContext *cx, uint16 closureLevel, UpvarCookie cookie)
> {
>-    level -= cookie.level();
>-    JS_ASSERT(level < JS_DISPLAY_SIZE);
>+    JS_ASSERT(closureLevel >= cookie.level() && cookie.level() > 0);
>+    const uint16 targetLevel = closureLevel - cookie.level();
>+    JSStackFrame *fp = FindFrameAtLevel(cx, targetLevel);
> 
>-    JSStackFrame *fp = cx->display[level];
>-    JS_ASSERT(fp->script);
>-
>-    uintN slot = cookie.slot();
>+    uint16 slot = cookie.slot();
>     Value *vp;
>-
>     if (!fp->fun || (fp->flags & JSFRAME_EVAL)) {
>         vp = fp->slots() + fp->script->nfixed;
>     } else if (slot < fp->fun->nargs) {
>         vp = fp->argv;
>     } else if (slot == UpvarCookie::CALLEE_SLOT) {
>         vp = &fp->argv[-2];
>         slot = 0;
>     } else {
>diff --git a/js/src/jsinterp.h b/js/src/jsinterp.h
>--- a/js/src/jsinterp.h
>+++ b/js/src/jsinterp.h
>@@ -155,18 +155,16 @@ struct JSStackFrame
>      * This lazy cloning is implemented in js_GetScopeChain, which is
>      * also used in some other cases --- entering 'with' blocks, for
>      * example.
>      */
>     JSObject        *scopeChain;
>     JSObject        *blockChain;
> 
>     uint32          flags;          /* frame flags -- see below */
>-    JSStackFrame    *displaySave;   /* previous value of display entry for
>-                                       script->staticLevel */
> 
>     /* Members only needed for inline calls. */
>     void            *hookData;      /* debugger call hook data */
>     JSVersion       callerVersion;  /* dynamic version of calling script */
> 
>     void putActivationObjects(JSContext *cx) {
>         /*
>          * The order of calls here is important as js_PutCallObject needs to
>@@ -406,24 +404,27 @@ GetInstancePrivate(JSContext *cx, JSObje
>     if (!InstanceOf(cx, obj, clasp, argv))
>         return NULL;
>     return obj->getPrivate();
> }
> 
> extern bool
> ValueToId(JSContext *cx, const Value &v, jsid *idp);
> 
>-} /* namespace js */
>-
> /*
>- * Given an active context, a static scope level, and an upvar cookie, return
>- * the value of the upvar.
>+ * @param closureLevel      The static level of the closure that the cookie
>+ *                          pertains to.
>+ * @param cookie            Level amount is a "skip" (delta) value from the
>+ *                          closure level.
>+ * @return  The value of the upvar.
>  */
> extern const js::Value &
>-js_GetUpvar(JSContext *cx, uintN level, js::UpvarCookie cookie);
>+GetUpvar(JSContext *cx, uint16 closureLevel, js::UpvarCookie cookie);
>+
>+} /* namespace js */
> 
> /*
>  * JS_LONE_INTERPRET indicates that the compiler should see just the code for
>  * the js_Interpret function when compiling jsinterp.cpp. The rest of the code
>  * from the file should be visible only when compiling jsinvoke.cpp. It allows
>  * platform builds to optimize selectively js_Interpret when the granularity
>  * of the optimizations with the given compiler is a compilation unit.
>  *
>diff --git a/js/src/jsobj.cpp b/js/src/jsobj.cpp
>--- a/js/src/jsobj.cpp
>+++ b/js/src/jsobj.cpp
>@@ -6625,19 +6625,16 @@ js_DumpStackFrame(JSContext *cx, JSStack
>             fprintf(stderr, " overridden_args");
>         fputc('\n', stderr);
> 
>         if (fp->scopeChain)
>             fprintf(stderr, "  scopeChain: (JSObject *) %p\n", (void *) fp->scopeChain);
>         if (fp->blockChain)
>             fprintf(stderr, "  blockChain: (JSObject *) %p\n", (void *) fp->blockChain);
> 
>-        if (fp->displaySave)
>-            fprintf(stderr, "  displaySave: (JSStackFrame *) %p\n", (void *) fp->displaySave);
>-
>         fputc('\n', stderr);
>     }
> }
> 
> #ifdef DEBUG
> bool
> IsSaneThisObject(JSObject &obj)
> {
>diff --git a/js/src/jsops.cpp b/js/src/jsops.cpp
>--- a/js/src/jsops.cpp
>+++ b/js/src/jsops.cpp
>@@ -230,19 +230,16 @@ BEGIN_CASE(JSOP_STOP)
> 
>     interpReturnOK = true;
>     if (inlineCallCount)
>   inline_return:
>     {
>         JS_ASSERT(!fp->blockChain);
>         JS_ASSERT(!js_IsActiveWithOrBlock(cx, fp->scopeChain, 0));
> 
>-        if (JS_LIKELY(script->staticLevel < JS_DISPLAY_SIZE))
>-            cx->display[script->staticLevel] = fp->displaySave;
>-
>         void *hookData = fp->hookData;
>         if (JS_UNLIKELY(hookData != NULL)) {
>             if (JSInterpreterHook hook = cx->debugHooks->callHook) {
>                 hook(cx, fp, JS_FALSE, &interpReturnOK, hookData);
>                 CHECK_INTERRUPT_HANDLER();
>             }
>         }
> 
>@@ -2248,21 +2245,16 @@ BEGIN_CASE(JSOP_APPLY)
>             newfp->fun = fun;
>             newfp->argc = argc;
>             newfp->argv = vp + 2;
>             newfp->rval = UndefinedValue();
>             newfp->annotation = NULL;
>             newfp->scopeChain = obj->getParent();
>             newfp->flags = flags;
>             newfp->blockChain = NULL;
>-            if (JS_LIKELY(newscript->staticLevel < JS_DISPLAY_SIZE)) {
>-                JSStackFrame **disp = &cx->display[newscript->staticLevel];
>-                newfp->displaySave = *disp;
>-                *disp = newfp;
>-            }
>             JS_ASSERT(!JSFUN_BOUND_METHOD_TEST(fun->flags));
>             JS_ASSERT_IF(!vp[1].isPrimitive(), IsSaneThisObject(vp[1].toObject()));
>             newfp->thisv = vp[1];
>             newfp->imacpc = NULL;
> 
>             /* Push void to initialize local variables. */
>             Value *newsp = newfp->base();
>             SetValueRangeToUndefined(newfp->slots(), newsp);
>@@ -2850,17 +2842,17 @@ END_SET_CASE(JSOP_SETLOCAL)
> BEGIN_CASE(JSOP_GETUPVAR)
> BEGIN_CASE(JSOP_CALLUPVAR)
> {
>     JSUpvarArray *uva = script->upvars();
> 
>     uintN index = GET_UINT16(regs.pc);
>     JS_ASSERT(index < uva->length);
> 
>-    const Value &rval = js_GetUpvar(cx, script->staticLevel, uva->vector[index]);
>+    const Value &rval = GetUpvar(cx, script->staticLevel, uva->vector[index]);
>     PUSH_COPY(rval);
> 
>     if (op == JSOP_CALLUPVAR)
>         PUSH_NULL();
> }
> END_CASE(JSOP_GETUPVAR)
> 
> BEGIN_CASE(JSOP_GETUPVAR_DBG)
>diff --git a/js/src/jsparse.cpp b/js/src/jsparse.cpp
>--- a/js/src/jsparse.cpp
>+++ b/js/src/jsparse.cpp
>@@ -690,17 +690,17 @@ Parser::parse(JSObject *chain)
>         }
>     }
>     return pn;
> }
> 
> JS_STATIC_ASSERT(UpvarCookie::FREE_LEVEL == JS_BITMASK(JSFB_LEVEL_BITS));
> 
> static inline bool
>-SetStaticLevel(JSTreeContext *tc, uintN staticLevel)
>+SetStaticLevel(JSTreeContext *tc, uint16 staticLevel)
> {
>     /*
>      * This is a lot simpler than error-checking every UpvarCookie::set, and
>      * practically speaking it leaves more than enough room for upvars.
>      */
>     if (UpvarCookie::isLevelReserved(staticLevel)) {
>         JS_ReportErrorNumber(tc->parser->context, js_GetErrorMessage, NULL,
>                              JSMSG_TOO_DEEP, js_function_str);
>@@ -714,17 +714,17 @@ SetStaticLevel(JSTreeContext *tc, uintN 
>  * Compile a top-level script.
>  */
> JSScript *
> Compiler::compileScript(JSContext *cx, JSObject *scopeChain, JSStackFrame *callerFrame,
>                         JSPrincipals *principals, uint32 tcflags,
>                         const jschar *chars, size_t length,
>                         FILE *file, const char *filename, uintN lineno,
>                         JSString *source /* = NULL */,
>-                        unsigned staticLevel /* = 0 */)
>+                        uint16 staticLevel /* = 0 */)
> {
>     JSArenaPool codePool, notePool;
>     TokenKind tt;
>     JSParseNode *pn;
>     uint32 scriptGlobals;
>     JSScript *script;
>     bool inDirectivePrologue;
> #ifdef METER_PARSENODES
>diff --git a/js/src/jsparse.h b/js/src/jsparse.h
>--- a/js/src/jsparse.h
>+++ b/js/src/jsparse.h
>@@ -1126,17 +1126,17 @@ struct Compiler
>                         const char *filename, uintN lineno);
> 
>     static JSScript *
>     compileScript(JSContext *cx, JSObject *scopeChain, JSStackFrame *callerFrame,
>                   JSPrincipals *principals, uint32 tcflags,
>                   const jschar *chars, size_t length,
>                   FILE *file, const char *filename, uintN lineno,
>                   JSString *source = NULL,
>-                  unsigned staticLevel = 0);
>+                  uint16 staticLevel = 0);
> };
> 
> } /* namespace js */
> 
> /*
>  * Convenience macro to access Parser.tokenStream as a pointer.
>  */
> #define TS(p) (&(p)->tokenStream)
>diff --git a/js/src/jsscript.h b/js/src/jsscript.h
>--- a/js/src/jsscript.h
>+++ b/js/src/jsscript.h
>@@ -79,16 +79,17 @@ class UpvarCookie
>     static const uint32 FREE_VALUE = 0xfffffffful;
> 
>   public:
>     /*
>      * All levels above-and-including FREE_LEVEL are reserved so that
>      * FREE_VALUE can be used as a special value.
>      */
>     static const uint16 FREE_LEVEL = 0x3fff;
>+    static const uint16 MAX_LEVEL = FREE_LEVEL - 1;
>     static const uint16 CALLEE_SLOT = 0xffff;
>     static bool isLevelReserved(uint16 level) { return level >= FREE_LEVEL; }
> 
>     bool isFree() const { return value == FREE_VALUE; }
>     uint32 asInteger() const { return value; }
>     /* isFree check should be performed before using these accessors. */
>     uint16 level() const { JS_ASSERT(!isFree()); return value >> 16; }
>     uint16 slot() const { JS_ASSERT(!isFree()); return value; }
>diff --git a/js/src/jstracer.cpp b/js/src/jstracer.cpp
>--- a/js/src/jstracer.cpp
>+++ b/js/src/jstracer.cpp
>@@ -3107,18 +3107,18 @@ GetUpvarOnTrace(JSContext* cx, uint32 up
>         *result = state->stackBase[native_slot];
>         return state->callstackBase[0]->get_typemap()[native_slot];
>     }
> 
>     /*
>      * If we did not find the upvar in the frames for the active traces,
>      * then we simply get the value from the interpreter state.
>      */
>-    JS_ASSERT(upvarLevel < JS_DISPLAY_SIZE);
>-    JSStackFrame* fp = cx->display[upvarLevel];
>+    JS_ASSERT(upvarLevel < UpvarCookie::FREE_LEVEL);
>+    JSStackFrame* fp = FindFrameAtLevel(cx, upvarLevel);
>     Value v = T::interp_get(fp, slot);
>     JSValueType type = getCoercedType(v);
>     ValueToNative(v, type, result);
>     return type;
> }
> 
> // For this traits type, 'slot' is the argument index, which may be -2 for callee.
> struct UpvarArgTraits {
>@@ -5632,21 +5632,16 @@ SynthesizeFrame(JSContext* cx, const Fra
> #endif
>     newfp->rval = UndefinedValue();
>     newfp->annotation = NULL;
>     newfp->scopeChain = NULL; // will be updated in FlushNativeStackFrame
>     newfp->flags = fi.is_constructing() ? JSFRAME_CONSTRUCTING : 0;
>     newfp->blockChain = NULL;
>     newfp->thisv.setNull(); // will be updated in FlushNativeStackFrame
>     newfp->imacpc = NULL;
>-    if (newscript->staticLevel < JS_DISPLAY_SIZE) {
>-        JSStackFrame **disp = &cx->display[newscript->staticLevel];
>-        newfp->displaySave = *disp;
>-        *disp = newfp;
>-    }
> 
>     /*
>      * Note that fp->script is still the caller's script; set the callee
>      * inline frame's idea of caller version from its version.
>      */
>     newfp->callerVersion = (JSVersion) fp->script->version;
> 
>     /* Push inline frame. (Copied from js_Interpret.) */
>@@ -5708,17 +5703,16 @@ SynthesizeSlowNativeFrame(TracerState& s
>     fp->argv = state.nativeVp + 2;
>     fp->fun = GET_FUNCTION_PRIVATE(cx, fp->callee());
>     fp->rval = UndefinedValue();
>     fp->annotation = NULL;
>     JS_ASSERT(cx->fp->scopeChain);
>     fp->scopeChain = cx->fp->scopeChain;
>     fp->blockChain = NULL;
>     fp->flags = exit->constructing() ? JSFRAME_CONSTRUCTING : 0;
>-    fp->displaySave = NULL;
> 
>     state.bailedSlowNativeRegs = *cx->regs;
> 
>     cx->stack().pushSynthesizedSlowNativeFrame(cx, cs, fp, state.bailedSlowNativeRegs);
> 
>     state.bailedSlowNativeRegs.pc = NULL;
>     state.bailedSlowNativeRegs.sp = fp->slots();
> }
>@@ -12920,35 +12914,35 @@ JS_DEFINE_CALLINFO_5(extern, UINT32, Get
>  * value. NULL is returned only on a can't-happen condition with an invalid
>  * typemap. The value of the upvar is returned as v.
>  */
> JS_REQUIRES_STACK LIns*
> TraceRecorder::upvar(JSScript* script, JSUpvarArray* uva, uintN index, Value& v)
> {
>     /*
>      * Try to find the upvar in the current trace's tracker. For &vr to be
>-     * the address of the jsval found in js_GetUpvar, we must initialize
>+     * the address of the jsval found in js::GetUpvar, we must initialize
>      * vr directly with the result, so it is a reference to the same location.
>      * It does not work to assign the result to v, because v is an already
>      * existing reference that points to something else.
>      */
>     UpvarCookie cookie = uva->vector[index];
>-    const Value& vr = js_GetUpvar(cx, script->staticLevel, cookie);
>+    const Value& vr = GetUpvar(cx, script->staticLevel, cookie);
>     v = vr;
> 
>     if (LIns* ins = attemptImport(&vr))
>         return ins;
> 
>     /*
>      * The upvar is not in the current trace, so get the upvar value exactly as
>      * the interpreter does and unbox.
>      */
>     uint32 level = script->staticLevel - cookie.level();
>     uint32 cookieSlot = cookie.slot();
>-    JSStackFrame* fp = cx->display[level];
>+    JSStackFrame* fp = FindFrameAtLevel(cx, level);
>     const CallInfo* ci;
>     int32 slot;
>     if (!fp->fun || (fp->flags & JSFRAME_EVAL)) {
>         ci = &GetUpvarStackOnTrace_ci;
>         slot = cookieSlot;
>     } else if (cookieSlot < fp->fun->nargs) {
>         ci = &GetUpvarArgOnTrace_ci;
>         slot = cookieSlot;
>diff --git a/js/src/methodjit/Compiler.cpp b/js/src/methodjit/Compiler.cpp
>--- a/js/src/methodjit/Compiler.cpp
>+++ b/js/src/methodjit/Compiler.cpp
>@@ -1449,42 +1449,29 @@ mjit::Compiler::jsop_getglobal(uint32 in
>     Address address = masm.objSlotRef(globalObj, reg, slot);
>     frame.freeReg(reg);
>     frame.push(address);
> }
> 
> void
> mjit::Compiler::emitReturn()
> {
>-    RegisterID t0 = frame.allocReg();
>-
>     /*
>      * if (!f.inlineCallCount)
>      *     return;
>      */
>     Jump noInlineCalls = masm.branchPtr(Assembler::Equal,
>                                         FrameAddress(offsetof(VMFrame, inlineCallCount)),
>                                         ImmPtr(0));
>     stubcc.linkExit(noInlineCalls, Uses(frame.frameDepth()));
> #if defined(JS_CPU_ARM)
>     stubcc.masm.loadPtr(FrameAddress(offsetof(VMFrame, scriptedReturn)), JSC::ARMRegisters::lr);
> #endif
>     stubcc.masm.ret();
> 
>-    /* Restore display. */
>-    if (script->staticLevel < JS_DISPLAY_SIZE) {
>-        RegisterID t1 = frame.allocReg();
>-        masm.loadPtr(FrameAddress(offsetof(VMFrame, cx)), t0);
>-        masm.loadPtr(Address(JSFrameReg, offsetof(JSStackFrame, displaySave)), t1);
>-        masm.storePtr(t1, Address(t0,
>-                                  offsetof(JSContext, display) +
>-                                  script->staticLevel * sizeof(JSStackFrame*)));
>-        frame.freeReg(t1);
>-    }
>-
>     JS_ASSERT_IF(!fun, JSOp(*PC) == JSOP_STOP);
> 
>     /*
>      * If there's a function object, deal with the fact that it can escape.
>      * Note that after we've placed the call object, all tracked state can
>      * be thrown away. This will happen anyway because the next live opcode
>      * (if any) must have an incoming edge.
>      *
>diff --git a/js/src/methodjit/InvokeHelpers.cpp b/js/src/methodjit/InvokeHelpers.cpp
>--- a/js/src/methodjit/InvokeHelpers.cpp
>+++ b/js/src/methodjit/InvokeHelpers.cpp
>@@ -237,22 +237,16 @@ CreateFrame(VMFrame &f, uint32 flags, ui
>                                cx->debugHooks->callHookData);
>         // CHECK_INTERRUPT_HANDLER();
>     } else {
>         newfp->hookData = NULL;
>     }
> 
>     stack.pushInlineFrame(cx, fp, cx->regs->pc, newfp);
> 
>-    if (newscript->staticLevel < JS_DISPLAY_SIZE) {
>-        JSStackFrame **disp = &cx->display[newscript->staticLevel];
>-        newfp->displaySave = *disp;
>-        *disp = newfp;
>-    }
>-
>     return true;
> }
> 
> static inline void
> FixVMFrame(VMFrame &f, JSStackFrame *fp)
> {
>     f.inlineCallCount++;
>     f.fp->ncode = f.scriptedReturn;
>@@ -295,19 +289,16 @@ InlineCall(VMFrame &f, uint32 flags, voi
> static bool
> InlineReturn(JSContext *cx, JSBool ok)
> {
>     JSStackFrame *fp = cx->fp;
> 
>     JS_ASSERT(!fp->blockChain);
>     JS_ASSERT(!js_IsActiveWithOrBlock(cx, fp->scopeChain, 0));
> 
>-    if (fp->script->staticLevel < JS_DISPLAY_SIZE)
>-        cx->display[fp->script->staticLevel] = fp->displaySave;
>-
>     // Marker for debug support.
>     void *hookData = fp->hookData;
>     if (JS_UNLIKELY(hookData != NULL)) {
>         JSInterpreterHook hook;
>         JSBool status;
> 
>         hook = cx->debugHooks->callHook;
>         if (hook) {
>@@ -489,22 +480,16 @@ CreateLightFrame(VMFrame &f, uint32 flag
> 
> #ifdef DEBUG
>     newfp->savedPC = JSStackFrame::sInvalidPC;
> #endif
>     newfp->down = fp;
>     fp->savedPC = f.regs.pc;
>     cx->setCurrentFrame(newfp);
> 
>-    if (newscript->staticLevel < JS_DISPLAY_SIZE) {
>-        JSStackFrame **disp = &cx->display[newscript->staticLevel];
>-        newfp->displaySave = *disp;
>-        *disp = newfp;
>-    }
>-
>     return true;
> }
> 
> /*
>  * stubs::Call is guaranteed to be called on a scripted call with JIT'd code.
>  */
> void * JS_FASTCALL
> stubs::Call(VMFrame &f, uint32 argc)
>diff --git a/js/src/methodjit/StubCalls.cpp b/js/src/methodjit/StubCalls.cpp
>--- a/js/src/methodjit/StubCalls.cpp
>+++ b/js/src/methodjit/StubCalls.cpp
>@@ -1443,17 +1443,17 @@ stubs::InitElem(VMFrame &f, uint32 last)
> 
> void JS_FASTCALL
> stubs::GetUpvar(VMFrame &f, uint32 ck)
> {
>     /* :FIXME: We can do better, this stub isn't needed. */
>     uint32 staticLevel = f.fp->script->staticLevel;
>     UpvarCookie cookie;
>     cookie.fromInteger(ck);
>-    f.regs.sp[0] = js_GetUpvar(f.cx, staticLevel, cookie);
>+    f.regs.sp[0] = GetUpvar(f.cx, staticLevel, cookie);
> }
> 
> JSObject * JS_FASTCALL
> stubs::DefLocalFun(VMFrame &f, JSFunction *fun)
> {
>     /*
>      * Define a local function (i.e., one nested at the top level of another
>      * function), parented by the current scope chain, stored in a local
Attachment #456616 - Flags: review? → review?(dvander)
Comment on attachment 456616 [details] [diff] [review]
Remove Algol-like display.

>+static JS_INLINE JSStackFrame *
>+FindFrameAtLevel(JSContext *cx, uint16 targetLevel, JSStackFrame * const baseFrame = 0)

NULL instead of 0. Also should this go in jscntxtinlines.h?

Should get this on TM, too.
Attachment #456616 - Flags: review?(dvander) → review+
(In reply to comment #12)
> Also should this go in jscntxtinlines.h?

I'll make a follow up bug to move the inlines that are in jscntxt.h into that file.

> Should get this on TM, too.

Will do after I perf-test it there.
Status: NEW → ASSIGNED
I don't know if they should all be moved, just ones that are only used in select places.
Comment on attachment 456616 [details] [diff] [review]
Remove Algol-like display.

>+ * @param closureLevel      The static level of the closure that the cookie
>+ *                          pertains to.
>+ * @param cookie            Level amount is a "skip" (delta) value from the
>+ *                          closure level.
>+ * @return  The value of the upvar.
>  */
> extern const js::Value &
>-js_GetUpvar(JSContext *cx, uintN level, js::UpvarCookie cookie);
>+GetUpvar(JSContext *cx, uint16 closureLevel, js::UpvarCookie cookie);

Don't change from intentional uintN to uint16 -- that just narrows types from natural width, potentially resulting in extra chopping/masking instructions. This applies elsewhere in the patch.

>diff --git a/js/src/jsparse.cpp b/js/src/jsparse.cpp

The jsparse.cpp patch is too small. The analysis under Parser::setFunctionKinds should shrink by a lot. More tomorrow, but this patch needs review from me -- and it should land on tm first or at the same time, for best regression blame.

/be
(In reply to comment #15)
> Don't change from intentional uintN to uint16 -- that just narrows types from
> natural width, potentially resulting in extra chopping/masking instructions.
> This applies elsewhere in the patch.

The static level for a function may never exceed the free static level (see jsparse.cpp:SetStaticLevel), which restricts the closure's static level to 16b. It really shouldn't result in much chopping on x86 if any, just word-sized instruction modes. Would be interesting to switch it to uintN and see how valgrind says the total instruction count changes, though.

> The jsparse.cpp patch is too small. The analysis under Parser::setFunctionKinds
> should shrink by a lot.

It didn't shrink because we still support JSOP_GETUPVAR, we just retrieve the values by traversing fp->down links.

> More tomorrow, but this patch needs review from me --
> and it should land on tm first or at the same time, for best regression blame.

I'll make a patch against TM tip and r?you.
Agree w/ Brendan - let's use uintN or uint32 instead. uint16 causes more expensive instructions to be emitted. At the very least, it looks a little weird since we don't tend to pass shortened integers around.
Again, for locals and params, use "natural width" int types, not the precise width types used in packed structs. It wins.

/be
(In reply to comment #18)
> Again, for locals and params, use "natural width" int types, not the precise
> width types used in packed structs. It wins.

Will do. Out of curiosity, can you cite sources on the win? I'd be interested to read about precisely why.
Regarding uintN versus uint16 discussion. Consider the following C-file that tests uintN/uint16 argument passing and result returns:

typedef unsigned uintN;
typedef unsigned short uint16;

extern void js_static_assert(int arg[sizeof(uint16) == 2 ? 1 : -1]);

struct Data {
    int i;
    uint16 a;
    uint16 b;
};

uintN f_uint(void *p, int i);
uintN f_uint16(void *p, uint16 s);

uintN test_uint(void *p, struct Data* d)
{
    return f_uint(p, d->b | 1);
}

uintN test_uint16(void *p, struct Data* d)
{
    return f_uint16(p, d->b | 1);
}

uint16 test_arg_uint16(void *p, uint16 arg)
{
    return arg | 1;
}

uintN test_arg_uint(void *p, int arg)
{
    return arg | 1;
}


I compile it with GCC 4.4 on Linux for i386 using:
gcc -m32 -c -S -O2 x.c

This gives the following assembly output:

test_arg_uint16:
	pushl	%ebp
	movl	%esp, %ebp
	movzwl	12(%ebp), %eax
	popl	%ebp
	orl	$1, %eax
	ret

test_arg_uint:
	pushl	%ebp
	movl	%esp, %ebp
	movl	12(%ebp), %eax
	popl	%ebp
	orl	$1, %eax
	ret

test_uint16:
	pushl	%ebp
	movl	%esp, %ebp
	subl	$8, %esp
	movl	12(%ebp), %eax
	movzwl	6(%eax), %eax
	orl	$1, %eax
	movzwl	%ax, %eax
	movl	%eax, 12(%ebp)
	leave
	jmp	f_uint16

test_uint:
	pushl	%ebp
	movl	%esp, %ebp
	subl	$8, %esp
	movl	12(%ebp), %eax
	movzwl	6(%eax), %eax
	orl	$1, %eax
	movzwl	%ax, %eax
	movl	%eax, 12(%ebp)
	leave
	jmp	f_uint


This shows that with GCC 4.4 there is no penalty in passing uint16 as parameters.

Still throughout the code SM style is to use uintN, not a narrower type, when passing values that fits by design uint8 or uint16 types. Typically the actual range of values is even less than the narrow type so one needs asserts in any case to check for value ranges. Using uint16 for arguments and returns just distrusts the reader who is accustom to SM style.
(In reply to comment #20)

> This shows that with GCC 4.4 there is no penalty in passing uint16 as
parameters.

Isn't that because the ABI specifies that the parameters will be 16-bit zero-extended on entry? If you actually call those functions you have a higher risk of getting extra MOVZW or CWT instructions, esp. if you accidentally use a 32-bit inparam (which, I think, this patch does in one place).

This,
        call    _rand
        movl    %eax, 8(%ebp)
        leave
        jmp     _test_arg_uint

versus

        call    _rand
        movzwl  %ax, %eax
        movl    %eax, (%esp)
        call    _test_arg_uint16

So, more evidence :)
(In reply to comment #21)
> 
> This,
>         call    _rand
>         movl    %eax, 8(%ebp)
>         leave
>         jmp     _test_arg_uint
> 
> versus
> 
>         call    _rand
>         movzwl  %ax, %eax
>         movl    %eax, (%esp)
>         call    _test_arg_uint16
> 
> So, more evidence :)

Is this on MAC?
FWIW, I'd like to see this land because it'll make things simpler for me when I'm adding more access regions, to improve alias analysis.  This is because after it lands JSStackFrames will always come from cx->fp, which simplifies some sanity checking I'm doing.
(In reply to comment #24)
> FWIW, I'd like to see this land because it'll make things simpler for me when
> I'm adding more access regions, to improve alias analysis.  This is because
> after it lands JSStackFrames will always come from cx->fp, which simplifies
> some sanity checking I'm doing.

Actually, I'm mistaken about that, never mind.  Sorry for the noise.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
> we understand that to be very few links in the common case 

But we introduced a pathological case as a result: recursive code using the module pattern is now super-slow.  See bug 614834.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: