Closed
Bug 65308
Opened 25 years ago
Closed 24 years ago
Nested function invocation too slow
Categories
(Core :: JavaScript Engine, defect, P2)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla0.9
People
(Reporter: brendan, Assigned: shaver)
Details
(Keywords: js1.5)
Attachments
(11 files)
2.66 KB,
text/plain
|
Details | |
15.39 KB,
patch
|
Details | Diff | Splinter Review | |
15.37 KB,
patch
|
Details | Diff | Splinter Review | |
16.20 KB,
patch
|
Details | Diff | Splinter Review | |
1.95 KB,
patch
|
Details | Diff | Splinter Review | |
17.32 KB,
patch
|
Details | Diff | Splinter Review | |
18.58 KB,
patch
|
Details | Diff | Splinter Review | |
18.64 KB,
patch
|
Details | Diff | Splinter Review | |
18.95 KB,
patch
|
Details | Diff | Splinter Review | |
19.10 KB,
patch
|
Details | Diff | Splinter Review | |
19.22 KB,
patch
|
Details | Diff | Splinter Review |
Call object creation is one cost, but worse: the engine pessimistically creates
an arguments object and syncs actual argument values with it. Patch coming up
in a day or two.
/be
Reporter | ||
Updated•25 years ago
|
Status: NEW → ASSIGNED
Keywords: js1.5,
mozilla0.8
Priority: -- → P2
Target Milestone: --- → mozilla0.8
Reporter | ||
Comment 1•25 years ago
|
||
Moving out, this may miss 1.5 and be an optimization for some future 1.5.1 or
whatever.
/be
Keywords: mozilla0.8 → mozilla0.9
Target Milestone: mozilla0.8 → mozilla0.9
Assignee | ||
Comment 2•24 years ago
|
||
What's an example of this nested function invocation? Can you attach a test
case?
Reporter | ||
Comment 3•24 years ago
|
||
Reporter | ||
Comment 4•24 years ago
|
||
Try uncommenting the // XXX commented function foo() inside quickSortHelper, and
measure performance with and without, in js and xpcshell. Also, try reversing
the array (the second // XXX comment) to get O(n**2) behavior on a sorted array,
just to give a different input to the sort code with and without the nested foo
function.
Thanks for your interest -- hit us here with results, ok?
/be
Reporter | ||
Comment 5•24 years ago
|
||
Now that activation (Call) objects are censored per ECMA, I think a lot of the
js_PutCallObject nonsense can go away -- maybe all of it? Need to get some time
to think the cases through, but feel free to beat me to it.
/be
Assignee | ||
Comment 6•24 years ago
|
||
The cases where we currently become HEAVYWEIGHT are:
- import:
http://lxr.mozilla.org/mozilla/js/src/jsparse.c#1103
- export:
http://lxr.mozilla.org/mozilla/js/src/jsparse.c#1084
- arguments:
http://lxr.mozilla.org/mozilla/js/src/jsparse.c#1340
http://lxr.mozilla.org/mozilla/js/src/jsparse.c#2014
http://lxr.mozilla.org/mozilla/js/src/jsparse.c#2084
http://lxr.mozilla.org/mozilla/js/src/jsparse.c#2350
- eval:
http://lxr.mozilla.org/mozilla/js/src/jsparse.c#2566
- __parent__/__proto__:
http://lxr.mozilla.org/mozilla/js/src/jsparse.c#2870
- with:
http://lxr.mozilla.org/mozilla/js/src/jsparse.c#1642
- debugger
http://lxr.mozilla.org/mozilla/js/src/jsparse.c#1728
I'm without a useful profiler right now, but I'll spend some time looking at how
we can live without a Call object in those cases, or at least in the
nested-function case. (Once I figure out exactly why we needed it in the first
place. =) )
Assignee | ||
Comment 7•24 years ago
|
||
My LXR links are, as brendan put it so eloquently, ``the shitz''.
- import:
http://lxr.mozilla.org/mozilla/source/js/src/jsparse.c#1103
- export:
http://lxr.mozilla.org/mozilla/source/js/src/jsparse.c#1084
- arguments:
http://lxr.mozilla.org/mozilla/source/js/src/jsparse.c#1340
http://lxr.mozilla.org/mozilla/source/js/src/jsparse.c#2014
http://lxr.mozilla.org/mozilla/source/js/src/jsparse.c#2084
http://lxr.mozilla.org/mozilla/source/js/src/jsparse.c#2350
- eval:
http://lxr.mozilla.org/mozilla/source/js/src/jsparse.c#2566
- __parent__/__proto__:
http://lxr.mozilla.org/mozilla/source/js/src/jsparse.c#2870
- with:
http://lxr.mozilla.org/mozilla/source/js/src/jsparse.c#1642
- debugger
http://lxr.mozilla.org/mozilla/source/js/src/jsparse.c#1728
Reporter | ||
Comment 8•24 years ago
|
||
In jsparse.c source order, we have these cases that induce JSFUN_HEAVYWEIGHT:
- export
- import
- for (arguments in o) ...;
- with (o) ...;
- debugger;
- var arguments = ...;
- arguments = ...;
- arguments++, --arguments, etc.
- eval(...)
- __proto__, __parent__ unqualified (no o. to left)
I don't believe any of these can be changed to lightweight. I tried recently to
make the eval() (and o.eval()) case work by lazily converting from lightweight
to heavyweight, but it was too complicated to get right in a few days, which I
think means it's just too damn complicated to be worth the trouble. It might in
fact require more labeling of caller stack frame "intentions", so we can know
how to splice the activation object that's lazily created into the scope chain,
and whether to use it for varobj, etc.
This bug is really asking (as I should have stated more clearly) for cheaper
js_GetCallObject and (especially!) js_PutCallObject implementations once you
know a function has to be heavyweight.
shaver, you wanna take this bug away from me? Calgon take me away!
/be
Assignee | ||
Comment 9•24 years ago
|
||
Brendan and I chatted about this at length, and we've got a strategy that should
make many more functions lightweight, by optimizing local (inner) function
references to stack slots á la SETVAR and friends. Only outer functions whose
inner functions are themselves heavy or make unqualified references to non-local
variables will pay the HEAVYWEIGHT price.
I've got much of it working now:
js> function f() { function g() { } print (g); var g = 5; print (g); }
defining local function g as slot 0 in obj 0x80d1750
defining local function g as slot 0 in obj 0x80d1768
js> f()
function g() {
}
5
js> dis(f)
00000: object function g() {}
00003: setvar 0
main:
00006: name "print"
00009: pushobj
00010: getvar 0
00013: call 1
00016: popv
00017: uint16 5
00020: setvar 0
00023: pop
00024: name "print"
00027: pushobj
00028: getvar 0
00031: call 1
00034: popv
The decompiler, as always, needs my help, but we're making progress. If I had a
profiler, I'd generate some timings before and after for Brendan's little
quickSort thingy. But I don't. Oh well.
Assignee: brendan → shaver
Status: ASSIGNED → NEW
Reporter | ||
Comment 10•24 years ago
|
||
Got shaver's patch working, thanks to his initiative and great effort more than
to my dogged pursuit of js testsuite purity (achieved). Attaching patch. Let's
find real-world and synthetic benchmarks to measure.
/be
Reporter | ||
Comment 11•24 years ago
|
||
Reporter | ||
Comment 12•24 years ago
|
||
Reviewing my own code:
+ /* We couldn't optimize it, so it's not a arg or local var name. */
typo: "an arg"
+ JSBool ok = JS_TRUE;
in js_PutCallObject is unnecessary -- removing.
+ /*
+ * Define a property on enclosing function, so that LookupArgOrVar
+ * can properly optimize later accesses.
+ */
+ if (!OBJ_DEFINE_PROPERTY(cx, cx->fp->varobj, (jsid)funAtom,
+ OBJECT_TO_JSVAL(fun->object),
+ js_GetLocalVariable, js_SetLocalVariable,
+ JSPROP_ENUMERATE | JSPROP_PERMANENT,
+ (JSProperty **)&sprop)) {
+ return NULL;
+ }
Should we use JSPROP_READONLY too? We're stashing the inner function as a
property of the outer only for use during compilation. We could use an internal
data structure here and for local variables, but now's not the time to invent
another mechanism. SpiderMonkey has always used function object as "static
link" or compiler scope chain to keep track of args and locals.
Anyway, it doesn't matter if someone overwrites, e.g., f.g = 42; after function
f(x){function g(y){return x*y}; return g} is compiled, except from the point of
view of observability/introspection. Thoughts?
/be
Reporter | ||
Comment 13•24 years ago
|
||
Reporter | ||
Comment 14•24 years ago
|
||
Uh, "reviewing shaver and my code". Actually, the typo was mine! Anyway, I
fixed a few more glitches (comments, tabs), so look at the last patch, all you
reviewers and testers.
/be
Assignee | ||
Comment 15•24 years ago
|
||
I like JSOP_DEFLOCALFUN, especially the part where it works. If we're just
concerned about observability, let's make the inner function enumerable only,
and forego PERMANENT as well.
The patch looks good, and I'm more than happy to take the cvsblame for it. I'll
try and generate some profiles now to compare.
Status: NEW → ASSIGNED
Assignee | ||
Comment 16•24 years ago
|
||
I'm able to use the profiler only for gross execution timing, and it doesn't
give units, but hey: it's all I've got.
5 runs of each, averaged.
BEFORE AFTER
sorted-inner 920 131
sorted-no-inner 138 131
unsorted-inner 976 160
unsorted-no-inner 168 158
Whee! Looks like we killed it, and good.
Reporter | ||
Comment 17•24 years ago
|
||
/me squints at the patch.
Do we need to ensure that if JSOP_CLOSURE is emitted, that the outer function is
heavyweight, and that no variable is defined? Example:
function f(x){print(g);if(x)function g(){}}
f(0) prints undefined, f(1) prints undefined and then assert-botches at line
3293 of jsinterp.c. fp->varobj is null, fp->scopeChain is [global]. Revised
patch coming up.
/be
Reporter | ||
Comment 18•24 years ago
|
||
Reporter | ||
Comment 19•24 years ago
|
||
js> function f(x){
print(typeof g == 'undefined' || g);
if (x) function g(){};
print(g);
}
js> f(0)
true
6: ReferenceError: g is not defined
js> f(1)
true
function g() {
}
js>
Anyone got another bad case? I'm liking this patch.
/be
Reporter | ||
Comment 20•24 years ago
|
||
Assignee | ||
Comment 21•24 years ago
|
||
You're so quick to concede to the lure of HEAVYWEIGHT!
Why should JSOP_CLOSURE force us to be heavy, a priori?
js> f
function f(x, call) {
if (x) {
call(function g() {return true;});
} else {
call(function g() {return false;});
}
}
js> f2
function f2(x, call) {
if (x) {
function g() {
return true;
}
} else {
function g() {
return false;
}
}
call(g);
}
js> dis(f)
main:
00000: getarg 0
00003: ifeq 20 (17)
...
js> dis(f2)
flags: HEAVYWEIGHT
main:
00000: getarg 0
00003: ifeq 12 (9)
...
Why should f2 be heavy there? It has no inners that are themselves heavy or
peek out, and I thought that was the rule we were using...
Reporter | ||
Comment 22•24 years ago
|
||
We still have a fun vs. var case, with closure:
js> function f(x){var g;print(typeof g);if (x) function g(){}; print(g)}
js> f(0)
undefined
undefined
js> f(1)
undefined
undefined
The last use of g, the final print's arg, is optimized to JSOP_GETVAR. Patch
coming up, once more unto the breach!
/be
Component: Javascript Engine → Keyboard Navigation
Reporter | ||
Comment 23•24 years ago
|
||
shaver: yeah, I thought about keeping the outer lightweight when the inner is a
closure, but it's messy: in the following, the first use of g *must* throw a 'g
is not defined' exception:
function f(x){print(g); if (x) function g(){}}
f(0) || f(1)
/be
Reporter | ||
Comment 24•24 years ago
|
||
Reporter | ||
Comment 25•24 years ago
|
||
Assignee | ||
Comment 26•24 years ago
|
||
I love what you've done with the place.
I'll take your mods as sr=, and prepare to check in if we can sucker jband into
coughing up r=.
For future consideration:
- to make JSOP_CLOSURE not be the heavy, we need to <brendan>avoid a
JSOP_DEFVAR when there is a var and a closure in a function, *and* emit a
JSOP_DEFLOCALFUN at the closure site</brendan>
- we could use a bit in sprop->spare (``they're just _sitting_ there'') to flag
a collision betwen var and closure, rather than putting the whole function out
of bounds.
Reporter | ||
Comment 27•24 years ago
|
||
Yes, I'll r/sr= this, so we can get the patch in and put future work in either
this bug targeted at a later date, or in a separate bug.
jband, whaddya say?
/be
Reporter | ||
Comment 28•24 years ago
|
||
Reporter | ||
Comment 29•24 years ago
|
||
One last (?) bug caught by review:
+ if (!tc->topStmt && (tc->flags & TCF_IN_FUNCTION)) {
+ /*
+ * Define a property on the outer function so that LookupArgOrVar
+ * can properly optimize accesses.
+ *
+ * XXX Here and in Variables, we use the function object's scope,
+ * XXX arguably polluting it, when we could use a compiler-private
+ * XXX scope structure. Tradition!
+ */
+ if (!OBJ_DEFINE_PROPERTY(cx, cx->fp->varobj, (jsid)funAtom,
+ OBJECT_TO_JSVAL(fun->object),
+ js_GetLocalVariable, js_SetLocalVariable,
+ JSPROP_ENUMERATE,
+ (JSProperty **)&sprop)) {
+ return NULL;
+ }
+
+ /* Allocate a slot for this property. */
+ sprop->id = INT_TO_JSVAL(cx->fp->fun->nvars++);
+ }
(1) This code must be defining a native property (sprop, not prop), as it knows
cx->fp->varobj is a function object (whose private data is cx->fp->fun). So it
should use js_DefineProperty, not OBJ_LOOKUP_PROPERTY.
(2) There's no OBJ_DROP_PROPERTY on sprop after we're done -- bad stuck lock and
refcount bug for JS_THREADSAFE mode.
Patch coming up, last one from me.
/be
Reporter | ||
Comment 30•24 years ago
|
||
Comment 31•24 years ago
|
||
I need help understanding the timing. How is TCF_FUN_USES_NONLOCALS getting set
on an inner function statement in time to inform the outer function? I was
trying to get the code to stop at:
(jsparse.c, line 744)
if ((!lambda && funAtom && tc->topStmt) ||
(funtc.flags & TCF_FUN_USES_NONLOCALS)) {
tc->flags |= TCF_FUN_HEAVYWEIGHT;
}
but couldn't invent a case that hit this?
Reporter | ||
Comment 32•24 years ago
|
||
rogerl: good question. That test in jsparse.c follows a recursive descent into
FunctionBody, but the TCF_FUN_USES_NONLOCALS is not set in any jsparse.c code.
It's set only in jsemit.c. But as shaver and I discussed, but failed to record
here, we want to be flexible about which pass is used (Parse or CodeGen) to flag
non-local use. So that test you cite,
if ((!lambda && funAtom && tc->topStmt) ||
*** (funtc.flags & TCF_FUN_USES_NONLOCALS)) {
tc->flags |= TCF_FUN_HEAVYWEIGHT;
}
is unnecessary, but could be considered "future-proofing" (if jsparse.c ever
grew another pass, it could determine non-local use without any jsemit.c code
having to do the job), or a "proof point".
The crucial test is in js_EmitTree, in the TOK_FUNCTION case:
if (!js_EmitFunctionBody(cx, &cg2, pn2, fun))
return JS_FALSE;
+
+ /* We need an activation object if an inner peeks out. */
+ if (cg2.treeContext.flags & TCF_FUN_USES_NONLOCALS)
+ cg->treeContext.flags |= TCF_FUN_HEAVYWEIGHT;
js_FinishCodeGenerator(cx, &cg2);
The set of TCF_FUN_USES_NONLOCALS is in LookupArgOrVar, when an arg or local var
name is *not* found. LookupArgOrVar is called for every TOK_NAME node in the
parsenode tree.
Whoops -- looks like LookupArgOrVar has early returns that fail to set this
flag! Shame on us, new patch coming up.
/be
Reporter | ||
Comment 33•24 years ago
|
||
My mistake, the early returns in LookupArgOrVar are ok (it's ok that they don't
set TCF_FUN_USES_NONLOCALS). The cases are
- pn_slot already set non-negative, or pn_op set to JSOP_ARGUMENTS, i.e.
LookupArgOrVar already optimized.
- TCF_FUN_CLOSURE_VS_VAR set, which implies TCF_FUN_HEAVYWEIGHT (see the
paragraph rogerl cited in his last comment).
- cx->fp->varobj is not a function or call object: we simply aren't compiling
function code.
- we are compiling function code, but cx->fp->scopeChain != cx->fp->varobj,
which means we must be in an eval in a with statement in a function. The with
and the eval both must have already forced JSFUN_HEAVYWEIGHT to be set.
- we're compiling a with statement containing the name that's being considered
for optimization by LookupArgOrVar.
- we're compiling a catch-block that contains the name in question.
NB:
function f(x){function g(x){try{throw x}catch(x){return x}}; return g(x)};
f(42);
throws 42, catches it in the same-named variable, therefore does not optimize
the x in return x into a JSOP_GETARG, leaving the unoptimized JSOP_NAME finds x
in the scope chain, which contains only the With object that catch arranges to
push on the scope chain head, followed by the global object.
Neither f nor g is heavyweight. Their activation objects have been properly
optimized away, even though a JSOP_NAME was left in the compiled bytecode,
precisely because LookupArgOrVar knew that that name referred to a catch var.
- the JSOP_ARGUMENTS optimization special case returns early, but 'arguments' in
function code is clearly a local name.
- early error return after js_LookupProperty failure.
All is well; sorry for the false alarm. I'll post a slightly prettier patch
anyway (cosmetic change; it orders TCF_FUN_* flags to align better, and to make
TCF_FUN_HEAVYWEIGHT have the same 0x80 value as JSFUN_HEAVYWEIGHT).
Ask me questions if anything's unclear.
/be
Reporter | ||
Comment 34•24 years ago
|
||
Comment 35•24 years ago
|
||
r=rogerl
Reporter | ||
Comment 36•24 years ago
|
||
Reporter | ||
Comment 37•24 years ago
|
||
I credited shaver in the log message and checked this in; itchy commit trigger,
just like hyatt (and who knows when or how long the tree will be open again?).
Credit to shaver, blame to me. Thanks to rogerl for the r=.
A spin-off bug about avoiding heavyweight outers when inners are closures that
use no outer names is a fine thing, but I'm lazy right now. We can always find
this bug from the CVS log message, if not from elsewhere.
/be
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 38•20 years ago
|
||
Fixing component, please ignore.
Component: Keyboard: Navigation → JavaScript Engine
You need to log in
before you can comment on or make changes to this bug.
Description
•