Closed Bug 65308 Opened 25 years ago Closed 24 years ago

Nested function invocation too slow

Categories

(Core :: JavaScript Engine, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla0.9

People

(Reporter: brendan, Assigned: shaver)

Details

(Keywords: js1.5)

Attachments

(11 files)

2.66 KB, text/plain
Details
15.39 KB, patch
Details | Diff | Splinter Review
15.37 KB, patch
Details | Diff | Splinter Review
16.20 KB, patch
Details | Diff | Splinter Review
1.95 KB, patch
Details | Diff | Splinter Review
17.32 KB, patch
Details | Diff | Splinter Review
18.58 KB, patch
Details | Diff | Splinter Review
18.64 KB, patch
Details | Diff | Splinter Review
18.95 KB, patch
Details | Diff | Splinter Review
19.10 KB, patch
Details | Diff | Splinter Review
19.22 KB, patch
Details | Diff | Splinter Review
Call object creation is one cost, but worse: the engine pessimistically creates an arguments object and syncs actual argument values with it. Patch coming up in a day or two. /be
Status: NEW → ASSIGNED
Keywords: js1.5, mozilla0.8
Priority: -- → P2
Target Milestone: --- → mozilla0.8
Moving out, this may miss 1.5 and be an optimization for some future 1.5.1 or whatever. /be
Keywords: mozilla0.8mozilla0.9
Target Milestone: mozilla0.8 → mozilla0.9
What's an example of this nested function invocation? Can you attach a test case?
Try uncommenting the // XXX commented function foo() inside quickSortHelper, and measure performance with and without, in js and xpcshell. Also, try reversing the array (the second // XXX comment) to get O(n**2) behavior on a sorted array, just to give a different input to the sort code with and without the nested foo function. Thanks for your interest -- hit us here with results, ok? /be
Now that activation (Call) objects are censored per ECMA, I think a lot of the js_PutCallObject nonsense can go away -- maybe all of it? Need to get some time to think the cases through, but feel free to beat me to it. /be
The cases where we currently become HEAVYWEIGHT are: - import: http://lxr.mozilla.org/mozilla/js/src/jsparse.c#1103 - export: http://lxr.mozilla.org/mozilla/js/src/jsparse.c#1084 - arguments: http://lxr.mozilla.org/mozilla/js/src/jsparse.c#1340 http://lxr.mozilla.org/mozilla/js/src/jsparse.c#2014 http://lxr.mozilla.org/mozilla/js/src/jsparse.c#2084 http://lxr.mozilla.org/mozilla/js/src/jsparse.c#2350 - eval: http://lxr.mozilla.org/mozilla/js/src/jsparse.c#2566 - __parent__/__proto__: http://lxr.mozilla.org/mozilla/js/src/jsparse.c#2870 - with: http://lxr.mozilla.org/mozilla/js/src/jsparse.c#1642 - debugger http://lxr.mozilla.org/mozilla/js/src/jsparse.c#1728 I'm without a useful profiler right now, but I'll spend some time looking at how we can live without a Call object in those cases, or at least in the nested-function case. (Once I figure out exactly why we needed it in the first place. =) )
In jsparse.c source order, we have these cases that induce JSFUN_HEAVYWEIGHT: - export - import - for (arguments in o) ...; - with (o) ...; - debugger; - var arguments = ...; - arguments = ...; - arguments++, --arguments, etc. - eval(...) - __proto__, __parent__ unqualified (no o. to left) I don't believe any of these can be changed to lightweight. I tried recently to make the eval() (and o.eval()) case work by lazily converting from lightweight to heavyweight, but it was too complicated to get right in a few days, which I think means it's just too damn complicated to be worth the trouble. It might in fact require more labeling of caller stack frame "intentions", so we can know how to splice the activation object that's lazily created into the scope chain, and whether to use it for varobj, etc. This bug is really asking (as I should have stated more clearly) for cheaper js_GetCallObject and (especially!) js_PutCallObject implementations once you know a function has to be heavyweight. shaver, you wanna take this bug away from me? Calgon take me away! /be
Brendan and I chatted about this at length, and we've got a strategy that should make many more functions lightweight, by optimizing local (inner) function references to stack slots á la SETVAR and friends. Only outer functions whose inner functions are themselves heavy or make unqualified references to non-local variables will pay the HEAVYWEIGHT price. I've got much of it working now: js> function f() { function g() { } print (g); var g = 5; print (g); } defining local function g as slot 0 in obj 0x80d1750 defining local function g as slot 0 in obj 0x80d1768 js> f() function g() { } 5 js> dis(f) 00000: object function g() {} 00003: setvar 0 main: 00006: name "print" 00009: pushobj 00010: getvar 0 00013: call 1 00016: popv 00017: uint16 5 00020: setvar 0 00023: pop 00024: name "print" 00027: pushobj 00028: getvar 0 00031: call 1 00034: popv The decompiler, as always, needs my help, but we're making progress. If I had a profiler, I'd generate some timings before and after for Brendan's little quickSort thingy. But I don't. Oh well.
Assignee: brendan → shaver
Status: ASSIGNED → NEW
Got shaver's patch working, thanks to his initiative and great effort more than to my dogged pursuit of js testsuite purity (achieved). Attaching patch. Let's find real-world and synthetic benchmarks to measure. /be
Reviewing my own code: + /* We couldn't optimize it, so it's not a arg or local var name. */ typo: "an arg" + JSBool ok = JS_TRUE; in js_PutCallObject is unnecessary -- removing. + /* + * Define a property on enclosing function, so that LookupArgOrVar + * can properly optimize later accesses. + */ + if (!OBJ_DEFINE_PROPERTY(cx, cx->fp->varobj, (jsid)funAtom, + OBJECT_TO_JSVAL(fun->object), + js_GetLocalVariable, js_SetLocalVariable, + JSPROP_ENUMERATE | JSPROP_PERMANENT, + (JSProperty **)&sprop)) { + return NULL; + } Should we use JSPROP_READONLY too? We're stashing the inner function as a property of the outer only for use during compilation. We could use an internal data structure here and for local variables, but now's not the time to invent another mechanism. SpiderMonkey has always used function object as "static link" or compiler scope chain to keep track of args and locals. Anyway, it doesn't matter if someone overwrites, e.g., f.g = 42; after function f(x){function g(y){return x*y}; return g} is compiled, except from the point of view of observability/introspection. Thoughts? /be
Uh, "reviewing shaver and my code". Actually, the typo was mine! Anyway, I fixed a few more glitches (comments, tabs), so look at the last patch, all you reviewers and testers. /be
I like JSOP_DEFLOCALFUN, especially the part where it works. If we're just concerned about observability, let's make the inner function enumerable only, and forego PERMANENT as well. The patch looks good, and I'm more than happy to take the cvsblame for it. I'll try and generate some profiles now to compare.
Status: NEW → ASSIGNED
I'm able to use the profiler only for gross execution timing, and it doesn't give units, but hey: it's all I've got. 5 runs of each, averaged. BEFORE AFTER sorted-inner 920 131 sorted-no-inner 138 131 unsorted-inner 976 160 unsorted-no-inner 168 158 Whee! Looks like we killed it, and good.
/me squints at the patch. Do we need to ensure that if JSOP_CLOSURE is emitted, that the outer function is heavyweight, and that no variable is defined? Example: function f(x){print(g);if(x)function g(){}} f(0) prints undefined, f(1) prints undefined and then assert-botches at line 3293 of jsinterp.c. fp->varobj is null, fp->scopeChain is [global]. Revised patch coming up. /be
Attached patch best fixSplinter Review
js> function f(x){ print(typeof g == 'undefined' || g); if (x) function g(){}; print(g); } js> f(0) true 6: ReferenceError: g is not defined js> f(1) true function g() { } js> Anyone got another bad case? I'm liking this patch. /be
You're so quick to concede to the lure of HEAVYWEIGHT! Why should JSOP_CLOSURE force us to be heavy, a priori? js> f function f(x, call) { if (x) { call(function g() {return true;}); } else { call(function g() {return false;}); } } js> f2 function f2(x, call) { if (x) { function g() { return true; } } else { function g() { return false; } } call(g); } js> dis(f) main: 00000: getarg 0 00003: ifeq 20 (17) ... js> dis(f2) flags: HEAVYWEIGHT main: 00000: getarg 0 00003: ifeq 12 (9) ... Why should f2 be heavy there? It has no inners that are themselves heavy or peek out, and I thought that was the rule we were using...
We still have a fun vs. var case, with closure: js> function f(x){var g;print(typeof g);if (x) function g(){}; print(g)} js> f(0) undefined undefined js> f(1) undefined undefined The last use of g, the final print's arg, is optimized to JSOP_GETVAR. Patch coming up, once more unto the breach! /be
Component: Javascript Engine → Keyboard Navigation
shaver: yeah, I thought about keeping the outer lightweight when the inner is a closure, but it's messy: in the following, the first use of g *must* throw a 'g is not defined' exception: function f(x){print(g); if (x) function g(){}} f(0) || f(1) /be
I love what you've done with the place. I'll take your mods as sr=, and prepare to check in if we can sucker jband into coughing up r=. For future consideration: - to make JSOP_CLOSURE not be the heavy, we need to <brendan>avoid a JSOP_DEFVAR when there is a var and a closure in a function, *and* emit a JSOP_DEFLOCALFUN at the closure site</brendan> - we could use a bit in sprop->spare (``they're just _sitting_ there'') to flag a collision betwen var and closure, rather than putting the whole function out of bounds.
Yes, I'll r/sr= this, so we can get the patch in and put future work in either this bug targeted at a later date, or in a separate bug. jband, whaddya say? /be
One last (?) bug caught by review: + if (!tc->topStmt && (tc->flags & TCF_IN_FUNCTION)) { + /* + * Define a property on the outer function so that LookupArgOrVar + * can properly optimize accesses. + * + * XXX Here and in Variables, we use the function object's scope, + * XXX arguably polluting it, when we could use a compiler-private + * XXX scope structure. Tradition! + */ + if (!OBJ_DEFINE_PROPERTY(cx, cx->fp->varobj, (jsid)funAtom, + OBJECT_TO_JSVAL(fun->object), + js_GetLocalVariable, js_SetLocalVariable, + JSPROP_ENUMERATE, + (JSProperty **)&sprop)) { + return NULL; + } + + /* Allocate a slot for this property. */ + sprop->id = INT_TO_JSVAL(cx->fp->fun->nvars++); + } (1) This code must be defining a native property (sprop, not prop), as it knows cx->fp->varobj is a function object (whose private data is cx->fp->fun). So it should use js_DefineProperty, not OBJ_LOOKUP_PROPERTY. (2) There's no OBJ_DROP_PROPERTY on sprop after we're done -- bad stuck lock and refcount bug for JS_THREADSAFE mode. Patch coming up, last one from me. /be
Attached patch my final answer!Splinter Review
I need help understanding the timing. How is TCF_FUN_USES_NONLOCALS getting set on an inner function statement in time to inform the outer function? I was trying to get the code to stop at: (jsparse.c, line 744) if ((!lambda && funAtom && tc->topStmt) || (funtc.flags & TCF_FUN_USES_NONLOCALS)) { tc->flags |= TCF_FUN_HEAVYWEIGHT; } but couldn't invent a case that hit this?
rogerl: good question. That test in jsparse.c follows a recursive descent into FunctionBody, but the TCF_FUN_USES_NONLOCALS is not set in any jsparse.c code. It's set only in jsemit.c. But as shaver and I discussed, but failed to record here, we want to be flexible about which pass is used (Parse or CodeGen) to flag non-local use. So that test you cite, if ((!lambda && funAtom && tc->topStmt) || *** (funtc.flags & TCF_FUN_USES_NONLOCALS)) { tc->flags |= TCF_FUN_HEAVYWEIGHT; } is unnecessary, but could be considered "future-proofing" (if jsparse.c ever grew another pass, it could determine non-local use without any jsemit.c code having to do the job), or a "proof point". The crucial test is in js_EmitTree, in the TOK_FUNCTION case: if (!js_EmitFunctionBody(cx, &cg2, pn2, fun)) return JS_FALSE; + + /* We need an activation object if an inner peeks out. */ + if (cg2.treeContext.flags & TCF_FUN_USES_NONLOCALS) + cg->treeContext.flags |= TCF_FUN_HEAVYWEIGHT; js_FinishCodeGenerator(cx, &cg2); The set of TCF_FUN_USES_NONLOCALS is in LookupArgOrVar, when an arg or local var name is *not* found. LookupArgOrVar is called for every TOK_NAME node in the parsenode tree. Whoops -- looks like LookupArgOrVar has early returns that fail to set this flag! Shame on us, new patch coming up. /be
My mistake, the early returns in LookupArgOrVar are ok (it's ok that they don't set TCF_FUN_USES_NONLOCALS). The cases are - pn_slot already set non-negative, or pn_op set to JSOP_ARGUMENTS, i.e. LookupArgOrVar already optimized. - TCF_FUN_CLOSURE_VS_VAR set, which implies TCF_FUN_HEAVYWEIGHT (see the paragraph rogerl cited in his last comment). - cx->fp->varobj is not a function or call object: we simply aren't compiling function code. - we are compiling function code, but cx->fp->scopeChain != cx->fp->varobj, which means we must be in an eval in a with statement in a function. The with and the eval both must have already forced JSFUN_HEAVYWEIGHT to be set. - we're compiling a with statement containing the name that's being considered for optimization by LookupArgOrVar. - we're compiling a catch-block that contains the name in question. NB: function f(x){function g(x){try{throw x}catch(x){return x}}; return g(x)}; f(42); throws 42, catches it in the same-named variable, therefore does not optimize the x in return x into a JSOP_GETARG, leaving the unoptimized JSOP_NAME finds x in the scope chain, which contains only the With object that catch arranges to push on the scope chain head, followed by the global object. Neither f nor g is heavyweight. Their activation objects have been properly optimized away, even though a JSOP_NAME was left in the compiled bytecode, precisely because LookupArgOrVar knew that that name referred to a catch var. - the JSOP_ARGUMENTS optimization special case returns early, but 'arguments' in function code is clearly a local name. - early error return after js_LookupProperty failure. All is well; sorry for the false alarm. I'll post a slightly prettier patch anyway (cosmetic change; it orders TCF_FUN_* flags to align better, and to make TCF_FUN_HEAVYWEIGHT have the same 0x80 value as JSFUN_HEAVYWEIGHT). Ask me questions if anything's unclear. /be
r=rogerl
I credited shaver in the log message and checked this in; itchy commit trigger, just like hyatt (and who knows when or how long the tree will be open again?). Credit to shaver, blame to me. Thanks to rogerl for the r=. A spin-off bug about avoiding heavyweight outers when inners are closures that use no outer names is a fine thing, but I'm lazy right now. We can always find this bug from the CVS log message, if not from elsewhere. /be
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Fixing component, please ignore.
Component: Keyboard: Navigation → JavaScript Engine
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: