Last Comment Bug 730497 - Remove the flat closure optimization
: Remove the flat closure optimization
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
-- normal with 1 vote (vote)
: mozilla14
Assigned To: Luke Wagner [:luke]
: Jason Orendorff [:jorendorff]
: 687185 (view as bug list)
Depends on: 731132 732308
Blocks: 647225 733950
  Show dependency treegraph
Reported: 2012-02-24 16:40 PST by Luke Wagner [:luke]
Modified: 2012-10-02 13:08 PDT (History)
18 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

rm flat closures (106.41 KB, patch)
2012-02-25 16:13 PST, Luke Wagner [:luke]
no flags Details | Diff | Splinter Review
rm flat closures v.2 (114.80 KB, patch)
2012-02-27 12:33 PST, Luke Wagner [:luke]
bhackett1024: review+
jwalden+bmo: review+
Details | Diff | Splinter Review
preparatory sanity changes (3.89 KB, patch)
2012-03-01 00:17 PST, Luke Wagner [:luke]
jorendorff: review+
Details | Diff | Splinter Review

Description User image Luke Wagner [:luke] 2012-02-24 16:40:27 PST
For bug 659577, we want closed-over variables to live exclusively in the activation objects (viz., call and block).  The flat closure optimization complicates this development in several ways: at runtime and in the frontend.  The flat closure optimization also requires a large quantity of sophisticated code, so it is worth re-evaluating whether it is still necessary.  I think, now, they are not for the following reasons:

 1. Bug 659577 removes the need to use JSOP_NAME for upvars: the scope chain will have a fixed depth which means we can emit/jit JSOP_GETALIASEDVAR(i,j) which loads the jth slot i hops up the scope chain; no branching.  Thus, except in cases of extreme lexical nesting, JSOP_GETFCSLOT won't be winning anything.

 2. Call object are about to be a lot cheaper: bug 659577 will remove 'put' and we are going to inline call-obj creation (for early boyer) anyway (which, with objshrink, will only be a few ops).
Comment 1 User image Brian Hackett (:bhackett) 2012-02-24 17:00:39 PST
Sounds good to me.
Comment 2 User image Luke Wagner [:luke] 2012-02-25 16:13:34 PST
Created attachment 600726 [details] [diff] [review]
rm flat closures

 28 files changed, 195 insertions(+), 1011 deletions(-)

This would have been a trivial patch except for the bytecode pattern-matching in str_replace (worth 8% on SS, I see) which depended on the lambda being a flat closure.  I had to write a bit of object-shape-fiddly code which is not by expertise so could both Waldo and Brian look extra careful at LambdaIsGetElem and CallObject::containsLocal?
Comment 3 User image Brian Hackett (:bhackett) 2012-02-27 07:20:46 PST
Comment on attachment 600726 [details] [diff] [review]
rm flat closures

Review of attachment 600726 [details] [diff] [review]:

::: js/src/jsstr.cpp
@@ +2253,5 @@
> +        return NULL;
> +
> +    Value b = UndefinedValue();
> +    if (!scopeChain.asCall().containsLocal(name, &b, cx))
> +        return NULL;

I think this level of pattern matching is excessive --- instead of looking for the local in the immediate parent, you can just do a FindProperty and figure out the current value of that name.  The str_replace optimization will bail out if it ever might call a script, so the value of that name won't change while the optimization is still in play.

There is a secondary problem with this solution though, which is that FindProperty can I think call script, if the lambda is wrapped in code like 'with (crazyProxy) {...}', which could make the optimization visible to scripts.  This is more of an issue with proxies themselves I think and the code is probably littered with places where proxies can make implementation details visible, so I'd be fine with just ignoring this.  It could, however, be solved with a obj->nativeFind variant that is guaranteed non-effectful (similar to obj->nativeLookup vs. LookupProperty).

Everything else looks great, but it would be good to see a followup patch with this addressed.

::: js/src/vm/ScopeObject.cpp
@@ +369,5 @@
>      return true;
>  }
> +bool
> +CallObject::containsLocal(PropertyName *name, Value *vp, JSContext *cx)

Unnecessary with the LambdaIsGetElem stuff discussed above.

::: js/src/vm/ScopeObject.h
@@ +171,5 @@
>      static JSBool setArgOp(JSContext *cx, JSObject *obj, jsid id, JSBool strict, Value *vp);
>      static JSBool setVarOp(JSContext *cx, JSObject *obj, jsid id, JSBool strict, Value *vp);
> +
> +    /* Return whether this environment contains 'name' and, if so, its value. */
> +    bool containsLocal(PropertyName *name, Value *vp, JSContext *cx);

Comment 4 User image Luke Wagner [:luke] 2012-02-27 09:07:55 PST
(In reply to Brian Hackett (:bhackett) from comment #3)
> I think this level of pattern matching is excessive

Haha; you realize the previous version was hard-coded to only accept flat closure upvars -- an optimization that depends on not just properties of closure, but the tokens of the enclosing function -- right?

> There is a secondary problem with this solution though, which is that
> FindProperty can I think call script,

Yes, this is exactly why I didn't want to go this direction.  But I guess there are really only two types of acceptable scope objects here: call and block objects.  How about I hoist containsLocal to ScopeObject and make it search while (isCall || isBlock).  That should strictly generalize the previous optimization.
Comment 5 User image Brian Hackett (:bhackett) 2012-02-27 09:15:22 PST
That's fine, but can you name it nativeFind instead of containsLocal (containsLocal in the patch will also get arguments, so the name already seems off).
Comment 6 User image Luke Wagner [:luke] 2012-02-27 09:23:58 PST
As I was writing the patch I realized that the conservative scope search belongs in LambdaIsGetElem (which calls CallObject::containsVarOrArg and ClonedBlockObject::containsVar) since the scope search isn't well-defined (its just a best effort) whereas containVarOrArg/containsVar are.
Comment 7 User image Luke Wagner [:luke] 2012-02-27 12:33:53 PST
Created attachment 601017 [details] [diff] [review]
rm flat closures v.2

I also changed FindReplaceLength to reuse HasDataProperty which appears to give a mild speedup.

Another bug fixed in this patch is that JSFunction::FinalizeKind needs to change to the BACKGROUND equivalent.  This is good news b/c it means now functions can be finalized on the background thread.
Comment 8 User image Brian Hackett (:bhackett) 2012-02-27 13:15:25 PST
Comment on attachment 601017 [details] [diff] [review]
rm flat closures v.2

Review of attachment 601017 [details] [diff] [review]:


::: js/src/vm/ScopeObject.cpp
@@ +385,5 @@
> +    
> +    if (op == getArgOp) {
> +        JS_ALWAYS_TRUE(getArgOp(cx, this, INT_TO_JSID(shape->shortid()), vp));
> +        return true;
> +    }

It looks like this and the above 'if' block can be combined.
Comment 9 User image Jeff Walden [:Waldo] (remove +bmo to email) 2012-02-29 17:09:19 PST
Comment on attachment 601017 [details] [diff] [review]
rm flat closures v.2

Review of attachment 601017 [details] [diff] [review]:

::: js/src/jit-test/tests/basic/testReplaceWithLambda.js
@@ +11,5 @@
> +        assertEq("abc".replace(/a|b/g, function(a) { return b[a] }), 'ABc');
> +    }
> +})();
> +(function() {
> +    var b = {a:'A', b:'B' };

Add a variant of this that introduces |var b| through an eval, in the non-optimized section, of course, with both inline and in-called-function variants.

::: js/src/jsscript.h
@@ +141,2 @@
> +    uintN countLocalNames() const { return nargs + nvars; }

Could even shorten this to countNames(), I think, without being unclear.  Up to you.

@@ +178,5 @@
>       * destructuring argument.  Return true on success.
>       *
>       * The parser builds shape paths for functions, usable by Call objects at
>       * runtime, by calling an "add" method. All ARGUMENT bindings must be added
> +     * before before any VARIABLE or CONSTANT bindings.

Fix "before before" while you're here.  o_O

::: js/src/jsstr.cpp
@@ +2213,5 @@
> + * Dean Edwards packer) to efficiently encode large scripts. We only handle the
> + * code patterns generated by such packers here.
> + */
> +static JSObject *
> +LambdaIsGetElem(JSObject &lambda, JSContext *cx)


::: js/src/vm/ScopeObject.cpp
@@ +387,5 @@
> +        JS_ALWAYS_TRUE(getArgOp(cx, this, INT_TO_JSID(shape->shortid()), vp));
> +        return true;
> +    }
> +
> +    return false;

I guess we hit this only for eval-induced vars or something?  If you can, assert something about what op might be here.  Not sure if that's possible offhand.
Comment 10 User image Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2012-02-29 19:07:49 PST
I'm a bit worried about this.  This increases the likelihood of functions being "heavyweight", right?  Won't that cause additional "leaks"?  I'm especially interested in addons that don't leak now but might start leaking.
Comment 11 User image Luke Wagner [:luke] 2012-02-29 20:48:34 PST
The flat closure optimization applies to maybe 2-4% of functions and already entrains/roots the global (it is just intermediate scopes that are added by flat closure removal).  Thus the overall probability of leak is rather small and it is just the 1M lines of JS that produce the 2 or so hits I've found.  Not to mention the fact that code balanced on such a tenuous ledge is a ticking time bomb.  I'd rather not keep baggage in the JS engine based on this spectre alone.
Comment 12 User image Luke Wagner [:luke] 2012-03-01 00:17:47 PST
Created attachment 601891 [details] [diff] [review]
preparatory sanity changes

Oops, I forgot the prerequisite sanity-bringing patch.
Comment 13 User image Jeff Walden [:Waldo] (remove +bmo to email) 2012-03-01 00:27:40 PST
It's also worth considering that leakiness is not the only axis here.  Complexity of implementation is another.  And corresponding to that, attack surface and security-trustworthiness are others.  A thousand lines of code removal is nothing to sneeze at, particularly when it's in some of the hairiest code in SpiderMonkey.
Comment 14 User image Jan de Mooij [:jandem] 2012-03-01 04:41:36 PST
*** Bug 687185 has been marked as a duplicate of this bug. ***
Comment 15 User image Andrew McCreight [:mccr8] 2012-03-01 08:20:56 PST
Sure, it would be great to get rid of this code, but we should try to figure out if it has any horrible impact on memory usage.
Comment 16 User image Justin Lebar (not reading bugmail) 2012-03-01 08:26:54 PST
Worst comes to worst, we can always check it in to m-c and then watch the telemetry data.  Would it be easy to pref this on/off, so we could do a proper A/B test?
Comment 17 User image Brian Hackett (:bhackett) 2012-03-01 08:34:23 PST
I don't think this will make a measurable difference, but agree that it would be good to get a sense of how things change.  The patch is almost all code removal, so can't be pref'ed on afterwards, and leaving it in so telemetry data can accumulate will slow down bug 659577, which is important to get done quickly.

This could be done by modifying about:memory to distinguish call objects from other objects, and use membench to compare before/after on a suite of websites.
Comment 18 User image Luke Wagner [:luke] 2012-03-01 13:22:24 PST
An important observation about the two leaks I've fixed: they aren't *real* leaks!  They are cases where a finite amount of stuff that was *already* living as long as the browser just lived a little longer (wrt shutdown) than it used to.  Peterv said that if we put the final cycle collection a little later, they wouldn't be leaks (but, if we did that, we'd miss other, real, leaks, so we don't).

Yes, we can imagine a worst case where an addon started entraining every content global but this is, so far, just speculation.  If we consider probable use cases for closures:
 1. long lived closures (the ones where it would be bad if they entrained content): I'd expect these to be created/registered/bound at startup
 2. closures created when dealing with content: I'd expect these to be:
   a. registered on the content where they may freely entrain the content w/o problem
   b. registered on some long-lived chrome object until some content-unloaded event at which point they would be unregistered

In particular, 2.c. "dealing with content but rooted by a long-lived chrome object" seems weird and sound like an existing leak.  Combine that with the low probability of the flat closure optimization applying (2-6% I've generally seen) and I would not expect there to be real risk.
Comment 19 User image Luke Wagner [:luke] 2012-03-01 21:04:10 PST
And with a 3rd leak down (another global-ish thing peculiar to xpcshell), we are looking green on try.
Comment 20 User image Luke Wagner [:luke] 2012-03-02 10:15:49 PST
And to really put the fear-of-leaks issue to rest: bug 659577 (which all this is for) will only store *closed over* variables in the call object (non-closed-over variables will live and die on the stack).  From a whats-rooted perspective, that is *almost* equivalent to flat closures  (but not quite in the case of multiple nested flat closures with different sets of upvars and different lifetimes...).
Comment 21 User image Jason Orendorff [:jorendorff] 2012-03-02 10:52:53 PST
Comment on attachment 601891 [details] [diff] [review]
preparatory sanity changes

>         /* Optimize accesses to undeclared globals. */
>-        if (!bce->mightAliasLocals() && !TryConvertToGname(bce, pn, &op))
>+        if (!TryConvertToGname(bce, pn, &op))
>             return JS_TRUE;

Huh. What's going on here? Was the mightAliasLocals() check here ever necessary? It looks totally bogus to me.

Comment 22 User image David Anderson [:dvander] 2012-03-02 10:58:15 PST
I think it was because:

var g;
function f() {
    if (x) {
        function g() { }

You can't emit a GNAME for the call to g() because "g" might intervene on the scope chain.
Comment 23 User image Luke Wagner [:luke] 2012-03-02 11:36:29 PST
Whoa.  Initially I thought I was just removing a redundant guard: TryConvertToGname already tests bce->mightAliasLocals before returning 'true'.  But now that I look at it, this changes control flow.  Previously, if mightAliasLocals was true, we'd continue and set the pn to "bound".  With the patch we return early (without converting to a global op), just like all the other conditions.  So, was the previous code incorrect or is there some subtlety about mightAliasLocals that I'm missing here?
Comment 24 User image David Anderson [:dvander] 2012-03-02 14:41:47 PST
Yeah as far as I can tell the old code does not look correct.
Comment 25 User image Jason Orendorff [:jorendorff] 2012-03-02 15:15:52 PST
I don't think PND_BOUND makes much of a difference. As far as I can tell it really just means "we don't need to try binding this name again". The old code looks bogus to me.
Comment 26 User image Jason Orendorff [:jorendorff] 2012-03-02 15:18:31 PST
Luke, if mightAliasLocals() is shorthand for "this scope is dynamic (subject to non-strict direct eval, with, or a function statement)", then a comment would help.
Comment 27 User image Luke Wagner [:luke] 2012-03-02 15:21:26 PST
I don't actually know what it means.  TCF_FUN_MIGHT_ALIAS_LOCALS seems to have a comment.
Comment 28 User image Luke Wagner [:luke] 2012-03-02 15:58:15 PST
Here is the preparatory patch (still waiting on reviews for blocking leak fixes):

Note to sheriff, the bug is still open.
Comment 29 User image Ed Morley [:emorley] 2012-03-04 04:53:13 PST
Comment 32 User image Boris Zbarsky [:bz] (still a bit busy) 2012-03-20 13:11:56 PDT
This apparently caused bug 737575.
Comment 33 User image Luke Wagner [:luke] 2012-03-20 14:46:37 PDT
Update: this bug isn't the cause (bug 733950 is).

Note You need to log in before you can comment on or make changes to this bug.