Closed Bug 753145 Opened 12 years ago Closed 12 years ago

JSScripts should carry complete scope nesting information

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla16

People

(Reporter: jimb, Assigned: luke)

References

Details

(Whiteboard: [js:t])

Attachments

(1 file, 1 obsolete file)

It's hard for the debugger to accurately present frames' and functions' environments because we don't have complete static information about the scopes surrounding functions --- only the nesting of functions themselves.

That is, if we have:

var g;
let (x = 1) {
  g = function g() { return 42; }
}
debugger;

and the onDebuggerStatement handler gets its hands on g, then there's no way for it to discover that g's scope chain ought to include a binding for x. Statically, the block chains only mention other blocks within the same immediately surrounding function; and the function tree only has functions. Dynamically, we won't create any real scope chain element for x's block.

It would be cool to have some static information, available from the script, that provided all this data. Then we could construct a proper ScopeIter for g's enclosing scope.
Blocks: 753158
Whiteboard: [js:t]
Just remember that there are *lots* of JSScript objects, so we don't want to make it bigger for something that's only used for .  For example, I recently moved some debug-only info into a script-to-info map in the compartment (JSCompartment::debugScriptMap).
(In reply to Nicholas Nethercote [:njn] from comment #1)
> Just remember that there are *lots* of JSScript objects, so we don't want to
> make it bigger for something that's only used for .

I think the missing word is "debugging"?  If so: this isn't just for debugging but also for maintaining enough information so that TI can create subset constraints between get/sets of upvars in enclosing scripts.
(That is the dependent bug 753158.)
Almost getting to this.  I came up with a scheme I liked for how to represent this nesting information; perhaps jimb or jorendorff would like to evaluate:

Every script (function, global, eval) has a "JSObject *enclosing" field.  When there is no 'let', this is the enclosing (compiler-created) function object (when nested inside a function) or NULL (when directly inside the global scope).  If the script is nested inside 'let', 'enclosing' points to the innermost static block object.  We can then steal the 'enclosingBlock' pointer of the outermost static block chain to point to the enclosing function (or NULL if in global scope).  Eval, as ever, is a special case:
 - strict eval also creates a lexical scope so, for this, we can use the CallObject as the 'enclosing' of nested functions or let blocks
 - non-strict eval doesn't add a scope so, for example in:

  function g() { eval("let (x=1) { eval('let (y=1) { function f() {} }') }") }

we'd have the chain (arrow means 'enclosing'):

  f -> y's block -> x's block -> g

This makes it difficult to see that x and y are not let-blocks in g, but rather let blocks in two eval activations in g.  Stack frame information helps, but is lost when the eval returns (and a nested closure is debugged).  TI won't care (no GETALIASEDVAR across such scopes) but I suspect the debugger would.  For this, I think a single bit indicating "I'm the outermost static block of an eval activation" would suffice.

(Note: if it makes njn feel better, I'll be removing the globalObject pointer in JSScript in bug 753158, keeping sizeof(JSScript) invariant.)
> (Note: if it makes njn feel better, I'll be removing the globalObject
> pointer in JSScript in bug 753158, keeping sizeof(JSScript) invariant.)

I've already removed it in patch 3 in bug 764714...
(In reply to Nicholas Nethercote [:njn] from comment #5)
I see BytecodeEmitter::globalScope removed, not JSScript::globalObject...
> I see BytecodeEmitter::globalScope removed, not JSScript::globalObject...

Yeah, but that's only because you looked at the real code, rather than the bogus copy of it in my head :P

More seriously, I'm not implacably opposed to any change that increases the size of JSScript.  It's just that until I shrunk it there were a bunch of fields in there that were only used in unusual cases, and given how many scripts we have we should certainly avoid that.  It sounds like this field(s) will be used frequently, so that's good.
Attached patch WIP 1 (obsolete) — Splinter Review
This patch adds the nesting information and a little iterator to walk over it (StaticScopeIter).  I added one use which is just a big assert that checks that the dynamic scope chain matches the static scope chain.  The patch is green on try, but I'm still tweaking it for use in bug 753158.

One thing to note: the patch only links together scopes compiled by the same frontend::CompileScript/CompileFunctionBody invocation.  Thus the scope information isn't really "complete".  However: given that the incompleteness is confined to direct eval which forced all enclosing scopes to be created, I don't think this is a problem for the purposes of the debugger (or bug 753158).
Assignee: general → luke
Status: NEW → ASSIGNED
Comment on attachment 639231 [details] [diff] [review]
WIP 1

Review of attachment 639231 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsscript.h
@@ +417,2 @@
>      js::HeapPtrFunction function_;
> +    js::HeapPtrObject   enclosing_;

Given that elsewhere we have both |enclosingScript| and |enclosingScope|, would renaming this |enclosingScope_| would be clearer?
(In reply to Nicholas Nethercote [:njn] from comment #9)
Sounds good to me.
(In reply to Luke Wagner [:luke] from comment #4)
> Eval, as ever, is a special case:
>  - strict eval also creates a lexical scope so, for this, we can use the
> CallObject as the 'enclosing' of nested functions or let blocks
>  - non-strict eval doesn't add a scope so, for example in:
> 
>   function g() { eval("let (x=1) { eval('let (y=1) { function f() {} }') }")
> }
> 
> we'd have the chain (arrow means 'enclosing'):
> 
>   f -> y's block -> x's block -> g

In this case, y's static block could happily point to x's dynamic block, and x's static block could point to g's CallObject. Given eval, there's no reason not to have static-to-dynamic links, especially since they provide more detail.
(In reply to Jim Blandy :jimb from comment #11)
That could work, but I'm not sure if it would gain us anything: given the simple static scope chain in the current patch, ScopeIter should be able to identify all missing scopes.  I could be missing a use case, though.
Comment on attachment 639231 [details] [diff] [review]
WIP 1

Review of attachment 639231 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/Stack.cpp
@@ +222,5 @@
>      return regs.fp()->script()->code;
>  }
>  
> +static inline void
> +AssertDynamicScopeMatchesStaticScope(JSScript *script, JSObject *scope)

I like this a lot.
(In reply to Luke Wagner [:luke] from comment #12)
> (In reply to Jim Blandy :jimb from comment #11)
> That could work, but I'm not sure if it would gain us anything: given the
> simple static scope chain in the current patch, ScopeIter should be able to
> identify all missing scopes.  I could be missing a use case, though.

So you're expecting that ScopeIter will walk the static and dynamic chains in parallel, and thus it will have the dynamic information at hand anyway? In that case, yeah, there's no real win.
(In reply to Jim Blandy :jimb from comment #14)
Exactly.  Really, it is already doing so (via StaticBlockObject::block_), it's just missing whole functions scopes.  Probably ScopeIter can be refactored to use StaticScopeIter directly.
(Actually, I bet this additional static information will allow some significant simplifications to ScopeIter::settle.)
Alright, green on try, ready to go.
Attachment #639231 - Attachment is obsolete: true
Attachment #640000 - Flags: review?(jimb)
Comment on attachment 640000 [details] [diff] [review]
add script parent

Review of attachment 640000 [details] [diff] [review]:
-----------------------------------------------------------------

Some preliminary comments. Nothing substantive, for better or worse.

::: js/src/tests/js1_5/extensions/regress-300079.js
@@ +26,5 @@
>    }
>    else {
>      expect = 'PASSED';
>  
> +    f = Function("return a * a;");

What's this change about?

::: js/src/vm/ScopeObject.cpp
@@ +45,5 @@
> +        obj = obj->toFunction()->script()->enclosingStaticScope();
> +    } else {
> +        onNamedLambda = true;
> +    }
> +    JS_ASSERT_IF(obj, obj->isStaticBlock() || obj->isFunction());

You could also:
JS_ASSERT_IF(onNamedLambda, obj->isFunction());

@@ +75,5 @@
> +JSScript *
> +StaticScopeIter::funScript() const
> +{
> +    JS_ASSERT(type() == FUNCTION);
> +    return obj->toFunction()->script();

Since this refuses to operate on NAMED_LAMBDA static scopes, there really isn't anything we can ask about them --- not even the name of the bound function.

But I believe this is okay, because when we don't have a dynamic DeclEnvObject, GetDebugScopeForMissing will create a fake CallObject/DeclEnvObject pair anyway --- and populate the DeclEnvObject appropriately then. Is that right?

::: js/src/vm/ScopeObject.h
@@ +24,5 @@
> + * reconstruction of the lexical scope for debugging or compiling upvars. The
> + * static scope is represented at runtime by a tree of compiler-created
> + * objects representing each scope:
> + *  - a StaticBlockObject is created for 'let' and 'catch' scopes
> + *  - a JSFunction+JSScript+Bindings trio is created for funtion scopes

"function"

@@ +35,5 @@
> + *
> + * g's innermost enclosing scope will first be the function scope containing
> + * 'x', enclosed by a scope containing only the name 'f'. (This separate scope
> + * is necessary due to the fact that declarations in the function scope shadow
> + * (dynamically, in the case of 'eval') the lambda name.)

In the first paragraph you talk about a tree of compiler-created objects representing the scopes. Then, here, you talk about "an additional scope". Is that also represented by a compiler-created static object? If not, the comment should make it clear what's used instead to represent it statically.

@@ +50,5 @@
> + *    f does not have an enclosing static scope. This is fine for current uses
> + *    for the same reason as 'with'.
> + *
> + * (See also AssertDynamicScopeMatchesStaticScope.)
> + */

This is a great comment --- thanks!

@@ +300,5 @@
> +     */
> +    bool containsVarAtDepth(uint32_t depth);
> +
> +    /*
> +     * A let binding is aliased is accessed lexically by nested functions or

"is aliased if"

@@ +329,5 @@
>      /*
> +     * The parser uses 'enclosingBlock' as the prev-link in the tc->blockChain
> +     * stack. Note: in the case of hoisting, this prev-link will not ultimately
> +     * be the same as enclosingBlock, initEnclosingStaticScope must be called
> +     * separately in the emitter. 'reset' is just for asserting stackiness.

I don't understand the sentence beginning with "Note:".
(In reply to Jim Blandy :jimb from comment #18)
> > +    f = Function("return a * a;");
> 
> What's this change about?

XDR/CloneScript now assert (perhaps like they always should have) that you only clone top-level functions and this test was cloning a nested function (!!).

> But I believe this is okay, because when we don't have a dynamic
> DeclEnvObject, GetDebugScopeForMissing will create a fake
> CallObject/DeclEnvObject pair anyway --- and populate the DeclEnvObject
> appropriately then. Is that right?

Oh, I'm really just starting out with strict assertions to catch misuse (any use atm would be an error) but I'd be happy to add, e.g. declEnvCallee(), as needed.

> I don't understand the sentence beginning with "Note:".

Yeah, it probably doesn't belong; it's just a rather esoteric detail of why we can't just leave the block chain set from the initial parsing pass.  I'll cut it out.

Addressed/agreed with the rest of the comments; thanks!
(In reply to Luke Wagner [:luke] from comment #19)
> (In reply to Jim Blandy :jimb from comment #18)
> > > +    f = Function("return a * a;");
> > 
> > What's this change about?
> 
> XDR/CloneScript now assert (perhaps like they always should have) that you
> only clone top-level functions and this test was cloning a nested function
> (!!).

Okay. I see now that Firefox is only applying JS_CloneFunctionObject to functions it got directly by compiling source code --- not functions obtained from arbitrary sources.
(In reply to Jim Blandy :jimb from comment #20)
> (In reply to Luke Wagner [:luke] from comment #19)
> > (In reply to Jim Blandy :jimb from comment #18)
> > > > +    f = Function("return a * a;");
> > > 
> > > What's this change about?
> > 
> > XDR/CloneScript now assert (perhaps like they always should have) that you
> > only clone top-level functions and this test was cloning a nested function
> > (!!).
> 
> Okay. I see now that Firefox is only applying JS_CloneFunctionObject to
> functions it got directly by compiling source code --- not functions
> obtained from arbitrary sources.

Should CloneFunctionObject or JS_CloneFunctionObject assert that they're not being applied to a nested function?
(In reply to Jim Blandy :jimb from comment #21)
They do (JS_ASSERT(!script->enclosingStaticScope()).
> XDR/CloneScript now assert (perhaps like they always should have) that you
> only clone top-level functions

Ah... I'd noticed that CloneScript doesn't clone every field in JSScript and wondered why.  Maybe a comment could be added as well?
(In reply to Nicholas Nethercote [:njn] from comment #23)
Sounds good.  I had one in the body of JS_CloneFunctionObject; I'll put one in jsapi.h and another near the assert in js_CloneFunctionObject.
(In reply to Luke Wagner [:luke] from comment #22)
> (In reply to Jim Blandy :jimb from comment #21)
> They do (JS_ASSERT(!script->enclosingStaticScope()).

Right you are --- they do now.
Comment on attachment 640000 [details] [diff] [review]
add script parent

Review of attachment 640000 [details] [diff] [review]:
-----------------------------------------------------------------

Wow, looks awesome. I'm very excited to have this data available.

::: js/src/jsfun.h
@@ +71,5 @@
>      bool isHeavyweight()     const { return JSFUN_HEAVYWEIGHT_TEST(flags); }
>      bool isNullClosure()     const { return kind() == JSFUN_NULL_CLOSURE; }
>      bool isFunctionPrototype() const { return flags & JSFUN_PROTOTYPE; }
>      bool isInterpretedConstructor() const { return isInterpreted() && !isFunctionPrototype(); }
> +    bool isNamedLambda()     const { return (flags & JSFUN_LAMBDA) && atom; }

Yay.

@@ -254,5 @@
>  }
>  
> -inline bool
> -js_IsNamedLambda(JSFunction *fun) { return (fun->flags & JSFUN_LAMBDA) && fun->atom; }
> -

Yay.

::: js/src/jsscript.cpp
@@ +341,5 @@
>  
> +/*
> + * If there's a parent id, then get the parent out of our script's object
> + * array. We know that we clone block objects in outer-to-inner order, which
> + * means that getting the parent now will work.

This comment doesn't make sense in this context. When is "now"? And it doesn't get a parent given an id; it finds a parent and returns its id.

@@ +662,5 @@
>  
>      /*
> +     * Here looping from 0-to-length to xdr objects is essential to ensure that
> +     * all references to enclosing blocks (via FindBlockIndex below) happen
> +     * after the enclosing block has been XDR'd.

I think *this* is the "now" that FindBlockIndex's comment means to cite. But it doesn't make sense as written.

@@ +689,5 @@
> +                }
> +            }
> +            if (!xdr->codeUint32(&funEnclosingScopeIndex))
> +                return false;
> +            Rooted<JSObject*> funEnclosingScope(cx);

(I still find Rooted<JSObject *> clearer than RootedObject...)

@@ +1771,5 @@
> +                clone = CloneStaticBlockObject(cx, enclosingScope, innerBlock);
> +            } else {
> +                Rooted<JSFunction*> innerFun(cx, obj.toFunction());
> +
> +                StaticScopeIter ssi = innerFun->script()->enclosingStaticScope();

This is really applying the StaticScopeIter constructor; could we use the constructor syntax here?

::: js/src/vm/ScopeObject.h
@@ +57,5 @@
> +    JSObject *obj;
> +    bool onNamedLambda;
> +
> +  public:
> +    StaticScopeIter(JSObject *obj);

Should this be 'explicit'? It seems a little odd to automatically convert, and initializations can use the constructor syntax.

::: js/src/vm/Stack.cpp
@@ +225,5 @@
> +static inline void
> +AssertDynamicScopeMatchesStaticScope(JSScript *script, JSObject *scope)
> +{
> +#ifdef DEBUG
> +    for (StaticScopeIter i = script->enclosingStaticScope(); !i.done(); i++) {

I came into this thinking about everything that can appear on a scope chain, so I was surprised not to see a case for 'eval' here, until I remembered that StaticScopeIter just stops before reaching such scopes. It'd be nice to have a comment to that effect.
Attachment #640000 - Flags: review?(jimb) → review+
(In reply to Jim Blandy :jimb from comment #26)
Thanks Jim!  Comments applied.
Unfortunately bug 772303 (https://hg.mozilla.org/integration/mozilla-inbound/rev/fad7d06d7dd5) had to be backed out for winxp pgo-only jsreftest failures.

This bug (amongst others) either caused unresolvable (for someone who doesn't work on the JS engine at least) conflicts with the backout of fad7d06d7dd5, or else bugzilla dependencies indicated that one of it's dependants had now been backed out. 

Since there was no one in #developers that could resolve the conflicts, unfortunately this bug has been backed out:

https://hg.mozilla.org/integration/mozilla-inbound/rev/f0be4b70b814
Target Milestone: mozilla16 → ---
Target Milestone: --- → mozilla16
https://hg.mozilla.org/mozilla-central/rev/7221c50cb5b4
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.