Closed Bug 962599 Opened 10 years ago Closed 10 years ago

Store let-bound variables in the fixed part of stack frames

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla30

People

(Reporter: wingo, Assigned: wingo)

References

Details

Attachments

(1 file, 4 obsolete files)

This will allow generators to cheaply capture let-bound variables, as they won't appear on the operand stack.  It will also fix some analyses in Ion that expect phi variables to have stack slots (bug 956166).
Blocks: 956166
\o/
As bug 956166 notes, currently "stackDepth()" is computed at compile-time as the common characteristic of NestedScope objects (with scopes, static block scopes, and cloned block scopes).

(As an aside, it doesn't seem that ClonedBlockObject needs stackDepth at all, as it has a link to a corresponding StaticBlockObject, but it has to have it in order to have the same shape as CallObject.  Boo.)

One use of stackDepth is to record the slot at which the first block local starts.  This is used when entering scopes with aliased locals, to copy the locals from the stack out to the heap.  It is also used when debugging is enabled, to allow DebugScopeObject to modify the stack, and to copy stack values off to the heap if there is a live DebugScopeObject.

The other use of stackDepth is to provide a partial order over scopes in a function.  There is a use of this in a ScopeIter constructor, but that constructor is dead and I have a patch to remove it.  The real use is in Interpreter.cpp::UnwindScope, when unwinding the scope chain before a "catch".  We need to know what the scope chain was when the try block began.  Currently we can use stackDepth to tell us this, as both block scopes and with scopes add to the stack.  If let scopes don't add to the stack, this option is unavailable.

I think probably what we should do is instead of recording the stack depth in a try note, we should record the scope.  If the topmost scope in a try is either the activation or a block scope, we can record an index into the script's object array.  Otherwise if the topmost scope is a with, we record the  scope depth in the try note, and record the scope depth instead of the stack depth in WithObject.

Sound like a plan?  I'm on it if that is OK with yall.
That sounds quite reasonable.

'with' is kindof a royal pain b/c it isn't present in the static scope information.  I was just thinking, would it make sense to make something analogous to StaticScopeObject that went into the static scope chain?  Then try notes could just point to a static scope object.  This might also simplify the hairy logic in ScopeIter.  Maybe that's more trouble than it's worth for this bug, though.  (In my magic future world, StaticScopeObject wouldn't be a real object, but be more like Bindings: a struct with info and the shape of cloned block objects.)
(In reply to Luke Wagner [:luke] from comment #3)
> analogous to StaticScopeObject

I meant "StaticBlockObject", in case that wasn't clear.
I like the idea of an analogue for WithObject on the static scope chain.  It is a very neat solution, and for now can be implemented in the same way as StaticBlockObject/ClonedBlockObject -- the dynamic one has a proto, and the static one doesn't.  This meshes well with the existing WithObject implementation, which links to the target object via the proto field.

JSTryNote can then change to record the static scope depth instead of the operand stack size.
I have a couple patches locally -- one that renames most functions that operate on "static blocks" to operate on "static scopes", and the next that adds StaticWithObject nodes to the static scope chain (complementing DynamicWithObject nodes on the reified run-time scope chain).

Now I am looking on how to mesh these with TryNoteIter, so that we can avoid using stackDepth as a measure of try dispatch state.  See Interpreter.cpp:TryNoteIter::settle.  I'm not quite sure how to do this; the straightforward thing to do is to duplicate finally blocks everywhere they are needed, and that way everything works out as it should.  That would allow you to get rid of GOSUB and RETSUB as well.  Dunno though, I would like to not get sucked into that :P
Depends on: 965243
Thinking on it some more, it looks like for now I can avoid messing with try/finally versus for-in: as for-in always pushes on the iter (unlike lexical scopes in the future) the existing logic is sufficient, though hacky.  I'll file another bug for thoughts on try/finally.
Depends on: 966912
I should note that without more GC support, this change will make garbage collection less precise, as let scopes will never be popped off the stack and so could be seen as live for longer.  There are a few ways to avoid this; probably the cheapest would be a map from call/allocation locations to some kind of "fixed stack height"; the marker could null out fixed slots above that height.  It's a relatively minor point, however.
Block-scoped bindings will not be present in the jsscript bindings, as any given slot might be occupied by different bindings at different points in the script.  However we do need to know what portion of the fixed slots set has Binding* backings; so I am adding a uint32_t field to Bindings.  Bindings will have numArgs, numVars, and also numBlockScoped.  This increases the size of Bindings from 4 to 5 words :/  Oh well.
Hum, scratch that -- it would be 5 words on a 32-bit system, but only 3 on a 64-bit system.  Not so bad I guess.
Oh boo, bindings is embedded in jsscript.  Maybe I can salvage this somehow.  I certainly don't want to make jsscript bigger!
(In reply to Andy Wingo [:wingo] from comment #8)
> I should note that without more GC support, this change will make garbage
> collection less precise, as let scopes will never be popped off the stack
> and so could be seen as live for longer.

Alternatively, we could go back to emitting JSOP_POPBLOCKSCOPE always and then, if !blockObj->needsClone(), we assign undefined to the associated fixed slot.

(In reply to Andy Wingo [:wingo] from comment #9)
> However we do need to know what portion of the fixed slots set
> has Binding* backings; so I am adding a uint32_t field to Bindings. 

Could you explain a bit more in what context this query is needed?
(In reply to Luke Wagner [:luke] from comment #12)
> (In reply to Andy Wingo [:wingo] from comment #9)
> > However we do need to know what portion of the fixed slots set
> > has Binding* backings; so I am adding a uint32_t field to Bindings. 
> 
> Could you explain a bit more in what context this query is needed?

Many things need to know script->nfixed().  Currently this is bindings.numVars().  In the future it will also include block-scoped locals, but there's no space to store that information in jsscript right now, and those locals won't have corresponding Binding* pointers in the bindings vector.

My current strategy is to use the uint16_t space that is in Bindings to store numBlockScoped.  It's a limitation but oh well, there are other limits right now, and this one doesn't really add to them.
(In reply to Andy Wingo [:wingo] from comment #13)
Ah, of course, thanks for explaining.  The uint16_t limitation makes sense.  Assuming numVars_ is padded out to the next multiple of 4 on 32/64-bit, then we'd effectively get this for free, right?
Blocks: 942810
(In reply to Luke Wagner [:luke] from comment #14)
> (In reply to Andy Wingo [:wingo] from comment #13)
> Ah, of course, thanks for explaining.  The uint16_t limitation makes sense. 
> Assuming numVars_ is padded out to the next multiple of 4 on 32/64-bit, then
> we'd effectively get this for free, right?

Yes I think that's right.  Fingers crossed and in any case the static asserts will carp otherwise :)
Assignee: nobody → wingo
This is a WIP patch; still many many failures.  But I was getting a little paranoid and wanted to back it up somewhere.
Still a WIP patch, but passing the jit-test suite.
Attachment #8370854 - Attachment is obsolete: true
Attachment #8374112 - Attachment is obsolete: true
Comment on attachment 8374760 [details] [diff] [review]
Store let-bound variables in the fixed part of stack frames

Some notes on this patch:

The operand stack is no longer accessed by GETLOCAL.  GETLOCAL is only used to access slots in the fixed part of stack frames.

The fixed part of stack frames is now composed of "fixed vars" plus block-scoped locals.  Only function frames have fixed vars (nfixedvars()).  All kinds of code may have block-scoped locals.

JSOP_POPNV is gone, as let scopes no longer push onto the stack, and so we don't need to pop off the stack while leaving a value.  Let expressions and statements are differentiated by whether their body is an expression or a statement; in the former case there will be a value left on the stack, and in the latter not.

There were a couple of cases that used GETLOCAL to address the operand stack: array comprehensions, and destructuring binding.  This wasn't really possible given that when we emit bytecode for a particular statement, we don't know the fixed size of a stack frame.  When compiling a script, for example, parsing and compiling later statements can indicate a need for more block-scoped locals, so nfixed() expands as we parse and compile the script.  So instead I introduced DUPAT, which is kinda like GETLOCAL except instead of addressing the fixed section from the fp, it addresses the operand stack from the sp.

There was some trickiness in the OSR side of things to determine whether a fixed slot was aliased or not.  There is one open FIXME that I'd like jandem to look at.
Attachment #8374760 - Flags: review?(luke)
Attachment #8374760 - Flags: review?(jdemooij)
https://tbpl.mozilla.org/?tree=Try&rev=f36685e53d94 with a style fix and a fix to CloneStaticBlockObject.
Comment on attachment 8374760 [details] [diff] [review]
Store let-bound variables in the fixed part of stack frames

Review of attachment 8374760 [details] [diff] [review]:
-----------------------------------------------------------------

Nice work! r=me with comments below addressed. I only skimmed the frontend, Bindings and ScopeObject changes though as Luke is more familiar with that.

::: js/src/frontend/BytecodeEmitter.cpp
@@ +2428,5 @@
>  
> +static bool
> +PushUndefinedValues(ExclusiveContext *cx, BytecodeEmitter *bce, unsigned n)
> +{
> +    for (unsigned i = 0; i < n; ++i)

Nit: { } around loop body, and there's some trailing whitespace after this function and the next one.

::: js/src/jit/BaselineCompiler.cpp
@@ +853,5 @@
>      return true;
>  }
>  
>  bool
> +BaselineCompiler::emit_JSOP_DUPAT()

You can now remove the |if| in BaselineCompiler::JSOP_GETLOCAL:

    if (local >= frame.nlocals()) {
        // Destructuring assignments may use GETLOCAL to access stack values.
        ...
    }

And add an assert to BaselineFrameInfo::pushLocal: JS_ASSERT(local < nlocals()).

You can also remove the #ifdef DEBUG code in BaselineFrameInfo::addressOfLocal and add the same local < nlocals() assert there. Very nice!

::: js/src/jit/CompileInfo.h
@@ +171,2 @@
>      // Number of slots needed for local variables.
>      unsigned nlocals() const {

Nit: people may wonder what's the difference between nlocals and nfixedvars. Please change the comment to explain it's vars + block-scoped locals.

@@ +252,5 @@
> +
> +            // Otherwise, it might be part of a block scope.
> +            for (; staticScope; staticScope = staticScope->enclosingNestedScope()) {
> +                if (staticScope->is<StaticBlockObject>()) {
> +                    StaticBlockObject &blockObj = staticScope->as<StaticBlockObject>();

Nit: use a continue to get rid of an extra level of indentation:

if (!staticScope->is<StaticBlockObject>())
    continue;

::: js/src/jit/IonBuilder.cpp
@@ +99,5 @@
> +    for (size_t i = frame->script()->nfixedvars(); i < frame->script()->nfixed(); i++) {
> +        // FIXME: If this slot corresponds to a scope that is active at this PC,
> +        // and the slot is unaliased, we should initialize the type from the
> +        // slot value, as above.
> +        inspector->varTypes.infallibleAppend(types::Type::UndefinedType());

If you pass a CompileInfo &info to this function, you can do:

if (info.isSlotAliasedAtOsr(info.firstLocalSlot() + i)) // please double check
    ... append UndefinedType() ...
else
    ... append type from frame value ...

If you use frame->unaliasedLocal() you can combine this loop with the previous one maybe, but if you think it's more efficient or nicer to have separate loops that's fine as well.

@@ +1566,2 @@
>        {
> +        current->pushSlot(current->stackDepth() - 1 - GET_UINT24(pc));

Nit: can remove the {}.

::: js/src/jsanalyze.cpp
@@ +934,5 @@
>            }
>  
>            case JSOP_CALLLOCAL:
>            case JSOP_SETLOCAL: {
> +            JS_ASSERT(GET_LOCALNO(pc) < script_->nfixed());

Nit: you can combine this with the previous case, since the bodies are the same, and also drop the {}.

@@ +939,5 @@
>              break;
>            }
>  
>            case JSOP_PUSHBLOCKSCOPE:
>              localsAliasStack_ = true;

Nit: this is the only place where we set localsAliasStack_. I think a better name for it now is hasLetVars_, hasBlockScope_ or something. Can you rename it?

::: js/src/vm/Stack.cpp
@@ +1267,5 @@
>  {
>      if (!checkAliasing)
>          return;
>  
>      JS_ASSERT(i < script->nslots());

Can we change this to script->nfixed() now?
Attachment #8374760 - Flags: review?(jdemooij) → review+
Oh you should also bump XDR_BYTECODE_VERSION in Xdr.h
Comment on attachment 8374760 [details] [diff] [review]
Store let-bound variables in the fixed part of stack frames

Review of attachment 8374760 [details] [diff] [review]:
-----------------------------------------------------------------

\o/

::: js/src/frontend/BytecodeCompiler.cpp
@@ +362,5 @@
> +                // The statements of a script are parsed and compiled one at a
> +                // time, but into one script object.  We need to make sure that
> +                // the maximum blockScopeDepth is correctly accumulated, even
> +                // when we have to restart the parse.
> +                pc.ref().blockScopeDepth = script->bindings.numBlockScoped();

Could you instead give ParseContext's constructor an initialBlockScopeDepth argument?  That makes it a tiny bit less ad hoc that we have to manually reset blockScopeDepth on failure and potentially catches this bug in other contexts (should they arise) where we restart parsing by destroy();construct().

::: js/src/frontend/BytecodeEmitter.cpp
@@ +748,5 @@
> +        Rooted<NestedScopeObject *> outer(cx, bce->staticScope);
> +        for (; outer; outer = outer->enclosingNestedScope()) {
> +            if (outer->is<StaticBlockObject>()) {
> +                localOffset = outer->as<StaticBlockObject>().localOffset() +
> +                    outer->as<StaticBlockObject>().slotCount();

Perhaps:
  StaticBlockObject &blockObj = outer->as<StaticBlockObject>();
  localOffset = blockObj.localOffset() + blockObj.slotCount();
?

@@ +765,5 @@
> +//
> +// A nested scope is a region of a compilation unit (function, script, or eval
> +// code) with an additional node on the scope chain.  This node may either be a
> +// "with" object or a "block" object.  "With" objects represent "with" scopes.
> +// Block objects represent lexical scopes, and contain let bindings: set of

Should this be "and contain let or catch bindings"?

@@ +767,5 @@
> +// code) with an additional node on the scope chain.  This node may either be a
> +// "with" object or a "block" object.  "With" objects represent "with" scopes.
> +// Block objects represent lexical scopes, and contain let bindings: set of
> +// named variables.  Those variables may be local and thus accessible directly
> +// from the stack, or "aliased" and only accessible through the scope chain.

Perhaps '... or "aliased" (accessed by nested functions or via dynamic name lookup) and ...'

@@ +784,5 @@
> +// scope chain at any pre-retire PC can be retrieved using
> +// JSScript::getStaticScope(jsbytecode *pc).
> +//
> +// Block scopes store their locals in the fixed part of a stack frame, after the
> +// "fixed var" bindings.  A fixed var binding is a "var" binding that occurs in

a "var" or "const" binding

@@ +2433,5 @@
> +        if (Emit1(cx, bce, JSOP_UNDEFINED) < 0)
> +            return false;
> +    return true;
> +}
> +        

Trailing whitespace here and below (bugzilla diff viewer makes 'em red; you can get hg diff to do the same by adding

[color]
diff.trailingwhitespace = bold red_background

to your .hgrc.)

@@ +2494,5 @@
> +
> +        if (!EnterNestedScope(cx, bce, &stmtInfo, pn2->pn_objbox, STMT_BLOCK))
> +            return false;
> +
> +        if (!InitializeBlockScopedLocalsFromStack(cx, bce, blockObj))

This PushUndefinedValues;EnterNestedScope;InitializeBlockScopedLocalsFromStack pattern appears 5 times and the only variation I see is the alreadyPushed business in EmitLexicalScope.  Could you hoist this out into some PushBlock function?

@@ +4953,5 @@
>          BindingIter bi(bce->script);
>          while (bi->name() != fun->atom())
>              bi++;
> +        JS_ASSERT(bi->kind() == Binding::VARIABLE || bi->kind() == Binding::CONSTANT
> +                  || bi->kind() == Binding::ARGUMENT);

I think SM style is to have the || be at the end of the first line, not the begin of the second.

::: js/src/frontend/FullParseHandler.h
@@ +625,5 @@
>      ParseNode *pn = LexicalScopeNode::create(PNK_LEXICALSCOPE, this);
>      if (!pn)
>          return nullptr;
>  
> +    pn->setOp(JSOP_POP);

The change in the ParseNode comment would seem to indicate that you could remove this altogether (or, if not, update the ParseNode comment).

::: js/src/jsanalyze.cpp
@@ +1805,5 @@
>              stack[stackDepth - 1].v = stack[stackDepth - 3].v = code->poppedValues[0];
>              stack[stackDepth - 2].v = stack[stackDepth - 4].v = code->poppedValues[1];
>              break;
>  
> +          case JSOP_DUPAT: {

This reminds me I should get around to filing a bug for removing jsanalyze and handling the arguments optimization directly in Ion and the frontend as discussed previously...

::: js/src/jsscript.cpp
@@ +79,5 @@
>      JS_ASSERT(!(uintptr_t(bindingArray) & TEMPORARY_STORAGE_BIT));
>      JS_ASSERT(numArgs <= ARGC_LIMIT);
>      JS_ASSERT(numVars <= LOCALNO_LIMIT);
> +    JS_ASSERT(numBlockScoped <= LOCALNO_LIMIT);
> +    JS_ASSERT(numVars + numBlockScoped <= LOCALNO_LIMIT);

I think numVars + numBlockScoped could overflow so it'd be better to use subtraction.

::: js/src/vm/ScopeObject.h
@@ +448,5 @@
> +    }
> +
> +    // Return the local corresponding to the 'var'th binding where 'var' is in the
> +    // range [0, slotCount()).
> +    uint32_t varToLocalIndex(uint32_t var) {

I think "var" isn't quite the right name for the input domain here.  I would say "slotToLocalIndex", which seems clearly implied by the input domain of [0, slotCount()), but that is inconsistent with localIndexToSlot which adds in RESERVED_SLOTS.  Really, we have a pre-existing ambiguity with blocks here that "slot" sometimes means "with RESERVED_SLOTS added in" and sometimes "without".  Engine-wide, "slot" means "with RESERVED_SLOTS" of course, so I'd say that we need a new term for [0, slotCount()).  "Block index" seems appropriate.  If you're up for it, perhaps we could rename s/slotCount/numVariables/ and s/varToLocalIndex/blockIndexToLocalIndex/ and then tidy up the variable names used at callsites of these functions to refer to "blockIndex" instead of "slot"?
Attachment #8374760 - Flags: review?(luke) → review+
Thank you both for the prompt review!

(In reply to Luke Wagner [:luke] from comment #25)
> > ::: js/src/vm/ScopeObject.h
> @@ +448,5 @@
> > +    }
> > +
> > +    // Return the local corresponding to the 'var'th binding where 'var' is in the
> > +    // range [0, slotCount()).
> > +    uint32_t varToLocalIndex(uint32_t var) {
> 
> I think "var" isn't quite the right name for the input domain here.  I would
> say "slotToLocalIndex", which seems clearly implied by the input domain of
> [0, slotCount()), but that is inconsistent with localIndexToSlot which adds
> in RESERVED_SLOTS.  Really, we have a pre-existing ambiguity with blocks
> here that "slot" sometimes means "with RESERVED_SLOTS added in" and
> sometimes "without".  Engine-wide, "slot" means "with RESERVED_SLOTS" of
> course, so I'd say that we need a new term for [0, slotCount()).  "Block
> index" seems appropriate.  If you're up for it, perhaps we could rename
> s/slotCount/numVariables/ and s/varToLocalIndex/blockIndexToLocalIndex/ and
> then tidy up the variable names used at callsites of these functions to
> refer to "blockIndex" instead of "slot"?

I totally agree with you, and I agonized over this ambiguity as well.  Will fix in a followup.

My only issue with numVariables and the like is that what was naturally (though incorrectly)

  uint32_t nslots = blockObj.slotCount()

becomes

  uint32_t nvars = blockObj.numVariables()

which is misleading because they aren't "var" bindings.  I'll think about it some more and default to your suggestions; let me know if another brain wave strikes you :)
(In reply to Jan de Mooij [:jandem] from comment #23)
> Comment on attachment 8374760 [details] [diff] [review]
> @@ +939,5 @@
> >              break;
> >            }
> >  
> >            case JSOP_PUSHBLOCKSCOPE:
> >              localsAliasStack_ = true;
> 
> Nit: this is the only place where we set localsAliasStack_. I think a better
> name for it now is hasLetVars_, hasBlockScope_ or something. Can you rename
> it?

It's unneeded now.  Will fix in a followup, if that's OK; I'm a little wary about what effect this might have on arguments optimization and I'd like to to test separately.
And lots more like it.
Thanks for the sheriffing.  https://tbpl.mozilla.org/?tree=Try&rev=bd7fc0981a22 bumps the xdr version; we'll see if that goes green.
Remember to update the XDR version.
Thanks Benjamin, Jan also pointed in that direction.  Looking green on try with updated Xdr version, so:

https://hg.mozilla.org/integration/mozilla-inbound/rev/c80de8d196af
An update:  the failing crashtest only fails on Mac OS X, and only in release mode.  It is a timeout on one test.  You can run the test like this:

  ./mach crashtest --filter 639737-1

The test is small; http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/crashtests/639737-1.html

Basically it recursively calls a function that is expected to throw an exception, and handles the exception inside a catch-all.  This patch does make the fixed part of the stack one slot larger, for the catch exception identifier, so it does perturb the test a bit, because the frames on the stack will be one slot larger.  But that should make the test take less time, as fewer will fit onto the stack, so it is a bit puzzling.

I'll upload a rebased patch.  I'm having trouble tracking down the reason for this failure, as I don't have a mac machine.
Attachment #8374760 - Attachment is obsolete: true
Blocks: 976047
If I make the default stack size on GNU/Linux match the OS X stack size, I can reproduce this crash.  Finally,
Excellent, I have tracked this one down!

Consider this test:

function sset() { throw new SyntaxError }

function b() { 
  try { sset("u"); } catch(e) { }
  try { [0].map(b); } catch(e) { }
}

b();

This reproduces the error on my machine with the standard js shell.  There is an N² memory usage issue, where N is the stack size, that obviously hits harder on the Mac port.  The issue is that the exception thrown in the first try/catch is bound to "e", which is local 0.  It isn't cleared; see comment 8 and comment 12.  The size of that stack string in the exception is proportional to the stack size, and voila crazy GC overhead and timeouts.

I had tried to just stub out the sset call with "throw 13" but obviously that didn't work as it didn't construct a new stack string.

Possible solutions:

  - Clear block scope stack slots when leaving the block scope via some other opcode.
  - Have the baseline frame marker find the current static scope, and only mark the part of the stack frame that is live.  Dead values should probably be reset to undefined.

Not sure which one to do.  The latter solution is better I think.  Will get some input from other folks.
Attachment #8380594 - Attachment is obsolete: true
Comment on attachment 8381419 [details] [diff] [review]
Store let-bound variables in the fixed part of stack frames

Quick r? on the BaselineFrame and StackFrame mark procedures.  This fixes the issue locally, though this test still takes a little while; the effort is n² in the stack size, as each exception takes time proportional to the total stack height as it builds the stack string.  Alack.
Attachment #8381419 - Flags: review?(jdemooij)
Comment on attachment 8381419 [details] [diff] [review]
Store let-bound variables in the fixed part of stack frames

Review of attachment 8381419 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/BaselineFrame.cpp
@@ +58,5 @@
> +    JSScript *script;
> +    jsbytecode *pc;
> +    NestedScopeObject *staticScope;
> +
> +    frameIterator.baselineScriptAndPc(&script, &pc);

This can be pretty slow, so it'd be good to avoid this call if the script has no block object. r=me with that.
Attachment #8381419 - Flags: review?(jdemooij) → review+
https://hg.mozilla.org/mozilla-central/rev/15869165b0aa
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: