Closed Bug 535912 Opened 15 years ago Closed 14 years ago

remove JSStackFrame::blockChain

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: luke, Assigned: billm)

References

Details

Attachments

(1 file, 5 obsolete files)

In pursuance of a slimmer JSStackFrame, we should remove blockChain.  (This would also slim tracer FrameInfos.)  blockChain is currently used to keep track of which let-blocks surround the current PC and is updated by JSOP_ENTERBLOCK/JSOP_LEAVEBLOCK upon entering and leaving let-blocks.  While blockChain is needed to reify the scope chain, this only occurs in a few situations:
 - JSOP_WITH
 - JSOP_EVAL
 - JSOP_DEFFUN of a non-null closure
 - when the debugger asks for the scope chain

For the first three cases, we can bake the correct block into the bytecode (analogous to the JSOP_LINENO trick) and for the last case, since performance is not an issue, we can simply walk backwards looking for the first un-matched JSOP_ENTERBLOCK.
Well, you really want to walk forwards, otherwise you don't know whether you're looking at a JSOp or an immediate.
(In reply to comment #1)
Oh yeah, I forgot about that :P
Blocks: 536275
Just to record, we talked today about alternatives to the JSOP_LINENO trick.  One is to put the blockChain into the opcode immediate (changing the opcode size, thereby requiring some opcodes to be un-fused, which is what JSOP_LINENO avoids).

Another idea dmandelin had is to set aside some slots in the script->nfixed that are used as a lazily-computed cache of the scopeChain reified from a particular blockChain.  After looking at how js_ScopeChain works, this could have a huge performance benefit!  Consider the following loop:

while (...) {
  let ... {
    foo();  // heavyweight call
  }
  let ... {
    foo();  // heavyweight call
  }
}

On each iteration of the loop, we clone two block objects (sharing the common call object).  The fact that we keep cloning the same two block objects is (AFAICS) not recognized and avoided.

So I guess the scheme would be that the fixed slot started out as a JSVAL_INT, which was really a cast pointer to what fp->blockChain would be and, once the block chain is cloned, the slot becomes a JSVAL_OBJECT which is a pointer to that object.
(In reply to comment #3)
> while (...) {
>   let ... {
>     foo();  // heavyweight call
>   }
>   let ... {
>     foo();  // heavyweight call
>   }
> }
> 
> On each iteration of the loop, we clone two block objects (sharing the common
> call object).  The fact that we keep cloning the same two block objects is
> (AFAICS) not recognized and avoided.

We can't avoid cloning or reuse clones if the inner uses are in escaping functions, of course -- the storage has to be separate per iteration and block.

If nothing escapes, then you're right that we could reuse, since the blocks are proxies for the local stack slots, which remain active and singleton per compile time block slot.

Of course, if nothing escapes, then why is foo heavyweight? Indeed JS has static scope, so you'd have to write

  while (...) {
    let ... {
      function foo() ...
      foo();  // heavyweight call
    }
    let ... {
      function bar() ...
      bar();  // heavyweight call
    }
  }

and foo and bar would have to be heavyweight due to with, eval, an upvar that could not be optimized, or some such. This is not AFAICT a common pattern.

/be
(In reply to comment #4)
Oops, yes, got a bit too excited there.
Blocks: 557378
We had some discussions about block objects in the context of fast closure access (bug 517164).  The problem is that fast closure access wants to emit code to do a static number of hops up the scope chain (possibly checking for scopes containing eval/with) and the lazy cloning of block objects makes the number of hops required dynamic.

A potential solution (that would also allow removing JSStackFrame::blockChain) would be:
  1. Create new opcodes JSOP_ENTER/LEAVEBLOCK_DBG with the same bytecode layout as their non-DBG counterparts.
  2. Soup up the compiler to emit JSOP_ENTER/LEAVEBLOCK only when there is a with/eval/non-null-closure in the let's scope and emit JSOP_ENTER/LEAVEBLOCK_DBG otherwise. 
  3. Rename js_GetScopeChain to js_CloneBlockObjects and call js_CloneBlockObjects exclusively from JSOP_ENTERBLOCK.
  4. use fp->scopeChain directly from with/eval/defun/lambda

Thus, the number of objects on the scopechain for a free variable access at given code location will always be the same.

Now, this ignores all the uses of js_GetScopeChain that aren't implementing with/eval/deffun/lambda.  Ignoring the debugger, it seems that most of these shouldn't care whether or not cloned block objects are on the scope chain.  Case and point, since js_GetScopeChain doesn't create a call object, JS_GetScopeChain already returns not-technically-the-tip for flat/null closures.  So, (if I haven't missed an important case) it may be safe for JS_GetScopeChain to just return fp->scopeChain.

As for debugging, if the scope chain is used for JS_EvalInFrame, it seems that this might make let variables disappear.  So that would be bad, and another reason JS_EvalInFrame should die.

Another concern is that eager cloning might be expensive in cases like:

function foo() {
  let x = 3;  /* scope contains heavyweight closure, clone here */
  if (rarelyTaken)
     return function() { /* heavyweight */ }
}

where lazy cloning would have avoided the clone.  We can measure how often this happens in practice.  If it is a problem, the number of JSOP_ENTERBLOCKs emitted could be further decreased by looking into closure bodies to determine if the block is actually needed.
JS_EvalInFrame (oh how I wish the jsdbgapi function were called that; jband named it JS_EvaluateUCInStackFrame instead, which oddly elides Script and seems to want to be pronounced "Uck" in the middle :-P) is never gonna die. Debuggers want to eval an expression in the middle of a run.

We may have to deoptimize heavily to accomodate this use-case, but it's an important one, from all I hear from Firebug users.

Speaking of names, why would you make the new ops end in _DBG? If they are emitted only when there's no compiler-visible reason to reify the Block objects onto the scope chain, then indeed the debugger could be the only cause. But that is not a good reason to use a _DBG suffix. See existing ops with _DBG near the ends of their names.

/be
Assignee: general → wmccloskey
(In reply to comment #6)
> We had some discussions about block objects in the context of fast closure
> access (bug 517164).  The problem is that fast closure access wants to emit
> code to do a static number of hops up the scope chain (possibly checking for
> scopes containing eval/with) and the lazy cloning of block objects makes the
> number of hops required dynamic.

Perhaps I misunderstand what operations "fast closure access" is talking about improving, but this doesn't seem right to me.

If fast closure access means referring to closed-over variables via a constant number of hops up the scope chain, then lazy block cloning doesn't affect that, because we always unlazy the block chain before creating the closure. A closure's scope chain always refers to a fully reified chain of block/call/with/rutabaga objects.
But as to the original idea in this bug --- sounds great to me! The compiler block chain is a static property of the pc, so ideally it would have no runtime cost at all.
(In reply to comment #8)
> If fast closure access means referring to closed-over variables via a constant
> number of hops up the scope chain

It does

> then lazy block cloning doesn't affect that,
> because we always unlazy the block chain before creating the closure. 

Yes, when a closure is first called, fp->scopeChain points to a Call object which is the head of a fully reified scope chain.  But as soon as execution enters a let block, we start being lazy again.  E.g.

function f(cond) {
  var upvar = 3;
  function g() {
    let (y = 9) {
                    // fp->scopeChain still points to Call obj
      if (cond)
        eval("");   // fp->scopeChain points to cloned Block
      print(upvar); // fp->scopeChain points to Block iff cond
    }
  };
  return g;
}

But comment 9 is right that this isn't the point of this bug; comment 7 has to do with a future overhaul of scope chains and I was just trying to remember an idea.
(In reply to comment #10)
> Yes, when a closure is first called, fp->scopeChain points to a Call object
> which is the head of a fully reified scope chain.  But as soon as execution
> enters a let block, we start being lazy again.

Okay; I follow. If we had a pointer to the closure that had been called, we could start the search from there. We need such a pointer to find atoms and friends anyway. Well, it's a pointer to a script, but it could just as well be a pointer to the function.
Attached patch Block chain elimination patch (obsolete) — Splinter Review
This patch does what Luke proposed in the initial comment. It removes the blockChain field from the stack frame and instead synthesizes it when necessary by referring to the bytecode.

The basic idea is to find the block chain by scanning forward from the start of the script, tracking ENTERBLOCK and LEAVEBLOCK instructions. However, since the scan is linear, there can be problems. Example (using a slightly augmented bytecode):
   ENTERBLOCK { v: 0 }
   if (...) {
     LEAVEBLOCK;
     goto x;
   }
   eval(...);
   LEAVEBLOCK
x: ...

If we try to get the block chain at eval, we'll see an ENTERBLOCK and then a LEAVEBLOCK. Since these seem to match, we return NULL, which is wrong.

Consequently, the patch adds a new instruction, BLOCKCHAIN, that gives the current block chain. This gets emitted after any early exit like the one above. The code would then become:
   ENTERBLOCK { v: 0 }
   if (x) {
     LEAVEBLOCK;
     goto x;
   }
   BLOCKCHAIN { v: 0 }
   eval(...);
   LEAVEBLOCK
x: ...

The BLOCKCHAIN instruction encodes the block object using the same indexing system as ENTERBLOCK does. There is also a NULLBLOCKCHAIN instruction to represent NULL.

In cases where you know you will need the scope chain (like a DEFFUN instruction), the patch emits a BLOCKCHAIN instruction directly afterwards. GetBlockChain has a fast-path that checks for this and returns without doing the bytecode scan.

There were also two places where the tracer needs to know whether fp->blockChain is NULL or not. If so, it chooses not to do tracing of recursive functions. Rather than compute the block chain here, I added a bit to JSScript to track whether any let blocks appear. I replaced the tracer's check with a more conservative one that will not trace recursion in the presence of let blocks.
Attachment #467131 - Flags: review?(cdleary)
(In reply to comment #12)
> The basic idea is to find the block chain by scanning forward from the start of
> the script, tracking ENTERBLOCK and LEAVEBLOCK instructions.

With a big script such search is expensive. What about using separated map similar in spirit to the try/catch/finally map that records block start and end pc? Alternatively code can use, say, some variable slot to hold the block chain.
(In reply to comment #13)
> (In reply to comment #12)
> > The basic idea is to find the block chain by scanning forward from the start of
> > the script, tracking ENTERBLOCK and LEAVEBLOCK instructions.
> 
> With a big script such search is expensive.

Well, to clarify, there is a fast path and a slow path. The fast path fetches the block chain out of the next bytecode. The slow path does the bytecode scan. Currently, this only happens from a debugger or from a "with" or "eval" statement. By emitting extra bytecode, I could put "with" and "eval" on the fast path as well, but it doesn't seem worth it.

In short, although the scan is potentially expensive, it should happen very rarely.
(In reply to comment #12)
> Created attachment 467131 [details] [diff] [review]
>    ENTERBLOCK { v: 0 }
>    if (...) {
>      LEAVEBLOCK;
>      goto x;
>    }
>    eval(...);
>    LEAVEBLOCK
> x: ...
> 
> If we try to get the block chain at eval, we'll see an ENTERBLOCK and then a
> LEAVEBLOCK. Since these seem to match, we return NULL, which is wrong.

See jsopcode.cpp, look for SRC_HIDDEN tests. We already scan bytecode for several reasons and have to cope with early exits. The ancient srcnote machinery was long ago used to "hide" such unbalancing ops from scans.

/be
Attached patch Revised patch using SRC_HIDDEN (obsolete) — Splinter Review
Attachment #467131 - Attachment is obsolete: true
Attachment #467131 - Flags: review?(cdleary)
(In reply to comment #15)
> See jsopcode.cpp, look for SRC_HIDDEN tests. We already scan bytecode for
> several reasons and have to cope with early exits. The ancient srcnote
> machinery was long ago used to "hide" such unbalancing ops from scans.

Thanks, that's definitely simpler. I've posted a revised patch.
With JSStackFrame::blockChain removed, perhaps you can move the long block chain comment from the JSStackFrame somewhere else.  Perhaps js_GetScopeChain?
That's easy enough :-). I didn't obsolete the previous patch in case anyone objects.
Attached patch rebased patch (obsolete) — Splinter Review
I've updated the patch to be coherent with the fp() changes.
Attachment #467217 - Attachment is obsolete: true
Attachment #468088 - Attachment is obsolete: true
Attachment #468559 - Flags: review?(cdleary)
Attachment #467217 - Flags: review?(cdleary)
Comment on attachment 468559 [details] [diff] [review]
rebased patch

It looks like usesLet pushes the bitfield to 9 bits with methodjit enabled -- we should make sure that doesn't cause performance degradation because the size of JSScript has been known to cause weird perf regressions.

If we're going to keep the blockChain member on JSStackFrame around we should at least make a comment saying it's only there for checking against the new scheme. I think we should eliminate it once we're confident the new scheme works because it clutters things up with preprocessor directives, but it's up to you.

A comment on the fast variants that describes the circumstances in which you can call them (like the one you gave above in the bug) seems prudent. Also a comment in the bytecode table documenting the circumstances in which we emit NULLBLOCKCHAIN and BLOCKCHAIN would be highly appreciated!

There are a few places where you break up declaration/definition of local vars, I think to conform to surrounding style, but nowadays we just smush those together wherever we can.

In future diffs, we use unified=8 showfunc=1 git=1 (see https://developer.mozilla.org/en/Mercurial_FAQ#section_21 ).
Attachment #468559 - Flags: review?(cdleary) → review+
Attached patch rebased blockchain patch (obsolete) — Splinter Review
Thanks, Chris. I added some comments and rebased the patch again. The extra bit in JSScript can now be dropped because we don't trace recursion anymore. I also was able to remove the lexicalBlock field in TraceRecorder, since it doesn't seem to do any good.

I also made the style changes. However, I left a few var decls at the top of functions in jsemit.cpp, since it seems awkward to lower them.
Attachment #468559 - Attachment is obsolete: true
Just rebased this on top of TraceMonkey.
Attachment #471316 - Attachment is obsolete: true
Will this bug land relatively soon?  We're still writing blockChain when doing a JM call and I'm wondering whether to lazify it (would be easy).
http://hg.mozilla.org/mozilla-central/rev/427282865362
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
This patch did not update jsxdrapi.h's JSXDR_BYTECODE_VERSION -- could that be the cause of bug 601355 ?

/be
(In reply to comment #26)
> This patch did not update jsxdrapi.h's JSXDR_BYTECODE_VERSION -- could that be
> the cause of bug 601355 ?

So (cdleary take note) whenever you see jsopcode.tbl change, look for a change to JSXDR_BYTECODE_VERSION.

/be
Depends on: 601835
(In reply to comment #27)
> So (cdleary take note) whenever you see jsopcode.tbl change, look for a change
> to JSXDR_BYTECODE_VERSION.

Will do! I had to apologize to my friend for crashing his 64b nightlies. :-/
Depends on: 610026
See bug 927782 for a redux of this issue.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: