Move name information from JSFunction into JSScript

RESOLVED FIXED in mozilla2.0b9

Status

()

defect
RESOLVED FIXED
9 years ago
9 years ago

People

(Reporter: Waldo, Assigned: Waldo)

Tracking

Trunk
mozilla2.0b9
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(4 attachments, 4 obsolete attachments)

Posted patch Patch (obsolete) — Splinter Review
Strict mode eval code will need binding information to give such code a correct lexical environment (distinct from the calling function's lexical environment), so binding information needs to be decoupled from functions.

Non-trivial points worth noting, going from a skim:

  * Some of the trickiness here is due to newborn-ish functions not yet
    having a corresponding script.  For the most part, most places
    interested in this information need it pre-script, so it's just a matter
    of getting the initial bindings from the right location-specific place.
  * The (old?) comment by JSFunction::findDuplicateFormal was wrong -- we
    risked and continue to risk O(n**2) behavior for all strict mode
    functions.  I tried to reproduce quadratic complexity and actually
    couldn't in practice for reasons pointed out by the comment, so it
    doesn't seem too important to worry about (and certainly not now).
  * Transfer munging is necessary for bindings that prematurely (before
    binding transfer to their final location in JSScript) convert to being in
    a dictionary, without first being frozen.  This wasn't necessary before
    for two reasons: 1) u.i.names didn't "move" except in one case, so going
    into dictionary mode didn't matter most of the time, and 2) in the one
    case where it did "move", in NewCallObject, it had already been frozen in
    JSScript::NewScriptFromCG.  (The subtilties of this last point consumed
    a good part of yesterday and today for me, but I grok them fully now.)
  * JSTreeContext::bindings tracing for GC is slightly hackish, randomly
    asking all a parser's trees to trace themselves.  (NB: But they only
    trace their bindings and not anything else they contain!  I assume
    everything interesting in tree contexts was previously duplicated in
    Parser.  Or something.).  If you think you see a cleaner way to introduce
    it, do tell!

This patch introduces no functionality change, and it mostly does things just as they were done before (but with more pass-in-a-JSFunction awkwardness -- I'll ease some of that in subsequent work), but it does point out a few anachronisms and things that would be nice to do (some of which I will almost certainly end up doing shortly).  These come to mind at a quick scan; there may be others:

  * nvar/nargs/nupvar counts should move into Bindings and out of JSFunction.
    (nargs will have to be copied back into the function, but it makes most
    sense to incrementally maintain it in Bindings; possibly ditto for the
    others, but it's impossible to say just now.)
  * We can probably defer JSFunction creation until after we have a script,
    meaning any interpreted function would always have a non-null script. \o/
  * If we're going to implement sharp slots like this, we should just atomize
    "#array" and not unnecessarily atomize fallibly every time.

However, this patch is enough change (if more or less straightforward) for one review, and those changes, among others, can be easily deferred.  So long as those changes are deferrable, I'm deferring.

This patch builds atop the empty-script optimization code removal in bug 614333.
Attachment #492923 - Flags: review?(brendan)
(I've bumped JSXDR_MAGIC_SCRIPT_CURRENT and JSXDR_BYTECODE_VERSION locally, steps clearly necessary for this patch to not break stuff.)
Comment on attachment 492923 [details] [diff] [review]
Patch

Hmm, no, js_XDRScript is "all" wrong with just these changes -- more work necessary.  :-\
Attachment #492923 - Flags: review?(brendan)
> nvar/nargs/nupvar counts should move into Bindings and out of JSFunction

I tried making the xdr-script changes without doing this move, and it's impossible: you have to be able to XDR a script, *without* any corresponding function which might or might not exist, so you need this data in the script, next to the names against which those numbers bind.  So doing that as a true followup is not possible.

My queue still separates this move from the initial changes, however.  It's still likely easier to review the move-names part, then review the move-binding-count part, even if the first is incomplete without the second.  Right now I think I have both parts finished.  But I'm not posting these parts, or some prerequisite work to eliminate the JSScript::*Offset-numbers-must-be-uint8s requirement, just yet because they depend on bug 614333's patches, and there's still active discussion about the exact nature of what those changes should look like.  So work here must remain in stasis until that completes.
Subsequent patches here cause JSScript + possible trailers to overflow the <256 limit that the *Offset values must not overflow (if only in 64-bit, threadsafe, debug builds, alas; sigh, complexity).  So make them relative to the end so that we can add more into JSScript.
Attachment #496036 - Flags: review?(dvander)
This is basically the patch from comment 0 with some unbitrotting.  IT DOES NOT WORK ON ITS OWN.  Everything but XDR works, and XDR doesn't work because I haven't moved fun->u.i.nvars/nupvars into the Bindings class.  That *must* come along with the script, because JS_XDRScript needs it in order to serialize/deserialize bindings information, and that API has no way to permit passing along the relevant function.  Shell tests, and strangely JSAPI tests, all work with just this patch, but the browser isn't going to start because XDR is broken with this.

So why post this?  Well, moving nvars/nupvars is another change to a lot of places, and even if it's a necessary part of a full fix here, it's probably clearer to do it as an immediate followup to ease review of the whole thing -- keep the two different data-location-moves in two patches to keep the conceptual change in each as small as possible.  I'll fold both parts together into one when I push a final, reviewed patch.
Attachment #496040 - Flags: review?(brendan)
This is lots of mechanical change, but the js_XDRFunction/js_XDRScript changes are the real interesting part here -- the part that makes part 1 not crash and die in flames when you start a browser.
Attachment #496041 - Flags: review?(brendan)
Blocks: 514568
Comment on attachment 496036 [details] [diff] [review]
Make JSScript::*Offset encode not-present with 0xFF, and make them offsets from the end of JSScript, not from the start

Good patch, but this change is just a style regression:

>+++ b/js/src/jsscript.h
>@@ -213,19 +213,25 @@ struct JSScript {
>     uint16          version;    /* JS version under which script was compiled */
>     uint16          nfixed;     /* number of slots besides stack operands in
>                                    slot array */
>-    uint8           objectsOffset;  /* offset to the array of nested function,
>-                                       block, scope, xml and one-time regexps
>-                                       objects or 0 if none */
>-    uint8           upvarsOffset;   /* offset of the array of display ("up")
>-                                       closure vars or 0 if none */
>-    uint8           regexpsOffset;  /* offset to the array of to-be-cloned
>-                                       regexps or 0 if none. */
>-    uint8           trynotesOffset; /* offset to the array of try notes or
>-                                       0 if none */
>-    uint8           globalsOffset;  /* offset to the array of global slots or
>-                                       0 if none */
>-    uint8           constOffset;    /* offset to the array of constants or
>-                                       0 if none */
>+
>+    /*
>+     * Offsets to various array structures from the end of this script, or
>+     * JSScript::INVALID_OFFSET if the array has length 0.
>+     */
>+
>+    /* nested function, block, scope, xml and one-time regexps objects */
>+    uint8           objectsOffset;
>+    /* closure upvars */
>+    uint8           upvarsOffset;
>+    /* to-be-cloned regexps */
>+    uint8           regexpsOffset;
>+    /* try notes */
>+    uint8           trynotesOffset;
>+    /* global slots */
>+    uint8           globalsOffset;
>+    /* constants */
>+    uint8           constOffset;

The comments on the right for members in the old lines are overlong and could be shortened by factoring common phrase and hoisting it, but the new lines deviate in several ways (no blank line before whole-or-greater comment line unless preceded by open brace; sentence fragment without full stop but in context where comments do contain sentences in prevailing style).

>@@ -7884,7 +7884,7 @@ TraceRecorder::updateAtoms()
> {
>     JSScript *script = cx->fp()->script();
>     atoms = FrameAtomBase(cx, cx->fp());
>-    consts = cx->fp()->hasImacropc() || script->constOffset == 0
>+    consts = cx->fp()->hasImacropc() || !JSScript::isValidOffset(script->constOffset)
>            ? 0 
>            : script->consts()->vector;
>     strictModeCode_ins = w.name(w.immi(script->strictModeCode), "strict");

Preexisting nits: parenthesize non-member/call/primary left operand of ? and put ? and : under left paren for condition, not under =.

r=me if you and dvander want it to relieve dvander.

/be
Attachment #496036 - Flags: review+
Comment on attachment 496036 [details] [diff] [review]
Make JSScript::*Offset encode not-present with 0xFF, and make them offsets from the end of JSScript, not from the start

Fine by me -- fixed up locally, will land this bit next time I have a series to push.
Attachment #496036 - Flags: review?(dvander)
Attachment #492923 - Attachment is obsolete: true
I landed the offset-changes bit:

http://hg.mozilla.org/tracemonkey/rev/2b92dda5e819

The guts of the work still remain, of course, so this push doesn't resolve the bug as a whole.
Attachment #496040 - Attachment is obsolete: true
Attachment #497679 - Flags: review?(brendan)
Attachment #496040 - Flags: review?(brendan)
Attachment #496041 - Attachment is obsolete: true
Attachment #497680 - Flags: review?(brendan)
Attachment #496041 - Flags: review?(brendan)
Comment on attachment 497679 [details] [diff] [review]
Part 1: Move fun->u.i.names into fun->script()->bindings, unbitrotted

Nice patch. I have a fix for addLocal that will have to rebase, njn is waiting on it with a big footprint win, but I'll do it after you since you've been waiting a while.

>+    for (Shape::Range r(fun->script()->bindings.lastUpvar(fun)); i-- != 0; r.popFront()) {

I counted 14 fun->script()->bindings uses -- how about a shorthand: fun->bindings()?

>@@ -2864,8 +2864,10 @@ Decompile(SprintStack *ss, jsbytecode *p
>                     jp->fun = jp->script->getFunction(0);
>                 }
> 
>-                if (!jp->localNames)
>-                    jp->localNames = jp->fun->getLocalNameArray(cx, &jp->pool);
>+                if (!jp->localNames) {
>+                    jp->localNames =
>+                        jp->fun->script()->bindings.getLocalNameArray(cx, fun, &jp->pool);
>+                }
> 
>                 uintN index = GET_UINT16(pc);
>                 if (index < jp->fun->u.i.nupvars) {

Would help this not wrap, e.g.

>@@ -2715,6 +2722,8 @@ LeaveFunction(JSParseNode *fn, JSTreeCon
>         }
>     }
> 
>+    funbox->bindings.transfer(funtc->parser->context, &funtc->bindings);
>+
>     return true;

Ok per house style to lose the blank line, since there's no control transfer or significant complexity in having the .transfer call followed by return true.

>+Bindings::add(JSContext *cx, JSFunction *fun, JSAtom *name, BindingKind kind)
>+{
>+    JS_ASSERT(lastBinding);
>+
>+    JS_ASSERT_IF(kind == ARGUMENT, fun != NULL);
>+
>+    /*
>+     * We don't track bindings in global code yet, but we will soon: see
>+     * bug 514568.
>+     */
>+    JS_ASSERT(fun);
>+    JS_ASSERT(fun->isInterpreted());

We don't need to assert fun but it's ok with the comment (or remove it, the comment says enough).

>+/*
>+ * We use the simplest possible algorithm here, risking O(n^2) pain.  Believe
>+ * it or not, because the upper bound on arguments is 2**16 and the loop body
>+ * is small, this quadratic behavior isn't really noticeable in practice!

Please restore the comment I wrote from jsfun.cpp, or merge them. What you write here isn't true. The upper bound could be very painful, so the point is not that it's a small-enough bound that we can tolerate O(n^2) growth. Rather, duplicate formals are extremely rare, gone in ES5 strict and Harmony, so we keep code simple accordingly.

>+/*
>+ * Formal parameters, local variables, and upvars are stored in a shape tree
>+ * path encapsulated within this class.  This class represents bindings for
>+ * both function and top-level scripts (the latter is needed to track names in
>+ * strict mode eval code, to give such code its own lexical environment).

Some comments you moved use one space after period, others you've originated use two spaces. Unless touch typing on a typewriter is hardwired in your fingers, please use one space throughout.

I like the makeImmutable renaming of freezeLocalNames for Binding. In this light are you willing to rename js::Shape::{setFrozen,frozen} to ...{setImmutable,immutable} and FROZEN to IMMUTABLE? Would be great to avoid the silly confusion opp'ty w.r.t. ES5 that I added.

r=me with some attention to these small points. Thanks,

/be
Attachment #497679 - Flags: review?(brendan) → review+
Cc'ing folks who may have to rebase through the code motion here in patch 1.

/be
Comment on attachment 497680 [details] [diff] [review]
Part 2: Move fun->u.i.nvars/nupvars into fun->script()->bindings, unbitrotted

JSFunction now has dead uint16 padding in two places from this change. That's no good -- pack Bindings and its interior aligned allocation in JSFunction, or bust.

Also, don't use JS_XDRUint16 (especially with "padding"). It wastes 16 bits on zeroes. Keep packing uint16s into uint32 words that are JS_XDRUint32'd.

/be
Attachment #497680 - Flags: review?(brendan) → review-
(In reply to comment #14)
> Comment on attachment 497680 [details] [diff] [review]
> Part 2: Move fun->u.i.nvars/nupvars into fun->script()->bindings, unbitrotted
> 
> JSFunction now has dead uint16 padding in two places from this change. That's
> no good -- pack Bindings and its interior aligned allocation in JSFunction, or
> bust.

Or encode and decode from packed members of JSFunction at low enough cost. But don't leave alignment padding gaps in a highly populous GC-thing.

/be
(In reply to comment #14)
> JSFunction now has dead uint16 padding in two places from this change. That's
> no good -- pack Bindings and its interior aligned allocation in JSFunction, or
> bust.

You're missing the benefit of the subsequent patches implementing bug 514568.  I've just uploaded what I currently have to that bug; those patches (#2 in particular) should demonstrate that I need binding information to live in scripts to create Call objects for function-less strict mode eval frames.  Do you mean I should repeat myself by preserving Bindings information in both function and script?  Just for starters, isn't that a problem for the backpointer scheme fun->u.i.names currently uses and bindings.lastBinding would use?

Alternately, are you worried about JSFunction's Scripted struct being smaller than its anonymous "n" struct?  Scripted functions simply require less data to encode than native functions.  I'm not sure what I can do to reduce the size of the "n" struct; its members, at this juncture, all seem necessary to me.


> Also, don't use JS_XDRUint16 (especially with "padding"). It wastes 16 bits on
> zeroes. Keep packing uint16s into uint32 words that are JS_XDRUint32'd.

If we choose to put a premium on space down to the two-byte amount, why is JS_XDRUint16 "wasteful" in this manner?  Alternately, why was it provided if it must waste space?  (This isn't relevant to the change itself.  I'm just curious.)

The Zen of Python aphorisms #3 (simple > complex), #5 (flat > nested), and #7 (readability counts) applied here, seemed to me, at the time I made these changes.  (With the exception of "padding", which I'd wrongly thought necessary to address XDR's not using any of its own.)  But if this is what it takes to move the patch forward, I can do it.
(In reply to comment #12)
> >+    for (Shape::Range r(fun->script()->bindings.lastUpvar(fun)); i-- != 0; r.popFront()) {
> 
> I counted 14 fun->script()->bindings uses -- how about a shorthand:
> fun->bindings()?

I originally had such a shorthand.  But since newborn functions don't have a script yet, this shorthand was an easy way for me to footgun myself by careless use of it.  I like the explicit fun->script() making clear that you must have a non-newborn function before you can access bindings.

In the longer run, I think fun->bindings() will make a lot of sense when we never create interpreted functions before their corresponding JSScripts.  With this bindings refactoring we're not horribly far off.  But there are still places in the parser/emitter, and in things like the function box abstraction, where we need null-script interpreted functions, so I prefer fun->script()->bindings until we can make this extra change.


> Please restore the comment I wrote from jsfun.cpp, or merge them. What you
> write here isn't true. The upper bound could be very painful, so the point is
> not that it's a small-enough bound that we can tolerate O(n^2) growth. Rather,
> duplicate formals are extremely rare, gone in ES5 strict and Harmony, so we
> keep code simple accordingly.

I'll merge the two.  And while on a second look what I wrote is looking dodgy (I should retry my asymptotic-behavior testing to double-check that I actually measured this correctly -- I did test, and I wasn't seeing any at the time, but maybe I mis-tested), what was there before is wrong.  The algorithm is O(n**2) regardless whether duplicate formals are, or are not, present -- all that's required is ECMA strict mode or javascript.options.strict.


> >+/*
> >+ * Formal parameters, local variables, and upvars are stored in a shape tree
> >+ * path encapsulated within this class.  This class represents bindings for
> >+ * both function and top-level scripts (the latter is needed to track names in
> >+ * strict mode eval code, to give such code its own lexical environment).
> 
> Some comments you moved use one space after period, others you've originated
> use two spaces. Unless touch typing on a typewriter is hardwired in your
> fingers, please use one space throughout.

Two-spacing is very much reflexive on my part.  I'll fix up the comments in the patch, although it's the sort of thing where I could easily miss instances at a skim, or even a closer read.


> I like the makeImmutable renaming of freezeLocalNames for Binding. In this
> light are you willing to rename js::Shape::{setFrozen,frozen} to
> ...{setImmutable,immutable} and FROZEN to IMMUTABLE? Would be great to avoid
> the silly confusion opp'ty w.r.t. ES5 that I added.

This seems sensible.  I'll do it as a followup patch, separate from this one and the next (which must be folded together to preserve a functional build), to minimize an already-large patch's size.
Quick comments:

0. Why is the Scripted arm of the JSFunction union given that tag? I could not find any uses.

1. I know where you're going, no need to review. I'm looking for packed JSFunction. If there's no net size increase then never mind. For some reason I thought Scripted was bigger than the anonymous Native.

2. JS_XDRUint16 is from old days, we could change it but that's a different bug. This bug just needs to avoid it.

3. Who wants to delay creating JSFunction objects until their scripts? That will simply require intermediaries. What's the win?

4. fun->bindings() used before script is set will crash (footgun). The current patch's overlong expansion of its body will crash too. What's the difference?

/be
(In reply to comment #18)
> Quick comments:
> 
> 0. Why is the Scripted arm of the JSFunction union given that tag? I could not
> find any uses.

$ grep -w Scripted methodjit/*.cpp 
methodjit/MonoIC.cpp:                           offsetof(JSFunction::U::Scripted, script));

/be
(In reply to comment #18)
> 3. Who wants to delay creating JSFunction objects until their scripts? That
> will simply require intermediaries. What's the win?

It's the mental complexity reduction of not having to worry about !fun->script() whenever fun->script() is used.  With the bindings move here, that reduction is easier than you think -- there isn't all that much that relies on an actual function existing in the parser or emitter these days.  But it was a mere offhand comment about possible future directions -- let's cross that bridge when we get to it, and get this done now so I can knock off more blockers.

> 4. fun->bindings() used before script is set will crash (footgun). The current
> patch's overlong expansion of its body will crash too. What's the difference?

Explicit reliance on having a script is better than implicit reliance on a function being fully constructed.  This is particularly so because, before this, bindings information was available from the function during initialization of the function.  I had a fun->bindings() method at one point, but the slightly longer version was clearer about what was required, and less error-prone to me when making changes here, so I removed the shorthand.
Attachment #497680 - Attachment is obsolete: true
Attachment #499975 - Flags: review?(brendan)
Re: comment 20, ok, I see -- so if you can defer function creation at no net cost until the function's script is created, then fun->bindings() as shorthand should make a come-back, because there's no valid null u.i.script state for interpreted functions. Agreed?

The C hangover and GC-thing packing make JSFunction hold a union, but distinct scripted and native C++ subtypes of a more abstract Function base seem better. I want this from Santa next year!

/be
Attachment #499975 - Flags: review?(brendan) → review+
(In reply to comment #22)
> Re: comment 20, ok, I see -- so if you can defer function creation at no net
> cost until the function's script is created, then fun->bindings() as shorthand
> should make a come-back, because there's no valid null u.i.script state for
> interpreted functions. Agreed?

Yes, agreed.

After much rebasing today (easily two-thirds of all my time today :-( ), I think this is good to land tomorrow morning, along with a handful of other things accumulated recently.  'bout time, too, I say.  :-D
http://hg.mozilla.org/tracemonkey/rev/0d9a5752b1cf
Whiteboard: fixed-in-tracemonkey
Target Milestone: --- → mozilla2.0b9
I pushed http://hg.mozilla.org/tracemonkey/rev/e0166849b71d to fix some opt-only warnings the previous change introduced.
(In reply to comment #17)
> The algorithm is O(n**2) regardless whether duplicate formals are, or are
> not, present -- all that's required is ECMA strict mode or
> javascript.options.strict.

I filed bug 622670 on this.
You need to log in before you can comment on or make changes to this bug.