Closed Bug 819509 Opened 12 years ago Closed 12 years ago

reparse strict functions

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla20

People

(Reporter: Benjamin, Assigned: Benjamin)

References

Details

Attachments

(2 files, 11 obsolete files)

83.94 KB, patch
n.nethercote
: review+
Details | Diff | Splinter Review
64.50 KB, patch
n.nethercote
: review+
Details | Diff | Splinter Review
I've been feeling increasingly guilty about bug 769072. It works, but it has to touch a lot of code and is brittle. Also, it wouldn't work if we ever need to change the parse tree based on strict mode. Therefore, I think we should simply reparse a function on discovering it's strict.
The seeking and restoring doesn't understand lookahead. It turns out this can work.. with a bit of handwaving in later patches.
Attachment #689853 - Flags: review?(n.nethercote)
Attachment #689854 - Flags: review?(n.nethercote)
When we reparse, we set things on the function multiple times.
Attachment #689855 - Flags: review?(n.nethercote)
Attached patch reparse strict functions (obsolete) — Splinter Review
The meat of the patch. Many cleanups left for later.
Attachment #689857 - Flags: review?(n.nethercote)
Attachment #689858 - Flags: review?(n.nethercote)
Attached patch make strictness a boolean (obsolete) — Splinter Review
Attachment #689859 - Flags: review?(n.nethercote)
Attachment #689860 - Flags: review?(n.nethercote)
Attachment #689860 - Attachment description: delete some crub we don't need anymore; simplify → delete some crud we don't need anymore; simplify
Cool.  Thanks for doing this.  It's Saturday morning here, I'll look at these patches on my Monday.
Comment on attachment 689853 [details] [diff] [review]
minimal lexer support for saving and restoring state; save it when we see "function"

Review of attachment 689853 [details] [diff] [review]:
-----------------------------------------------------------------

Are you planning to fold these patches together before landing?

::: js/src/frontend/TokenStream.cpp
@@ +389,5 @@
>  
> +void
> +TokenStream::tell(Position &pos)
> +{
> +    JS_ASSERT(lookahead == 0);

Judging from the bugzilla comment you'll lift this restriction later?  An XXX comment would help.

@@ +406,5 @@
> +    lineno = pos.lineno;
> +    linebase = pos.linebase;
> +    prevLinebase = pos.prevLinebase;
> +    lookahead = 0;
> +    cursor = 0;

Hmm, why is cursor always 0 here?  Don't you have to save it in Position as well?

::: js/src/frontend/TokenStream.h
@@ +708,5 @@
>      }
>  
> +    struct Position {
> +        friend class TokenStream;
> +      private:

Make it a class.

@@ +915,5 @@
>      bool                moarXML;        /* see JSOPTION_MOAR_XML */
>      JSContext           *const cx;
>      JSPrincipals        *const originPrincipals;
>      StrictModeGetter    *strictModeGetter; /* used to test for strict mode */
> +    Position            lastFunctionKeyword;

Add a comment to this field, something like "needed to reparse functions with defaults that are found to be in strict mode".
Comment on attachment 689855 [details] [diff] [review]
make some JSFunction setters idempotent

Review of attachment 689855 [details] [diff] [review]:
-----------------------------------------------------------------

A comment above |flags| that says some of the fields can be set twice (due to reparsing of functions with defaults and strict mode) would be nice.

::: js/src/jsfun.h
@@ -118,1 @@
>          this->nargs = nargs;

For this one you could change it to |JS_ASSERT(this->nargs == 0 || this->nargs == nargs)|.
(In reply to Nicholas Nethercote [:njn] from comment #9)
> Comment on attachment 689853 [details] [diff] [review]
> minimal lexer support for saving and restoring state; save it when we see
> "function"
> 
> Review of attachment 689853 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Are you planning to fold these patches together before landing?
> 
> ::: js/src/frontend/TokenStream.cpp
> @@ +389,5 @@
> >  
> > +void
> > +TokenStream::tell(Position &pos)
> > +{
> > +    JS_ASSERT(lookahead == 0);
> 
> Judging from the bugzilla comment you'll lift this restriction later?  An
> XXX comment would help.

What I meant was calls to |tell| will be strategically placed not so as not to trigger this.

> 
> @@ +406,5 @@
> > +    lineno = pos.lineno;
> > +    linebase = pos.linebase;
> > +    prevLinebase = pos.prevLinebase;
> > +    lookahead = 0;
> > +    cursor = 0;
> 
> Hmm, why is cursor always 0 here?  Don't you have to save it in Position as
> well?

cursor indexes into the lookahead buffer. When we seek, there is no lookahead.
> What I meant was calls to |tell| will be strategically placed not so as not
> to trigger this.
> 
> cursor indexes into the lookahead buffer. When we seek, there is no
> lookahead.

Two good candidates for comments, then!
> > Hmm, why is cursor always 0 here?  Don't you have to save it in Position as
> > well?
> 
> cursor indexes into the lookahead buffer.

Hang on, are you sure?  That doesn't match my understanding, which is that cursor points into the circular |tokens| buffer.  Look at TokenStream::newToken() and TokenStream::currentToken(), for example.

Or do you mean that it's ok to effectively reset the circular buffer, because we won't need to look ahead?  If so, I wonder if it's worth NULLing it out (or similar) just to make things clear and to catch any potential errors.

---

Do you have any more patches coming, or are these seven it?  It's not clear from the attachment comments.

---

IIRC you had to extend the max lookahead by 1 for your previous strict mode work;  will that be able to be reduced again?  (Maybe you've done that in one of the patches I haven't read yet.)
(In reply to Nicholas Nethercote [:njn] from comment #9)
> Are you planning to fold these patches together before landing?

I don't know. Do you have an opinion?

(In reply to Nicholas Nethercote [:njn] from comment #13)
> > > Hmm, why is cursor always 0 here?  Don't you have to save it in Position as
> > > well?
> > 
> > cursor indexes into the lookahead buffer.
> 
> Hang on, are you sure?  That doesn't match my understanding, which is that
> cursor points into the circular |tokens| buffer.  Look at
> TokenStream::newToken() and TokenStream::currentToken(), for example.
> 
> Or do you mean that it's ok to effectively reset the circular buffer,
> because we won't need to look ahead?  If so, I wonder if it's worth NULLing
> it out (or similar) just to make things clear and to catch any potential
> errors.

This. However, we need to give the parser correct line data after reseting in case it calls currentToken().pos.begin. I change the way this is done in a revised first patch.

> 
> ---
> 
> Do you have any more patches coming, or are these seven it?  It's not clear
> from the attachment comments.

This is all.

> 
> ---
> 
> IIRC you had to extend the max lookahead by 1 for your previous strict mode
> work;  will that be able to be reduced again?  (Maybe you've done that in
> one of the patches I haven't read yet.)

Good point. We can reduce it again now.
Address of review comments. Replace "cursor" fiddling.
Attachment #689853 - Attachment is obsolete: true
Attachment #689853 - Flags: review?(n.nethercote)
Attachment #690409 - Flags: review?(n.nethercote)
Forgot to qref...
Attachment #690409 - Attachment is obsolete: true
Attachment #690409 - Flags: review?(n.nethercote)
Attachment #690413 - Flags: review?(n.nethercote)
Comments.
Attachment #689855 - Attachment is obsolete: true
Attachment #689855 - Flags: review?(n.nethercote)
Attachment #690414 - Flags: review?(n.nethercote)
Attached patch reduce tokenizer lookahead (obsolete) — Splinter Review
Attachment #690416 - Flags: review?(n.nethercote)
Do these patches remove the things blocking bug 678037 as described in bug 678037 comment c28? Or, if not, could things be changed without too much effort to make it so?
(In reply to Till Schneidereit [:till] from comment #19)
> Do these patches remove the things blocking bug 678037 as described in bug
> 678037 comment c28? Or, if not, could things be changed without too much
> effort to make it so?

Those are already gone.
For next time(In reply to :Benjamin Peterson from comment #14)
> (In reply to Nicholas Nethercote [:njn] from comment #9)
> > Are you planning to fold these patches together before landing?
> 
> I don't know. Do you have an opinion?

I've just found the patch division unintuitive.  E.g. patch two is weird;  I assume the octal stuff will be used in a later patch.  I guess it doesn't matter much for landing so long as all intermediate versions build.

BTW, for next time, it makes things much easier for the reviewer if you number the patches, esp. when you rework some of the patches.
Comment on attachment 690414 [details] [diff] [review]
make some JSFunction setters idempotent

Review of attachment 690414 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsfun.h
@@ +147,5 @@
>      void setIsHeavyweight() {
>          flags |= HEAVYWEIGHT;
>      }
>  
>      void setIsExprClosure() {

Add a "Can be called multiple times..." comment?
Comment on attachment 690416 [details] [diff] [review]
reduce tokenizer lookahead

Review of attachment 690416 [details] [diff] [review]:
-----------------------------------------------------------------

r=me if you introduce a |maxLookahead| constant and rewrite these changes in terms of it :)

::: js/src/frontend/TokenStream.h
@@ +603,5 @@
>      }
>  
>      TokenKind peekToken() {
>          if (lookahead != 0) {
> +            JS_ASSERT(lookahead <= 1);

I assume this is <= 1 because that represents the lookahead already being used, and peekToken() increments that?
Attachment #690416 - Flags: review?(n.nethercote) → review+
Comment on attachment 689859 [details] [diff] [review]
make strictness a boolean

Review of attachment 689859 [details] [diff] [review]:
-----------------------------------------------------------------

ParseContext::queuedStrictModeError and StrictModeGetter:queuedStrictModeError() and both now dead, AFAICT, but this patch doesn't seem to remove them.

With this patch we now have the following:  SharedContext::strictMode, SharedContext::inStrictMode(), JSScript::strictModeCode, TokenStream::strictMode(), as well as various local variables and parameters of different names.  r=me you choose one (probably |strictMode|) and make them all consistent.  Do the ones you've already changed in this patch and the rest in a follow-up.

(Apologies for making you clean up pre-existing messes, but this stuff needs to be cleaned up and this is a good opportunity.)

::: js/src/frontend/SharedContext-inl.h
@@ +42,5 @@
>      JS_ASSERT(isFunction);
>      return static_cast<FunctionBox*>(this);
>  }
>  
> +GlobalSharedContext::GlobalSharedContext(JSContext *cx, JSObject *scopeChain, bool strict)

An example:  this should be |strictMode|.  There are numerous similar cases.
Attachment #689859 - Flags: review?(n.nethercote) → review+
Attachment #690414 - Flags: review?(n.nethercote) → review+
Comment on attachment 689860 [details] [diff] [review]
delete some crud we don't need anymore; simplify

Review of attachment 689860 [details] [diff] [review]:
-----------------------------------------------------------------

Oh, you've removed the queued strict stuff here.
Attachment #689860 - Flags: review?(n.nethercote) → review+
Attachment #689858 - Flags: review?(n.nethercote) → review+
Comment on attachment 690413 [details] [diff] [review]
minimal lexer support for saving and restoring state; save it when we see "function"

Review of attachment 690413 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/TokenStream.cpp
@@ +409,5 @@
> +    prevLinebase = pos.prevLinebase;
> +    lookahead = 0;
> +
> +    // Make the last token look like it it came from here. The parser looks at
> +    // currentTokens() position to calulate line numbers.

s/calulate/calculate/

Also, "the position of currentTokens()" is clearer.
Attachment #690413 - Flags: review?(n.nethercote) → review+
Attachment #689854 - Flags: review?(n.nethercote) → review+
Comment on attachment 689857 [details] [diff] [review]
reparse strict functions

Review of attachment 689857 [details] [diff] [review]:
-----------------------------------------------------------------

This whole change is great, it's clearly much simpler than what we had.

I'm giving r- on this patch, not because there's anything majorly wrong with it, but because I want to see the whole sequence again once you've made my requested changes.  Just fold all the patches up into one for that.  Thanks!

::: js/src/frontend/BytecodeCompiler.cpp
@@ +294,2 @@
>      Rooted<JSScript*> script(cx, JSScript::Create(cx, NullPtr(), false, options,
> +                                                  0, ss, 0, length));

Please write |/* staticLevel = */ 0|.  While you're at it, can you do the same for the pre-existing |sourceStart| arg?

@@ +296,5 @@
>      if (!script)
>          return false;
>  
> +    TokenStream::Position start;
> +    parser.tokenStream.tell(start);

The fact that tell() takes a |Position&| is a bit surprising here.  A |Position*| would be clearer, make it more obvious that |start| is modified.

Also, I wonder if Position is sufficiently different from a simple file offset that the names position() and setPosition() would be clearer than tell() and seek().

@@ +300,5 @@
> +    parser.tokenStream.tell(start);
> +    bool initiallyStrict = StrictModeFromContext(cx) == StrictMode::STRICT;
> +    bool strict = initiallyStrict;
> +    FunctionBox *funbox;
> +    ParseNode *pn = parser.standaloneFunctionBody(fun, formals, script, fn, &funbox, strict);

*Please* make |strict| be passed as a pointer not a reference, its modification is also quite surprising.  (Esp. when funbox is a pointer not a reference.)

Also, |strict| is an in-out-param?  Erk.

@@ +302,5 @@
> +    bool strict = initiallyStrict;
> +    FunctionBox *funbox;
> +    ParseNode *pn = parser.standaloneFunctionBody(fun, formals, script, fn, &funbox, strict);
> +    if (!pn) {
> +        if (initiallyStrict || !strict || parser.tokenStream.hadError())

Please add a comment explaining this.  You have a good comment in 
Parser::functionDef for a similar case.

::: js/src/frontend/BytecodeEmitter.cpp
@@ -4854,5 @@
> -        // functions defined in defaults), inherit it from the parent.
> -        if (funbox->strictModeState == StrictMode::UNKNOWN) {
> -            JS_ASSERT(outersc->strictModeState != StrictMode::UNKNOWN);
> -            funbox->strictModeState = outersc->strictModeState;
> -        }

Hooray :)

::: js/src/frontend/Parser.cpp
@@ +1699,5 @@
>  #else
> +    if (!tokenStream.matchToken(TOK_RC)) {
> +        reportError(NULL, JSMSG_CURLY_BEFORE_BODY);
> +        return false;
> +    }

You changed TOK_LC to TOK_RC here, I think that's a typo.  Have you tested these changes?

@@ +1886,5 @@
>                      return false;
> +                } else {
> +                    // We don't reparse global scopes, so we keep track of the
> +                    // one possible strict violation that could occur in the
> +                    // directive proglogue, octal escapes, and complain now.

s/proglogue/prologue/

Also, it's clearer if you write "-- octal escapes --".

::: js/src/frontend/Parser.h
@@ +199,5 @@
>      // they need to be treated differently.
>      bool            inDeclDestructuring:1;
>  
> +    // True if we just saw a "use strict" directive and weren't strict before.
> +    bool            becameStrict:1;

I'm guessing that this is only used for functions?  If so, the comment could make that clear, and the member name could even be |funBecameStrict|.

@@ +354,5 @@
>      /* Public entry points for parsing. */
>      ParseNode *statement();
> +    bool maybeParseDirective(ParseNode *pn, bool &cont);
> +
> +    ParseNode *standaloneFunctionBody(HandleFunction fun, const AutoNameVector &formals, HandleScript script,

This needs a comment.
Attachment #689857 - Flags: review?(n.nethercote) → review-
(In reply to Nicholas Nethercote [:njn] from comment #28)
> ::: js/src/frontend/Parser.cpp
> @@ +1699,5 @@
> >  #else
> > +    if (!tokenStream.matchToken(TOK_RC)) {
> > +        reportError(NULL, JSMSG_CURLY_BEFORE_BODY);
> > +        return false;
> > +    }
> 
> You changed TOK_LC to TOK_RC here, I think that's a typo.  Have you tested
> these changes?

Good catch. It's in one of those alternate syntax sections that is normally #ifdefed out, so nobody has probably tested that code in years.
Attached patch conglomerationSplinter Review
Attachment #689854 - Attachment is obsolete: true
Attachment #689857 - Attachment is obsolete: true
Attachment #689858 - Attachment is obsolete: true
Attachment #689859 - Attachment is obsolete: true
Attachment #689860 - Attachment is obsolete: true
Attachment #690413 - Attachment is obsolete: true
Attachment #690414 - Attachment is obsolete: true
Attachment #690416 - Attachment is obsolete: true
Attachment #690686 - Flags: review?(n.nethercote)
Attachment #690687 - Flags: review?(n.nethercote)
Comment on attachment 690686 [details] [diff] [review]
conglomeration

Review of attachment 690686 [details] [diff] [review]:
-----------------------------------------------------------------

Lovely!  Just some nits below.

::: js/src/frontend/BytecodeCompiler.cpp
@@ +310,5 @@
> +            return false;
> +
> +        // Reparse in strict mode.
> +        parser.tokenStream.seek(start);
> +        pn = parser.standaloneFunctionBody(fun, formals, script, fn, &funbox, true);

Change |true| to |/* initiallyStrict = */ true|.

::: js/src/frontend/Parser.cpp
@@ +704,5 @@
> +    *funbox = newFunctionBox(fun, /* outerpc = */ NULL, strict);
> +    if (!funbox)
> +        return NULL;
> +
> +    ParseContext funpc(this, *funbox, 0, /* bodyid = */ 0);

Change the first 0 to |/* staticLevel = */ 0|.

@@ +1858,5 @@
> +
> +    ParseNode *string = pn->pn_kid;
> +    if (string->isEscapeFreeStringLiteral()) {
> +        // Mark this statement as being a possibly legitimate part of a
> +        // directive prologue, so the byte code emitter won't warn about it

s/byte code/bytecode/

@@ +1882,5 @@
>                      return false;
> +                } else {
> +                    // We don't reparse global scopes, so we keep track of the
> +                    // one possible strict violation that could occur in the
> +                    // directive prologue--octal escapes--and complain now.

I'd put spaces around the dashes.  Or you could use parens if you like that better.

@@ +1934,5 @@
>              return NULL;
>          }
>  
> +        if (canHaveDirectives) {
> +            if (!maybeParseDirective(next, canHaveDirectives))

Again, I'm not much of a fan of non-const reference parameters.  Can you make it a pointer?  |&canHaveDirectives| makes it so much more obvious that it's modified.

::: js/src/frontend/Parser.h
@@ +193,5 @@
>      // assignment-like and declaration-like destructuring patterns, and why
>      // they need to be treated differently.
>      bool            inDeclDestructuring:1;
>  
> +    // True if we just saw a "use strict" directive and weren't strict before.

Change comment to "True if we were in a function and we just saw..."?

::: js/src/frontend/TokenStream.cpp
@@ +412,5 @@
> +    // Make the last token look like it it came from here. The parser looks at
> +    // the position of currentToken() to calculate line numbers.
> +    Token *cur = &tokens[cursor];
> +    cur->pos.begin.lineno = lineno;
> +    cur->pos.begin.index = pos.buf - linebase;

So the idea here is that after seek'ing, the parser will look at currentToken(), but only for the beginning lineno and index?  That's a bit nasty and requires intimate knowledge of what the parser's doing.

Is there a way to be more defensive here?  For one, could you reset the other fields in the token so that if they are read things will break quickly?

::: js/src/frontend/TokenStream.h
@@ +604,5 @@
>      }
>  
>      TokenKind peekToken() {
>          if (lookahead != 0) {
> +            JS_ASSERT(lookahead <= maxLookahead - 1);

Is |lookahead < maxLookahead| clearer?

@@ +622,5 @@
>          if (!onCurrentLine(currentToken().pos))
>              return TOK_EOL;
>  
>          if (lookahead != 0) {
> +            JS_ASSERT(lookahead <= maxLookahead - 1);

Ditto.
Attachment #690686 - Flags: review?(n.nethercote) → review+
Comment on attachment 690687 [details] [diff] [review]
grand strictness renaming

Review of attachment 690687 [details] [diff] [review]:
-----------------------------------------------------------------

Hmm.  We have both "strict warnings" and "strict mode", and |strict| doesn't distinguish between them sufficiently.  Can you use |strictMode|?  Sorry to be a PITA.  r=me with that.
Attachment #690687 - Flags: review?(n.nethercote) → review+
(In reply to Nicholas Nethercote [:njn] from comment #33)
> Comment on attachment 690687 [details] [diff] [review]
> grand strictness renaming
> 
> Review of attachment 690687 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Hmm.  We have both "strict warnings" and "strict mode", and |strict| doesn't
> distinguish between them sufficiently.  Can you use |strictMode|?  Sorry to
> be a PITA.  r=me with that.

The thing I like about "strict" is that it's an adjective, so its obviously it is a boolean. Strict warnings is something we always have to ask the context about rather a property of a script or function.
> The thing I like about "strict" is that it's an adjective, so its obviously
> it is a boolean.

If we're discussing that, my preference is for "isStrict".  But I can live with "strict" if you feel strongly about it.
Assignee: general → benjamin
Depends on: 821970
No longer depends on: 821970
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: