Closed Bug 1155472 Opened 9 years ago Closed 9 years ago

Implement ES6's grammar parametrization ideas

Categories

(Core :: JavaScript: Standard Library, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: Waldo, Assigned: Waldo)

References

Details

Attachments

(2 files)

The grammar makes frequent use of [Yield], [In], [Return], and so on as effectively arguments to grammar productions, so that ES6 doesn't have to have ES5's *NoIn grammar monstrosity -- and to reasonably address the problem of where yield is/isn't valid as a name.  We should add all this to our parser to get rid of the fragile parsingForInit stateful tracking.  This is also necessary to fully enforce yield-validity: checkYieldNameValidity does a fair job, and it's necessary so long as we have scumbag legacy generators, but there's a *lot* of messiness to exactly when yield is/isn't allowed.  We're going to have to do this sooner or later.  And conveniently, this also pointed out a few places where we get things wrong -- noted with FIXMEs for now, as this patch isn't intended to fix any bugs.  (And note how I haven't ripped out the parsingForInit code yet -- after this is in place, for simplicity.)

Because most of this is directed from Statement downward, I have a patch reordering the cases in Parser::statements first -- actual parameters come in a followup.

It would be nice if we compared each of these arguments against the spec grammar, but I probably can't quite hold it against anyone for not doing so here.  Presumably most of it's tested already -- certainly I made enough screwups that tests found as I worked on this.
I'm a little dubious that we really must detect top-level catch/finally and so on manually here, rather than letting ::expression complain, but whatever.
Attachment #8593689 - Flags: review?(jorendorff)
As requested, for an opportunity to grok grammar parametrization stuff.  Extra care in the values specified for everything is appreciated!
Attachment #8593695 - Flags: review?(efaustbmo)
Comment on attachment 8593695 [details] [diff] [review]
Add the ES6 grammar parametrization to all the Parser methods

Review of attachment 8593695 [details] [diff] [review]:
-----------------------------------------------------------------

Looks pretty good to me. I will say that I am most confused by the apparent lack of actually using the inHandling anywhere (at least an assert is probably called for, if possible), and that I have never wanted to kill off legacy generators so much. What's our ETA on their death? Have they been formally deprecated? If not, can we file a followup to do so and make that happen "real soon"?

::: js/src/frontend/BytecodeCompiler.cpp
@@ +341,5 @@
>  
>          TokenStream::Position pos(parser.keepAtoms);
>          parser.tokenStream.tell(&pos);
>  
> +        ParseNode* pn = parser.statement(YieldIsName, canHaveDirectives);

Is this true? Can BCE::CompileScript really never see generators? Isn't there a %GeneratorFunction% constructor similar to %Function%? Do we implement it?

@@ +365,5 @@
>                  if (!pc->init(parser.tokenStream))
>                      return nullptr;
>                  MOZ_ASSERT(parser.pc == pc.ptr());
>  
> +                pn = parser.statement(YieldIsName);

Same concern as above.

::: js/src/frontend/Parser.cpp
@@ +1012,5 @@
>  #endif
>  
>      Node pn;
>      if (type == StatementListBody) {
> +        pn = statements(yieldHandling);

How is it that we can be parsing a functionBody with some |in| semantics, and not need to pass that through to the statement list? This seems very odd to me, but also seems to be what the spec says.

@@ +1033,5 @@
>          MOZ_ASSERT(pc->lastYieldOffset == startYieldOffset);
>          break;
>  
>        case LegacyGenerator:
> +        // FIXME: Catch these errors eagerly, in Parser::yieldExpression.

Thanks. This is a nice comment improvement. We should endeavor (mostly I'm chiding myself) not to refer to functions in the "C comment style" with (), but by their C++ names.

@@ +1699,5 @@
>                      return false;
> +                // FIXME: This fails to handle a rest parameter named |yield|
> +                //        correctly outside of generators: that is,
> +                //        |var f = (...yield) => 42;| should be valid code!
> +                //        When this is fixed, make sure to coonsult both

typo: coonsult

@@ +1701,5 @@
> +                //        correctly outside of generators: that is,
> +                //        |var f = (...yield) => 42;| should be valid code!
> +                //        When this is fixed, make sure to coonsult both
> +                //        |yieldHandling| and |checkYieldNameValidity| for
> +                //        correctness when legacy generator syntax is removed.

I am confused by "when legacy generator syntax is removed". Is this meant to be "until"?
Is there a way we can kill checkYieldNameValidity() now? By doing the version checks at the top of Parser::parse, or something and passing that around as the baseline YieldValidity? Or leave it encapsulated, but replace the generatorKind checks in various places? Having both schemes seems rickety, somehow, but maybe it's mandatory until we can kill off legacy generators.

@@ +4462,5 @@
>          // and fall through.
>          tokenStream.ungetToken();
>        case TOK_LET:
>        case TOK_CONST:
> +        kid = lexicalDeclaration(YieldIsName, tt == TOK_CONST);

I'm not convinced I believe this one. I'm not seeing in the spec where exported lexicals are not subject to the early error in 15.2.3.1

@@ +5342,5 @@
>  {
>      MOZ_ASSERT(tokenStream.isCurrentTokenType(TOK_RETURN));
>      uint32_t begin = pos().begin;
>  
> +    MOZ_ASSERT(pc->sc->isFunctionBox());

This is lofty, but definitely good if it's right. Was this change intentional, or the bygone of an exploratory jit-test run?

@@ +5762,5 @@
>                      return null();
>                  break;
>  
>                case TOK_YIELD:
> +                if (yieldHandling == YieldIsKeyword) {

When we get this sorted finally, is there a way we can push this information into the tokenizer? I guess maybe not, with lookahead, though probably we could just always lookahead TOK_YIELD, and maybe get TOK_NAME with the right params? At any rate, it would be nice not to have to do this yieldHandling dance everywhere...

@@ +6081,5 @@
>        case TOK_RETURN:
> +        // The Return parameter is only used here, and the effect is easily
> +        // detected this way, so don't bother passing around an extra parameter
> +        // everywhere.
> +        if (!pc->sc->isFunctionBox()) {

Oh, OK. Disregard the question above, then.

@@ +6125,2 @@
>        case TOK_CONST:
>          if (!abortIfSyntaxParser())

aside: should we push this syntax parse abort into lexicalDeclaration? It's also invalid for lets, right? If not, we should just make consts work in lazy parsing. There's no good reason why they shouldn't, offhand.

@@ +6303,5 @@
>          if (!tokenStream.getToken(&tok))
>              return null();
> +
> +        // FIXME: Change this to use |inHandling == InAllowed|, not
> +        // |pc->parsingForInit|.

What's holding that back now? Followup possible?

@@ +8621,5 @@
>          if (!tokenStream.getToken(&next))
>              return null();
> +        // FIXME: This fails to handle a rest parameter named |yield| correctly
> +        //        outside of generators: |var f = (...yield) => 42;| should be
> +        //        valid code!  When this is fixed, make sure to coonsult both

same typo(s?) copypasted.
Attachment #8593695 - Flags: review?(efaustbmo) → review+
Comment on attachment 8593689 [details] [diff] [review]
Reorder the various statement items in Parser::statement to correspond to the ordering in the Statement grammar production

Review of attachment 8593689 [details] [diff] [review]:
-----------------------------------------------------------------

Stealing review as per IRC request. LGTM with one question addressed.

::: js/src/frontend/Parser.cpp
@@ +6037,5 @@
> +      // WithStatement[?Yield, ?Return]
> +      case TOK_WITH:
> +        return withStatement();
> +
> +      // LabelledStatement[?Yield, ?Return]

Where is this case handled? This feels like there's "code missing" on a read. Can we elaborate on this comment somehow?

@@ +6087,5 @@
> +      case TOK_FINALLY:
> +        report(ParseError, false, null(), JSMSG_FINALLY_WITHOUT_TRY);
> +        return null();
> +
> +        // NOTE: default case handled in the ExpressionStatement section.

Nit: unindent (fixed in patch in part 2, but I'm still gonna gripe. Haters gonna hate)
Attachment #8593689 - Flags: review?(jorendorff) → review+
(In reply to Eric Faust [:efaust] from comment #3)
> I am most confused by the apparent lack of actually using the inHandling
> anywhere (at least an assert is probably called for, if possible)

Haven't added this, haven't added either, in order to land the basic infrastructure, then make use of it after the fact.  Ditto for an assert.

And, actually, the assert is *not* equivalent to current behavior, at the edges.  I really didn't check closely, but really almost every place that passes a specific InHandling should have parsingForInit saving around it.  And from memory, I'm not sure they all do.  At the very least, I remember this case looking screwy -- and indeed it is.  In a pre-patch shell:

  for (x => x in true; false; ) break; // parses, runs without error

In contrast, if I followed the parameters correctly, and passed them correctly, and assuming my first pass on using InHandling *alone* for this is correct:

js> for (x => x in true; false; ) break;
typein:3:4 SyntaxError: invalid for/in left-hand side:
typein:3:4 for (x => x in true; false; ) break;
typein:3:4 ....^

This comports with what I read in ES6.  Bug in ES6?  Maybe.  Right now I don't care.  :-)  Get everything landed, turn it on separately.

> I have never wanted to kill off legacy generators so much. What's our ETA
> on their death? Have they been formally deprecated? If not, can we file a
> followup to do so and make that happen "real soon"?

Unclear.  The syntax that was in ES6 got punted, and who knows if it'll ever happen now.  :-\  Doesn't prevent us from deprecating, but when we can't point to a thing and say it's standard and won't change and it's safe to use, it's harder to do.

> ::: js/src/frontend/BytecodeCompiler.cpp
> @@ +341,5 @@
> >  
> >          TokenStream::Position pos(parser.keepAtoms);
> >          parser.tokenStream.tell(&pos);
> >  
> > +        ParseNode* pn = parser.statement(YieldIsName, canHaveDirectives);
> 
> Is this true? Can BCE::CompileScript really never see generators? Isn't
> there a %GeneratorFunction% constructor similar to %Function%? Do we
> implement it?
> 
> Same concern as above.

My DXRing says this is only reachable for top-level scripts, which fortunately comports with the ES6 sense of "script" as "a full program".

> ::: js/src/frontend/Parser.cpp
> @@ +1012,5 @@
> >  #endif
> >  
> >      Node pn;
> >      if (type == StatementListBody) {
> > +        pn = statements(yieldHandling);
> 
> How is it that we can be parsing a functionBody with some |in| semantics,
> and not need to pass that through to the statement list? This seems very odd
> to me, but also seems to be what the spec says.

It seems reasonable enough -- if we have a list of statements as body, there were braces surrounding it, and things are okay.

js> for (x => { x in true }; false; ) break; // fine

It's only when the |in| is maybe confusable with a for-in that things are screwy.  (Query whether |=>| should preclude that confusability, but whatever.)

> I am confused by "when legacy generator syntax is removed". Is this meant to
> be "until"?

Yeah, think so.  Fixt.

> Is there a way we can kill checkYieldNameValidity() now? By doing the
> version checks at the top of Parser::parse, or something and passing that
> around as the baseline YieldValidity? Or leave it encapsulated, but replace
> the generatorKind checks in various places? Having both schemes seems
> rickety, somehow, but maybe it's mandatory until we can kill off legacy
> generators.

Yeah, I don't see a way around it.  And honestly, I'd be somewhat surprised if there's not a place where (as with InHandling) different conclusions are reached.  We're stuck with both whe^H^H^Huntil legacy generators die.  :-(

> @@ +4462,5 @@
> >          // and fall through.
> >          tokenStream.ungetToken();
> >        case TOK_LET:
> >        case TOK_CONST:
> > +        kid = lexicalDeclaration(YieldIsName, tt == TOK_CONST);
> 
> I'm not convinced I believe this one. I'm not seeing in the spec where
> exported lexicals are not subject to the early error in 15.2.3.1

I talked on #jslang with this, and people agreed with me that 15.2.3.1 applies only to export { ... } and not to export var/const/let ....  But that points out that the TOK_VAR case is wrong.  So I left this as-is and changed the other to use YieldIsName.

(I think we enforce the 15.2.3.1 restrictions via getToken without KeywordIsName, but I haven't verified personally, given this all is short-circuited by import/export not being supported yet.)

> @@ +5762,5 @@
> >                      return null();
> >                  break;
> >  
> >                case TOK_YIELD:
> > +                if (yieldHandling == YieldIsKeyword) {
> 
> When we get this sorted finally, is there a way we can push this information
> into the tokenizer? I guess maybe not, with lookahead, though probably we
> could just always lookahead TOK_YIELD, and maybe get TOK_NAME with the right
> params? At any rate, it would be nice not to have to do this yieldHandling
> dance everywhere...

I'm not convinced there's much better to do than to test the parameter directly, either.  Fortunately I see a grand total of two places that test yieldHandling, so I'm not worried about fugly.

> @@ +6125,2 @@
> >        case TOK_CONST:
> >          if (!abortIfSyntaxParser())
> 
> aside: should we push this syntax parse abort into lexicalDeclaration? It's
> also invalid for lets, right? If not, we should just make consts work in
> lazy parsing. There's no good reason why they shouldn't, offhand.

The two reasons for this were, um, laziness, and early errors for const reassignments historically.  The latter needs to die, and the former is a symptom of benchmarketing.  :-(

> > +        // FIXME: Change this to use |inHandling == InAllowed|, not
> > +        // |pc->parsingForInit|.
> 
> What's holding that back now? Followup possible?

See start.  Will file followup.

(In reply to Eric Faust [:efaust] from comment #4)
> > +      // LabelledStatement[?Yield, ?Return]
> 
> Where is this case handled? This feels like there's "code missing" on a
> read. Can we elaborate on this comment somehow?

It's handled by expressionStatement in the default clause.  |foo| is either a label or a name, no way to know without sniffing the token after.  So expression parsing detects this.  I tweaked the comment a bit.

> > +        // NOTE: default case handled in the ExpressionStatement section.
> 
> Nit: unindent (fixed in patch in part 2, but I'm still gonna gripe. Haters
> gonna hate)

Fixed in the first part, even.
Blocks: 1163851
https://hg.mozilla.org/mozilla-central/rev/03335da9925a
https://hg.mozilla.org/mozilla-central/rev/d7a5e972e003
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: