Closed Bug 1066827 Opened 7 years ago Closed 7 years ago

"Illegal character" error shouldn't be overwritten by other error message.

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla36

People

(Reporter: arai, Assigned: arai)

References

(Blocks 1 open bug)

Details

Attachments

(13 files, 26 obsolete files)

23.17 KB, patch
arai
: review+
Details | Diff | Splinter Review
48.00 KB, patch
arai
: review+
Details | Diff | Splinter Review
8.93 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
43.58 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
4.29 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
4.63 KB, patch
arai
: review+
Details | Diff | Splinter Review
2.80 KB, patch
arai
: review+
Details | Diff | Splinter Review
4.51 KB, patch
arai
: review+
Details | Diff | Splinter Review
26.38 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
8.67 KB, patch
arai
: review+
Details | Diff | Splinter Review
6.33 KB, patch
arai
: review+
Details | Diff | Splinter Review
3.40 KB, patch
arai
: review+
Details | Diff | Splinter Review
1.36 KB, patch
arai
: review+
Details | Diff | Splinter Review
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:33.0) Gecko/20100101 Firefox/33.0
Build ID: 20140908190852

Steps to reproduce:

Derived from bug 1041426.

When TokenStream.peekToken() returns TOK_ERROR, "Illegal character" error is already reported inside TokenStream. Parser should not overwrite it with other error message.
Depends on: 1041426
OS: Mac OS X → All
Hardware: x86 → All
Thank you for landing the patch in bug 1041426.

If we should check TOK_ERROR everytime we call peekToken or similar methods, it will be better to check it inside those methods, and return the result.

I changed following methods:
  * peekToken
  * getToken
  * peekTokenSameLine
  * matchToken
  * matchContextualKeyword
Now they returns boolean value which indicates that TOK_ERROR is not hit.
We can abort parsing when they return false. And there is no need to compare current token with TOK_ERROR separated from those method calls.

Here are some examples:

  peekToken
    before
      TokenKind tt = tokenStream.peekToken();
      if (tt == TOK_ERROR)
        return null();
      ...
    after
      TokenKind tt;
      if (!tokenStream.peekToken(&tt))
        return null();
      ...
  matchToken
    before
      if (tokenStream.matchToken(TOK_FOR))
        ...
      else if (/* check TOK_ERROR */)
        return null();
    after
      bool matched;
      if (!tokenStream.matchToken(&matched, TOK_FOR))
        return null();
      if (matched)
        ...


Prepred 5 parts for each methods, and 1 part for testcase.


Almost green on try run: https://tbpl.mozilla.org/?tree=Try&rev=15a8d8ad2b34
dt3 (browser_storage_values.js) fails once and suceeds twice, I guess it's not related to this patch.
Attachment #8492750 - Flags: review?(jwalden+bmo)
Attachment #8492751 - Flags: review?(jwalden+bmo)
Attachment #8492752 - Flags: review?(jwalden+bmo)
Attachment #8492753 - Flags: review?(jwalden+bmo)
Attachment #8492754 - Flags: review?(jwalden+bmo)
Attachment #8492755 - Flags: review?(jwalden+bmo)
Blocks: jserror
(In reply to Tooru Fujisawa [:arai] from comment #1)
> If we should check TOK_ERROR everytime we call peekToken or similar methods,
> it will be better to check it inside those methods, and return the result.
> 
> I changed following methods:
>   * peekToken
>   * getToken
>   * peekTokenSameLine
>   * matchToken
>   * matchContextualKeyword
> Now they returns boolean value which indicates that TOK_ERROR is not hit.
> We can abort parsing when they return false. And there is no need to compare
> current token with TOK_ERROR separated from those method calls.

Yeah, this is a good change.

I wonder, once these patches are in, how much TOK_ERROR even needs to exist.  (Having not looked at these patches to see just how many uses are removed, just intuiting.)  Guess we can regroup after these are in the tree.
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #7)
> I wonder, once these patches are in, how much TOK_ERROR even needs to exist.
> (Having not looked at these patches to see just how many uses are removed,
> just intuiting.)  Guess we can regroup after these are in the tree.

After those patches are applied, TOK_ERROR is used only in TokenKind.h and TokenStream.* files.
I guess error chould be checked by `flags.hadError` instead of comparing token with TOK_ERROR inside TokenStream class. If TOK_LIMIT can be used as default value for token, TOK_ERROR is no more needed.
Some patches conflict with other commits.
I'll fix them soon.
Rebased, and fixed conflict between bug 1054835 in Part2.
Also, added more tests for 'export' syntax in Part6,
and removed TOK_ERROR as Part7 and 8.

Green on try run: https://tbpl.mozilla.org/?tree=Try&rev=eddfdc0f3812
( https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=eddfdc0f3812 )
Attachment #8492750 - Attachment is obsolete: true
Attachment #8492750 - Flags: review?(jwalden+bmo)
Attachment #8496091 - Flags: review?(jwalden+bmo)
Attachment #8492751 - Attachment is obsolete: true
Attachment #8492751 - Flags: review?(jwalden+bmo)
Attachment #8496092 - Flags: review?(jwalden+bmo)
Attachment #8492752 - Attachment is obsolete: true
Attachment #8492752 - Flags: review?(jwalden+bmo)
Attachment #8496093 - Flags: review?(jwalden+bmo)
Attachment #8492753 - Attachment is obsolete: true
Attachment #8492753 - Flags: review?(jwalden+bmo)
Attachment #8496094 - Flags: review?(jwalden+bmo)
Attachment #8492754 - Attachment is obsolete: true
Attachment #8492754 - Flags: review?(jwalden+bmo)
Attachment #8496095 - Flags: review?(jwalden+bmo)
Attachment #8492755 - Attachment is obsolete: true
Attachment #8492755 - Flags: review?(jwalden+bmo)
Attachment #8496096 - Flags: review?(jwalden+bmo)
Attachment #8496097 - Flags: review?(jwalden+bmo)
Attachment #8496098 - Flags: review?(jwalden+bmo)
Comment on attachment 8496091 [details] [diff] [review]
Part1: Check error in TokenStream.peekToken.

Review of attachment 8496091 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/asmjs/AsmJSValidate.cpp
@@ +357,2 @@
>          ts.consumeKnownToken(TOK_SEMI);
> +    (void) ts.peekToken(&tk, TokenStream::Operand);

I'd rather see

    TokenKind tk;
    while (true) {
        if (!ts.peekToken(&tk, TokenStream::Operand))
            return TOK_ERROR;
        if (tk != TOK_SEMI)
            break;
        ts.consumeKnownToken(TOK_SEMI);
    }
    return tk;

Longer, sure, but you don't have this explicitly-avoiding-error-cases oddness.  And really, comma expressions in a loop condition is pretty frowny anyway.

::: js/src/frontend/Parser.cpp
@@ +2688,2 @@
>              break;
>          }

Braces should go now.  And tt <= TOK_EOF means only TOK_EOF or TOK_ERROR, and we've eliminated the latter, so make this just

  tt == TOK_EOF || tt == TOK_RC

@@ +4180,1 @@
>          return null();

I think these days we're trying to do

    if (tt == TOK_SEMI) {
        if (!report(...))
            return null();
    }

which is to say, have a guard around all the logic, rather than requiring careful parsing of short-circuiting by the reader.

@@ +4828,5 @@
>          Node body = handler.newStatementList(pc->blockid(), pos());
>          if (!body)
>              return null();
>  
> +        for (;;) {

I mildly prefer |while (true)| for this.

::: js/src/frontend/TokenStream.h
@@ +397,5 @@
> +    bool peekToken(TokenKind *ttp, Modifier modifier = None) {
> +        if (lookahead != 0) {
> +            TokenKind tt = tokens[(cursor + 1) & ntokensMask].type;
> +            *ttp = tt;
> +            return tt != TOK_ERROR;

Ultimately, it seems to me that *if* we have lookahead, it *should* be the case that the lookahead is valid, never TOK_ERROR.  Maybe that'll be an easy change to make after this patch series is done.

@@ +403,4 @@
>          TokenKind tt = getTokenInternal(modifier);
>          ungetToken();
> +        *ttp = tt;
> +        return tt != TOK_ERROR;

We should probably common a little:

    if (lookahead > 0) {
        *ttp = tokens[(cursor + 1) & ntokensMask].type;
    } else {
        *ttp = getTokenInternal(modifier);
        ungetToken();
    }
    return *ttp != TOK_ERROR;

@@ +467,5 @@
>  
>      bool nextTokenEndsExpr() {
> +        TokenKind tt;
> +        if (!peekToken(&tt))
> +            return false;

Hm.  This answer is not really right or wrong in this situation.  We should probably change this method signature as well to complain immediately if it encounters an error, too.  Another patch.
Attachment #8496091 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8496092 [details] [diff] [review]
Part2: Check error in TokenStream.getToken.

Review of attachment 8496092 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/asmjs/AsmJSValidate.cpp
@@ +6370,1 @@
>      JS_ASSERT(tk == TOK_FUNCTION);

This can just be

    tokenStream.consumeKnownToken(TOK_FUNCTION);

::: js/src/frontend/Parser.cpp
@@ +1210,5 @@
>          return false;
>      if (tt != TOK_EOF && tt != TOK_EOL && tt != TOK_SEMI && tt != TOK_RC) {
>          /* Advance the scanner for proper error location reporting. */
> +        if (!ts.getToken(&tt, TokenStream::Operand))
> +            return false;

ts.consumeKnownToken(tt, TokenStream::Operand).  We may need to add an optional Modifier argument to that method, to do this.

@@ +1974,2 @@
>          return false;
>      return true;

Just return tokenStream.getToken(...).

|tt| as passed to this method should be a pointer, not a reference -- this looked an awful lot like not checking a TokenKind and accepting something dreckish, at first glance.  Followup patch?

@@ +4387,5 @@
>                  pn1 = variables(tt == TOK_VAR ? PNK_VAR : PNK_CONST);
>              } else if (tt == TOK_LET) {
>                  handler.disableSyntaxParser();
> +                if (!tokenStream.getToken(&tt))
> +                    return null();

tokenStream.consumeKnownToken(tt, TokenStream::Operand);

@@ +4819,5 @@
>      pc->blockNode = caseList;
>  
>      bool seenDefault = false;
>      TokenKind tt;
> +    for (;;) {

while (true) {

@@ +5782,5 @@
>          return null();
>  
> +    TokenKind tt;
> +    if (!tokenStream.getToken(&tt)) /* read one token past the end */
> +        return null();

The comment is more a literal restatement of what's done than helpful commentary.  Also, I'd prefer a different name indicating the token is being (deliberately, temporarily) ignored:

    // Advance to the next token; the caller is responsible for interpreting it.
    TokenKind ignored;
    if (!tokenStream.getToken(&tt))
        return null();

@@ +5924,2 @@
>              return null();
>          tokenStream.ungetToken();

This might as well be

    TokenKind ignored;
    if (!tokenStream.peekToken(&ignored))
        return null();

On the other hand, this looks a lot like code I just reviewed in bug 1054835, so hopefully this isn't even a thing now.

@@ +7122,5 @@
>          if (!lhs)
>              return null();
>      }
>  
> +    for (;;) {

while (true)

@@ +7131,1 @@
>          Node nextMember;

Blank line before this.  Generally it's good to have blank lines between distinct bits of concept.  This one could probably get away without it before, but now I think it's much more clearly desirable.

::: js/src/frontend/TokenStream.h
@@ +372,5 @@
>          TemplateTail,   // Treat next characters as part of a template string
>      };
>  
>      // Get the next token from the stream, make it the current token, and
> +    // store its kind to ttp, return true if no error happens.

Slightly different wording:

    // Advance to the next token.  If the token stream encountered an error, return
    // false.  Otherwise return true and store the token kind in |*ttp|.

I imagine at the end of your patches this storing thing doesn't matter, but just in case, I don't want to imply anything about |*ttp| in the failure case, as usually happens with fallible methods with outparams.

@@ +441,5 @@
>          //   is a newline between the next token and the one after that.
>          // The following test is somewhat expensive but gets these cases (and
>          // all others) right.
> +        TokenKind tmp;
> +        (void) getToken(&tmp, modifier);

I...assume you're changing this method to be fallible in subsequent patch?  Not okay to just ignore error cases here, except if it's a temporary state of affairs.

@@ +461,3 @@
>              return true;
>          ungetToken();
>          return false;

This definition isn't really right.  The meaning of the return value was true if the provided token was encountered and consumed, false otherwise.  I guess maybe we're preserving that here from before, but ugh.  Hopefully a subsequent patch is fixing this.

@@ +466,5 @@
>      void consumeKnownToken(TokenKind tt) {
>          JS_ALWAYS_TRUE(matchToken(tt));
>      }
>  
>      bool matchContextualKeyword(Handle<PropertyName*> keyword) {

Same comment here as on matchToken.

::: js/src/jsfun.cpp
@@ +931,5 @@
>      do {
> +        TokenKind tt;
> +        if (!ts.getToken(&tt))
> +            // Must be memory.
> +            return false;

Braces around multiline if/else-bodies.  But I don't even understand this comment, and this if-false-return-failure idiom is totally ubiquitous, so just remove it.

@@ +1809,1 @@
>                   */

/* * */ like that is for multiline comments.  If this isn't one, either do

    /* Check that it's a name. */

or

    // Check that it's a name.
Attachment #8496092 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8496093 [details] [diff] [review]
Part3: Check error in TokenStream.peekTokenSameLine.

Review of attachment 8496093 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/TokenStream.h
@@ +419,5 @@
>      }
>  
>      // This is like peekToken(), with one exception:  if there is an EOL
>      // between the end of the current token and the start of the next token, it
> +    // stores TOK_EOL to ttp.  In that case, no token with TOK_EOL is actually

*ttp, not ttp.  On the other hand, a previous review comment suggested a rewording, so probably this is passed up by the past.  (Wordplay intended.)

@@ +433,5 @@
>          // to return TOK_EOL.
> +        if (lookahead != 0 && srcCoords.isOnThisLine(curr.pos.end, lineno)) {
> +            TokenKind tt = tokens[(cursor + 1) & ntokensMask].type;
> +            *ttp = tt;
> +            return tt != TOK_ERROR;

No harm in writing to *ttp in the error case, but we don't want to promise this in the API.  It's a momentary API behavior that people shouldn't rely on.
Attachment #8496093 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8496094 [details] [diff] [review]
Part4: Check error in TokenStream.matchToken.

Review of attachment 8496094 [details] [diff] [review]:
-----------------------------------------------------------------

r- for the Parser::variables loop change and the matchForInOrOf thing -- both are a bit messy as far as changes go, and to be honest, I sort of started skimming when I saw the latter was a little off.  :-)  So another look is wanted here.

::: js/src/frontend/Parser.cpp
@@ +1227,1 @@
>      return true;

bool ignored;
return ts.matchToken(&ignored, TOK_SEMI);

@@ +1531,5 @@
>          return false;
>      handler.setFunctionBody(funcpn, argsbody);
>  
> +    bool hasArguments = false;
> +    if (parenFreeArrow) {

Ouch, this code expansion hurts.  But I guess there's no way around it.

@@ +1545,5 @@
>          bool hasDefaults = false;
>          Node duplicatedArg = null();
>          Node list = null();
>  
> +        for (;;) {

while (true)

@@ +3527,5 @@
> +             * Strict mode eliminates a grammar ambiguity with unparenthesized
> +             * LetExpressions in an ExpressionStatement. If followed immediately
> +             * by an arguments list, it's ambiguous whether the let expression
> +             * is the callee or the call is inside the let expression body.
> +             *

Huh, I never knew about this!  Let's add an example to the comment:

             * ...expression body:
             *
             *   function id(x) { return x; }
             *   var x = "outer";
             *   // Does this parse as
             *   //   (let (loc = "inner") id)(loc) // "outer"
             *   // or as
             *   //   let (loc = "inner") (id(loc)) // "inner"
             *   let (loc = "inner") id(loc);

@@ +3673,5 @@
>          data.initVarOrConst(op);
>  
>      bool first = true;
>      Node pn2;
> +    for (;;) {

while (true)

@@ +3678,3 @@
>          if (psimple && !first)
>              *psimple = false;
> +        if (!first) {

Ugh, continue.  This may or may not be an equivalent semantic change (tho it probably is).  Whether it is or not, tho, structuring the code this way requires that a reader of it understand approximately the "full" structure of the loop, to understand when this can be hit.  That's not great for the casual reader.

Instead of having this boolean-conditioned behavior for end-of-loop logic, I think we should do this the way we do it elsewhere: with a do-while (false) loop.  The only exits out of the current loop code are via continue -- which needs the match-comma logic -- and via return -- which doesn't.  If we do instead this:

    while (true) {
        do {
            if (psimple && !first)
                *psimple = false;
            first = false;

            // ...all the rest of the current loop code, with every
            // |continue| replaced with a |break|
        } while (false);

        bool matched;
        if (!tokenStream.matchToken(&matched, TOK_COMMA))
            return null();
        if (!matched)
            break;
    }

Arguably this isn't hugely better, to be sure.  But it keeps end-of-loop behavior at end of loop, and loops broken out of this way are an idiom we use lots of places, so I think it's worth it.

@@ +3702,5 @@
>              bool ignored;
> +            bool parsingForInOrOfInit = false;
> +            if (pc->parsingForInit) {
> +                bool matched;
> +                if (!matchInOrOf(&matched, &ignored))

Move |ignored|'s declaration into this loop.

Now that we have two outparam bools here, I...have no idea what they each mean, when reading.  |ignored| particularly is really bad here.  :-)  I think we need to rejigger matchForInOrOf a little harder than we're doing here.  I propose we give it this signature:

    bool matchForInOrOf(bool *isForIn, bool *isForOf);

where return-false indicates error, return-true indicates no error, and if the next token is in/of you get one outparam set to true, one to false.  Another possibility is a tristate enum outparam, but that feels a little much.  Maybe.  My intuition isn't strong enough to say one thing is definitely the right way; please say something if you think some other option's better.  :-)

@@ +3999,5 @@
>                  return null();
>  
>              handler.addList(importSpecSet, importSpec);
>          } else {
> +            for (;;) {

while (true)

@@ +4121,5 @@
>          if (!kid)
>              return null();
>  
>          if (tt == TOK_LC) {
> +            for (;;) {

while (true)

@@ +4352,4 @@
>          *isForOfp = false;
>          return true;
>      }
>      if (tokenStream.matchContextualKeyword(context->names().of)) {

Hmm, this has fallibility problems.  But, really, I think having two matches in here is a little off.  We should have a getToken, then if that succeeds comparisons for TOK_IN and (TOK_NAME + name is of), with an unget if neither matches.  Basically use of matchToken inside this method isn't warranted, when we're looking for multiple different things.

@@ +5684,4 @@
>          Node seq = handler.newList(PNK_COMMA, pn);
>          if (!seq)
>              return null();
> +        for (;;) {

while (true)

@@ +6533,5 @@
>  
>      JS_ASSERT(pc->staticScope && pc->staticScope == pn->pn_objbox->object);
>      data.initLet(HoistVars, &pc->staticScope->as<StaticBlockObject>(), JSMSG_ARRAY_INIT_TOO_BIG);
>  
> +    for (;;) {

while (true)

@@ +7142,5 @@
>  
>      uint32_t startYieldOffset = pc->lastYieldOffset;
>      bool arg0 = true;
>  
> +    for (;;) {

while (true)

::: js/src/frontend/TokenStream.h
@@ +461,4 @@
>          TokenKind token;
>          if (!getToken(&token, modifier)) {
>              ungetToken();
> +            *matchedp = false;

Don't add this.  Anyone depending on it, after this patch, is doing it wrong.  We want no writes, so that if someone accidentally expects it to have been used, ASAN/Valgrind/whatever will tell us about the error.  But adding a deliberate write like this will fool them.

The same could/should be said about the ungetToken() call here, too.  Please remove that as well.  Callers of this method should, post-patch, all be immediately failing.  If we find any callers expect the unget to happen in this case, we should fix them too.  (I doubt there are any, so not worth an audit -- fuzzers will catch this, I expect.)

@@ +475,5 @@
>  
>      void consumeKnownToken(TokenKind tt) {
> +        bool matched;
> +        (void) matchToken(&matched, tt);
> +        JS_ALWAYS_TRUE(matched);

Switch this to MOZ_ALWAYS_TRUE while you're touching it.
Attachment #8496094 - Flags: review?(jwalden+bmo) → review-
Comment on attachment 8496095 [details] [diff] [review]
Part5: Check error in TokenStream.matchContextualKeyword.

Review of attachment 8496095 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, modulo the previous review obsoleting some of these changes.  :-)  The relevant adjustments should be trivial, tho.

::: js/src/frontend/TokenStream.h
@@ +483,4 @@
>          TokenKind token;
>          if (!getToken(&token)) {
>              ungetToken();
> +            *matchedp = false;

The unget and matched-assignment should be unnecessary, remove them (or in a followup patch, if necessary).  (We should probably land all these patches as one folded-together setup, so we don't have intermediate-state worries.)
Attachment #8496095 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8496096 [details] [diff] [review]
Part6: Add Testcase for Illegal character error.

Review of attachment 8496096 [details] [diff] [review]:
-----------------------------------------------------------------

This test is utterly brilliant.  So much awesome here.  \o/  r++++++++++ would review again.  (But none is necessary, to be perfectly clear -- just a humorous allusion to the over-the-top seller review style found on eBay.  :-) )

::: js/src/jit-test/tests/basic/syntax-error-illegal-character.js
@@ +1,2 @@
> +var JSMSG_ILLEGAL_CHARACTER = "illegal character";
> +var JSMSG_UNTERMINATED_STRING = "unterminated string literal";

Let's hope ES7 never makes @ a valid token/token prefix.  ;-)

@@ +6,5 @@
> +  try {
> +    Reflect.parse(code);
> +  } catch (e) {
> +    caught = true;
> +    assertEq(e.message, JSMSG_ILLEGAL_CHARACTER, code);

For sanity, here and in test_eval please add assertEq(e instanceof SyntaxError, true) too.

@@ +174,5 @@
> +
> +// var
> +
> +test("var @");
> +test("var x @");

We should probably add a few for destructuring declarations as well:

test("var [ @");
test("var [ x @");
test("var [ x, @");
test("var { @");
test("var { x @");
test("var { x: @");
test("var { x: y @");
test("var { x: y, @");
test("var { x: y } @");
test("var { x: y } = @");

@@ +278,5 @@
> +test("function* f() { yield* @");
> +test("function* f() { yield* 1 @");
> +
> +test("function* f() { yield\n@");
> +test("function* f() { yield*\n@");

test("function* f() { yield\n*@");

(I can't remember if there's a [no LineTerminator] restriction between yield and * -- if there is this might be invalid.)

@@ +463,5 @@
> +test_no_fun_no_eval("import 'a'; @");
> +
> +// label
> +
> +test("a: @");

Surely this is somewhere in this file, but a skim didn't find it:

test("a @");

@@ +559,5 @@
> +test("({ get p( @");
> +test("({ get p() @");
> +test("({ get p() { @");
> +test("({ get p() {} @");
> +test("({ get p() {}, @");

Probably should add

test("({ get p() {}, } @");

for trailing-comma-ish covering, tho I suppose this is kinda more parenthesis-land or something.

@@ +668,5 @@
> +test("a[ @");
> +test("a[1 @");
> +test("a[1] @");
> +
> +test("a. @");

This was valid (well, with a name after) in E4X.  Good riddance!  :-D

@@ +698,5 @@
> +test("delete a @");
> +test("delete a[ @");
> +test("delete a[b @");
> +test("delete a[b] @");
> +test("delete a[b]; @");

Add copies of all of these with a '(' after the delete.  Parentheses don't affect delete's behavior, but I think there's been implementation oddity with this in the past, so more testing better.

@@ +704,5 @@
> +// void
> +
> +test("void @");
> +test("void a @");
> +test("void a; @");

Same '(' thing here.

@@ +710,5 @@
> +// typeof
> +
> +test("typeof @");
> +test("typeof a @");
> +test("typeof a; @");

And here.

@@ +716,5 @@
> +// -
> +
> +test("- @");
> +test("- 1 @");
> +test("- 1; @");

Add + versions of these.

@@ +818,5 @@
> +test("([ @");
> +test("([a @");
> +test("([a] @");
> +test("([a]) @");
> +test("([a]) => @");

Add ({a}) versions of these.

@@ +885,5 @@
> +test("let ( x = 1, y = 2 ) @");
> +test("let ( x = 1, y = 2 ) { @");
> +test("let ( x = 1, y = 2 ) { x @");
> +test("let ( x = 1, y = 2 ) { x; @");
> +test("let ( x = 1, y = 2 ) { x; } @");

That weird strict-mode case suggests we should have this:

test("let ( x = 1, y = 2) x @");
Attachment #8496096 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8496097 [details] [diff] [review]
Part7: Return error status from TokenStream.getTokenInternal.

Review of attachment 8496097 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/TokenStream.cpp
@@ +1640,1 @@
>      onError();

I guess the assignments above are potentially consumed by onError() or the caller?  ...no, they aren't, consulting -- at least for the former.  Presumably the callers do, but it would be good to confirm this is the exact set of things expected to be sane on error.

onError() is called only here, so we should just fold it into this location, seems to me.

::: js/src/frontend/TokenStream.h
@@ +381,5 @@
>              cursor = (cursor + 1) & ntokensMask;
>              TokenKind tt = currentToken().type;
>              JS_ASSERT(tt != TOK_EOL);
>              *ttp = tt;
> +            return !flags.hadError;

We shouldn't have lookahead that's an error token, once all these patches are done.  By this point in the patch set, can we MOZ_ASSERT(!flags.hadError) and return true?  Or must that happen at the end, following up?  Either way's fine by me.

@@ +397,5 @@
>  
>      bool peekToken(TokenKind *ttp, Modifier modifier = None) {
>          if (lookahead != 0) {
> +            *ttp = tokens[(cursor + 1) & ntokensMask].type;
> +            return !flags.hadError;

Same shouldn't-be-needed, can-assert thing here too.

@@ +404,2 @@
>          ungetToken();
> +        return result;

if (!getTokenInternal(...))
    return false;
ungetToken();
return true;

...at least after some set of patches in this bug.

@@ +408,5 @@
>      TokenPos peekTokenPos(Modifier modifier = None) {
>          if (lookahead != 0)
>              return tokens[(cursor + 1) & ntokensMask].pos;
> +        TokenKind tt;
> +        (void) getTokenInternal(&tt, modifier);

This function needs to be fallible as well, looks like.  We need a patch to handle this case as well.

@@ +429,5 @@
>          // stronger condition than what we are looking for, and we don't need
>          // to return TOK_EOL.
>          if (lookahead != 0 && srcCoords.isOnThisLine(curr.pos.end, lineno)) {
> +            *ttp = tokens[(cursor + 1) & ntokensMask].type;
> +            return !flags.hadError;

Again asserting comment from above.
Attachment #8496097 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8496098 [details] [diff] [review]
Part8: Remove TOK_ERROR from TokenKind.

Review of attachment 8496098 [details] [diff] [review]:
-----------------------------------------------------------------

We still need more followup patches for the various things flagged in all these reviews.  I think they probably should land at the same time as all patches here, for simplicity/sureness of overall correctness.  But this looks good.  Just more to do before we can land anything here.  :-)  Good work on all this!  So nice to see the parser/token stream get better about handling errors correctly.

::: js/src/frontend/TokenStream.cpp
@@ +522,3 @@
>  TokenStream::advance(size_t position)
>  {
>      const char16_t *end = userbuf.base() + position;

We should have a

  MOZ_ASSERT(position <= mozilla::PointerRangeSize(userbuf.base(), userbuf.limit()));

here.  (I *think* it's supposed to be <=, but please double-check me that it's not supposed to be <.  Tests will probably reveal quickly if it's wrong.)  I'm surprised we don't already.

@@ +526,5 @@
>          getChar();
>  
>      Token *cur = &tokens[cursor];
>      cur->pos.begin = userbuf.addressOfNextRawChar() - userbuf.base();
> +    cur->type = TOK_LIMIT;

We shouldn't need TOK_LIMIT, either, as a sentinel when all's said and done, just same as TOK_ERROR.  This is fine to kill off TOK_ERROR, but we shouldn't need to assign anything here...I think.  Let's work at removing this, followup patch.  (If it's just supposed to be a poison value, we should use mozilla/MemoryChecking.h's MOZ_MAKE_MEM_UNDEFINED(addr, size) for it.)

@@ +1640,2 @@
>      onError();
> +    *ttp = TOK_LIMIT;

MOZ_MAKE_MEM_UNDEFINED(ttp, sizeof(*ttp)) and ditto for tp->type, ideally/I think.  Followup patch.

@@ +1779,2 @@
>      // match what it expected, it will unget the token, and the next getToken()
> +    // call will immediately return the just-gotten TOK_LIMIT token again

This "will either" bit, after this bug, turns into always aborting parsing immediately, so this comment needs narrowing.

::: js/src/frontend/TokenStream.h
@@ +109,2 @@
>          pos(0, 0)
>      {

MOZ_MAKE_MEM_UNDEFINED(type, sizeof(type)) inside the constructor, and don't mention type in the member initializer list at all.

@@ +351,5 @@
>      {
>          bool isEOF:1;           // Hit end of file.
>          bool isDirtyLine:1;     // Non-whitespace since start of line.
>          bool sawOctalEscape:1;  // Saw an octal character escape.
> +        bool hadError:1;        // Hit illegal character.

From earlier reading of patches and testing, I think this isn't the only case where hadError occurs.  Specifically, I think this case would hit it:

  Function("/foo");

from an unterminated RegExp.  Maybe ditto for a string literal?  Not sure off top of head.  I think we probably shouldn't change this this way, but rather to say "Hit a syntax error, at start or during a token."  Or something.
Attachment #8496098 - Flags: review?(jwalden+bmo) → review+
Thank you!

Updated patches are almost ready, now testing :)
  https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c109b3c035e2

Then, I have some questions.

(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #19)
> Comment on attachment 8496092 [details] [diff] [review]
> Part2: Check error in TokenStream.getToken.
> 
> Review of attachment 8496092 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/frontend/Parser.cpp
> @@ +1210,5 @@
> >          return false;
> >      if (tt != TOK_EOF && tt != TOK_EOL && tt != TOK_SEMI && tt != TOK_RC) {
> >          /* Advance the scanner for proper error location reporting. */
> > +        if (!ts.getToken(&tt, TokenStream::Operand))
> > +            return false;
> 
> ts.consumeKnownToken(tt, TokenStream::Operand).  We may need to add an
> optional Modifier argument to that method, to do this.

Correct me if I'm wrong, I guess Modifier has no meaning for consumeKnownToken, because the token should be already gotten (so `lookahead != 0`, and modifier parameter is not used in getToken), otherwise it's not "known token".
Maybe adding `MOZ_ASSERT(lookahead != 0)` to consumeKnownToken is enough?

> @@ +461,3 @@
> >              return true;
> >          ungetToken();
> >          return false;
> 
> This definition isn't really right.  The meaning of the return value was
> true if the provided token was encountered and consumed, false otherwise.  I
> guess maybe we're preserving that here from before, but ugh.  Hopefully a
> subsequent patch is fixing this.

Sorry, I'm not sure I understand what you mean.
Do you mean the change in Part4, or anything else?
Flags: needinfo?(jwalden+bmo)
(In reply to Tooru Fujisawa [:arai] from comment #26)
> Correct me if I'm wrong, I guess Modifier has no meaning for
> consumeKnownToken, because the token should be already gotten (so `lookahead
> != 0`, and modifier parameter is not used in getToken), otherwise it's not
> "known token".
> Maybe adding `MOZ_ASSERT(lookahead != 0)` to consumeKnownToken is enough?

Good point -- adding that is definitely the right thing to do.

> > @@ +461,3 @@
> > >              return true;
> > >          ungetToken();
> > >          return false;
> > 
> > This definition isn't really right.  The meaning of the return value was
> > true if the provided token was encountered and consumed, false otherwise.  I
> > guess maybe we're preserving that here from before, but ugh.  Hopefully a
> > subsequent patch is fixing this.
> 
> Sorry, I'm not sure I understand what you mean.
> Do you mean the change in Part4, or anything else?

Yes, I meant that the incremental state was bad, and that hopefully some subsequent patch here was fixing it.  It looks like Part 4 *does* make said correction, so you're good here.  :-)
Flags: needinfo?(jwalden+bmo)
Thanks a lot!

Changed following methods, as Part6, Part7 and Part8 (inserted before tests. so Part6,7,8 in previous patch series are renumbered as Part9,10,11).
  * tokenStream.peekTokenPos
  * tokenStream.nextTokenEndsExpr
  * PeekToken (AsmJSValidate.cpp)
Also, fixed following method as Part12:
  * Parser.addExprAndGetNextTemplStrToken


Green on try run: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=334746a31a9a


(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #18)
> Comment on attachment 8496091 [details] [diff] [review]
> Part1: Check error in TokenStream.peekToken.
> 
> Review of attachment 8496091 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/frontend/TokenStream.h
> @@ +397,5 @@
> > +    bool peekToken(TokenKind *ttp, Modifier modifier = None) {
> > +        if (lookahead != 0) {
> > +            TokenKind tt = tokens[(cursor + 1) & ntokensMask].type;
> > +            *ttp = tt;
> > +            return tt != TOK_ERROR;
> 
> Ultimately, it seems to me that *if* we have lookahead, it *should* be the
> case that the lookahead is valid, never TOK_ERROR.  Maybe that'll be an easy
> change to make after this patch series is done.

Fixed in Part10, after removing all unnecessary ungetToken in error case.

> @@ +467,5 @@
> >  
> >      bool nextTokenEndsExpr() {
> > +        TokenKind tt;
> > +        if (!peekToken(&tt))
> > +            return false;
> 
> Hm.  This answer is not really right or wrong in this situation.  We should
> probably change this method signature as well to complain immediately if it
> encounters an error, too.  Another patch.

Fixed as Part7.
Attachment #8496091 - Attachment is obsolete: true
Attachment #8504488 - Flags: review?(jwalden+bmo)
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #19)
> Comment on attachment 8496092 [details] [diff] [review]
> Part2: Check error in TokenStream.getToken.
> 
> @@ +1974,2 @@
> >          return false;
> >      return true;
> 
> Just return tokenStream.getToken(...).
> 
> |tt| as passed to this method should be a pointer, not a reference -- this
> looked an awful lot like not checking a TokenKind and accepting something
> dreckish, at first glance.  Followup patch?

Fixed as Part12.

> @@ +441,5 @@
> >          //   is a newline between the next token and the one after that.
> >          // The following test is somewhat expensive but gets these cases (and
> >          // all others) right.
> > +        TokenKind tmp;
> > +        (void) getToken(&tmp, modifier);
> 
> I...assume you're changing this method to be fallible in subsequent patch? 
> Not okay to just ignore error cases here, except if it's a temporary state
> of affairs.

Changed to return TOK_ERROR immediately.
(Of course it's changed again in Part3)
Attachment #8496092 - Attachment is obsolete: true
Attachment #8504489 - Flags: review?(jwalden+bmo)
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #20)
> Comment on attachment 8496093 [details] [diff] [review]
> Part3: Check error in TokenStream.peekTokenSameLine.
> 
> Review of attachment 8496093 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> @@ +433,5 @@
> >          // to return TOK_EOL.
> > +        if (lookahead != 0 && srcCoords.isOnThisLine(curr.pos.end, lineno)) {
> > +            TokenKind tt = tokens[(cursor + 1) & ntokensMask].type;
> > +            *ttp = tt;
> > +            return tt != TOK_ERROR;
> 
> No harm in writing to *ttp in the error case, but we don't want to promise
> this in the API.  It's a momentary API behavior that people shouldn't rely
> on.

Fixed in Part10, by removing error case in `lookahead != 0`.
Attachment #8496093 - Attachment is obsolete: true
Attachment #8504490 - Flags: review?(jwalden+bmo)
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #21)
> Comment on attachment 8496094 [details] [diff] [review]
> Part4: Check error in TokenStream.matchToken.
> 
> Review of attachment 8496094 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> @@ +3702,5 @@
> >              bool ignored;
> > +            bool parsingForInOrOfInit = false;
> > +            if (pc->parsingForInit) {
> > +                bool matched;
> > +                if (!matchInOrOf(&matched, &ignored))
> 
> Move |ignored|'s declaration into this loop.
> 
> Now that we have two outparam bools here, I...have no idea what they each
> mean, when reading.  |ignored| particularly is really bad here.  :-)  I
> think we need to rejigger matchForInOrOf a little harder than we're doing
> here.  I propose we give it this signature:
> 
>     bool matchForInOrOf(bool *isForIn, bool *isForOf);
> 
> where return-false indicates error, return-true indicates no error, and if
> the next token is in/of you get one outparam set to true, one to false. 
> Another possibility is a tristate enum outparam, but that feels a little
> much.  Maybe.  My intuition isn't strong enough to say one thing is
> definitely the right way; please say something if you think some other
> option's better.  :-)

Yeah, it's better than mine.
I also think 2 bool values are sufficient for this, until another keywords are added to `for...???` statement in future :)
Attachment #8496094 - Attachment is obsolete: true
Attachment #8504491 - Flags: review?(jwalden+bmo)
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #22)
> Comment on attachment 8496095 [details] [diff] [review]
> Part5: Check error in TokenStream.matchContextualKeyword.
> 
> Review of attachment 8496095 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks good, modulo the previous review obsoleting some of these changes. 
> :-)  The relevant adjustments should be trivial, tho.
> 
> ::: js/src/frontend/TokenStream.h
> @@ +483,4 @@
> >          TokenKind token;
> >          if (!getToken(&token)) {
> >              ungetToken();
> > +            *matchedp = false;
> 
> The unget and matched-assignment should be unnecessary, remove them (or in a
> followup patch, if necessary).  (We should probably land all these patches
> as one folded-together setup, so we don't have intermediate-state worries.)

Removed them in this patch.
Attachment #8496095 - Attachment is obsolete: true
Attachment #8504492 - Flags: review?(jwalden+bmo)
Error checking in peekTokenPos for `lookahead != 0` case is fixed in Part10.
Attachment #8504493 - Flags: review?(jwalden+bmo)
Attachment #8504494 - Flags: review?(jwalden+bmo)
Attached patch Part8: Check error in PeekToken. (obsolete) — Splinter Review
Modifed PeekToken function to return boolean value, because we'll remove TOK_ERROR later, and it will be better to improve error handling in asm.js parsing.
Attachment #8504495 - Flags: review?(jwalden+bmo)
Added testcases for asm.js.

(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #23)
> Comment on attachment 8496096 [details] [diff] [review]
> Part6: Add Testcase for Illegal character error.
> 
> Review of attachment 8496096 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> @@ +278,5 @@
> > +test("function* f() { yield* @");
> > +test("function* f() { yield* 1 @");
> > +
> > +test("function* f() { yield\n@");
> > +test("function* f() { yield*\n@");
> 
> test("function* f() { yield\n*@");
> 
> (I can't remember if there's a [no LineTerminator] restriction between yield
> and * -- if there is this might be invalid.)

[no LineTerminator here] exists between them :)
So SyntaxError about `*` is thrown before hitting `@`.

> @@ +885,5 @@
> > +test("let ( x = 1, y = 2 ) @");
> > +test("let ( x = 1, y = 2 ) { @");
> > +test("let ( x = 1, y = 2 ) { x @");
> > +test("let ( x = 1, y = 2 ) { x; @");
> > +test("let ( x = 1, y = 2 ) { x; } @");
> 
> That weird strict-mode case suggests we should have this:
> 
> test("let ( x = 1, y = 2) x @");

In strict mode, SyntaxError is thrown before hitting `@`,
So I added test_no_strict(...).
Attachment #8496096 - Attachment is obsolete: true
Attachment #8504496 - Flags: review?(jwalden+bmo)
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #24)
> Comment on attachment 8496097 [details] [diff] [review]
> Part7: Return error status from TokenStream.getTokenInternal.
> 
> Review of attachment 8496097 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/frontend/TokenStream.h
> @@ +381,5 @@
> >              cursor = (cursor + 1) & ntokensMask;
> >              TokenKind tt = currentToken().type;
> >              JS_ASSERT(tt != TOK_EOL);
> >              *ttp = tt;
> > +            return !flags.hadError;
> 
> We shouldn't have lookahead that's an error token, once all these patches
> are done.  By this point in the patch set, can we
> MOZ_ASSERT(!flags.hadError) and return true?  Or must that happen at the
> end, following up?  Either way's fine by me.

Fixed in this patch.

> @@ +408,5 @@
> >      TokenPos peekTokenPos(Modifier modifier = None) {
> >          if (lookahead != 0)
> >              return tokens[(cursor + 1) & ntokensMask].pos;
> > +        TokenKind tt;
> > +        (void) getTokenInternal(&tt, modifier);
> 
> This function needs to be fallible as well, looks like.  We need a patch to
> handle this case as well.

Fixed as Part6.
Attachment #8496097 - Attachment is obsolete: true
Attachment #8504498 - Flags: review?(jwalden+bmo)
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #25)
> Comment on attachment 8496098 [details] [diff] [review]
> Part8: Remove TOK_ERROR from TokenKind.
> 
> Review of attachment 8496098 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> We still need more followup patches for the various things flagged in all
> these reviews.  I think they probably should land at the same time as all
> patches here, for simplicity/sureness of overall correctness.  But this
> looks good.  Just more to do before we can land anything here.  :-)  Good
> work on all this!  So nice to see the parser/token stream get better about
> handling errors correctly.
> 
> ::: js/src/frontend/TokenStream.cpp
> @@ +522,3 @@
> >  TokenStream::advance(size_t position)
> >  {
> >      const char16_t *end = userbuf.base() + position;
> 
> We should have a
> 
>   MOZ_ASSERT(position <= mozilla::PointerRangeSize(userbuf.base(),
> userbuf.limit()));
> 
> here.  (I *think* it's supposed to be <=, but please double-check me that
> it's not supposed to be <.  Tests will probably reveal quickly if it's
> wrong.)  I'm surprised we don't already.

I also think `<=` is appropriate here.

Assertion fails with `<` in asm.js cache.
  https://tbpl.mozilla.org/?tree=Try&rev=4dcbe866a799

> @@ +526,5 @@
> >          getChar();
> >  
> >      Token *cur = &tokens[cursor];
> >      cur->pos.begin = userbuf.addressOfNextRawChar() - userbuf.base();
> > +    cur->type = TOK_LIMIT;
> 
> We shouldn't need TOK_LIMIT, either, as a sentinel when all's said and done,
> just same as TOK_ERROR.  This is fine to kill off TOK_ERROR, but we
> shouldn't need to assign anything here...I think.  Let's work at removing
> this, followup patch.  (If it's just supposed to be a poison value, we
> should use mozilla/MemoryChecking.h's MOZ_MAKE_MEM_UNDEFINED(addr, size) for
> it.)

Thanks, replaced with MOZ_MAKE_MEM_UNDEFINED.
Attachment #8496098 - Attachment is obsolete: true
Attachment #8504499 - Flags: review?(jwalden+bmo)
Attachment #8504488 - Flags: review?(jwalden+bmo) → review+
Attachment #8504489 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8504490 [details] [diff] [review]
Part3: Check error in TokenStream.peekTokenSameLine.

Review of attachment 8504490 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/TokenStream.h
@@ +421,5 @@
> +    // return true and store TOK_EOL in |*ttp|.  In that case, no token with
> +    // TOK_EOL is actually created, just a TOK_EOL TokenKind is returned, and
> +    // currentToken() shouldn't be consulted.  (This is the only place TOK_EOL
> +    // is produced.)
> +    MOZ_ALWAYS_INLINE bool peekTokenSameLine(TokenKind *ttp, Modifier modifier = None) {

Put a line break after bool here.  Keeps things a little more readable, I think.
Attachment #8504490 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8504492 [details] [diff] [review]
Part5: Check error in TokenStream.matchContextualKeyword.

Review of attachment 8504492 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/TokenStream.h
@@ +483,2 @@
>              return false;
>          }

Remove the braces here, now that this is just a single line.
Attachment #8504492 - Flags: review?(jwalden+bmo) → review+
Attachment #8504493 - Flags: review?(jwalden+bmo) → review+
Attachment #8504494 - Flags: review?(jwalden+bmo) → review+
Attachment #8504495 - Flags: review?(jwalden+bmo) → review+
Attachment #8504499 - Flags: review?(jwalden+bmo) → review+
Attachment #8504500 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8504496 [details] [diff] [review]
Part9: Add Testcase for Illegal character error.

Review of attachment 8504496 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit-test/tests/basic/syntax-error-illegal-character.js
@@ +189,5 @@
> +test("var x = 1 + 2, y, z; @");
> +
> +test("var [ @");
> +test("var [ x @");
> +test("var [ x, @");

Another testcase occurs to me, looking at this:

test("var [ x, ... @");

We don't seem to have spread-as-target tested anywhere without this.  (Although it might have been in the bug that precipitated this.  Still worth doing so readers of this test don't have to wonder if a case slipped through the cracks.)

@@ +214,5 @@
> +test("let x = 1 + 2, y, z; @");
> +
> +test("let [ @");
> +test("let [ x @");
> +test("let [ x, @");

test("let [ x, ... @");

@@ +239,5 @@
> +test("const x = 1 + 2, y, z; @");
> +
> +test("const [ @");
> +test("const [ x @");
> +test("const [ x, @");

test("const [ x, ... @");

@@ +885,5 @@
> +test("({ @");
> +test("({a @");
> +test("({a} @");
> +test("({a}) @");
> +test("({a}) => @");

So {} as target doesn't quite have the same syntax as [] as target.  This only works because of "shorthand" targeting inside object patterns.  We want testing of the other sorts of things that object patterns permit, namely targets that are specified distinct from the property name:

test("({a: @");
test("({a: b @");
test("({a: b, @");
test("({a: b }) @");
test("({a: b }) => @");
Attachment #8504496 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8504491 [details] [diff] [review]
Part4: Check error in TokenStream.matchToken.

Review of attachment 8504491 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/Parser.cpp
@@ +700,5 @@
>      if (pn) {
> +        bool matched;
> +        if (!tokenStream.matchToken(&matched, TOK_EOF))
> +            return null();
> +        if (!matched) {

matchToken plus peekToken seems like a strange way to detect this, but no need to adjust in these patches.

@@ +825,5 @@
>  
> +    bool matched;
> +    if (!tokenStream.matchToken(&matched, TOK_EOF))
> +        return null();
> +    if (!matched) {

Likewise here.

@@ +3788,3 @@
>  
> +            // ...all the rest of the current loop code, with every
> +            // |continue| replaced with a |break|

This isn't necessary.  :-)

@@ +4325,5 @@
>      //   http://bugzilla.mozilla.org/show_bug.cgi?id=238945
>      // ES3 and ES5 disagreed, but ES6 conforms to Web reality:
>      //   https://bugs.ecmascript.org/show_bug.cgi?id=157
> +    bool matched;
> +    if (!tokenStream.matchToken(&matched, TOK_SEMI))

I'd name this |ignored|, to be clear that we don't care whether a match happens -- restating the comment above, in the variable names used.

@@ +4367,5 @@
>      }
> +    tokenStream.ungetToken();
> +    *isForInp = false;
> +    *isForOfp = false;
> +    return true;

Maybe:

    TokenKind tt;
    if (!tokenStream.getToken(&tt))
        return false;
    *isForInp = tt == TOK_IN;
    *isForOfp = tt == TOK_NAME && tokenStream.currentToken().name() == context->names().of;
    if (!*isForInp && !*isForOfp)
        tokenStream.ungetToken();
    return true;

Seems slightly shorter, seems slightly clearer how each value is determined (without having to read through all three arms, separately).

@@ +5690,5 @@
> +
> +    bool matched;
> +    if (!tokenStream.matchToken(&matched, TOK_COMMA))
> +        return null();
> +    if (matched) {

Separate patch, separate bug, we should invert this:

if (!matched)
    return pn;

Node seq = handler.newList(PNK_COMMA, pn);
...

which is slightly more readable because you don't have to consider all the comma stuff to understand how just a single expression is parsed.  Also it keeps code a little less indented, always helpful.

@@ +7458,4 @@
>          // ES6 array comprehension.
>          return arrayComprehension(begin);
>      } else {
> +        tokenStream.ungetToken();

So this is pre-existing, and I don't know what to do about it, but I guess we have a case here of

  getToken(..., non-default modifiers M1);
  ungetToken();
  getToken(..., non-default modifiers M1);

If instead the second instance were with different modifiers M2, we would have a bug, because lookahead-storage would mean the M2 modifiers wouldn't actually be used.  We don't have that, but it's not obvious with the context in Splinter.

I don't see an obvious way to avoid having to potentially get the same token, twice, here.  So let's do this as you have it, tho it's not 100% happymaking.

::: js/src/frontend/TokenStream.h
@@ +462,2 @@
>              return false;
>          }

Remove the extra braces...or did I already say to do this once already?  Dunno, I'm working through jetlag right now.  :-)
Attachment #8504491 - Flags: review?(jwalden+bmo) → review+
Attachment #8504498 - Flags: review?(jwalden+bmo) → review+
Blocks: 1089038
Blocks: 1089045
Rebased all patches.
Carrying over r+ from previous patch series, except Part 3-5 and 9.

Green on try run: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1b5f58c21fa0
Attachment #8504488 - Attachment is obsolete: true
Attachment #8511453 - Flags: review+
Attachment #8504489 - Attachment is obsolete: true
Attachment #8511454 - Flags: review+
Attachment #8504490 - Attachment is obsolete: true
Attachment #8511455 - Flags: review?(jwalden+bmo)
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #43)
> Comment on attachment 8504491 [details] [diff] [review]
> Part4: Check error in TokenStream.matchToken.
> 
> Review of attachment 8504491 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/frontend/Parser.cpp
> @@ +700,5 @@
> >      if (pn) {
> > +        bool matched;
> > +        if (!tokenStream.matchToken(&matched, TOK_EOF))
> > +            return null();
> > +        if (!matched) {
> 
> matchToken plus peekToken seems like a strange way to detect this, but no
> need to adjust in these patches.
> 
> @@ +825,5 @@
> >  
> > +    bool matched;
> > +    if (!tokenStream.matchToken(&matched, TOK_EOF))
> > +        return null();
> > +    if (!matched) {
> 
> Likewise here.

Filed as bug 1089038.

> @@ +4367,5 @@
> >      }
> > +    tokenStream.ungetToken();
> > +    *isForInp = false;
> > +    *isForOfp = false;
> > +    return true;
> 
> Maybe:
> 
>     TokenKind tt;
>     if (!tokenStream.getToken(&tt))
>         return false;
>     *isForInp = tt == TOK_IN;
>     *isForOfp = tt == TOK_NAME && tokenStream.currentToken().name() ==
> context->names().of;
>     if (!*isForInp && !*isForOfp)
>         tokenStream.ungetToken();
>     return true;
> 
> Seems slightly shorter, seems slightly clearer how each value is determined
> (without having to read through all three arms, separately).

Wow, cool :)

> @@ +5690,5 @@
> > +
> > +    bool matched;
> > +    if (!tokenStream.matchToken(&matched, TOK_COMMA))
> > +        return null();
> > +    if (matched) {
> 
> Separate patch, separate bug, we should invert this:
> 
> if (!matched)
>     return pn;
> 
> Node seq = handler.newList(PNK_COMMA, pn);
> ...
> 
> which is slightly more readable because you don't have to consider all the
> comma stuff to understand how just a single expression is parsed.  Also it
> keeps code a little less indented, always helpful.

I'll file this after this patch is landed.

> @@ +7458,4 @@
> >          // ES6 array comprehension.
> >          return arrayComprehension(begin);
> >      } else {
> > +        tokenStream.ungetToken();
> 
> So this is pre-existing, and I don't know what to do about it, but I guess
> we have a case here of
> 
>   getToken(..., non-default modifiers M1);
>   ungetToken();
>   getToken(..., non-default modifiers M1);
> 
> If instead the second instance were with different modifiers M2, we would
> have a bug, because lookahead-storage would mean the M2 modifiers wouldn't
> actually be used.  We don't have that, but it's not obvious with the context
> in Splinter.
> 
> I don't see an obvious way to avoid having to potentially get the same
> token, twice, here.  So let's do this as you have it, tho it's not 100%
> happymaking.

Filed as bug 1089045.
Attachment #8504491 - Attachment is obsolete: true
Attachment #8511456 - Flags: review?(jwalden+bmo)
Attachment #8504492 - Attachment is obsolete: true
Attachment #8511457 - Flags: review?(jwalden+bmo)
Attachment #8504493 - Attachment is obsolete: true
Attachment #8511458 - Flags: review+
Attachment #8504495 - Attachment is obsolete: true
Attachment #8511461 - Flags: review+
added "var [ x, ... @" cases and "({a: b }) => @" cases.
Attachment #8504496 - Attachment is obsolete: true
Attachment #8511462 - Flags: review?(jwalden+bmo)
Attachment #8504499 - Attachment is obsolete: true
Attachment #8511464 - Flags: review+
Attachment #8511455 - Flags: review?(jwalden+bmo) → review+
Attachment #8511456 - Flags: review?(jwalden+bmo) → review+
Attachment #8511457 - Flags: review?(jwalden+bmo) → review+
Attachment #8511462 - Flags: review?(jwalden+bmo) → review+
Thanks again!
Keywords: checkin-needed
Fixed following bustage.
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=3349772&repo=mozilla-inbound
> 02:12:48 ERROR - /builds/slave/l64-br-haz_m-in_dep-0000000000/build/source/js/src/asmjs/AsmJSValidate.cpp:3878:38: error: no matching function for call to 'js::frontend::TokenStream::matchToken(js::frontend::TokenKind)'
> 02:12:48 ERROR - /builds/slave/l64-br-haz_m-in_dep-0000000000/build/source/js/src/asmjs/AsmJSValidate.cpp:3884:36: error: no matching function for call to
Attachment #8512532 - Flags: review+
sorry had to back this out since without part 13 it caused a bustage and we were seeing multiple issues caused by this or other patches
Duplicate of this bug: 1075242
(In reply to Tooru Fujisawa [:arai] from comment #47)
> (In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #43)
> > @@ +5690,5 @@
> > > +
> > > +    bool matched;
> > > +    if (!tokenStream.matchToken(&matched, TOK_COMMA))
> > > +        return null();
> > > +    if (matched) {
> > 
> > Separate patch, separate bug, we should invert this:
> > 
> > if (!matched)
> >     return pn;
> > 
> > Node seq = handler.newList(PNK_COMMA, pn);
> > ...
> > 
> > which is slightly more readable because you don't have to consider all the
> > comma stuff to understand how just a single expression is parsed.  Also it
> > keeps code a little less indented, always helpful.
> 
> I'll file this after this patch is landed.

Filed as bug 1090695.
You need to log in before you can comment on or make changes to this bug.