TokenStream::getToken may return wrong token when passed modifier is different from the modifier used to get the token in lookahead storage.

RESOLVED FIXED in Firefox 43

Status

()

Core
JavaScript Engine
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: arai, Assigned: arai)

Tracking

Trunk
mozilla43
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox43 fixed)

Details

Attachments

(7 attachments, 11 obsolete attachments)

6.02 KB, patch
Waldo
: review-
Details | Diff | Splinter Review
4.41 KB, patch
Details | Diff | Splinter Review
2.80 KB, text/plain
Details
1.98 KB, patch
Details | Diff | Splinter Review
4.43 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
62.05 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
54.46 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
(Assignee)

Description

4 years ago
Derived from bug 1066827 comment #43

TokenStream::getToken returns token from lookahead-storage without checking modifier passed as argument, when TokenStream::ungetToken was called just before it.
So, in following case, second getToken will return wrong token:

    getToken(..., non-default modifiers M1);
    ungetToken();
    getToken(..., non-default modifiers M2);

If that case happens in normal path (in future? not sure), we should invalidate lookahead-storage, move position back, and tokenize again.
If not, just associating modifier to lookahead-storage and assert that they are same, in debug mode, will solve it.
(Assignee)

Updated

4 years ago
Depends on: 1066827
(Assignee)

Comment 1

4 years ago
Created attachment 8514101 [details] [diff] [review]
Check modifier used to get the token when returing it from lookahead-storage in debug mode.

Just for testing, I added modifier property to Token struct, and compare it when returning it from lookahead-strorage, only in debug build,
no assersion failure happens :)
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=07a9fb024f09
I asked about this a while ago on the mailing list: https://groups.google.com/forum/#!topic/mozilla.dev.tech.js-engine.internals/2JLH5jRcr7E

jorendorff and Waldo basically said "it shouldn't actually happen in practice". But assertions are a good idea. Some comments explaining what's going on would be good, too.
(Assignee)

Comment 3

4 years ago
(In reply to Nicholas Nethercote [:njn] from comment #2)
> I asked about this a while ago on the mailing list:
> https://groups.google.com/forum/#!topic/mozilla.dev.tech.js-engine.internals/
> 2JLH5jRcr7E
> 
> jorendorff and Waldo basically said "it shouldn't actually happen in
> practice". But assertions are a good idea. Some comments explaining what's
> going on would be good, too.

Thanks, I'll add comments and post updated patch.
(Assignee)

Comment 4

4 years ago
Created attachment 8519342 [details] [diff] [review]
Check modifier used to get the token when returing it from lookahead-storage in debug mode.

Added comment to previous attachment.

Green on try run: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=d5606c47a981
Attachment #8514101 - Attachment is obsolete: true
Attachment #8519342 - Flags: review?(jwalden+bmo)

Comment 5

4 years ago
Comment on attachment 8519342 [details] [diff] [review]
Check modifier used to get the token when returing it from lookahead-storage in debug mode.

Review of attachment 8519342 [details] [diff] [review]:
-----------------------------------------------------------------

I led you slightly astray here -- you can assert a little more than you're actually asserting here.  Sorry about that!

::: js/src/frontend/TokenStream.cpp
@@ +1628,5 @@
> +#ifdef DEBUG
> +    // Remember modifier used to get this token, to compare it later when this
> +    // token is pushed back to lookahead-storage, and returned from it again.
> +    // When passed modifier is different from it and both of them are not None,
> +    // content of lookahead-storage can be wrong for that call.

A little shorter:

    // Save the modifier used to get this token, so that if an ungetToken()
    // occurs and then the token is re-gotten (or peeked, etc.), we can assert
    // that both gets have used the same modifiers.

This also fixes a couple grammar nits.  Typically it'd be "Remember *the* modifier used...".  And it'd be "When *the* passed modifier...".  And it'd be "*the* content of lookahead-storage".  (Although, I think talking about lookahead storage at all is probably undesirable.  The point of the comment is to explain the API misuse that we want to detect.  How we implement that, is less important.  And the basic idea, to save the modifier from each get and verify it for each reget, is sufficiently obvious enough [to me at least] to not need explanation.)

@@ +1632,5 @@
> +    // content of lookahead-storage can be wrong for that call.
> +    //
> +    //   getToken(..., non-default modifiers M1);
> +    //   ungetToken();
> +    //   getToken(..., non-default modifiers M2);

My previous comment constrained the situation more than it should have been constrained.  The modifiers don't have to be non-default at all -- they simply have to be different.  So this is a problem:

   getToken(...); // implicit None
   ungetToken();
   getToken(..., anything but None);

And so is this:

   getToken(..., anything but None);
   ungetToken();
   getToken(...); // implicit None

So "and both of them are not None" should be removed here.

@@ +1634,5 @@
> +    //   getToken(..., non-default modifiers M1);
> +    //   ungetToken();
> +    //   getToken(..., non-default modifiers M2);
> +    //
> +    // Check this only in debug mode, since it shouldn't happen in practice.

The shorter explanation's mention of "assert" clarifies why this only happens in debug builds, so no need for this trailing sentence.  (Also the field doesn't even exist in non-debug builds, so it's in some sense not even necessary to mention it.)

::: js/src/frontend/TokenStream.h
@@ +91,5 @@
>          RegExpFlag      reflags;        // regexp flags; use tokenbuf to access
>                                          //   regexp chars
>      } u;
> +#ifdef DEBUG
> +    uint8_t modifier;                   // Modifier used to get this token

Make this a Modifier.  I'm guessing this was uint8_t so you could minimize size, but struct packing means this probably gets padded to a full pointer anyway, so Modifier wouldn't change size.

And past that, we don't actually care about trivial space consumption like this in debug builds.  :-)

@@ +381,5 @@
>              cursor = (cursor + 1) & ntokensMask;
>              TokenKind tt = currentToken().type;
>              MOZ_ASSERT(tt != TOK_EOL);
> +            MOZ_ASSERT(modifier == None || currentToken().modifier == None ||
> +                       modifier == currentToken().modifier);

With the adjustments for my over-cabining of the API misuse to detect, this should be (with an error message to make the mistake clearer):

            MOZ_ASSERT(modifier == currentToken().modifier,
                       "this token was previously looked up with a different "
                       "modifier, potentially making tokenization non-"
                       "deterministic");

@@ +400,5 @@
>      bool peekToken(TokenKind *ttp, Modifier modifier = None) {
>          if (lookahead > 0) {
>              MOZ_ASSERT(!flags.hadError);
> +            MOZ_ASSERT(modifier == None || tokens[(cursor + 1) & ntokensMask].modifier == None ||
> +                       modifier == tokens[(cursor + 1) & ntokensMask].modifier);

Same assertion adjustment as above.

@@ +420,5 @@
>              MOZ_ASSERT(lookahead != 0);
>          } else {
>              MOZ_ASSERT(!flags.hadError);
> +            MOZ_ASSERT(modifier == None || tokens[(cursor + 1) & ntokensMask].modifier == None ||
> +                       modifier == tokens[(cursor + 1) & ntokensMask].modifier);

And here.

@@ +444,5 @@
>          // to return TOK_EOL.
>          if (lookahead != 0 && srcCoords.isOnThisLine(curr.pos.end, lineno)) {
>              MOZ_ASSERT(!flags.hadError);
> +            MOZ_ASSERT(modifier == None || tokens[(cursor + 1) & ntokensMask].modifier == None ||
> +                       modifier == tokens[(cursor + 1) & ntokensMask].modifier);

And here.
Attachment #8519342 - Flags: review?(jwalden+bmo) → review-
(Assignee)

Comment 6

4 years ago
Thank you for reviewing!

(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #5)
> Comment on attachment 8519342 [details] [diff] [review]
> Check modifier used to get the token when returing it from lookahead-storage
> in debug mode.
> 
> Review of attachment 8519342 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> A little shorter:
> 
>     // Save the modifier used to get this token, so that if an ungetToken()
>     // occurs and then the token is re-gotten (or peeked, etc.), we can
> assert
>     // that both gets have used the same modifiers.
> 
> This also fixes a couple grammar nits.  Typically it'd be "Remember *the*
> modifier used...".  And it'd be "When *the* passed modifier...".  And it'd
> be "*the* content of lookahead-storage".  (Although, I think talking about
> lookahead storage at all is probably undesirable.  The point of the comment
> is to explain the API misuse that we want to detect.  How we implement that,
> is less important.  And the basic idea, to save the modifier from each get
> and verify it for each reget, is sufficiently obvious enough [to me at
> least] to not need explanation.)

Thanks! I can learn English by contributing :)

Using "ungetToken()" in the description is nice,
and I guess it's clear enough without any sample code in the comment.

> ::: js/src/frontend/TokenStream.h
> @@ +91,5 @@
> >          RegExpFlag      reflags;        // regexp flags; use tokenbuf to access
> >                                          //   regexp chars
> >      } u;
> > +#ifdef DEBUG
> > +    uint8_t modifier;                   // Modifier used to get this token
> 
> Make this a Modifier.  I'm guessing this was uint8_t so you could minimize
> size, but struct packing means this probably gets padded to a full pointer
> anyway, so Modifier wouldn't change size.
> 
> And past that, we don't actually care about trivial space consumption like
> this in debug builds.  :-)

Sorry, I should've mentioned it.
It seems that forward declaration of enum is not allowed in the mozilla code.
  https://developer.mozilla.org/en-US/docs/Using_CXX_in_Mozilla_code

Modifier is declared in TokenStream class and they are referred as TokenStream::Operand, etc,
and TokenStream class uses Token struct in its definition,
I don't know any other way than just using uint8_t for modifier property of Token,
do you have any idea?

> @@ +381,5 @@
> >              cursor = (cursor + 1) & ntokensMask;
> >              TokenKind tt = currentToken().type;
> >              MOZ_ASSERT(tt != TOK_EOL);
> > +            MOZ_ASSERT(modifier == None || currentToken().modifier == None ||
> > +                       modifier == currentToken().modifier);
> 
> With the adjustments for my over-cabining of the API misuse to detect, this
> should be (with an error message to make the mistake clearer):
> 
>             MOZ_ASSERT(modifier == currentToken().modifier,
>                        "this token was previously looked up with a different
> "
>                        "modifier, potentially making tokenization non-"
>                        "deterministic");

Okay, fixed locally.
Flags: needinfo?(jwalden+bmo)

Comment 7

4 years ago
Created attachment 8523305 [details] [diff] [review]
Extra bits to make Token::modifier be a Modifier

Hm, that's unfortunate.  Still, well-typed seems better than not.  So let's move Modifier into Token, then have TokenStream include Modifier as a typedef, and have static constexprs to bring the various enum initializers into scope, like here.
Flags: needinfo?(jwalden+bmo)
(Assignee)

Comment 8

4 years ago
Created attachment 8523377 [details]
assertion failure list

Thank you very much for your advice. yeah, cool solution :)

Then, assertion fails :(
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=78248dc60c6b

Currently, some tokens are got both with None and Operand (see attached text)
Some of them seem to be really wrong (especially, peekTokenSameLine for TOK_DIV, with Operand),
but not sure all of them are.

Anyway, I'll look into them.

Comment 9

4 years ago
Created attachment 8523383 [details] [diff] [review]
Partial start on supplying consistent modifiers

I looked at this some, because I was curious how we were screwing this up.  Looks like, sadly, it's not always the easiest thing in the world to eyeball.  :-(  On the plus side, these failures (and probably most of them) are just hit by starting up the JS shell (presumably in self-hosted code), so they're at least quite prominent.

I did this for a few cases, most of which were just use the right modifier, fairly simple.  Then I hit forStatement(), for the |for (;;)| case.  That one's not actually solvable (I think, didn't test this because I couldn't get past self-hosting!) by replacing MUST_MATCH_TOKEN with the expanded form with a different modifier.  The problem is semicolons in that case (and I'd expect others beyond it, but maybe for-loops are just that crazy) are eaten up with inconsistent modifiers:

for (;;) {}

In this case, the semicolon is where an expression/operand is expected, so we'd use TokenStream::Operand.

for (expr; false; ) {}

But here, the first semicolon is after an expression parsed by expr().  And if we look at expr(), that keeps glomming up AssignmentExpressions in a comma-separated list as long as it can.  So the last match is for a *comma*, with the non-expression None modifier.

So the first (and second) semicolons in for-statements have to be parsed with different modifiers.  The second can be fixed by doing a getToken(, Operand), then having the not-a-TOK_SEMI case ungetToken() and do a full expr() followed by matching a semicolon in None mode.  The first is a bit harder.  Maybe you can hack around that testing |pn1| or |forLetDecl|.  Not sure.

Anyway, that's as far as I got doing this before the rat-hole started, and I decided to get back to other work.  :-)
(Assignee)

Updated

4 years ago
Depends on: 1099956
Summary: TokenStream::getToken may return wrong token when non-default modifier was passed and there is lookahead with different non-default modifier → TokenStream::getToken may return wrong token when passed modifier is different from the modifier used to get the token in lookahead storage.
(Assignee)

Comment 10

4 years ago
Okay, for-head seems to be solved locally.

Then, Automatic Semicolon Insertion interacts badly with this assertion.

Here is an example:

  func1()
  func2()

In this case, "func2" will be got with None modifier while parsing first line as expressionStatement, and semicolon is inserted before "func2".
Then, "func2" should be got with Operand modifier, but it's already got with None.

So, we should inactivate this assertion (or assert more loose condition, just like first patch) when semicolon is inserted just before the token.
(Assignee)

Comment 11

3 years ago
finally it passes the tests :)
  https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=674aaf9d0f2a
maybe need some more brush up though.
Depends on: 1105608, 1101265
(Assignee)

Comment 12

3 years ago
Created attachment 8537182 [details] [diff] [review]
Part 1: Supply consistent modifiers to TokenStream.

Draft patch for supplying consitent modifier and checking it.

To determine appropriate modifier, I added parameters to some methods.
Also, added 2 exceptional things.

(1) `ExprFlags *exprFlagsOut` in expression related methods

ExprFlags stores the information about exceptional case while parsing an expression.
Currently following 1 case exists:

> // Parsed the yield without any operand, and indicates that the next token is
> // alrady gotten with TokenStream::Operand modifier.
> static const ExprFlags EXPR_FLAG_YIELD_WITHOUT_OPERAND = 1;

There was one more case for TOK_RP in empty arrow function parameter, but it was gone in bug 1101265 :D

Generally, the token next to an expression should be gotten with None modifier,
but in above case, it's already gotten with Operand modifier,
so caller function need to know the case and change the modifier to pass to
the next getToken call.


(2) TokenStream::Modifier parameter in function definition related methods

In those case, TOK_LP next to get/set is already gotten with TokenStream::KeywordIsName.

> ({
>   get() {
>   },
>   set() {
>   }
> })

Function definition related methods need to know it to get TOK_LP token.


(3) TokenStream::Modifier parameter in matchInOrOf/MatchOrInsertSemicolon

Those method and function need modifier to get the next token.


(4) TokenStream::Modifier parameter in consumeKnownToken

Now it needs modifier parameter to check assertion in matchToken method.


(5) MUST_MATCH_TOKEN_MOD macro

MUST_MATCH_TOKEN macro with TokenStream::Modifier.
original MUST_MATCH_TOKEN now calls MUST_MATCH_TOKEN_MOD with TokenStream::None.


(6: exception1) TokenStream::insertSemicolon/Token.isSemicolonInserted

As noted in comment #10, it asserts special condition when a semicolon is inserted.
When a semicolon is inserted just before a token, the token is already gotten with None modifier,
but it will be gotten with Operand in next statement.

TokenStream::insertSemicolon methods sets Token.isSemicolonInserted property of
the next token in the lookahead storage.
Those codes are debug-only since it's used only in the assertion code,
and have no effect for actual parsing.


(7: exception2) TokenStream::hasLookahead/TokenStream::getLookaheadModifier

ModuleCompiler::fail method uses those methods to directly access to
the lookahead storage and its modifier to determine what modifier should be
passed to peekTokenPos method, because the method is called from so much places,
and peekTokenPos is not used for parsing, but for reporting error.
Attachment #8537182 - Flags: feedback?(jwalden+bmo)
(Assignee)

Comment 13

3 years ago
Created attachment 8537185 [details] [diff] [review]
Part 2: Tests for modifiers.

This reuses the testcases of bug 1066827.

Green on try run: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=73dadcb657ae
Attachment #8537185 - Flags: feedback?(jwalden+bmo)
(Assignee)

Comment 14

3 years ago
Created attachment 8547024 [details] [diff] [review]
(Plan B) Part 1: Supply consistent modifiers to TokenStream.

Plan B.

This patch removes `ExprFlags` from previous one, treats "yield without any operand" as "insert yield operand virtually", and handles it in TokenStream.
Of course it's not correct per grammar, but we don't need to modify so much code/function parameter just for debug mode.

Which do you think is better?

Green on try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=34ebac585cba
Attachment #8547024 - Flags: feedback?(jwalden+bmo)
(Assignee)

Comment 15

3 years ago
Created attachment 8558956 [details] [diff] [review]
(Plan C) Part 1: Supply consistent modifiers to TokenStream.

Plan C.
The simplest one :)

To solve (1), (2), (6) in comment #12, added TokenStream::addModifierException and Token.modifierExceptions, and removed `modifier` parameters from Parser methods, TokenStream::insertSemicolon, and Token.isSemicolonInserted.

If we unget some token and going to get the token with different modifier, call TokenStream::addModifierException to notify TokenStream about the exceptional case, so we can test the exception is valid and has no ambiguity (assertions in TokenStream::addModifierException), and treat those exceptions later (TokenStream::assertModifier).

Green on try runs:
  https://treeherder.mozilla.org/#/jobs?repo=try&revision=1d8e3133e149 (timeout on ggc)
  https://treeherder.mozilla.org/#/jobs?repo=try&revision=5253f444c347 (separated a test)
Attachment #8547024 - Attachment is obsolete: true
Attachment #8547024 - Flags: feedback?(jwalden+bmo)
Attachment #8558956 - Flags: feedback?(jwalden+bmo)
(Assignee)

Comment 16

3 years ago
Created attachment 8558957 [details] [diff] [review]
Part 2: Tests for modifiers.

rebased
Attachment #8537185 - Attachment is obsolete: true
Attachment #8537185 - Flags: feedback?(jwalden+bmo)
Attachment #8558957 - Flags: feedback?(jwalden+bmo)
(Assignee)

Comment 17

3 years ago
Created attachment 8574332 [details] [diff] [review]
Part 1: Supply consistent modifiers to TokenStream.

Rebased, mainly for ES6 class.
Now I think Plan C is the best in term of maintainability (supporting `static` in class definition can be done by just adding one line), so obsoleting Plan A.

Green on try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7abd9b7d34af
Attachment #8537182 - Attachment is obsolete: true
Attachment #8558956 - Attachment is obsolete: true
Attachment #8537182 - Flags: feedback?(jwalden+bmo)
Attachment #8558956 - Flags: feedback?(jwalden+bmo)
Attachment #8574332 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 18

3 years ago
Created attachment 8574333 [details] [diff] [review]
Part 2: Tests for modifiers.
Attachment #8558957 - Attachment is obsolete: true
Attachment #8558957 - Flags: feedback?(jwalden+bmo)
Attachment #8574333 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 19

3 years ago
Created attachment 8574334 [details] [diff] [review]
Part 3: Add parser test for class syntax.

Added testcases for ES6 class.
Attachment #8574334 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 20

3 years ago
Created attachment 8601028 [details] [diff] [review]
Part 3: Add parser test for ES6 class.

Added the non-Nightly check, class expression, and extend.

Green on try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e88c9a0091c9
Assignee: nobody → arai.unmht
Attachment #8574334 - Attachment is obsolete: true
Attachment #8574334 - Flags: review?(jwalden+bmo)
Attachment #8601028 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 21

3 years ago
Created attachment 8601032 [details] [diff] [review]
Part 1: Supply consistent modifiers to TokenStream.

Might be better to update Part 1 as well, no major change, but style change happened from previous one.
Attachment #8574332 - Attachment is obsolete: true
Attachment #8574332 - Flags: review?(jwalden+bmo)
Attachment #8601032 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 22

3 years ago
Created attachment 8614443 [details] [diff] [review]
Part 1: Supply consistent modifiers to TokenStream.

just rebasing Part 1 and Part 2.

Green on try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a8451f1a8e72
Attachment #8601032 - Attachment is obsolete: true
Attachment #8601032 - Flags: review?(jwalden+bmo)
Attachment #8614443 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 23

3 years ago
Created attachment 8614444 [details] [diff] [review]
Part 2: Tests for modifiers.
Attachment #8574333 - Attachment is obsolete: true
Attachment #8574333 - Flags: review?(jwalden+bmo)
Attachment #8614444 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 24

3 years ago
I'll post rebased patches this weekend
try is running: https://treeherder.mozilla.org/#/jobs?repo=try&revision=69c22753f630
(Assignee)

Comment 25

3 years ago
Created attachment 8644574 [details] [diff] [review]
Part 1: Supply consistent modifiers to TokenStream.

Rebased Part 1.
no changes in Part 2 and 3.
Attachment #8614443 - Attachment is obsolete: true
Attachment #8614443 - Flags: review?(jwalden+bmo)
Attachment #8644574 - Flags: review?(jwalden+bmo)
Comment on attachment 8644574 [details] [diff] [review]
Part 1: Supply consistent modifiers to TokenStream.

Review of attachment 8644574 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the horrifically long delay on this.  Idea is good, execution is good, just a few little things still.

::: js/src/asmjs/AsmJSValidate.cpp
@@ +1605,5 @@
>          // delayed until the compilation is off the stack and more memory can be freed.
>          gc::AutoSuppressGC nogc(cx_);
>          TokenPos pos;
> +        TokenStream::Modifier modifier = tokenStream().hasLookahead() ? tokenStream().getLookaheadModifier() : TokenStream::None;
> +        if (!tokenStream().peekTokenPos(&pos, modifier))

This is sheer garbage.  :-)  We shouldn't be flailing about like a drunk on roller skates to get a position when we're given a null node, we should be erroring out before this if the node is null.

I'll let this slide for now, but I *really* want a followup bug to remove this code.  Skimming through the asm.js code, I think overwhelmingly there's no issue with this, and you could just make the change.  But I see a few spots to hit:

js> function cH(bl) { "use asm"; if (bl & 0xffff) return; }
Hit MOZ_CRASH(hi) at /home/jwalden/moz/slots/js/src/asmjs/AsmJSValidate.cpp:1602

js> function f(b) { 'use asm'; 'use fksjdkfls'; }
Hit MOZ_CRASH(hi) at /home/jwalden/moz/slots/js/src/asmjs/AsmJSValidate.cpp:1602

and CheckFuncPtrTables, CheckModuleReturn, CheckModule, non-exhaustively.  But get rid of this nonsense ASAP.  :-)

::: js/src/frontend/TokenStream.h
@@ +99,5 @@
> +        None,           // Normal operation.
> +        Operand,        // Looking for an operand, not an operator.  In
> +                        //   practice, this means that when '/' is seen,
> +                        //   we look for a regexp instead of just returning
> +                        //   TOK_DIV.

I'd rather we not indented extra for the succeeding lines here.  And, generally, I'd rather see

enum Modifier
{
    // Normal operation.
    None,

    // ...
    Operand,

    // ...
    KeywordIsName,

    // ...
    TemplateTail,
};

for everything, with line breaks between everything and initializers clearly on their own lines.

@@ +101,5 @@
> +                        //   practice, this means that when '/' is seen,
> +                        //   we look for a regexp instead of just returning
> +                        //   TOK_DIV.
> +        KeywordIsName,  // Treat keywords as names by returning TOK_NAME.
> +        TemplateTail,   // Treat next characters as part of a template string

This is a little brief, IMO, looking back at this now.  How about:

"""
Treat subsequent characters as the tail of a template literal, after a template substitution, beginning with a "}", continuing with zero or more template literal characters, and ending with either "${" or the end of the template literal.  For example:

  var entity = "world";
  var s = `Hello ${entity}!`;
  //                     ^ TemplateTail context
"""

@@ +122,5 @@
>                                          //   regexp chars
>      } u;
> +#ifdef DEBUG
> +    // A semicolon is inserted automatically.
> +    static const uint8_t INSERT_SEMICOLON     = 1;

Don't bother aligning, it'll just bitrot the next time someone adds or removes something here.  Also a blank line between them, like in the enum.

@@ +406,5 @@
> +
> +    static MOZ_CONSTEXPR_VAR uint8_t NoException = 0;
> +    // If a semicolon is inserted automatically, the next token is already
> +    // gotten with None, but we expect Operand.
> +    static MOZ_CONSTEXPR_VAR uint8_t None_Is_Operand = 1;

We use InterCaps for enum (or enum-like things), so these should be NoneIsOperand and such as far as style goes.  As this is a bitset, use hexadecimal for the numbers as a mini-hint toward that meaning.

Is there a reason these can't be grouped into an enum?  Seems better than having any numeric literal or so be accepted in place of a use of these, accidentally.

Also, I'd like to see blank lines between each comment/initializer combination for readability.

@@ +416,5 @@
> +    static MOZ_CONSTEXPR_VAR uint8_t None_Is_KeywordIsName = 4;
> +
> +    void addModifierException(uint8_t modifierException) {
> +#ifdef DEBUG
> +        MOZ_ASSERT(lookahead != 0);

MOZ_ASSERT(hasLookahead());

@@ +417,5 @@
> +
> +    void addModifierException(uint8_t modifierException) {
> +#ifdef DEBUG
> +        MOZ_ASSERT(lookahead != 0);
> +        Token& next = tokens[(cursor + 1) & ntokensMask];

Add a private nextToken() helper to return this -- but have it return const Token&, please.  I see this expression a few different places, seems like it should be commoned up (and the hasLookahead() assertion moved inside it).

@@ +422,5 @@
> +        switch (modifierException) {
> +          case None_Is_Operand:
> +            MOZ_ASSERT(next.modifier == Operand);
> +            MOZ_ASSERT(next.type != TOK_DIV &&
> +                       next.type != TOK_REGEXP);

Add "next token requires contextual specifier to be parsed unambiguously" as a reason-string in this second assertion.

@@ +430,5 @@
> +            // None_Is_Operand exception.
> +            MOZ_ASSERT(next.modifier == None ||
> +                       ((next.modifierExceptions & None_Is_Operand) && next.modifier == Operand));
> +            MOZ_ASSERT(next.type != TOK_DIV &&
> +                       next.type != TOK_REGEXP);

Same reason-string here, too.

@@ +444,5 @@
> +
> +    bool hasLookahead() { return lookahead > 0; }
> +
> +    Modifier getLookaheadModifier() {
> +        MOZ_ASSERT(lookahead != 0);

hasLookahead()

@@ +453,5 @@
> +#endif
> +    }
> +
> +    MOZ_ALWAYS_INLINE void
> +    assertModifier(Modifier modifier, Token lookaheadToken) {

No need for the always-inline, this is pure debug code.  And "assert" is somewhat vague, I'd make this "verifyConsistentModifier".

Also, you pass in the exact same expression for the token every place.  Add in:

  const Token& next = nextToken();

and use that everywhere, getting rid of the argument.

@@ +463,5 @@
> +                   ((lookaheadToken.modifierExceptions & None_Is_KeywordIsName) &&
> +                    modifier == None && lookaheadToken.modifier == KeywordIsName),
> +                   "this token was previously looked up with a different "
> +                   "modifier, potentially making tokenization non-"
> +                   "deterministic");

Ugh, this is practically unreadable.  What about the following?

#ifdef DEBUG
        // Easy case: modifiers match.
        if (modifier == lookaheadToken.modifier)
            return;

        if (lookaheadToken.modifierExceptions & OperandIsNone) {
            // getToken(Operand) permissibly following getToken().
            if (modifier == Operand && lookaheadToken.modifier == None)
                return;
        }

        if (lookaheadToken.modifierExceptions & NoneIsOperand) {
            // getToken() permissibly following getToken(Operand).
            if (modifier == None && lookaheadToken.modifier == Operand)
                return;
        }

        if (lookaheadToken.modifierExceptions & NoneIsKeywordIsName) {
            // getToken() permissibly following getToken(KeywordIsName).
            if (modifier == None && lookaheadToken.modifier == KeywordIsName)
                return;
        }

        MOZ_ASSERT_UNREACHABLE("this token was previously looked up with a "
                               "different modifier, potentially making "
                               "tokenization non-deterministic");
#endif

That also gives debuggers more of a place to breakpoint on.
Attachment #8644574 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8614444 [details] [diff] [review]
Part 2: Tests for modifiers.

Review of attachment 8614444 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit-test/tests/parser/modifier-semicolon-insertion.js
@@ +1,1 @@
> +Reflect.parse("function a()f1()\nf2()");

I'd use `` with all these so they're readable, but this works too.
Attachment #8614444 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8601028 [details] [diff] [review]
Part 3: Add parser test for ES6 class.

Review of attachment 8601028 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit-test/lib/syntax.js
@@ +978,5 @@
> +    test("class a { constructor() { } static set p(v) @");
> +    test("class a { constructor() { } static set p(v) { @");
> +    test("class a { constructor() { } static set p(v) {} @");
> +    test("class a { constructor() { } static set p(v) {} } @");
> +

Good set of tests.  Add in some generator methods as well -- and static generator methods, if those are a thing (can't think why they shouldn't be, but offhand I don't remember if they are).
Attachment #8601028 - Flags: review?(jwalden+bmo) → review+
(Assignee)

Comment 29

3 years ago
Thank you for reviewing :D

(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #26)
> @@ +417,5 @@
> > +
> > +    void addModifierException(uint8_t modifierException) {
> > +#ifdef DEBUG
> > +        MOZ_ASSERT(lookahead != 0);
> > +        Token& next = tokens[(cursor + 1) & ntokensMask];
> 
> Add a private nextToken() helper to return this -- but have it return const
> Token&, please.  I see this expression a few different places, seems like it
> should be commoned up (and the hasLookahead() assertion moved inside it).

Most part can be done with `const Token&`, but following line needs non-const:

>        next.modifierExceptions |= modifierException;

So I replaced this line with:

>        tokens[(cursor + 1) & ntokensMask].modifierExceptions |= modifierException;

> @@ +453,5 @@
> > +#endif
> > +    }
> > +
> > +    MOZ_ALWAYS_INLINE void
> > +    assertModifier(Modifier modifier, Token lookaheadToken) {
> 
> No need for the always-inline, this is pure debug code.  And "assert" is
> somewhat vague, I'd make this "verifyConsistentModifier".
> 
> Also, you pass in the exact same expression for the token every place.  Add
> in:
> 
>   const Token& next = nextToken();
> 
> and use that everywhere, getting rid of the argument.

getToken passes currentToken(), so we cannot remove the argument.
(Assignee)

Comment 31

3 years ago
filed AsmJSValidate things as bug 1193777
https://hg.mozilla.org/mozilla-central/rev/05f838caf076
https://hg.mozilla.org/mozilla-central/rev/4f87d8225c7a
https://hg.mozilla.org/mozilla-central/rev/a8493abd3c62
Status: NEW → RESOLVED
Last Resolved: 3 years ago
status-firefox43: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
(Assignee)

Updated

3 years ago
Depends on: 1194022
Depends on: 1195578
Depends on: 1204368
Depends on: 1212719
You need to log in before you can comment on or make changes to this bug.