Closed Bug 883333 Opened 11 years ago Closed 11 years ago

Further Parser/ParseHandler protocol cleanups: statements

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla24

People

(Reporter: jorendorff, Assigned: jorendorff)

Details

Attachments

(15 files, 3 obsolete files)

6.21 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
7.83 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
13.73 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
19.15 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
48.43 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
9.60 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
6.23 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
17.13 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
2.10 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
3.15 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
2.41 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
2.76 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
2.41 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
3.16 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
10.91 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
My last big stack for now. 15 easy patches.
I made this patch first, but it was like wiping the nose of a grubby kid: it made a glaring clean spot. There was nothing for it but to work through the rest.
Assignee: general → jorendorff
Attachment #762872 - Flags: review?(jwalden+bmo)
Attached patch Part 2 - rearrange, v1 (obsolete) — Splinter Review
Put all the existing statement parsers into the order they appear in ES6. (permutations of lines, plus a very few other whitespace-only tweaks)
Attachment #762895 - Flags: review?(jwalden+bmo)
Attachment #762904 - Flags: review?(jwalden+bmo)
Attachment #762905 - Flags: review?(jwalden+bmo)
Attachment #762906 - Flags: review? → review?(jwalden+bmo)
I've made many false starts cleaning up this code. It's hairy. Here's an initial effort at maybe making it better.
Attachment #762909 - Flags: review?(jwalden+bmo)
Attachment #762910 - Flags: review?(jwalden+bmo)
Incidentally, the duplication between break and continue is silly, and some of that code is wondrously gross. I wish I could clean everything up but I have to stop somewhere for the day. :-\
Attachment #762895 - Attachment is obsolete: true
Attachment #762895 - Flags: review?(jwalden+bmo)
Attachment #762919 - Flags: review?(jwalden+bmo)
Attachment #762910 - Attachment is obsolete: true
Attachment #762910 - Flags: review?(jwalden+bmo)
Attachment #762920 - Flags: review?(jwalden+bmo)
Attachment #762922 - Flags: review?(jwalden+bmo)
(v1.1 fixes a tricky bug in SyntaxParseHandler::newExpressionStatement)
Attachment #762872 - Attachment is obsolete: true
Attachment #762872 - Flags: review?(jwalden+bmo)
Attachment #762956 - Flags: review?(jwalden+bmo)
Attachment #762968 - Flags: review?(jwalden+bmo)
Attachment #762969 - Flags: review?(jwalden+bmo)
Attachment #762971 - Flags: review?(jwalden+bmo)
Attachment #762972 - Flags: review?(jwalden+bmo)
Attachment #762974 - Attachment description: Part 13 - Squeeze out leftover whitespace → Part 13 - Squeeze out leftover whitespace, v1
Attachment #762975 - Flags: review?(jwalden+bmo)
This removes a member-function-template specialization from the parser and a use of setBlockId. I would kill off setBlockId the rest of the way but I'm totally sure that would lead to another week of tidying, which I don't want.
Attachment #762980 - Flags: review?(jwalden+bmo)
Comment on attachment 762956 [details] [diff] [review] Part 1 - empty, expression, return, and throw, v1.1 Review of attachment 762956 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/frontend/FullParseHandler.h @@ +196,5 @@ > + return new_<UnaryNode>(PNK_SEMI, JSOP_NOP, pos, (ParseNode *) NULL); > + } > + > + ParseNode *newExprStatement(ParseNode *expr, uint32_t end) { > + JS_ASSERT(expr->pn_pos.end <= end); I can't remember if I've mentioned it elsewhere, but can we have a begin <= end assertion in TokenPos's constructor at some point? @@ +206,5 @@ > + JS_ASSERT_IF(expr, expr->pn_pos.end <= end); > + return new_<UnaryNode>(PNK_RETURN, JSOP_RETURN, TokenPos::make(begin, end), expr); > + } > + > + ParseNode *newThrowStatement(ParseNode *expr, uint32_t begin, uint32_t end) { I'd much rather we be passing const TokenPos& here than begin/end as separate arguments, same above too. ::: js/src/frontend/Parser.cpp @@ +1054,5 @@ > if (!kid) > return null(); > > + TokenPos kidpos = handler.getPosition(kid); > + pn = handler.newReturnStatement(kid, kidpos.begin, kidpos.end); Ah, and you fixed the likely-bug from the previous patches. I still want the test for this. :-) It is really pretty mind-bending to think about this and keep in mind that almost every single "position" observed from a getPosition, getPositionEnd, etc. call is a lie when syntax-parsing. I'm not sure I like that, but I don't have better ideas. This is the sort of thing I expect to prevail in the long run, making passing const TokenPos& the right thing to do. @@ +3429,5 @@ > + handler.setEndPosition(pnlet, pos().end); > + > + if (needExprStmt) { > + if (!MatchOrInsertSemicolon(context, &tokenStream)) > + return null(); Hmm. This MatchOrInsert bit wasn't here before. Did we have a bug? It's unclear to me what causes this to be needed. A test illustrating what changed would be nice. ::: js/src/frontend/TokenStream.cpp @@ -1873,5 @@ > case TOK_THROW: return "TOK_THROW"; > case TOK_INSTANCEOF: return "TOK_INSTANCEOF"; > case TOK_DEBUGGER: return "TOK_DEBUGGER"; > case TOK_YIELD: return "TOK_YIELD"; > - case TOK_LEXICALSCOPE: return "TOK_LEXICALSCOPE"; We really need to do the frontend/TokenKind.h header with a FOR_EACH_TOKEN_KIND macro in it at some point, and I should have done when I originally added this abomination. :-(
Attachment #762956 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 762919 [details] [diff] [review] Part 2 - rearrange, v1.1 Review of attachment 762919 [details] [diff] [review]: ----------------------------------------------------------------- Didn't look at Parser.cpp at all, assuming the copypasta was cooked properly.
Attachment #762919 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 762904 [details] [diff] [review] Part 3 - if statements, v1 Review of attachment 762904 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/frontend/Parser.cpp @@ +3743,5 @@ > + Node cond = condition(); > + if (!cond) > + return null(); > + > + if (tokenStream.peekTokenSameLine() == TOK_SEMI && peekToken (without TSF_OPERAND) metastasized back to same-line (still without TSF_OPERAND). Fix!
Attachment #762904 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 762905 [details] [diff] [review] Part 4 - while statements, v1 Review of attachment 762905 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/frontend/FullParseHandler.h @@ +209,5 @@ > pn->pn_pos.begin = begin; > return pn; > } > > + ParseNode *newDoWhileStatement(uint32_t begin, ParseNode *body, ParseNode *cond, uint32_t end) { Would rather see const TokenPos& again. ::: js/src/frontend/Parser.cpp @@ +3796,5 @@ > + // semicolons in JS. Web compat required this by 2004: > + // http://bugzilla.mozilla.org/show_bug.cgi?id=238945 > + // ES3 and ES5 disagreed, but ES6 conforms to Web reality: > + // https://bugs.ecmascript.org/show_bug.cgi?id=157 > + (void) tokenStream.matchToken(TOK_SEMI); Could you make this change in a separate patch/revision, please, for tracking purposes?
Attachment #762905 - Flags: review?(jwalden+bmo) → review+
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #20) > I can't remember if I've mentioned it elsewhere, but can we have a begin <= > end assertion in TokenPos's constructor at some point? OK, on my todo list. > > + ParseNode *newThrowStatement(ParseNode *expr, uint32_t begin, uint32_t end) { > > I'd much rather we be passing const TokenPos& here than begin/end as > separate arguments, same above too. Done. > It is really pretty mind-bending to think about this and keep in mind that > almost every single "position" observed from a getPosition, getPositionEnd, > etc. call is a lie when syntax-parsing. I'm not sure I like that, but I > don't have better ideas. It seems brittle to me. And it's not just position information. Syntax-parsing mode doesn't even track parse node kinds, except approximately. My idea is to use those as little as possible and have the parser mostly keep whatever information it needs. Less lying. > @@ +3429,5 @@ > > + handler.setEndPosition(pnlet, pos().end); > > + > > + if (needExprStmt) { > > + if (!MatchOrInsertSemicolon(context, &tokenStream)) > > + return null(); > > Hmm. This MatchOrInsert bit wasn't here before. Did we have a bug? It's > unclear to me what causes this to be needed. A test illustrating what > changed would be nice. No bug, the existing code was just a bit subtle. In Parser::statement(), each case indicates "I need a trailing semicolon" by breaking out of the switch, or "no trailing semicolon for me thanks" by returning instead. Not super subtle, but enough to make you go "huh?" and I've had enough of it. As far as I'm concerned each Parser::whatsoeverStatement() method that needs a semicolon can call MatchOrInsertSemicolon directly. As a bonus, statement node end positions include the semicolon, if any. > We really need to do the frontend/TokenKind.h header with a > FOR_EACH_TOKEN_KIND macro in it at some point, and I should have done when I > originally added this abomination. :-( Also added to my list.
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #23) > > + ParseNode *newDoWhileStatement(uint32_t begin, ParseNode *body, ParseNode > Would rather see const TokenPos& again. Done. > > + (void) tokenStream.matchToken(TOK_SEMI); > > Could you make this change in a separate patch/revision, please, for > tracking purposes? Which change exactly? All I did was move it and rearrange an if-statement. Well, and rewrite the comment. No change in behavior.
Comment on attachment 762906 [details] [diff] [review] Part 5 - for and switch statements Review of attachment 762906 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/frontend/FullParseHandler.h @@ +220,5 @@ > } > > + ParseNode *newForStatement(uint32_t begin) { > + return new_<BinaryNode>(PNK_FOR, JSOP_NOP, TokenPos::make(begin, begin + 1), > + (ParseNode *) NULL, (ParseNode *) NULL); Blurgh. I guess we'll get to the point where we can create these things with the right stuff, eventually, and we live with this awful for now. @@ +223,5 @@ > + return new_<BinaryNode>(PNK_FOR, JSOP_NOP, TokenPos::make(begin, begin + 1), > + (ParseNode *) NULL, (ParseNode *) NULL); > + } > + > + ParseNode *newForHead(bool isForInOrOf, ParseNode *pn1, ParseNode *pn2, ParseNode *pn3, Naming the nodes |init|, |step|, |done| would be good. Same for all the other names in all the other patches I've reviewed today, and yesterday and the day before -- don't know why I haven't thought to mention this til now! :-\ @@ +224,5 @@ > + (ParseNode *) NULL, (ParseNode *) NULL); > + } > + > + ParseNode *newForHead(bool isForInOrOf, ParseNode *pn1, ParseNode *pn2, ParseNode *pn3, > + const TokenPos &pos) { I think { goes on the next line here, actually. @@ +229,5 @@ > + ParseNodeKind kind = isForInOrOf ? PNK_FORIN : PNK_FORHEAD; > + return new_<TernaryNode>(kind, JSOP_NOP, pn1, pn2, pn3, pos); > + } > + > + ParseNode *newSwitchStatement(uint32_t begin, ParseNode *discriminant, ParseNode *caseList) { Nice ParseNode* names. :-) ::: js/src/frontend/ParseNode.h @@ +903,5 @@ > pn_kid3 = kid3; > } > > + TernaryNode(ParseNodeKind kind, JSOp op, ParseNode *kid1, ParseNode *kid2, ParseNode *kid3, > + const TokenPos &pos) Nice overload! ;-) ::: js/src/frontend/Parser.cpp @@ +4192,5 @@ > ParseNode *body = statement(); > if (!body) > return null(); > > + /* A FOR node is binary, left is loop control and right is the body. */ Looks like, with begin squirreled away, you could punt creating |pn| down to here. Or maybe that's in the next patch or so (I cheated and looked ahead because this one was getting mind-bendy, then looked back when I realized you'd trapped me into reading this).
Attachment #762906 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 762909 [details] [diff] [review] Part 6 - Further forStatement cleanup, v1 Review of attachment 762909 [details] [diff] [review]: ----------------------------------------------------------------- Ugh, this code is such disaster. But it's getting better! ::: js/src/frontend/Parser.cpp @@ +3983,5 @@ > + ParseNode *forLetImpliedBlock = NULL; > + ParseNode *forLetDecl = NULL; > + > + // If non-null, the node for the decl 'var v = expr1' in the weirdo form > + // 'for (var v = expr1 in expr2) stmt'. Eloquent. I would also accept further use of "mad" here. @@ +4193,5 @@ > PopStatementPC(context, pc); > #endif > PopStatementPC(context, pc); > + > + ParseNode *pn = handler.newForStatement(begin, forHead, body, iflags); forNode or some mildly better name, please.
Attachment #762909 - Flags: review?(jwalden+bmo) → review+
Attachment #762920 - Flags: review?(jwalden+bmo) → review+
(In reply to Jason Orendorff [:jorendorff] from comment #24) > No bug, the existing code was just a bit subtle. > > In Parser::statement(), each case indicates "I need a trailing semicolon" by > breaking out of the switch, or "no trailing semicolon for me thanks" by > returning instead. > > Not super subtle, but enough to make you go "huh?" and I've had enough of > it. Ah. I missed this! ╯°□°)╯︵ ┻━┻ Agreed. And putting the semicolon in the statement method matches the language grammar, too.
(In reply to Jason Orendorff [:jorendorff] from comment #25) > Which change exactly? All I did was move it and rearrange an if-statement. > Well, and rewrite the comment. No change in behavior. The change you didn't make because of the misunderstanding noted in comment 28. How was that not obvious? :-)
Comment on attachment 762922 [details] [diff] [review] Part 8 - return statements, v1 Review of attachment 762922 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/frontend/Parser.cpp @@ +4454,2 @@ > if (isYield) { > if (!abortIfSyntaxParser()) I like having less conditionally-parsed code, but let's assert JS_HAS_GENERATORS if we're going to do this, as long as that macro exists. @@ +4465,5 @@ > pc->yieldOffset = begin; > } > } > + > + // Parse an optional operand. This is a good comment. @@ +4472,5 @@ > + // there is not a mandatory semicolon. > + // > + // ES6 does not permit yield without an operand. We will have to sunset > + // this extension in order to conform to the ES6 syntax, which treats > + // "yield \n expr;" as a single ExpressionStatement. Followup to add a warning to this case? And if you plan on landing any of this before the uplift (you've got to ask yourself one question: do I feel lucky?), landing the warning would be an especially good thing to do. Or do the warning even if none of the rest of this lands, tho you'd have to rebase some for it. @@ +4473,5 @@ > + // > + // ES6 does not permit yield without an operand. We will have to sunset > + // this extension in order to conform to the ES6 syntax, which treats > + // "yield \n expr;" as a single ExpressionStatement. > + Node pn2; exprNode, producedExpr, or something other than pn2. @@ +4480,3 @@ > return null(); > + if (next == TOK_EOF || next == TOK_EOL || next == TOK_SEMI || next == TOK_RC > + || (isYield && (next == TOK_YIELD || next == TOK_RB || next == TOK_RP || || before (isYield ...) should go at the end of the previous line. ::: js/src/frontend/Parser.h @@ +427,5 @@ > Node forStatement(); > Node switchStatement(); > Node continueStatement(); > Node breakStatement(); > + Node returnStatementOrYieldExpression(); Obviously the spec's not going to conflate return and yield, so we should undo this conflation at some point. Given practically every other basic block of this is an |isYield| test, I think this is highly warranted.
Attachment #762922 - Flags: review?(jwalden+bmo) → review+
Attachment #762968 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 762969 [details] [diff] [review] Part 10 - throw statements, v1 Review of attachment 762969 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/frontend/Parser.cpp @@ +4624,5 @@ > + /* ECMA-262 Edition 3 says 'throw [no LineTerminator here] Expr'. */ > + TokenKind tt = tokenStream.peekTokenSameLine(TSF_OPERAND); > + if (tt == TOK_ERROR) > + return null(); > + if (tt == TOK_EOF || tt == TOK_EOL || tt == TOK_SEMI || tt == TOK_RC) { We should create a little helper thingy to encapsulate this list. I bet it shows up more than just here and return/yield, and could definitely use less repetition. @@ +4629,5 @@ > + report(ParseError, false, null(), JSMSG_SYNTAX_ERROR); > + return null(); > + } > + > + Node pnexp = expr(); throwExpr
Attachment #762969 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 762971 [details] [diff] [review] Part 11 - debugger statements, v1 Review of attachment 762971 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/frontend/Parser.cpp @@ +4822,5 @@ > + > + pc->sc->setBindingsAccessedDynamically(); > + pc->sc->setHasDebuggerStatement(); > + > + return handler.newDebuggerStatement(pos()); Erm, don't you mean |p|?
Attachment #762971 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 762972 [details] [diff] [review] Part 12 - Block, v1 Review of attachment 762972 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/frontend/Parser.cpp @@ +3365,5 @@ > template <typename ParseHandler> > typename ParseHandler::Node > +Parser<ParseHandler>::blockStatement() > +{ > + StmtInfoPC stmtInfo(context); Assert the current token is TOK_LC. @@ +3369,5 @@ > + StmtInfoPC stmtInfo(context); > + if (!PushBlocklikeStatement(&stmtInfo, STMT_BLOCK, pc)) > + return null(); > + > + Node pn = statements(); list, or something
Attachment #762972 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 762974 [details] [diff] [review] Part 13 - Squeeze out leftover whitespace, v1 Review of attachment 762974 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/frontend/Parser.cpp @@ +4890,5 @@ > + case TOK_BREAK: return breakStatement(); > + case TOK_RETURN: return returnStatementOrYieldExpression(); > + case TOK_WITH: return withStatement(); > + case TOK_THROW: return throwStatement(); > + case TOK_TRY: return tryStatement(); I don't much like horizontal alignment like this, because it bitrots when longer things get added. I'd just kill the blank lines. Oh well.
Attachment #762974 - Flags: review?(jwalden+bmo) → review+
Attachment #762975 - Flags: review?(jwalden+bmo) → review+
Attachment #762980 - Flags: review?(jwalden+bmo) → review+
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #26) > Blurgh. I guess we'll get to the point where we can create these things > with the right stuff, eventually, and we live with this awful for now. I feel pretty good about this comment, since I fixed it in the next patch. :) > > + ParseNode *newForHead(bool isForInOrOf, ParseNode *pn1, ParseNode *pn2, ParseNode *pn3, > > Naming the nodes |init|, |step|, |done| would be good. Same for all the > other names in all the other patches I've reviewed today, and yesterday and > the day before -- don't know why I haven't thought to mention this til now! > :-\ I think all the statement parts are well named except for this one method, and the reason this one is specially bad is that it serves both kinds of syntax: for (init; cond; update) for (decl/pattern of iterable) So the parameters really have no single meaning. Yeah, I know, there should be two methods. Someday! > I think { goes on the next line here, actually. Done. > Looks like, with begin squirreled away, you could punt creating |pn| down to > here. Or maybe that's in the next patch or so > [...] > Ugh, this code is such disaster. But it's getting better! :-D > > + // If non-null, the node for the decl 'var v = expr1' in the weirdo form > > + // 'for (var v = expr1 in expr2) stmt'. > > Eloquent. I would also accept further use of "mad" here. Ha! Sticking with "weirdo". > > + ParseNode *pn = handler.newForStatement(begin, forHead, body, iflags); > > forNode or some mildly better name, please. Changed it to forLoop, and wow. I hardly recognize this code anymore.
> > if (isYield) { > > [...] let's assert JS_HAS_GENERATORS [...] Done. > > + // ES6 does not permit yield without an operand. We will have to sunset > > + // this extension in order to conform to the ES6 syntax, which treats > > + // "yield \n expr;" as a single ExpressionStatement. > > Followup to add a warning to this case? Filed bug 885463. > Obviously the spec's not going to conflate return and yield, so we should > undo this conflation at some point. Given practically every other basic > block of this is an |isYield| test, I think this is highly warranted. Agreed.
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #31) > > + /* ECMA-262 Edition 3 says 'throw [no LineTerminator here] Expr'. */ > > + TokenKind tt = tokenStream.peekTokenSameLine(TSF_OPERAND); > > + if (tt == TOK_ERROR) > > + return null(); > > + if (tt == TOK_EOF || tt == TOK_EOL || tt == TOK_SEMI || tt == TOK_RC) { > > We should create a little helper thingy to encapsulate this list. I bet it > shows up more than just here and return/yield, and could definitely use less > repetition. Investigating a bit, I found we implement [no LineTerminator here] with custom code everywhere it appears in the grammar. returnStatementOrYieldExpression and throwStatement use this idiom; breakStatement and continueStatement both use MatchLabel which does something different, and postfix expressions are parsed in unaryExpr which does something else. The only common thread is peekTokenSameLine. Not much to common up. :-P
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #34) > I don't much like horizontal alignment like this, because it bitrots when > longer things get added. I'd just kill the blank lines. [...] Done.
I have a pending patch to separate returnOrYield into two pieces and refactor the detection of generators, in preparation for function*. So if there is something in that area that you haven't done but are thinking of doing, please be lazy and let me do it :)
(In reply to Andy Wingo from comment #39) > I have a pending patch to separate returnOrYield into two pieces and > refactor the detection of generators, in preparation for function*. So if > there is something in that area that you haven't done but are thinking of > doing, please be lazy and let me do it :) Sorry for bitrotting you so heavily! Patches in bug 885463 and bug 880447 also touch this code, but not much. I am going to go ahead and land them (the former has to ship, and the latter will not hurt much).
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: