Last Comment Bug 634472 - disable |yield| and |arguments| in generator expressions
: disable |yield| and |arguments| in generator expressions
Status: RESOLVED FIXED
fixed-in-tracemonkey
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: x86 Mac OS X
: -- normal (vote)
: ---
Assigned To: Dave Herman [:dherman]
:
Mentors:
Depends on: 665286 683738
Blocks: 632028
  Show dependency treegraph
 
Reported: 2011-02-15 17:08 PST by Dave Herman [:dherman]
Modified: 2011-08-31 14:50 PDT (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
rough draft (no tests yet) (10.89 KB, patch)
2011-02-15 17:16 PST, Dave Herman [:dherman]
no flags Details | Diff | Review
catches 'arguments' in top-level genexp (11.73 KB, patch)
2011-03-25 18:30 PDT, Dave Herman [:dherman]
no flags Details | Diff | Review
improved error reporting (14.75 KB, patch)
2011-03-26 17:59 PDT, Dave Herman [:dherman]
no flags Details | Diff | Review
ready for review (46.88 KB, patch)
2011-04-07 16:07 PDT, Dave Herman [:dherman]
brendan: review+
cdleary: review-
Details | Diff | Review
correct even in unparenthesized sole function argument (49.41 KB, patch)
2011-06-06 18:59 PDT, Dave Herman [:dherman]
cdleary: review+
Details | Diff | Review
final version (48.21 KB, patch)
2011-06-15 11:45 PDT, Dave Herman [:dherman]
cdleary: review+
Details | Diff | Review

Description Dave Herman [:dherman] 2011-02-15 17:08:18 PST
Using |yield| and |arguments| in a generator expression breaks the syntactic abstraction. I have an experimental patch (post FF4, of course) for syntactically rejecting them in the body of a generator expression.

See also: bug 632028.

Dave
Comment 1 Dave Herman [:dherman] 2011-02-15 17:16:42 PST
Created attachment 512666 [details] [diff] [review]
rough draft (no tests yet)

This isn't quite right for |arguments| outside of a function (it allows references but not assignments), but it's a start.

Dave
Comment 2 Dave Herman [:dherman] 2011-02-16 12:35:25 PST
I should also mention: the quality of error reporting is less than ideal, because it reports the error only upon reaching the "for" or ")" token. And in the current patch it actually points to the last token *before* the "for" or ")" token, which is really confusing for the user.

So at the very least, it should advance the token pointer by one before reporting. But if there's some way to actually rewind to the yield/arguments token, that would be ideal. Not sure that's possible or worth the trouble.

Dave
Comment 3 Brendan Eich [:brendan] 2011-02-16 17:48:14 PST
Just save the offending pn and pass that.

Probably you should use uint32 counters, just to avoid some fuzzer wrapping 16-bits and getting around the check.

/be
Comment 4 Dave Herman [:dherman] 2011-03-25 18:30:53 PDT
Created attachment 522047 [details] [diff] [review]
catches 'arguments' in top-level genexp

This now catches the case of a plain reference to 'arguments' in a generator expression at top-level (i.e., not nested inside any function). Also switched to uint32 per Brendan's suggestion. Merged the two error message types to just one type.

Still need to create test cases and improve the error reporting.

Dave
Comment 5 Dave Herman [:dherman] 2011-03-26 17:59:48 PDT
Created attachment 522168 [details] [diff] [review]
improved error reporting

This version fixes the error reporting. It should be almost ready to go; just needs a bunch of tests.

Dave
Comment 6 Dave Herman [:dherman] 2011-04-07 16:07:55 PDT
Created attachment 524523 [details] [diff] [review]
ready for review

Should be ready now. This version fixes all tests and includes a big set of its own tests.

Dave
Comment 7 Brendan Eich [:brendan] 2011-04-10 17:28:32 PDT
Comment on attachment 524523 [details] [diff] [review]
ready for review

>@@ -4826,28 +4829,34 @@ ContainsStmt(JSParseNode *pn, TokenKind 
> 
> JSParseNode *
> Parser::returnOrYield(bool useAssignExpr)
> {
>     TokenKind tt, tt2;
>     JSParseNode *pn, *pn2;
> 
>     tt = tokenStream.currentToken().type;
>-    if (tt == TOK_RETURN && !tc->inFunction()) {
>-        reportErrorNumber(NULL, JSREPORT_ERROR, JSMSG_BAD_RETURN_OR_YIELD, js_return_str);
>+    if (!tc->inFunction()) {
>+        reportErrorNumber(NULL, JSREPORT_ERROR, JSMSG_BAD_RETURN_OR_YIELD, tt == TOK_RETURN ? js_return_str : js_yield_str);

Nits: parenthesize ?: condition if lower precedence than unary; wrap before that ?: expression to avoid violating columng 100.

> #if JS_HAS_GENERATORS
>-    if (tt == TOK_YIELD)
>-        tc->flags |= TCF_FUN_IS_GENERATOR;
>+    if (tt == TOK_YIELD) {
>+        if (tc->parenDepth == 0) {
>+            tc->flags |= TCF_FUN_IS_GENERATOR;
>+        } else {

A comment here about delaying setting FUN_IS_GENERATOR till after ruling out yield in genexp would be nice.

>                 tc->noteArgumentsUse();
>+                tc->argumentsNode = pn2;

tc->noteArgumentsUse should set argumentsNode (this applies another place, primaryExpr, iirc).

>@@ -6944,18 +6960,21 @@ Parser::unaryExpr()
>             }
>             break;
>           case TOK_NAME:
>             if (!ReportStrictModeError(context, &tokenStream, tc, pn,
>                                        JSMSG_DEPRECATED_DELETE_OPERAND)) {
>                 return NULL;
>             }
>             pn2->pn_op = JSOP_DELNAME;
>-            if (pn2->pn_atom == context->runtime->atomState.argumentsAtom)
>+            if (pn2->pn_atom == context->runtime->atomState.argumentsAtom) {
>                 tc->flags |= TCF_FUN_HEAVYWEIGHT;
>+                tc->argumentsCount++;
>+                tc->argumentsNode = pn2;
>+            }

Could use an even smaller helper factored out of noteArgumentsUse, to do the argumentsCount++ and argumentsNode assignment? call it tc->countArgumentsUse?

>                    ) && !(tc->flags & TCF_DECL_DESTRUCTURING)) {
>+            /*
>+             * In case this is a generator expression outside of any function.
>+             */

Nit: one-line comment style works here.

>+            if (!(tc->flags & TCF_IN_FUNCTION) &&

This should test !tc->inFunction() -- check for similar hand-expansions and unexpand.

>+                pn->pn_atom == context->runtime->atomState.argumentsAtom) {
>+                tc->argumentsCount++;
>+                tc->argumentsNode = pn;
>+            }

The atom test could be factored into an inline helper at no cost, too.

Looks good to me otherwise. Chris, you have any thoughts?

/be
Comment 8 Chris Leary [:cdleary] (not checking bugmail) 2011-04-26 22:55:04 PDT
Comment on attachment 524523 [details] [diff] [review]
ready for review

Sorry for the delay! I was finishing up the review, but I think this case is still not producing a syntax error when it should:

  print((yield) for (i in [0,1,2,3]));

Makes:

  main:
  00001:  callgname "print"
  00004:  lambda (function () {for (i in [1, 2, 3, 4]) {yield (yield);}})

It's tricky because a) the genexp doesn't have to be parenthesized in call position and b) the error data gets reset when the paren count dips back down to zero. Since the invoke-arguments production doesn't increase the paren count, it goes back to zero before we encounter the |for|, we reset it in the comprehensionTail production. Most of the other parens-are-implicit productions in the parser use the parenExpr production to avoid this problem.

It'd also be sweet if we could factor out the similar code (parenDepth/token-count accounting) from comprehensionTail and parenExpr into a stacky guard class on the tree context with a start/finish kind of thing.

I hope the fix is as easy as bumping the paren count in the arguments production, but I'm not totally sure it will be, so please r? me with that fixed. Thanks!
Comment 9 Chris Leary [:cdleary] (not checking bugmail) 2011-04-26 22:56:50 PDT
(In reply to comment #8)
>   print((yield) for (i in [0,1,2,3]));

Forgot to mention: this has to be in a function. Also, the numbers don't get magically transposed as my disasm would indicate. :-)
Comment 10 Dave Herman [:dherman] 2011-06-06 18:59:26 PDT
Created attachment 537712 [details] [diff] [review]
correct even in unparenthesized sole function argument

Addresses Brendan's nits and the bug cdleary pointed out (thank you!). Abstracted out the bookkeeping in a helper class called GenexpGuard. Works in all contexts where genexps can appear, including in the unparenthesized case of a genexp as a sole function argument. Added this case to every unit test in the test file.

Brendan: the one nit I didn't address was abstracting out the == argumentsAtom check. Would you want that as an inline method in JSAtom?

Dave
Comment 11 Brendan Eich [:brendan] 2011-06-06 22:18:46 PDT
Comment on attachment 537712 [details] [diff] [review]
correct even in unparenthesized sole function argument

>@@ -305,16 +305,27 @@ struct JSTreeContext {              /* t
>     JSStmtInfo      *topScopeStmt;  /* top lexical scope statement */
>     JSObjectBox     *blockChainBox; /* compile time block scope chain (NB: one
>                                        deeper than the topScopeStmt/downScope
>                                        chain when in head of let block/expr) */
>     JSParseNode     *blockNode;     /* parse node for a block with let declarations
>                                        (block with its own lexical scope)  */
>     JSAtomList      decls;          /* function, const, and var declarations */
>     js::Parser      *parser;        /* ptr to common parsing and lexing data */
>+    uint32          parenDepth;     /* paren-nesting depth */
>+    uint32          yieldCount;     /* number of |yield| tokens encountered at
>+                                       non-zero depth in current paren tree */
>+    uint32          argumentsCount; /* number of |arguments| references encountered
>+                                       at non-zero depth in current paren tree */
>+    JSParseNode     *yieldNode;     /* parse node for a yield expression that might

There are three uint32 members at the front of the struct, followed by pointer members. On 64-bit systems the layout in this patch wastes two 32-bit words. Doesn't matter unless valgrind says so, but I thought I would mention it.

> Parser::returnOrYield(bool useAssignExpr)
> {
>     TokenKind tt, tt2;
>     JSParseNode *pn, *pn2;
> 
>     tt = tokenStream.currentToken().type;
>-    if (tt == TOK_RETURN && !tc->inFunction()) {
>-        reportErrorNumber(NULL, JSREPORT_ERROR, JSMSG_BAD_RETURN_OR_YIELD, js_return_str);
>+    if (!tc->inFunction()) {
>+        reportErrorNumber(NULL, JSREPORT_ERROR, JSMSG_BAD_RETURN_OR_YIELD, (tt == TOK_RETURN) ? js_return_str : js_yield_str);
..1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890

100 column limit wants the ?: expression to wrap (the whole thing, not lumpy trailing parts).

>-NewBindingNode(JSAtom *atom, JSTreeContext *tc, bool let = false)
>+NewBindingNode(JSAtom *atom, JSTreeContext *tc, JSContext *cx, bool let = false)

Fallible functions take leading cx, not trailing or next-to-last.

>+class GenexpGuard {
>+    JSTreeContext   *tc;
>+    uint32          yieldCount;
>+    uint32          argumentsCount;
>+  public:

Nit: blank line before the public: label.

>+            /* In case this is a generator expression outside of any function. */
>+            if (!(tc->inFunction()) &&

The ! operand is overparenthesized here, contrary to house style and unlike the later !tc->inFunction() test.

/be
Comment 12 Brendan Eich [:brendan] 2011-06-06 22:23:40 PDT
(In reply to comment #11)
> >-NewBindingNode(JSAtom *atom, JSTreeContext *tc, bool let = false)
> >+NewBindingNode(JSAtom *atom, JSTreeContext *tc, JSContext *cx, bool let = false)
> 
> Fallible functions take leading cx, not trailing or next-to-last.

But there's no reason to change this signature to take cx. You can get the context from tc->parser->context.

/be
Comment 13 Chris Leary [:cdleary] (not checking bugmail) 2011-06-14 10:29:46 PDT
Comment on attachment 537712 [details] [diff] [review]
correct even in unparenthesized sole function argument

Review of attachment 537712 [details] [diff] [review]:
-----------------------------------------------------------------

Good stuff!

::: js/src/jsemit.h
@@ +477,5 @@
>          if (funbox)
>              funbox->node->pn_dflags |= PND_FUNARG;
>      }
>  
> +    void countArgumentsUse(JSParseNode *pn) {

Worth asserting the atom is the arguments atom?

::: js/src/jsparse.cpp
@@ +6753,5 @@
>  };
>  
>  /*
> + * A helper for lazily checking for the presence of |yield| or |arguments| tokens
> + * inside of generator expressions.

Just from a usage perspective in this comment -- when do I *need* to call checkYield and checkArguments, and when I do that what does this guard ensure?

@@ +6757,5 @@
> + * inside of generator expressions.
> + */
> +class GenexpGuard {
> +    JSTreeContext   *tc;
> +    uint32          yieldCount;

yieldCountBefore? Or startYieldCount? Just makes the relational expressions in the check* methods a little more self-documenting.

@@ +6760,5 @@
> +    JSTreeContext   *tc;
> +    uint32          yieldCount;
> +    uint32          argumentsCount;
> +  public:
> +    GenexpGuard(JSTreeContext *tc)

Please use the |explicit| keyword for single-argument constructors. I'm slightly paranoid about that.

@@ +6763,5 @@
> +  public:
> +    GenexpGuard(JSTreeContext *tc)
> +      : tc(tc)
> +    {
> +        if (tc->parenDepth == 0) {

This should only be the case when no other GenexpGuards exist (otherwise you're changing the counts on the tree contexts behind those GenexpGuards' backs). Not sure if that's worth asserting in debug mode.

@@ +6771,5 @@
> +        yieldCount = tc->yieldCount;
> +        argumentsCount = tc->argumentsCount;
> +    }
> +
> +    bool checkYield(Parser *parser, JSParseNode *pn);

I think you should be able to get the parser here by using tc->parser, just a little neater than passing one in explicitly, since the guard already pertains to that particular tree context.
Comment 14 Dave Herman [:dherman] 2011-06-15 11:45:14 PDT
Created attachment 539608 [details] [diff] [review]
final version
Comment 15 Dave Herman [:dherman] 2011-06-15 11:49:58 PDT
Comment on attachment 539608 [details] [diff] [review]
final version

Oops, should've done an r? instead of a feedback?.

Dave
Comment 16 Chris Leary [:cdleary] (not checking bugmail) 2011-06-15 15:30:04 PDT
Comment on attachment 539608 [details] [diff] [review]
final version

Cool beans.
Comment 17 Dave Herman [:dherman] 2011-06-16 08:23:51 PDT
http://hg.mozilla.org/tracemonkey/rev/81c343a150a4
Comment 18 Chris Leary [:cdleary] (not checking bugmail) 2011-06-20 17:13:35 PDT
cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/81c343a150a4

Note You need to log in before you can comment on or make changes to this bug.