Closed Bug 624199 Opened 14 years ago Closed 14 years ago

Decompilation turns ignored "use strict" into top-of-function "use strict"

Categories

(Core :: JavaScript Engine, defect)

x86
All
defect
Not set
minor

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: jruderman, Assigned: jimb)

References

Details

(Keywords: regression, testcase, Whiteboard: softblocker, fixed-in-tracemonkey)

Attachments

(2 files, 4 obsolete files)

function f() { if (true) { 'use strict'; } var eval; }
eval(uneval(f));

typein:2: SyntaxError: redefining eval is deprecated:
typein:2: function f() {"use strict";var eval;}
typein:2: ...............................^

The first bad revision is:
changeset:   8ea7ed461dc0
user:        Brendan Eich
date:        Mon Dec 27 13:10:25 2010 -0800
summary:     ES5 directives should not trigger 'useless expression' errors (559402, r=jimb).
There's web and extension code that uses decompilation somewhat-extensively, so this could cause compat issues.
blocking2.0: --- → ?
This is true.  On the other hand, for this to manifest requires a useless use-strict expression inside an if block at the start of a function.  That seems unlikely to happen much on the web.  I wouldn't start being worried until I saw that happen somewhere other than as the demented result of fuzzing.  :-)
agreed, but at this point it seems like the "useless expr" warning is the better place to be -- brendan, want to back that out and we can tidy it up later?
I'd have to disagree.  We've long recommended javascript.options.strict for web developers.  I suspect they'll prefer less console spam to mis-decompilation occurring only with fairly weird code.
No, I don't want to back that out. This is a very low priority bug. It's extremely unlikely, as Waldo says, that any code has "use strict"; at sub-statement position with a constant controlling things so that our cheap-o constant folder moves the useless string expression to top level.

A warning is better than a change in semantics if it came to that, except it's likely anyone writing "use strict"; not in prologue position wanted strict mode. It's a toss-up and again, vanishingly unlikely.

This can be fixed during a coffee break with about the same effort as backing out, and greater good.

/be
Assignee: general → brendan
Attached patch fix (obsolete) — Splinter Review
The JSParseNode::isDirective* helpers are not enough, we need to record the parser's verdict when it actually recognized a candidate, based not only on node kind but also on state machine in Compiler::compileScript and Parser::statements.

/be
Attachment #502384 - Flags: review?(jimb)
Gets in fuzzer's way, safe patch -> b9.

/be
OS: Mac OS X → Windows 7
... or at least tm landing soon, b10 is ok too. I hope Jim is around.

/be
OS: Windows 7 → All
Looking at this.
blocking2.0: ? → betaN+
Whiteboard: softblocker
I have a different solution that I think might be better; stealing.
Assignee: brendan → jimb
Is this increase in use-stricts the same bug?

js> (function() { "use strict"; })
(function () {'use strict';"use strict";})
js> (function () {'use strict';"use strict";})
(function () {'use strict';"use strict";"use strict";})
Yes.

/be
Never emit bytecode for expression statements consisting of a single string
literal. Complain about them as useless code only if they are not part of a
Directive Prologue. The comments in recognizeDirectivePrologue explain the
details.

Fix bad names of directive-prologue-related parse node member functions.
Attachment #502680 - Flags: review?(brendan)
That patch needs brendan's tests, of course.
Comment on attachment 502680 [details] [diff] [review]
Correctly skip emitting bytecode for useless string literals and complain, while not flagging directives.

>@@ -5934,15 +5935,22 @@ js_EmitTree(JSContext *cx, JSCodeGenerat
>              * catches the case where we are nesting in js_EmitTree for a
>              * labeled compound statement.
>              */
>-            if (!useful &&
>-                (!cg->topStmt ||
>-                 cg->topStmt->type != STMT_LABEL ||
>-                 cg->topStmt->update < CG_OFFSET(cg))) {
>-                CG_CURRENT_LINE(cg) = pn2->pn_pos.begin.lineno;
>-                if (!ReportCompileErrorNumber(cx, CG_TS(cg), pn2,
>-                                              JSREPORT_WARNING | JSREPORT_STRICT,
>-                                              JSMSG_USELESS_EXPR)) {
>-                    return JS_FALSE;
>+            if (!useful) {
>+                if (cg->topStmt &&
>+                    cg->topStmt->type == STMT_LABEL &&
>+                    cg->topStmt->update >= CG_OFFSET(cg))
>+                    useful = true;

This violates bracing style (brace when any of condition/then/else multiline). But it seems gratuitous anyway, the (&&(||...)) condition was ok, although per common style the { for the consequent could have been put on its own line, underhanging the i in 'if'.

>+            }
>+
>+            if (!useful) {
>+                /* Don't complain about directive prologue members; just omit their code. */

This comment is inaccurate starting at the ';'...

>+                if (!pn->isDirectivePrologueMember()) {
>+                    CG_CURRENT_LINE(cg) = pn2->pn_pos.begin.lineno;
>+                    if (!ReportCompileErrorNumber(cx, CG_TS(cg), pn2,
>+                                                  JSREPORT_WARNING | JSREPORT_STRICT,
>+                                                  JSMSG_USELESS_EXPR)) {
>+                        return JS_FALSE;
>+                    }
>                 }
>             } else {
>                 op = wantval ? JSOP_POPV : JSOP_POP;

... because the else is where we emit code.

>+ * Recognize Directive Prologue members and directives. Assuming |pn| is a candidate for
>+ * membership in a directive prologue, recognize directives and set |tc|'s flags
>+ * accordingly. If |pn| is indeed part of a prologue, set its |pn_prologue| flag.
..12345678901234567890123456789012345678901234567890123456789012345678901234567890

Wrap convention for comments uses margin of 79.

>+ * That is, even though "use\x20loose" can never be a directive, now or in the future
>+ * (because of the hex escape), the Directive Prologue extends through it to the "use
>+ * strict" statement, which is indeed a directive.

Ditto.

>-    *isDirectivePrologueMember = pn->isDirectivePrologueMember();
>+    *isDirectivePrologueMember = pn->isStringExprStmt();

This is a good name cleanup. However the disparity between ExprStmt and DirectivePrologueMember in degree of contraction is kinda bugging me. "Expr" is well used in jsparse.cpp, "Stmt" too but less so. Not sure what you prefer, I could even see pn->isStringExpressionStatement().

>+    if (kid->isEscapelessStringLiteral()) {

"EscapeFree" seems slightly better, YMMV -- your call.

>+

Useless blank line.

>+        /*
>+         * Mark this statement as being a possibly legitimate part of a directive
>+         * prologue, so the byte code emitter won't warn about it being useless code. (We
>+         * mustn't just omit the statement entirely yet, as it could be producing the
>+         * value of an eval or JSScript execution.)
>+         *
>+         * Note that even if the string isn't one we recognize as a directive, the emitter
>+         * still shouldn't flag it as useless, as it could become a directive in the
>+         * future. We don't want to interfere with people taking advantage of
>+         * directive-prologue-enabled features that appear in other browsers first.

Wrap-marging 79 nag.

>+         */
>+        pn->pn_prologue = true;
>+
>+        JSAtom *directive = kid->pn_atom;
>         if (directive == context->runtime->atomState.useStrictAtom) {
>             /*
>              * Unfortunately, Directive Prologue members in general may contain
>diff --git a/js/src/jsparse.h b/js/src/jsparse.h
>--- a/js/src/jsparse.h
>+++ b/js/src/jsparse.h
>@@ -138,6 +138,9 @@ JS_BEGIN_EXTERN_C
>  *                                     pn_right: initializer
>  * TOK_RETURN   unary       pn_kid: return expr or null
>  * TOK_SEMI     unary       pn_kid: expr or null statement
>+ *                          pn_prologue: true if Directive Prologue member in original
>+ *                              source, not introduced via constant folding or other tree
..12345678901234567890123456789012345678901234567890123456789012345678901234567890

ditto.

>+     * True if this statement node could be a member of a Directive Prologue: an
>+     * expression statement consisting of a single string literal.
>+     *
>+     * This considers only the node and its children, not its context. After parsing,
>+     * check the node's pn_prologue flag to see if it is indeed part of a directive
>+     * prologue.
>+     *
>+     * Note that a Directive Prologue can contain statements that cannot themselves be
>+     * directives (string literals that include escape sequences or escaped newlines,
>+     * say). This member function returns true for such nodes; we use it to determine the
>+     * extent of the prologue. isEscapelessStringLiteral, below, checks whether the node
>+     * itself could be a directive.

Again.

>+     * Return true if this node, known to be a string literal, could be the string of a
>+     * directive in a Directive Prologue. Directive strings never contain escape sequences
>+     * or line continuations.

Ditto.

>+         * If the string's length in the source code is its length as a value, accounting
>+         * for the quotes, then it must not contain any escape sequences or line
>+         * continuations.

Again.

Another issue with the patch: it relies on the decompiler to output "use strict"; and that is done by js_DecompileFunction (with single quotes, non-canonical :-P) -- but what about JS_DecompileScript on a top-level script containing "use strict"? More significantly, does dherman's Reflect stuff expose an AST reflection of the directive? By editing it out of bytecode, this patch risks losing it from ASTs, where before it was preserved verbatim.

/be
The Reflect stuff is just AST based, not bytecode based, so no worries there. XDR'ing scripts preserves their flags. So the decompiler could handle the "use strict" preservation but if we add more directives, or pragmas with parameters, we'll want more flag bits if not opcode sequences.

/be
Sorry about misreading "omit" as "emit". Need my Captain Kirk STII:TWOK reading glasses...

/be
(In reply to comment #16)
> This violates bracing style (brace when any of condition/then/else multiline).
> But it seems gratuitous anyway, the (&&(||...)) condition was ok, although per
> common style the { for the consequent could have been put on its own line,
> underhanging the i in 'if'.

Changed to keep original, single if with large condition.

> >+            }
> >+
> >+            if (!useful) {
> >+                /* Don't complain about directive prologue members; just omit their code. */
> 
> This comment is inaccurate starting at the ';'...

Sorted this out on IRC (misread "emit" for "omit"). Rewrote comment to avoid this confusion.

> >+ * Recognize Directive Prologue members and directives. Assuming |pn| is a candidate for
> >+ * membership in a directive prologue, recognize directives and set |tc|'s flags
> >+ * accordingly. If |pn| is indeed part of a prologue, set its |pn_prologue| flag.
> ..12345678901234567890123456789012345678901234567890123456789012345678901234567890
> 
> Wrap convention for comments uses margin of 79.

I'm confused about when the 100-column rule applies and when the 80-column rule applies.

https://wiki.mozilla.org/JavaScript:SpiderMonkey:C_Coding_Style
This suggests 100.

https://developer.mozilla.org/en/SpiderMonkey_coding_conventions
This suggests 80 for everything, which I know is not what we're doing.

I changed all the comments to fit within 80 columns.

> >-    *isDirectivePrologueMember = pn->isDirectivePrologueMember();
> >+    *isDirectivePrologueMember = pn->isStringExprStmt();
> 
> This is a good name cleanup. However the disparity between ExprStmt and
> DirectivePrologueMember in degree of contraction is kinda bugging me. "Expr" is
> well used in jsparse.cpp, "Stmt" too but less so. Not sure what you prefer, I
> could even see pn->isStringExpressionStatement().

Changed to isStringExprStatement.

> 
> >+    if (kid->isEscapelessStringLiteral()) {
> 
> "EscapeFree" seems slightly better, YMMV -- your call.

I agree. Changed to isEscapeFreeStringLiteral.

> >+
> 
> Useless blank line.

Fixed.

> Another issue with the patch: it relies on the decompiler to output "use
> strict"; and that is done by js_DecompileFunction (with single quotes,
> non-canonical :-P) -- but what about JS_DecompileScript on a top-level script
> containing "use strict"?

Patch for these issues upcoming.

> More significantly, does dherman's Reflect stuff
> expose an AST reflection of the directive? By editing it out of bytecode, this
> patch risks losing it from ASTs, where before it was preserved verbatim.

Reflect.parse takes a string as input, calls js::Parser::parse, and then runs over the parse tree. Bytecode isn't involved, so the omit-needless-expressions stuff in js_EmitTree doesn't come into play. Now, it would be utterly awesome to have the decompiler drive one of Dave's ASTSerializers, but in that case, the code walking the bytecode and making calls on the serializer should consult the script's strict mode flag and tell the serializer appropriately. Under no circumstances is leaving a string literal in the byte code the right way to carry this information: the script's flag is its proper home.
Attached patch Regression tests. (obsolete) — Splinter Review
No r? yet, I need to expand on these a bit.
Revised as requested.
Attachment #502680 - Attachment is obsolete: true
Attachment #502680 - Flags: review?(brendan)
Still under test, but comments welcome.
Attachment #502741 - Flags: feedback?(brendan)
Comment on attachment 502741 [details] [diff] [review]
Move code for decompiling strict mode code directives to where it works for JS_DecompileScript, too.

Okay, all tests pass on the decompiler simplification.
Attachment #502741 - Flags: feedback?(brendan) → review?(brendan)
Comment on attachment 502741 [details] [diff] [review]
Move code for decompiling strict mode code directives to where it works for JS_DecompileScript, too.

Is this going to make every function in this expression:

function A() {"use strict"; return function B(){}(function C(){}); }

decompile with an explicit "use strict"; directive? Seems a shame -- compiling something into the script to indicate the presence of the directive would fix, at at least one bit's cost.

/be
(In reply to comment #24)
> Is this going to make every function in this expression:
> 
> function A() {"use strict"; return function B(){}(function C(){}); }
> 
> decompile with an explicit "use strict"; directive? Seems a shame -- compiling
> something into the script to indicate the presence of the directive would fix,
> at at least one bit's cost.

No, not at all. It retains the same logic we've had since the summer, where we emit the minimum number of directives necessary to accurately represent the function:

js> function f() {"use strict"; function B() { }; return [B, (function C() {})]; }
js> f
function f() {"use strict";function B() {};return [B, function C() {}];}
js> f()
[function B() {"use strict";}, (function C() {"use strict";})]
js> 

When we print f, there's only one directive, which covers B's and C's strictness. When we print the list of two functions, we print two directives, because it's two independent function values, which would not be accurately represented without the directive. It's exactly as it must be.
Attachment #502384 - Attachment is obsolete: true
Attachment #502384 - Flags: review?(jimb)
Excellent. Want to qfold into one patch? Else r? me on both.

/be
Comment on attachment 502741 [details] [diff] [review]
Move code for decompiling strict mode code directives to where it works for JS_DecompileScript, too.

>+    if (script->strictModeCode && !jp->strict) {
>+        if (jp->fun && jp->fun->flags & JSFUN_EXPR_CLOSURE) {

Nit: prevailing style always overparenthesizes bitwise and shift ops against all others.

/be
Attachment #502741 - Flags: review?(brendan) → review+
[Tests folded in; all requested revisions made.]

Never emit bytecode for expression statements consisting of a single string
literal. Complain about them as useless code only if they are not part of a
Directive Prologue. The comments in recognizeDirectivePrologue explain the
details.

Fix bad names of directive-prologue-related parse node member functions.
Attachment #502739 - Attachment is obsolete: true
Attachment #502740 - Attachment is obsolete: true
Attachment #502840 - Flags: review?(brendan)
(In reply to comment #27)
> Nit: prevailing style always overparenthesizes bitwise and shift ops against
> all others.

Fixed --- thanks for the review.
Comment on attachment 502840 [details] [diff] [review]
Correctly skip emitting bytecode for useless string literals and complain, while not flagging directives.

>@@ -5935,14 +5936,21 @@ js_EmitTree(JSContext *cx, JSCodeGenerat
>              * labeled compound statement.
>              */
>             if (!useful &&
>-                (!cg->topStmt ||
>-                 cg->topStmt->type != STMT_LABEL ||
>-                 cg->topStmt->update < CG_OFFSET(cg))) {
>-                CG_CURRENT_LINE(cg) = pn2->pn_pos.begin.lineno;
>-                if (!ReportCompileErrorNumber(cx, CG_TS(cg), pn2,
>-                                              JSREPORT_WARNING | JSREPORT_STRICT,
>-                                              JSMSG_USELESS_EXPR)) {
>-                    return JS_FALSE;
>+                (cg->topStmt &&
>+                 cg->topStmt->type == STMT_LABEL &&
>+                 cg->topStmt->update >= CG_OFFSET(cg))) {

Now the (A&&(B&&C&&D)) could lose the inner parens

>@@ -566,12 +572,21 @@ public:
> 
>     /*
>      * True if this statement node could be a member of a Directive
>-     * Prologue.  Note that the prologue may contain strings that
>-     * cannot themselves be directives; that's a stricter test.
>-     * If Statement begins to simplify trees into this form, then
>-     * we'll need additional flags that we can test here.

..12345678901234567890123456789012345678901234567890123456789012345678901234567890

>+     * Prologue: an expression statement consisting of a single string
>+     * literal.

"literal." fits on previous line by my reckoning.

Etc. for the other comments cited below. r=me with these nits picked.

/be

>+     *
>+     * This considers only the node and its children, not its context.
>+     * After parsing, check the node's pn_prologue flag to see if it is
>+     * indeed part of a directive prologue.
>+     *
>+     * Note that a Directive Prologue can contain statements that cannot
>+     * themselves be directives (string literals that include escape
>+     * sequences or escaped newlines, say). This member function returns
>+     * true for such nodes; we use it to determine the extent of the
>+     * prologue. isEscapeFreeStringLiteral, below, checks whether the node
>+     * itself could be a directive.
>      */
>-    bool isDirectivePrologueMember() const {
>+    bool isStringExprStatement() const {
>         if (PN_TYPE(this) == js::TOK_SEMI) {
>             JS_ASSERT(pn_arity == PN_UNARY);
>             JSParseNode *kid = pn_kid;
>@@ -581,23 +596,26 @@ public:
>     }
> 
>     /*
>+     * Return true if this node, known to be a string literal, could be the
>+     * string of a directive in a Directive Prologue. Directive strings
>+     * never contain escape sequences or line continuations.
>      */
>-    bool isDirective() const {
>-        JS_ASSERT(isDirectivePrologueMember());
>-        JSParseNode *kid = pn_kid;
>-        JSString *str = ATOM_TO_STRING(kid->pn_atom);
>+    bool isEscapeFreeStringLiteral() const {
>+        JS_ASSERT(pn_type == js::TOK_STRING && !pn_parens);
>+        JSString *str = ATOM_TO_STRING(pn_atom);
> 
>         /*
>+         * If the string's length in the source code is its length as a
>+         * value, accounting for the quotes, then it must not contain any
>+         * escape sequences or line continuations.
>          */
>         return (pn_pos.begin.lineno == pn_pos.end.lineno &&
>                 pn_pos.begin.index + str->length() + 2 == pn_pos.end.index);
>     }
> 
>+    /* Return true if this node appears in a Directive Prologue. */
>+    bool isDirectivePrologueMember() const { return pn_prologue; }
>+
> #ifdef JS_HAS_GENERATOR_EXPRS
>     /*
>      * True if this node is a desugared generator expression.
Attachment #502840 - Flags: review?(brendan) → review+
http://hg.mozilla.org/tracemonkey/rev/e69de12efa82
Whiteboard: softblocker → softblocker, fixed-in-tracemonkey
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: