Closed Bug 349814 Opened 18 years ago Closed 18 years ago

Decompiler escapes line breaks and backslashes within E4X literals

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla1.8.1

People

(Reporter: jruderman, Assigned: brendan)

Details

(Keywords: testcase, verified1.8.1)

Attachments

(3 files, 2 obsolete files)

testcase 18 years ago Jesse Ruderman 106 bytes, text/html		Details
fix, plus jsstr.c spacing cleanup 18 years ago Brendan Eich [:brendan] 18.50 KB, patch		Details \| Diff \| Splinter Review
diff -w version of last attachment 18 years ago Brendan Eich [:brendan] 13.64 KB, patch		Details \| Diff \| Splinter Review
fix, v2 18 years ago Brendan Eich [:brendan] 18.51 KB, patch	mrbkap : review+ beltzner : approval1.8.1+	Details \| Diff \| Splinter Review
diff -w version of last attachment 18 years ago Brendan Eich [:brendan] 13.65 KB, patch		Details \| Diff \| Splinter Review

Jesse Ruderman

Reporter

Description

•

18 years ago

z = function ()
{ 
  a =
    <x>
      <y/>
    </x>;
}

alert(z);
alert(eval("" + z));


Result: first alert shows \n, second alert shows \\n.

Expected: both alerts should show actual line breaks.

Jesse Ruderman

Reporter

Comment 1

•

18 years ago

Attached file testcase — Details

Brendan Eich [:brendan]

Assignee

Comment 2

•

18 years ago

Attached patch fix, plus jsstr.c spacing cleanup (obsolete) — Details — Splinter Review

diff -w next.

/be

Assignee: general → brendan

Status: NEW → ASSIGNED

Attachment #236587 - Flags: review?(mrbkap)

Brendan Eich [:brendan]

Assignee

Comment 3

•

18 years ago

Attached patch diff -w version of last attachment (obsolete) — Details — Splinter Review

Review this for best results.

/be

Brendan Eich [:brendan]

Assignee

Updated

•

18 years ago

Summary: function toString incorrectly escapes line breaks and backslashes within E4X literals → Decompiler escapes line breaks and backslashes within E4X literals

Brendan Eich [:brendan]

Assignee

Updated

•

18 years ago

OS: Mac OS X 10.4 → All

Priority: -- → P2

Hardware: Macintosh → All

Target Milestone: --- → mozilla1.8.1

Brendan Eich [:brendan]

Assignee

Comment 4

•

18 years ago

Attached patch fix, v2 — Details — Splinter Review

Parser already selects JSOP_QNAMEPART, so the code generator tweak is unnecessary (along with SRC_UNQUOTE).

/be

Attachment #236587 - Attachment is obsolete: true

Attachment #236670 - Flags: review?(mrbkap)

Attachment #236587 - Flags: review?(mrbkap)

Brendan Eich [:brendan]

Assignee

Comment 5

•

18 years ago

Attached patch diff -w version of last attachment — Details — Splinter Review

Attachment #236588 - Attachment is obsolete: true

Blake Kaplan (:mrbkap) (inactive)

Updated

•

18 years ago

Attachment #236670 - Flags: review?(mrbkap) → review+

Brendan Eich [:brendan]

Assignee

Comment 6

•

18 years ago

Fixed on trunk.

/be

Status: ASSIGNED → RESOLVED

Closed: 18 years ago

Resolution: --- → FIXED

Brendan Eich [:brendan]

Assignee

Comment 7

•

18 years ago

Comment on attachment 236670 [details] [diff] [review]
fix, v2

Nominating for 1.8.1.  This patch is bulked up by cleanup, but the cleanup is important to avoid confusing trunk vs. branch differences, some of this cleanup will very likely be needed for the remaining important js1.7 bugfixes, and the remainder is provably safe (especially the cosmetic-only cleanup).  Here's what is going on:

>Index: jsemit.c
...
>     /* The right side of the descendant operator is implicitly quoted. */
>-    if (op == JSOP_DESCENDANTS && right->pn_op == JSOP_STRING &&
>-        js_NewSrcNote(cx, cg, SRC_UNQUOTE) < 0) {
>-        return JS_FALSE;
>-    }
>+    JS_ASSERT(op != JSOP_DESCENDANTS || right->pn_type != TOK_STRING ||
>+              right->pn_op == JSOP_QNAMEPART);

The parser already selected JSOP_QNAMEPART, so we don't have to use SRC_UNQUOTE.  This gets rid of that source note, leaving a gap we may need for destructuring decompilation (bug 346642).

>@@ -6027,21 +6025,21 @@ js_EmitTree(JSContext *cx, JSCodeGenerat
> 
> JS_FRIEND_DATA(JSSrcNoteSpec) js_SrcNoteSpec[] = {
>     {"null",            0,      0,      0},
>     {"if",              0,      0,      0},
>     {"if-else",         1,      0,      1},
>     {"while",           1,      0,      1},
>     {"for",             3,      1,      1},
>     {"continue",        0,      0,      0},
>-    {"decl",            1,      0,      0},
>+    {"decl",            1,      0,      1},

This is actually an important change, see the comment in jsemit.h highlighted below.

>Index: jsemit.h
...
>-/* Constants for the SRC_DECL source note. */
>+/*
>+ * Constants for the SRC_DECL source note.  Note that span-dependent bytecode
>+ * selection means that any SRC_DECL offset greater than SRC_DECL_LET may need
>+ * to be adjusted, but these "offsets" are too small to span a span-dependent
>+ * instruction, so can be used to denote distinct declaration syntaxes to the
>+ * decompiler.
>+ */

Without the change from 0 to 1 for the last member initializer in the "decl" row of js_SrcNoteSpec, when a very long let block has its code generated, and it is long enough to overflow 16-bit signed immediate jump offsets, the span-dependent instruction selection algorithm kicks in and extends enough jumps to 32-bit signed immediate forms for the entire block to compile correctly.  This will require the SRC_DECL source note's first offset (a different kind of immediate operand, similar to a jump offset) to be adjusted upward as jumps are extended.  But without that little 1 at the end of the row, the span-dependent instruction selector would not know to update the SRC_DECL that tells the decompiler how to find the end of the let block.

That means not only broken decompilation, but at the limit, memory safety bugs.  The decompiler, being the hardest case and most red-headed of step-children, tries to defend against bugs in its own understanding of bytecode not matching the code generators, but it may fail.  So we want this fix, even if we don't care about >64K code in let blocks.

>Index: jsopcode.c
...
>+#define QUOTE_IN_XML    0x10000
>+
> static char *
>-QuoteString(Sprinter *sp, JSString *str, jschar quote)
>+QuoteString(Sprinter *sp, JSString *str, uint32 quote)
... 
>         /* Use js_EscapeMap, \u, or \x only if necessary. */
>         if ((u = js_strchr(js_EscapeMap, c)) != NULL) {
>-            ok = Sprint(sp, "\\%c", (char)u[1]) >= 0;
>+            ok = inXML
>+                 ? Sprint(sp, "%c", (char)c) >= 0
>+                 : Sprint(sp, "\\%c", (char)u[1]) >= 0;

This is the main bug fix, to avoid escaping (and then double-escaping, etc. on round-trips through the decompiler) control characters in XML literals.  It's very safe, just communicating the inXML flag from the main decompiler function (Decompile) into the QuoteString helper, without hacking a second parameter all over the place (QuoteString is called a lot).  Hence QUOTE_IN_XML, a flag bit passed in above the valid 16 jschar bits used by the engine to communicate a quote character (which is always 0, ' or ").

> #ifdef JS_C_STRINGS_ARE_UTF8
>             /* If this is a surrogate pair, make sure to print the pair. */
>             if (c >= 0xD800 && c <= 0xDBFF) {
>                 jschar buffer[3];
>                 buffer[0] = c;
>                 buffer[1] = *++t;
>                 buffer[2] = 0;
>                 if (t == z) {
>+                    char numbuf[10];
>+                    JS_snprintf(numbuf, sizeof numbuf, "0x%x", c);
>+                    JS_ReportErrorFlagsAndNumber(cx, JSREPORT_ERROR,
>+                                                 js_GetErrorMessage, NULL,
>+                                                 JSMSG_BAD_SURROGATE_CHAR,
>+                                                 numbuf);

This JS_C_STRINGS_ARE_UTF8 code is off by default still, and the fix here is to supply a missing error report instead of just failing silently via:

>                     ok = JS_FALSE;
>                     break;

as the code would do until this patch, if it were given the first part of a surrogate pair at the end of the string to quote.

>Index: jsstr.c

The JS_C_STRINGS_ARE_UTF8 missing error case made me look at jsstr.c, where I found an error to reuse for this (arguably internal, but it shouldn't be a silent failure) case.  That revealed a bunch of style violence done when JS_C_STRINGS_ARE_UTF8 went in under shaver's r= :-P.  The fixes here are all cosmetic, in on the trunk, good for 1.8 branch -- see the diff -w attachment for proof.

/be

Attachment #236670 - Flags: approval1.8.1?

Bob Clary [:bc] (inactive)

Comment 8

•

18 years ago

Checking in regress-349814.js;
/cvsroot/mozilla/js/tests/e4x/Regress/regress-349814.js,v  <--  regress-349814.js
initial revision: 1.1

Flags: in-testsuite+

Mike Beltzner [:beltzner, not reading bugmail]

Comment 9

•

18 years ago

Comment on attachment 236670 [details] [diff] [review]
fix, v2

a=beltzner on behalf of 181drivers

Attachment #236670 - Flags: approval1.8.1? → approval1.8.1+

Bob Clary [:bc] (inactive)

Comment 10

•

18 years ago

verified fixed 1.9 20060906 windows/mac*/linux

Status: RESOLVED → VERIFIED

Brendan Eich [:brendan]

Assignee

Comment 11

•

18 years ago

Fixed on the 1.8 branch.

/be

Keywords: fixed1.8.1

Bob Clary [:bc] (inactive)

Comment 12

•

18 years ago

verified fixed 1.8 1.9 20060909 windows/mac*/linux

Keywords: fixed1.8.1 → verified1.8.1

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Decompiler escapes line breaks and backslashes within E4X literals

Categories

(Core :: JavaScript Engine, defect, P2)

Tracking

()

People

(Reporter: jruderman, Assigned: brendan)

References

Details

(Keywords: testcase, verified1.8.1)

Crash Data

Security

(public)

User Story

Attachments

(3 files, 2 obsolete files)

Description

Comment 1

Comment 2

Comment 3

Updated

Updated

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Attachment

General

Description

File Name

Content Type