Closed
Bug 349814
Opened 18 years ago
Closed 18 years ago
Decompiler escapes line breaks and backslashes within E4X literals
Categories
(Core :: JavaScript Engine, defect, P2)
Core
JavaScript Engine
Tracking
()
VERIFIED
FIXED
mozilla1.8.1
People
(Reporter: jruderman, Assigned: brendan)
Details
(Keywords: testcase, verified1.8.1)
Attachments
(3 files, 2 obsolete files)
106 bytes,
text/html
|
Details | |
18.51 KB,
patch
|
mrbkap
:
review+
beltzner
:
approval1.8.1+
|
Details | Diff | Splinter Review |
13.65 KB,
patch
|
Details | Diff | Splinter Review |
z = function () { a = <x> <y/> </x>; } alert(z); alert(eval("" + z)); Result: first alert shows \n, second alert shows \\n. Expected: both alerts should show actual line breaks.
Reporter | ||
Comment 1•18 years ago
|
||
Assignee | ||
Comment 2•18 years ago
|
||
diff -w next. /be
Assignee | ||
Comment 3•18 years ago
|
||
Review this for best results. /be
Assignee | ||
Updated•18 years ago
|
Summary: function toString incorrectly escapes line breaks and backslashes within E4X literals → Decompiler escapes line breaks and backslashes within E4X literals
Assignee | ||
Updated•18 years ago
|
OS: Mac OS X 10.4 → All
Priority: -- → P2
Hardware: Macintosh → All
Target Milestone: --- → mozilla1.8.1
Assignee | ||
Comment 4•18 years ago
|
||
Parser already selects JSOP_QNAMEPART, so the code generator tweak is unnecessary (along with SRC_UNQUOTE). /be
Attachment #236587 -
Attachment is obsolete: true
Attachment #236670 -
Flags: review?(mrbkap)
Attachment #236587 -
Flags: review?(mrbkap)
Assignee | ||
Comment 5•18 years ago
|
||
Attachment #236588 -
Attachment is obsolete: true
Updated•18 years ago
|
Attachment #236670 -
Flags: review?(mrbkap) → review+
Assignee | ||
Comment 6•18 years ago
|
||
Fixed on trunk. /be
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 7•18 years ago
|
||
Comment on attachment 236670 [details] [diff] [review] fix, v2 Nominating for 1.8.1. This patch is bulked up by cleanup, but the cleanup is important to avoid confusing trunk vs. branch differences, some of this cleanup will very likely be needed for the remaining important js1.7 bugfixes, and the remainder is provably safe (especially the cosmetic-only cleanup). Here's what is going on: >Index: jsemit.c ... > /* The right side of the descendant operator is implicitly quoted. */ >- if (op == JSOP_DESCENDANTS && right->pn_op == JSOP_STRING && >- js_NewSrcNote(cx, cg, SRC_UNQUOTE) < 0) { >- return JS_FALSE; >- } >+ JS_ASSERT(op != JSOP_DESCENDANTS || right->pn_type != TOK_STRING || >+ right->pn_op == JSOP_QNAMEPART); The parser already selected JSOP_QNAMEPART, so we don't have to use SRC_UNQUOTE. This gets rid of that source note, leaving a gap we may need for destructuring decompilation (bug 346642). >@@ -6027,21 +6025,21 @@ js_EmitTree(JSContext *cx, JSCodeGenerat > > JS_FRIEND_DATA(JSSrcNoteSpec) js_SrcNoteSpec[] = { > {"null", 0, 0, 0}, > {"if", 0, 0, 0}, > {"if-else", 1, 0, 1}, > {"while", 1, 0, 1}, > {"for", 3, 1, 1}, > {"continue", 0, 0, 0}, >- {"decl", 1, 0, 0}, >+ {"decl", 1, 0, 1}, This is actually an important change, see the comment in jsemit.h highlighted below. >Index: jsemit.h ... >-/* Constants for the SRC_DECL source note. */ >+/* >+ * Constants for the SRC_DECL source note. Note that span-dependent bytecode >+ * selection means that any SRC_DECL offset greater than SRC_DECL_LET may need >+ * to be adjusted, but these "offsets" are too small to span a span-dependent >+ * instruction, so can be used to denote distinct declaration syntaxes to the >+ * decompiler. >+ */ Without the change from 0 to 1 for the last member initializer in the "decl" row of js_SrcNoteSpec, when a very long let block has its code generated, and it is long enough to overflow 16-bit signed immediate jump offsets, the span-dependent instruction selection algorithm kicks in and extends enough jumps to 32-bit signed immediate forms for the entire block to compile correctly. This will require the SRC_DECL source note's first offset (a different kind of immediate operand, similar to a jump offset) to be adjusted upward as jumps are extended. But without that little 1 at the end of the row, the span-dependent instruction selector would not know to update the SRC_DECL that tells the decompiler how to find the end of the let block. That means not only broken decompilation, but at the limit, memory safety bugs. The decompiler, being the hardest case and most red-headed of step-children, tries to defend against bugs in its own understanding of bytecode not matching the code generators, but it may fail. So we want this fix, even if we don't care about >64K code in let blocks. >Index: jsopcode.c ... >+#define QUOTE_IN_XML 0x10000 >+ > static char * >-QuoteString(Sprinter *sp, JSString *str, jschar quote) >+QuoteString(Sprinter *sp, JSString *str, uint32 quote) ... > /* Use js_EscapeMap, \u, or \x only if necessary. */ > if ((u = js_strchr(js_EscapeMap, c)) != NULL) { >- ok = Sprint(sp, "\\%c", (char)u[1]) >= 0; >+ ok = inXML >+ ? Sprint(sp, "%c", (char)c) >= 0 >+ : Sprint(sp, "\\%c", (char)u[1]) >= 0; This is the main bug fix, to avoid escaping (and then double-escaping, etc. on round-trips through the decompiler) control characters in XML literals. It's very safe, just communicating the inXML flag from the main decompiler function (Decompile) into the QuoteString helper, without hacking a second parameter all over the place (QuoteString is called a lot). Hence QUOTE_IN_XML, a flag bit passed in above the valid 16 jschar bits used by the engine to communicate a quote character (which is always 0, ' or "). > #ifdef JS_C_STRINGS_ARE_UTF8 > /* If this is a surrogate pair, make sure to print the pair. */ > if (c >= 0xD800 && c <= 0xDBFF) { > jschar buffer[3]; > buffer[0] = c; > buffer[1] = *++t; > buffer[2] = 0; > if (t == z) { >+ char numbuf[10]; >+ JS_snprintf(numbuf, sizeof numbuf, "0x%x", c); >+ JS_ReportErrorFlagsAndNumber(cx, JSREPORT_ERROR, >+ js_GetErrorMessage, NULL, >+ JSMSG_BAD_SURROGATE_CHAR, >+ numbuf); This JS_C_STRINGS_ARE_UTF8 code is off by default still, and the fix here is to supply a missing error report instead of just failing silently via: > ok = JS_FALSE; > break; as the code would do until this patch, if it were given the first part of a surrogate pair at the end of the string to quote. >Index: jsstr.c The JS_C_STRINGS_ARE_UTF8 missing error case made me look at jsstr.c, where I found an error to reuse for this (arguably internal, but it shouldn't be a silent failure) case. That revealed a bunch of style violence done when JS_C_STRINGS_ARE_UTF8 went in under shaver's r= :-P. The fixes here are all cosmetic, in on the trunk, good for 1.8 branch -- see the diff -w attachment for proof. /be
Attachment #236670 -
Flags: approval1.8.1?
Comment 8•18 years ago
|
||
Checking in regress-349814.js; /cvsroot/mozilla/js/tests/e4x/Regress/regress-349814.js,v <-- regress-349814.js initial revision: 1.1
Flags: in-testsuite+
Comment 9•18 years ago
|
||
Comment on attachment 236670 [details] [diff] [review] fix, v2 a=beltzner on behalf of 181drivers
Attachment #236670 -
Flags: approval1.8.1? → approval1.8.1+
Comment 12•18 years ago
|
||
verified fixed 1.8 1.9 20060909 windows/mac*/linux
Keywords: fixed1.8.1 → verified1.8.1
You need to log in
before you can comment on or make changes to this bug.
Description
•