Closed Bug 512266 Opened 15 years ago Closed 13 years ago

JSON.stringify serializes \n to \u000a

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla6

People

(Reporter: lsmith, Assigned: Waldo)

References

()

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 2 obsolete files)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2

According to the latest draft version of ES 5 spec (though this has probably been in there for a while), newlines should stringify to "\n".

This is defined in step 2 of the abstract function Quote in section 15.12.3:
2. For each character C in value 
    a. If C is the double quote character or the backslash character 
        i. Let product be the concatenation of product and the backslash character. 
        ii. Let product be the concatenation of product and C.
    b. Else If C is backspace, formfeed, newline, carriage return, or tab 
        i. Let product be the concatenation of product and the backslash character.
        ii. Let abbrev be the character corresponding to the value of C as follows: 
            backspace "b" 
            formfeed "f" 
            newline "n" 
            carriage return "r" 
            tab "t" 
        iii. Let product be the concatenation of product and abbrev. 
    c. Else If C is a control character having a code unit value less than the space character 
        i. Let product be the concatenation of product and the backslash character. 
        ii. Let product be the concatenation of product and "u". 
        iii. Let hex be the result of converting the numeric code unit value of C to a String of 4 base 16 digits. 
        iv. Let product be the concatenation of product and hex. 
    d. Else 
        i. Let product be the concatenation of product and C.

Reproducible: Always

Steps to Reproduce:
JSON.stringify({x:"has\nnewline"});

Actual Results:  
{"x":"has\u000anewline"}

Expected Results:  
{"x":"has\nnewline"}
Assignee: nobody → general
Status: UNCONFIRMED → NEW
Component: General → JavaScript Engine
Ever confirmed: true
OS: Mac OS X → All
Product: Firefox → Core
QA Contact: general → general
Hardware: x86 → All
Version: unspecified → Trunk
Whiteboard: DUPEME
Patch anon.  It seems this is what causes us to fail, inter alia (so it appears), http://hg.ecmascript.org/tests/test262/file/ee75c6254247/test/suite/ietestcenter/chapter15/15.12/15.12.3/15.12.3-11-19.js on the test262 site.
Assignee: general → jwalden+bmo
Status: NEW → ASSIGNED
Whiteboard: DUPEME
Attached patch Patch and test (obsolete) — Splinter Review
This applies atop the other two JSON patches I've already pointed at you.
Attachment #520105 - Flags: review?(pbiggar)
Comment on attachment 520105 [details] [diff] [review]
Patch and test

It would be nice if we could more clearly see the spec in the code. So we see /* Steps 3-4 */ for example, but not a great deal more than that.


+            JS_ASSERT((c >> 4) < 10);
+            uint8 x = c >> 4, y = c % 16;
+            if (!sb.append('0' + x) || !sb.append(y < 10 ? '0' + y : 'a' + (y - 10)))
+                return false;

I can't follow this, though it roughly looks like what the spec says. What does 10 do? Can we have better names for x, c and y (like the exact names the spec uses)?

Ideally, I'd like to see the lines from the spec added as comments, but since the rest of the project doesn't do this, it's probably not helpful to do it only here.

I know it was there before this bug, but a comment explaining exactly what |mark| is doing would be nice.

I could be wrong, but the test looks a little weak. They test exactly this, but not a lot around it. What about things like "\\u000D", "\u000d", "\u00D"? There seems to be only success tested, not failure.


r+ with these addressed (better names required, the rest may not be needed).
Attachment #520105 - Flags: review?(pbiggar) → review+
In rereading I noticed a stupid bug concerning our append-maximal-nonspecial-sequences optimization -- we weren't appending that sequence if if was followed by [\b\f\n\r\t].  This refactoring fixes that, arguably even better as the maximal-sequence control flow is completely disentangled from the likely-less-predictable escaping stuff.  That problem would have caused things like |JSON.stringify("a\b") === '"\\b"'|, omitting the characters before the special character.  I figure these changes should probably get a re-review, and it won't hurt to see if I addressed your comments as desired at the same time.

(In reply to comment #3)
> +            JS_ASSERT((c >> 4) < 10);
> +            uint8 x = c >> 4, y = c % 16;
> +            if (!sb.append('0' + x) || !sb.append(y < 10 ? '0' + y : 'a' +
> (y - 10)))
> +                return false;
> 
> I can't follow this, though it roughly looks like what the spec says. What
> does 10 do? Can we have better names for x, c and y (like the exact names the
> spec uses)?

|c| is the spec name.  |x| and |y| aren't named in the spec -- "Let hex be the result of converting the numeric code unit value of C to a String of four hexadecimal digits" specifies behavior but not process.  I don't like or dislike the names, but I suspect longer names would hurt readability more than help.  The ternary picks out 0-9 or a-f as appropriate for the hex digit.

> Ideally, I'd like to see the lines from the spec added as comments, but since
> the rest of the project doesn't do this, it's probably not helpful to do it
> only here.

I've been moving that way as I make changes to existing methods, actually, and generally request the same in patches -- search for "/* Step" for examples.  dmandelin too has said he finds the pattern nice to read.  There's nothing wrong with requesting it, although optimizations can funkify step-numbering (jsnum.cpp:num_parseInt is one such case).

> I could be wrong, but the test looks a little weak. They test exactly this,
> but not a lot around it. What about things like "\\u000D", "\u000d", "\u00D"?
> There seems to be only success tested, not failure.

I added tests for JSON.stringify("\\u000D") and the like, which I believe addresses your first concern.

Assuming I understand you correctly, the second and third are things the JS scanner (jsscan.cpp) would handle when parsing the overall program, so they should be tested separately, in tests specifically targeted at the scanner/parser.
Attachment #520105 - Attachment is obsolete: true
Attachment #521081 - Flags: review?(pbiggar)
Attachment #521081 - Attachment is obsolete: true
Attachment #525820 - Flags: review?(pbiggar)
Attachment #521081 - Flags: review?(pbiggar)
Attachment #525820 - Flags: review?(pbiggar) → review+
http://hg.mozilla.org/tracemonkey/rev/f1751a93f665
Whiteboard: fixed-in-tracemonkey
Target Milestone: --- → mozilla6
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: