Closed Bug 507998 Opened 15 years ago Closed 10 years ago

JSON.parse() gives uninformative errors on malformed input

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla29

People

(Reporter: jhiatt, Assigned: isk)

References

(Blocks 1 open bug)

Details

(Keywords: dev-doc-complete, feature, Whiteboard: [mentor=Waldo][lang=c++][DocArea=JS][qa-])

Attachments

(3 files, 12 obsolete files)

11.75 KB, patch
n.nethercote
: review+
Details | Diff | Splinter Review
15.80 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
6.16 KB, patch
Details | Diff | Splinter Review
Passing a malformed JSON string to JSON.parse() yields a generic SyntaxError and doesn't give any specific information about what caused the error. The error report should contain the index of the token causing the parse error and a brief description of why it caused an error, e.g. "Unexpected '}' at index 0" when calling JSON.parse("}").
I believe this was done for speed; maintaining all the indexing state can be expensive.  It would seem reasonable to me to include the passed-in string in the error message, however; it seems feasible to include that without having to maintain state at every step of the algorithm.
(In reply to comment #1)
> I believe this was done for speed; maintaining all the indexing state can be
> expensive.

Lexers have to maintain a current input cursor of some kind. That's all that error-in-context pointing requires.

/be
I made an attempt to add some better error reporting. To be honest I don't really know what I'm doing, so if I'm doing something stupid or there's a better way to do this please tell me!
Seems you could show the char in context without much effort. Cc'ing sayrer.

/be
The new parser added bug 589664 includes marginally more informative error messages that explain the nature of the error that was encountered.  They don't do anything like print out context within the string being parsed, or report the index of the offending character, or anything like that, tho, which seems a necessary component of a full fix here, so I'm not going to say this bug's fixed yet.
If there is still interest in this, and someone is willing to give me some guidance, I think I could come up with something.
I'm happy to give guidance here.  Though, first I think we need to decide what better errors would look like.  Right now we have this:

js> JSON.parse("{ x: 7 }")
typein:1: SyntaxError: JSON.parse: expected property name or '}'

What would we rather see instead?  This is a "UI" issue, and I don't know what a nicer thing to do instead would be.
Whiteboard: [mentor=Waldo][lang=c++]
The very best error messages are actionable. Imagine:

  js> JSON.parse("{ x: 7 }")
  SyntaxError: Quotes are required around the 'x' in '{ x: 7 }'.

But anything that includes an offset and enough of the source string would be a nice improvement.
I would like to work on this bug if nobody is working on it. But how many error cases do i need to check?
Bhimsen, it's up to you. Try and figure out what the common errors are. Decide what error messages you'd like to see.

I think these might be common mistakes. I'm sure I missed some.

- Accidentally passing undefined. Whoops.
  js> var x; JSON.parse(x)
  SyntaxError: JSON.parse: unexpected character
  js> JSON.parse("undefined")
  SyntaxError: JSON.parse: unexpected character

- Passing an empty string
  js> JSON.parse("")
  SyntaxError: JSON.parse: unexpected end of data
  js> JSON.parse("    ")
  SyntaxError: JSON.parse: unexpected end of data

- Accidental parentheses
  js> JSON.parse("({})")
  SyntaxError: JSON.parse: unexpected character

- Forgetting the quotes:
  js> JSON.parse("{a: 1, b: 2}")
  SyntaxError: JSON.parse: expected property name or '}'

- Using single quotes instead of double quotes:
  js> JSON.parse("{'a': 1}")
  SyntaxError: JSON.parse: expected property name or '}'

- Including multiple values in the string without wrapping them in []
  js> JSON.parse("1, 2, 3")
  SyntaxError: JSON.parse: unexpected non-whitespace character after JSON data

- An extra comma after the last item in a list or object:
  js> JSON.parse("[1, 2, 3, ]")
  SyntaxError: JSON.parse: unexpected character
  js> JSON.parse('{"a": 1, "b": 2, }')
  SyntaxError: JSON.parse: expected double-quoted property name
Okay. i will be working on the issues that you have mentioned and in the course if i come up with new error cases, i will work on them too.
Bhimsen, it's great that you're working on this -- I was recently debugging some JSON syntax errors in 500KB files for bug 768470, and I had to use binary search to zero in on the error and it was painful.  In my case, one of the problems was failing to escape newlines within strings.
Note especially that in my case just giving the line number would have been a huge help.
Haven't seen any update on this.  I implemented line number parsing for JSON in SpiderMonkey 1.8.5 around a year ago, let me dig up the patch.
Note that I'd bet for perf reasons we're going to want to do this better-error reconstruction only once we're sure there's an actual error, not in the first attempt to parse.  JSON.parse matters in various benchmarks, and I kinda doubt we can eat the perf hit of tracking line/column/whatever info when we're not using it.
Oh, and your 1.8.5 patch is kind of hopelessly out of date, as we've since rewritten the JSON parser.  :-)
Sounds like the proper thing to do would be to template in an informative error mode and expose JSON.parse as is today, additionally exposing JSON.parseErrorInfo with the extended class.  There are a couple choices, a new function returning the error information (seems weird as the return value is now different, and null for valid JSON), or throwing an exception (probably the best choice).
Blocks: 836284
Diogo said he was interested in this.  :-)
Assignee: general → diogopontual
My thoughts:

   1. Create a field on JSONParser to store a const pointer to the first position of the json message (more 8 bytes );  
   2. Change the messages passed to JSONParser::error from char* to a new enum;
   3. Create a method such as JSONParser::getParseErrorMessage() that will discover the offset of the error (doing current.get() minus ourNewPointerToTheFirstPosition) and construct more informative message errors... 

I think that it is enough to create contextualized messages without a lot of overhead. Almost all the overhead will take place only when parse errors occur, then no problem. Remember that it is a very small overhead.


Suggestions?
That sounds broadly reasonable.  The devil will be in the details of how it all works, of course.  To start, it may be enough just to include, say, twenty characters' context each direction or so in the string.  We have code that does this sort of thing for JS syntax errors -- see TokenStream::reportCompileErrorNumberVA in js/src/frontend/TokenStream.cpp.  It might provide inspiration of sorts, even if it's not fully usable on its own.
Attached patch WIP-bug507998.patch (obsolete) — Splinter Review
I'm interseted in this bug.

I make WIP-patch.
This patch  is to be able to display the number of column.
1) add begin in JSONParser to calculate column.
2) modify JS_ReportErrorNumberVA and JS_ReportErrorNumber,JS_ReportErrorNumberVA to change "JSErrorReport::column".

Output changes as follows by this patch.
-    js> JSON.parse('{"a": 1, b: 2}')
     typein:3:9 SyntaxError: JSON.parse: expected double-quoted property name

I have few question.
1) Should I make *.msg file to get error message?
2) To display pointer as follows is difficult.
-   js> JSON.parse("{"a": 1, b: 2}")
    typein:4:14 SyntaxError: missing ) after argument list: 
    typein:4:14 JSON.parse("{"a": 1, b: 2}")
    typein:4:14 ..............^
Because JSONParser don't have linebase and userbuf.
Do you have any plan?
Attachment #821027 - Flags: review?(jwalden+bmo)
> 2) To display pointer as follows is difficult.
> -   js> JSON.parse("{"a": 1, b: 2}")
>     typein:4:14 SyntaxError: missing ) after argument list: 
>     typein:4:14 JSON.parse("{"a": 1, b: 2}")
>     typein:4:14 ..............^
> Because JSONParser don't have linebase and userbuf.
> Do you have any plan?

Speaking as someone who has wrestled with the poor error messages... the pointer would be nice but even just having the line and column number is *so* much better than what we currently have, I wouldn't worry about it, or perhaps would do it as a follow-up.
Attached patch WIP-bug507998.patch (obsolete) — Splinter Review
I make Macro to organize error message.
This patch has yet to improve error message still.

I have one idea.
Now we get column number. So we omit the word which indicate the position as if follow.
"missing digits after exponent sign"
"expected ',' or '}' after property-value pair in object literal"

My English skill is very poor.
So I can't find best word to understand error.
Please help me.
Attachment #821027 - Attachment is obsolete: true
Attachment #821027 - Flags: review?(jwalden+bmo)
Attachment #821537 - Flags: review?(n.nethercote)
> Output changes as follows by this patch.
> -    js> JSON.parse('{"a": 1, b: 2}')
>      typein:3:9 SyntaxError: JSON.parse: expected double-quoted property name

The 9 is the column number.  Where does the 3 come from?  What are you doing with line numbers?  You should test with JSON files as well as snippets in the shell.  Here's a good test case:

- Visit about:memory.

- Click on "Measure and save..." to save some JSON-format memory reports to a file.

- Modify the file to introduce a syntax error. E.g. omit a comma at the end of one of the memory report records. (Note that the JSON file is gzipped, so you might need to gunzip it first if your editor cannot handle gzipped files.)

- Then load the file in about:memory with the "Load..." button.  You should get a syntax error.  Are the line number and column number correct?

As for using the macro to decouple the error message strings, it's an admirable change, but one that should be done in a separate patch (and probably a separate bug report) from the error message changes.  Can you split them, please?

> I have one idea.
> Now we get column number. So we omit the word which indicate the position as
> if follow.
> "missing digits after exponent sign"
> "expected ',' or '}' after property-value pair in object literal"

Sorry, I don't understand.  Is this something you put in the patch you posted, or is it an idea that you are thinking about adding to the patch?
Attachment #821537 - Flags: review?(n.nethercote)
I think the line number and column number in the SyntaxError should point to the JSON.parse call site. These numbers are only meaningful in combination with the .fileName property.

Sp information about the location of the syntax error within the given string must be stored in the error object's .message, I guess.

    js> JSON.parse('{"x": 1, "y": 1,35}')
    typein:1:0 SyntaxError: JSON.parse: Number uses a comma for a decimal point (at line 1, column 7
    of the data, '..."y": 1,35...')
Comment on attachment 821537 [details] [diff] [review]
WIP-bug507998.patch

Review of attachment 821537 [details] [diff] [review]:
-----------------------------------------------------------------

In this bug, let's focus on getting good position information into the error message. Also, we should include a portion of the JSON data leading up to the error.

We can try to improve the English strings in a future bug.

::: js/src/jsonparser.h
@@ +182,5 @@
>      bool parse(MutableHandleValue vp);
>  
> +    uint32_t getColumn() {
> +        return current - begin;
> +    }

njn asked for this to compute the line number and column number. I think that's a good idea. Don't worry about speed, even if it requires a second scan over the input. Users won't mind waiting an extra microsecond for a much better error message.
(In reply to Nicholas Nethercote [:njn] from comment #25)
> > Output changes as follows by this patch.
> > -    js> JSON.parse('{"a": 1, b: 2}')
> >      typein:3:9 SyntaxError: JSON.parse: expected double-quoted property name
> 
> The 9 is the column number.  Where does the 3 come from?  What are you doing
> with line numbers?  You should test with JSON files as well as snippets in
> the shell.  Here's a good test case:
> 
> - Visit about:memory.
> 
> - Click on "Measure and save..." to save some JSON-format memory reports to
> a file.
> 
> - Modify the file to introduce a syntax error. E.g. omit a comma at the end
> of one of the memory report records. (Note that the JSON file is gzipped, so
> you might need to gunzip it first if your editor cannot handle gzipped
> files.)
> 
> - Then load the file in about:memory with the "Load..." button.  You should
> get a syntax error.  Are the line number and column number correct?
> 

I command make in /js/src.
So I couldn't use Nightly.

Now I build Nightly.
But I don't know how to run jsshell.
I write "export MOZ_PACKAGE_JSSHELL=1" in mozconfig.
Building the browser also builds a js shell. It's $OBJDIR/dist/bin/js .

But I build the browser and the shell separately, in separate build directories, because building the shell alone is so much faster than building the whole browser.
(In reply to Jason Orendorff [:jorendorff] from comment #29)
> Building the browser also builds a js shell. It's $OBJDIR/dist/bin/js .
> 
> But I build the browser and the shell separately, in separate build
> directories, because building the shell alone is so much faster than
> building the whole browser.

Thanks :)
(In reply to Jason Orendorff [:jorendorff] from comment #29)
> But I build the browser and the shell separately, in separate build
> directories, because building the shell alone is so much faster than
> building the whole browser.

How to build a js shell separately?
I can't find the option in mozconfig.
(In reply to Josh Matthews [:jdm] from comment #32)
> https://wiki.mozilla.org/JavaScript:New_to_SpiderMonkey#Build_the_js_shell
Thanks.
I tried to advanced build but I can't build due to error message as follow
../jslock.h:12:11: fatal error: 'pratom.h' file not found

I didn't easy build.
(In reply to iseki.m.aa from comment #33)
> I tried to advanced build but I can't build due to error message as follow
> ../jslock.h:12:11: fatal error: 'pratom.h' file not found
This is due to bug 924986, and should eventually be solved by bug 927915. Until then, either configure with --disable-threadsafe or look up how to make a threadsafe build [0]. Note that --enable-threadsafe is no longer required since it's the default now, but the rest is still needed.

[0] https://developer.mozilla.org/en-US/docs/SpiderMonkey/Build_Documentation#Building
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #34)
> (In reply to iseki.m.aa from comment #33)
> > I tried to advanced build but I can't build due to error message as follow
> > ../jslock.h:12:11: fatal error: 'pratom.h' file not found
> This is due to bug 924986, and should eventually be solved by bug 927915.
> Until then, either configure with --disable-threadsafe or look up how to
> make a threadsafe build [0]. Note that --enable-threadsafe is no longer
> required since it's the default now, but the rest is still needed.
> 
> [0]
> https://developer.mozilla.org/en-US/docs/SpiderMonkey/
> Build_Documentation#Building

Thanks.
I can build:)
Attached patch bug507998.patch (obsolete) — Splinter Review
I separate previous patch to focus on getting good position information into the error message.
Attachment #821537 - Attachment is obsolete: true
Attachment #822883 - Flags: review?(jorendorff)
Would you assign me on this bug?
Would you assign me on this bug?
Assigned.
Assignee: diogopontual → iseki.m.aa
Status: NEW → ASSIGNED
can the information be also available as properties on the error object? It would help in cases where we throw custom exceptions based on this error.
Comment on attachment 822883 [details] [diff] [review]
bug507998.patch

Review of attachment 822883 [details] [diff] [review]:
-----------------------------------------------------------------

I don't think this is the right way to do this. See comment 26 and comment 27.

1. We don't want to set errorReport.column. It would be very strange for error messages to contain a line number and a column number that refer to two totally different files.

2. The file and line number within the JSON data should be part of the error message. This means changing the error message like this:

>-MSG_DEF(JSMSG_JSON_BAD_PARSE,         228, 1, JSEXN_SYNTAXERR, "JSON.parse: {0}")
>+MSG_DEF(JSMSG_JSON_BAD_PARSE,         228, 3, JSEXN_SYNTAXERR, "JSON.parse: {0} at line {1} column {2} of the JSON data")

3. This needs tests.
Attachment #822883 - Flags: review?(jorendorff)
Attached patch bug507998.patch (obsolete) — Splinter Review
(In reply to Jason Orendorff [:jorendorff] from comment #41)
> Comment on attachment 822883 [details] [diff] [review]
> bug507998.patch
> 
> Review of attachment 822883 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I don't think this is the right way to do this. See comment 26 and comment
> 27.
Sorry,I hadn't understand well.
I fix patch.

> 3. This needs tests.
Sorry, I don't have idea to test this patch.
I will change error message.
So it is not good to assert error message.
Attachment #822883 - Attachment is obsolete: true
Attachment #827199 - Flags: review?(jorendorff)
Dropping in my two cents :

IMHO, it is pretty fragile to hope that each and every "current++" (and every "current++" to come ...) in the code will always be accompanied by a correct "begin = current; ++currentLineNumber;".

I think a safer way is to have a "getTextPosition(StableCharPtr ptr)" method, which returns a "{line,col}" struct, and call it inside "JSONParser::error(...)".
This would leave "current++" loops unmodified, allow to separately test that "getTextPosition" returns correct text coordinates in a stream.

This would add an extra linear pass on the data to compute the line/col coordinates, but as Jason Orendorff said, [comment 27](https://bugzilla.mozilla.org/show_bug.cgi?id=507998#c27) :

> Don't worry about speed, even if it requires a second scan over the input. Users won't mind waiting an extra microsecond for a much better error message.

What do you think ?
(In reply to Georges Varouchas from comment #43)
> Dropping in my two cents :
> 
> IMHO, it is pretty fragile to hope that each and every "current++" (and
> every "current++" to come ...) in the code will always be accompanied by a
> correct "begin = current; ++currentLineNumber;".
> 
> I think a safer way is to have a "getTextPosition(StableCharPtr ptr)"
> method, which returns a "{line,col}" struct, and call it inside
> "JSONParser::error(...)".
> This would leave "current++" loops unmodified, allow to separately test that
> "getTextPosition" returns correct text coordinates in a stream.
> 
> This would add an extra linear pass on the data to compute the line/col
> coordinates, but as Jason Orendorff said, [comment
> 27](https://bugzilla.mozilla.org/show_bug.cgi?id=507998#c27) :
> 
> > Don't worry about speed, even if it requires a second scan over the input. Users won't mind waiting an extra microsecond for a much better error message.
> 
> What do you think ?

Thanks Georges :)
I will agree with you. My patch is pretty fragile.
Attached patch bug507998.patch (obsolete) — Splinter Review
I fixed patch and add testcase.
I found new bug939410.

So now added test are not passed.
Attachment #827199 - Attachment is obsolete: true
Attachment #827199 - Flags: review?(jorendorff)
Attachment #833320 - Flags: review?(jorendorff)
Attachment #833320 - Flags: review?(jorendorff) → review?(n.nethercote)
Comment on attachment 833320 [details] [diff] [review]
bug507998.patch

Review of attachment 833320 [details] [diff] [review]:
-----------------------------------------------------------------

> So now added test are not passed.

Perhaps they should be fixed?

::: js/src/jsapi-tests/testParseJSON.cpp
@@ +35,5 @@
>      size_t length() const { return length_; }
>  };
>  
> +template<size_t N> bool
> +isSame(const char (&rsv)[N], char *lsv) {

Couldn't you just use strcmp() instead?

@@ +215,5 @@
>  
>      CHECK(!ok);
>      CHECK(!p.unexpectedErrorCount);
>      CHECK(p.expectedErrorCount == 1);
> +    CHECK(!expectedColumn||isSame(expectedColumn,p.column));

Whitespace around '||' and ',', please.  Likewise in various places below.
Attachment #833320 - Flags: review?(n.nethercote)
Attached patch bug507998.patch (obsolete) — Splinter Review
Thanks for reviewing.

I fixed a point that has been pointed out.

> > So now added test are not passed.
> Perhaps they should be fixed? 
I fixed this bug in bug939410.
Attachment #833320 - Attachment is obsolete: true
Attachment #8335104 - Flags: review?(n.nethercote)
Attached patch bug507998.patch (obsolete) — Splinter Review
I add bug939401's patch.
This patch will pass all test.
Attachment #8335104 - Attachment is obsolete: true
Attachment #8335104 - Flags: review?(n.nethercote)
Attachment #8335697 - Flags: review?(n.nethercote)
Comment on attachment 8335697 [details] [diff] [review]
bug507998.patch

Review of attachment 8335697 [details] [diff] [review]:
-----------------------------------------------------------------

This is progressing well, though it's not quite ready yet, so I've giving f+.  Thanks for your patience.

I tried it with some sample JSON output from about:memory and the error locations all looked to be correct, which is great!  It'll be *really* nice to have this land -- finding a syntax error in thousands of lines of JSON output is no fun.

::: js/src/jsapi-tests/testParseJSON.cpp
@@ +228,5 @@
>  reportJSONEror(JSContext *cx, const char *message, JSErrorReport *report)
>  {
>      ContextPrivate *p = static_cast<ContextPrivate *>(JS_GetContextPrivate(cx));
> +    strcpy(p->line,(char *)report->messageArgs[1]);
> +    strcpy(p->column,(char *)report->messageArgs[2]);

Add a space after the commas, please.

Also, the types are weird here.  |messageArgs| is an array of jschar*, but you're using char*.  I think this works because the JS_ReportErrorNumber() call passes in char*.  This is unpleasant, but there is precedent, e.g. the jsopcode.cpp snippet I have quoted below.  So a comment explaining what is going on would be very helpful.

::: js/src/jsonparser.cpp
@@ +56,5 @@
>      }
>  }
>  
>  void
> +JSONParser::getTextPosition(uint32_t &column,uint32_t &line)

Space after the comma.

@@ +60,5 @@
> +JSONParser::getTextPosition(uint32_t &column,uint32_t &line)
> +{
> +    StableCharPtr ptr = begin;
> +    for (; ptr != current; ptr++) {
> +        if (*ptr == '\n') {

This assumes that '\n' is the only way to encode a newline, which I think is false.

The JSON spec (http://www.ietf.org/rfc/rfc4627.txt) doesn't appear to explain how to count line numbers, but it says that '\n' and '\r' can both appear between tokens.  Combine that with how JavaScript tracks line numbers, and I think you need to look for any of the following sequences:
- "\n"
- "\r"
- "\r\n" (yes, the two chars are treated as a single newline)

In particular, the latter sequence is used on Windows, so if you don't handle it I think you'll get incorrect line numbers on Windows.

@@ +75,5 @@
>  {
> +    if (errorHandling == RaiseError) {
> +        uint32_t column = 1, line = 1;
> +        getTextPosition(column, line);
> +        char columnNumber[10],lineNumber[10];

Space after the comma.

@@ +78,5 @@
> +        getTextPosition(column, line);
> +        char columnNumber[10],lineNumber[10];
> +        sprintf(columnNumber, "%d", column);
> +        sprintf(lineNumber, "%d", line);
> +        JS_ReportErrorNumber(cx, js_GetErrorMessage, nullptr, JSMSG_JSON_BAD_PARSE, msg, lineNumber, columnNumber);

In js/src/jsopcode.cpp we have this existing code:

        char numBuf1[12], numBuf2[12];
        JS_snprintf(numBuf1, sizeof numBuf1, "%d", op);
        JS_snprintf(numBuf2, sizeof numBuf2, "%d", JSOP_LIMIT);
        JS_ReportErrorNumber(cx, js_GetErrorMessage, nullptr,
                             JSMSG_BYTECODE_TOO_BIG, numBuf1, numBuf2);

Yours is similar, but I'd change it to be even more similar, for consistency.  E.g. use JS_snprintf().

@@ +759,5 @@
>                case ArrayClose:
>                case ObjectClose:
>                case Colon:
>                case Comma:
> +                // advance() advance current pointer in spite of the error.

"advance() advance" reads badly.  And aren't you moving the pointer backwards, rather than advancing it?

::: js/src/jsonparser.h
@@ +23,5 @@
>      /* Data members */
>  
>      JSContext * const cx;
>      StableCharPtr current;
> +    const StableCharPtr begin,end;

Space after comma.

@@ +200,5 @@
>      JSObject *createFinishedObject(PropertyVector &properties);
>      bool finishObject(MutableHandleValue vp, PropertyVector &properties);
>      bool finishArray(MutableHandleValue vp, ElementVector &elements);
>  
> +    void getTextPosition(uint32_t &column,uint32_t &line);

Space after comma.
Attachment #8335697 - Flags: review?(n.nethercote) → feedback+
Waldo, there were two things about my review (comment 49) that I was unsure about -- the "types are weird" issue, and the "encode a newline" issue.  Do you know more about either of those?
Flags: needinfo?(jwalden+bmo)
> I think you need to look for any of the following sequences:
> - "\n"
> - "\r"
> - "\r\n" (yes, the two chars are treated as a single newline)

Assuming this is correct (hopefully Waldo can tell us), you should change some of your test cases to include "\r" and "\r\n" sequences.
Attached patch bug507998.patch (obsolete) — Splinter Review
Thank you for reviewing.

I fix the point which pointed out and add new test case.

> ::: js/src/jsapi-tests/testParseJSON.cpp
> @@ +228,5 @@
> >  reportJSONEror(JSContext *cx, const char *message, JSErrorReport *report)
> >  {
> >      ContextPrivate *p = static_cast<ContextPrivate *>(JS_GetContextPrivate(cx));
> > +    strcpy(p->line,(char *)report->messageArgs[1]);
> > +    strcpy(p->column,(char *)report->messageArgs[2]);
> 
> Add a space after the commas, please.
Sorry , Previous patch have wrong  style in some place.
How do you confirm the correctness of code style ? by your eyes ?
Attachment #8335697 - Attachment is obsolete: true
Attachment #8337523 - Flags: review?(n.nethercote)
> How do you confirm the correctness of code style ? by your eyes ?

Yes.
Comment on attachment 8337523 [details] [diff] [review]
bug507998.patch

Review of attachment 8337523 [details] [diff] [review]:
-----------------------------------------------------------------

This is better, but you still haven't addressed everything I mentioned in my last review.

::: js/src/jsapi-tests/testParseJSON.cpp
@@ +242,5 @@
>  reportJSONEror(JSContext *cx, const char *message, JSErrorReport *report)
>  {
>      ContextPrivate *p = static_cast<ContextPrivate *>(JS_GetContextPrivate(cx));
> +    strcpy(p->line, (char *)report->messageArgs[1]);
> +    strcpy(p->column, (char *)report->messageArgs[2]);

As I said last time:  I'd like a comment explaining the |(char *)| cast.  Something like "Although messageArgs[1] and messageArgs[2] are jschar*, we cast them to char* here because JSONParser::error() stores char* strings in them."

::: js/src/jsonparser.cpp
@@ +61,5 @@
> +JSONParser::getTextPosition(uint32_t &column, uint32_t &line)
> +{
> +    StableCharPtr ptr = begin;
> +    for (; ptr != current; ptr++) {
> +        if (*ptr == '\n' || *ptr == '\r') {

As I said in my last review:  if you see a '\r' followed immediately by '\n', you should treat that as a single line-break.  (This is how line-breaks are typically encoded on Windows.)  You'll need to make this loop a little more complex to detect this case.

Once you've done that, please modify some of your test cases above to include '\r\n' sequences.

@@ +763,5 @@
>                case Colon:
>                case Comma:
> +                // move current pointer backward because advance()
> +                // move current pointer forward in spite of the error.
> +                --current;

This comment reads poorly.  I suggest something like "Move the current pointer backwards so that the position reported in the error message is correct."

::: js/src/jsonparser.h
@@ +23,5 @@
>      /* Data members */
>  
>      JSContext * const cx;
>      StableCharPtr current;
> +    const StableCharPtr begin,end;

As I said last time:  space after the comma.
Attachment #8337523 - Flags: review?(n.nethercote) → feedback+
Attachment #394417 - Attachment is obsolete: true
Attachment #688367 - Attachment is obsolete: true
Attached patch bug507998.patchSplinter Review
Thank you for reviewing.
> This is better, but you still haven't addressed everything I mentioned in my last review.
I'm sorry, I didn't conceive good English and I forgot it on comment.

> Assuming this is correct (hopefully Waldo can tell us), you should change some of 
> your test cases to include "\r" and "\r\n" sequences.
I fix it and add test case.
Attachment #8337523 - Attachment is obsolete: true
Attachment #8338300 - Flags: review?(n.nethercote)
Comment on attachment 8338300 [details] [diff] [review]
bug507998.patch

Review of attachment 8338300 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!  Thanks for persevering.

Do you have commit access?  If not, I can land the patch for you.

::: js/src/jsonparser.cpp
@@ +66,5 @@
> +            ++line;
> +            column = 1;
> +            // \n\r are treated as a single newline
> +            if (*ptr == '\r' && *(ptr+1) == '\n')
> +                ++ptr;

This can cause ptr to go past current.  If you change the condition to:

 if (*ptr == '\r' && (ptr+1) != current && *(ptr+1) == '\n')

it will be safer.
Attachment #8338300 - Flags: review?(n.nethercote) → review+
Comment on attachment 8338300 [details] [diff] [review]
bug507998.patch

Review of attachment 8338300 [details] [diff] [review]:
-----------------------------------------------------------------

I didn't really look through all the coordinate details in testParseJSON.cpp to ensure they all made sense.  Did you, njn?  Someone reviewing should do so, certainly, as a sanity double-check, but I'm also holding up people right now in real life, so I'm going to punt on it myself.

::: js/src/js.msg
@@ +277,5 @@
>  MSG_DEF(JSMSG_BAD_OBJECT_INIT,        224, 0, JSEXN_SYNTAXERR, "invalid object initializer")
>  MSG_DEF(JSMSG_CANT_SET_ARRAY_ATTRS,   225, 0, JSEXN_INTERNALERR, "can't set attributes on indexed array properties")
>  MSG_DEF(JSMSG_EVAL_ARITY,             226, 0, JSEXN_TYPEERR, "eval accepts only one parameter")
>  MSG_DEF(JSMSG_MISSING_FUN_ARG,        227, 2, JSEXN_TYPEERR, "missing argument {0} when calling function {1}")
> +MSG_DEF(JSMSG_JSON_BAD_PARSE,         228, 3, JSEXN_SYNTAXERR, "JSON.parse: {0} at line {1} column {2} of the JSON data")

Now's about the time I wish our error-reporting mechanism had the ability to expose truly structured data, like clang's error messages do -- capable of exposing some of the surrounding text, a cursor to point at the offending character(s), etc.

Shoehorning everything into a string like this is going to bite us at some point, when some site decides to start parsing this stuff out of the string and relying on its formatting.  :-(  In the short run, tho, that's not something we can change.  Until we have such a structured reporting mechanism, perhaps we should tweak this error message every uplift cycle in some sort of way, to ensure that nobody starts relying on it.  (Me, paranoid?  Nah, quite possibly realistic, unfortunately.  :-\ )

::: js/src/jsapi-tests/testParseJSON.cpp
@@ +163,5 @@
>  END_TEST(testParseJSON_success)
>  
>  BEGIN_TEST(testParseJSON_error)
>  {
> +    CHECK(Error(cx, "["                         , "1", "2"));

Hmm, column "2" is debatably the correct thing to show.  I dunno.

@@ +181,5 @@
> +    CHECK(Error(cx, "\n[1,]"                    , "2", "4"));
> +    CHECK(Error(cx, "\n{a:2}"                   , "2", "2"));
> +    CHECK(Error(cx, "\n{\"a\":2,}"              , "2", "8"));
> +    CHECK(Error(cx, "\n]"                       , "2", "1"));
> +    CHECK(Error(cx, "\n'bad string'"            , "2", "1"));

I definitely want tests for "\"bad string\n", "\"bad string\r", "\"bad string\r\n".  Generally, the case of newlines at the end of the input feel like they want a lot of careful testing, to me.  Something for that followup to do a bunch of tests for, specifically.

::: js/src/jsonparser.cpp
@@ +60,5 @@
>  void
> +JSONParser::getTextPosition(uint32_t &column, uint32_t &line)
> +{
> +    StableCharPtr ptr = begin;
> +    for (; ptr != current; ptr++) {

Change != to <, just for a slight bit more comfort if something were to go off the rails.  (Yeah, it can't here, I think -- still a good habit to get into.)

@@ +64,5 @@
> +    for (; ptr != current; ptr++) {
> +        if (*ptr == '\n' || *ptr == '\r') {
> +            ++line;
> +            column = 1;
> +            // \n\r are treated as a single newline

Rather, that should be "\r\n is treated as a single newline." with trailing period.

@@ +65,5 @@
> +        if (*ptr == '\n' || *ptr == '\r') {
> +            ++line;
> +            column = 1;
> +            // \n\r are treated as a single newline
> +            if (*ptr == '\r' && *(ptr+1) == '\n')

Spaces around binary operators like +.  Or, you could do ptr[0] and ptr[1], which I *maybe* prefer, but I'm not sure.

@@ +66,5 @@
> +            ++line;
> +            column = 1;
> +            // \n\r are treated as a single newline
> +            if (*ptr == '\r' && *(ptr+1) == '\n')
> +                ++ptr;

Yeah, this could make ptr go past current.  I am *sort* of thinking it doesn't make stuff go past end, but I am not especially confident in that belief.  :-\  Either way, robustification greatly desired -- just put spaces around the binary operators in the above.  :-)

@@ +79,5 @@
>  {
> +    if (errorHandling == RaiseError) {
> +        uint32_t column = 1, line = 1;
> +        getTextPosition(column, line);
> +        char columnNumber[12], lineNumber[12];

It doesn't matter crazily, but this magic number 12 is 1) unexplained, and 2) off by one, I believe.  Better instead to do it this way:

  const size_t MaxWidth = sizeof("4294967295");
  char lineNumber[MaxWidth], columnNumber[MaxWidth];

@@ +81,5 @@
> +        uint32_t column = 1, line = 1;
> +        getTextPosition(column, line);
> +        char columnNumber[12], lineNumber[12];
> +        JS_snprintf(columnNumber, sizeof columnNumber, "%d", column);
> +        JS_snprintf(lineNumber, sizeof lineNumber, "%d", line);

%d is the wrong format specifier for uint32_t.  (Yes, I tested.)  I think %lu is what you want, instead.

@@ +766,5 @@
>                case Colon:
>                case Comma:
> +                // Move the current pointer backwards so that the position
> +                // reported in the error message is correct.
> +                --current;

I'm not at all sure this is the *only* place that current needs to be decremented.  Probably (followup patch) we should write tests, based on case-wise enumeration of all the JSON grammar, that exercises every possible error case, particularly near the end of the input string, to be confident we've hit all of these cases.

::: js/src/jsonparser.h
@@ +200,5 @@
>      JSObject *createFinishedObject(PropertyVector &properties);
>      bool finishObject(MutableHandleValue vp, PropertyVector &properties);
>      bool finishArray(MutableHandleValue vp, ElementVector &elements);
>  
> +    void getTextPosition(uint32_t &column, uint32_t &line);

SpiderMonkey style is biased against using references as outparams, because it's not obvious at the call site that the parameter might be modified.  We should trivially change these to pointers.
Flags: needinfo?(jwalden+bmo)
(In reply to Nicholas Nethercote [:njn] from comment #56)
> Comment on attachment 8338300 [details] [diff] [review]
> bug507998.patch
> 
> Review of attachment 8338300 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks good!  Thanks for persevering.
> 
> Do you have commit access?  If not, I can land the patch for you.

Thank you for your reviewing.
I don't have commit access.
Attached patch bug507998.patch (obsolete) — Splinter Review
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #57)
> Comment on attachment 8338300 [details] [diff] [review]
> bug507998.patch
> 
> Review of attachment 8338300 [details] [diff] [review]:
> -----------------------------------------------------------------

Thank you for your review.

::: js/src/jsapi-tests/testParseJSON.cpp
@@ +163,5 @@
> >  END_TEST(testParseJSON_success)
> >  
> >  BEGIN_TEST(testParseJSON_error)
> >  {
> > +    CHECK(Error(cx, "["                         , "1", "2"));
>  Hmm, column "2" is debatably the correct thing to show.  I dunno.
I think column "2" is good, If our error-reporting mechanism in JSON have a cursor to point at the offending character(s).

@@ +766,5 @@
> >                case Colon:
> >                case Comma:
> > +                // Move the current pointer backwards so that the position
> > +                // reported in the error message is correct.
> > +                --current;
> I'm not at all sure this is the *only* place that current needs to
> be decremented.  Probably (followup patch) we should write 
> tests, based on case-wise enumeration of all the JSON grammar, 
> that exercises every possible error case, particularly near the end 
> of the input string, to be confident we've hit all of these cases.

This reason is JSONParser::advance() advance the current pointer in spite of error.
Attachment #8339779 - Flags: review?(jwalden+bmo)
Blocks: 931475
Comment on attachment 8339779 [details] [diff] [review]
bug507998.patch

Review of attachment 8339779 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi-tests/testParseJSON.cpp
@@ +168,5 @@
> +    CHECK(Error(cx, "[,]"                       , "1", "2"));
> +    CHECK(Error(cx, "[1,]"                      , "1", "4"));
> +    CHECK(Error(cx, "{a:2}"                     , "1", "2"));
> +    CHECK(Error(cx, "{\"a\":2,}"                , "1", "8"));
> +    CHECK(Error(cx, "]"                         , "1", "1"));

Do we have a test for "" being 1/1?  (And for "\n" being 2/1, and "\r", and "\r\n".)  Add one if not.

@@ +169,5 @@
> +    CHECK(Error(cx, "[1,]"                      , "1", "4"));
> +    CHECK(Error(cx, "{a:2}"                     , "1", "2"));
> +    CHECK(Error(cx, "{\"a\":2,}"                , "1", "8"));
> +    CHECK(Error(cx, "]"                         , "1", "1"));
> +    CHECK(Error(cx, "'bad string'"              , "1", "1"));

wrongly-quoted, not bad (just to be clear)

@@ +181,5 @@
> +    CHECK(Error(cx, "\n[1,]"                    , "2", "4"));
> +    CHECK(Error(cx, "\n{a:2}"                   , "2", "2"));
> +    CHECK(Error(cx, "\n{\"a\":2,}"              , "2", "8"));
> +    CHECK(Error(cx, "\n]"                       , "2", "1"));
> +    CHECK(Error(cx, "\n'bad string'"            , "2", "1"));

wrongly-quoted

@@ +182,5 @@
> +    CHECK(Error(cx, "\n{a:2}"                   , "2", "2"));
> +    CHECK(Error(cx, "\n{\"a\":2,}"              , "2", "8"));
> +    CHECK(Error(cx, "\n]"                       , "2", "1"));
> +    CHECK(Error(cx, "\n'bad string'"            , "2", "1"));
> +    CHECK(Error(cx, "\"bad string\n\""          , "1", "1"));

Erm, why is this correct?  Should be 1/12.  Have you run this test in a debug build?  If you're not testing with a debug build, you're making a big mistake, we heavily depend on assertions to detect our mistakes before they ship!

@@ +195,5 @@
> +    CHECK(Error(cx, "\r[1,]"                    , "2", "4"));
> +    CHECK(Error(cx, "\r{a:2}"                   , "2", "2"));
> +    CHECK(Error(cx, "\r{\"a\":2,}"              , "2", "8"));
> +    CHECK(Error(cx, "\r]"                       , "2", "1"));
> +    CHECK(Error(cx, "\r'bad string'"            , "2", "1"));

wrongly-quoted

@@ +196,5 @@
> +    CHECK(Error(cx, "\r{a:2}"                   , "2", "2"));
> +    CHECK(Error(cx, "\r{\"a\":2,}"              , "2", "8"));
> +    CHECK(Error(cx, "\r]"                       , "2", "1"));
> +    CHECK(Error(cx, "\r'bad string'"            , "2", "1"));
> +    CHECK(Error(cx, "\"bad string\r\""          , "1", "1"));

1/12 again?

@@ +209,5 @@
> +    CHECK(Error(cx, "\r\n[1,]"                    , "2", "4"));
> +    CHECK(Error(cx, "\r\n{a:2}"                   , "2", "2"));
> +    CHECK(Error(cx, "\r\n{\"a\":2,}"              , "2", "8"));
> +    CHECK(Error(cx, "\r\n]"                       , "2", "1"));
> +    CHECK(Error(cx, "\r\n'bad string'"            , "2", "1"));

wrongly-quoted

@@ +210,5 @@
> +    CHECK(Error(cx, "\r\n{a:2}"                   , "2", "2"));
> +    CHECK(Error(cx, "\r\n{\"a\":2,}"              , "2", "8"));
> +    CHECK(Error(cx, "\r\n]"                       , "2", "1"));
> +    CHECK(Error(cx, "\r\n'bad string'"            , "2", "1"));
> +    CHECK(Error(cx, "\"bad string\r\n\""          , "1", "1"));

1/12 again.

@@ +241,5 @@
>      CHECK(!ok);
>      CHECK(!p.unexpectedErrorCount);
>      CHECK(p.expectedErrorCount == 1);
> +    CHECK(!expectedColumn || strcmp(expectedColumn, p.column) == 0);
> +    CHECK(!expectedLine || strcmp(expectedLine, p.line) == 0);

Remove the !expectedFoo halves here.  It's impossible for a reference to a const char array to be falsy.

@@ +258,4 @@
>  };
>  
>  static void
>  reportJSONEror(JSContext *cx, const char *message, JSErrorReport *report)

Please fix the name here to reportJSONError ("Error", not "Eror") while you're here.

@@ +263,5 @@
>      ContextPrivate *p = static_cast<ContextPrivate *>(JS_GetContextPrivate(cx));
> +    // Although messageArgs[1] and messageArgs[2] are jschar*, we cast them to char*
> +    // here because JSONParser::error() stores char* strings in them.
> +    strcpy(p->line, (char *)report->messageArgs[1]);
> +    strcpy(p->column, (char *)report->messageArgs[2]);

I'm not sure I believe this.  In a debugger, for the 1/12 case above that I commented on:

(gdb) p report->messageArgs[2]
$21 = (const jschar *) 0x1bf4b50 u"12"
(gdb) p report->messageArgs[1]
$22 = (const jschar *) 0x1bf4b30 u"1"

I don't have enough knowledge to say definitely for sure these actually are jschar* as their type implies.  But why do you think they aren't, here?  My quick stepping-through suggests that the char* passed in, get converted to jschar* under the hood for storage, so maybe you really should be trusting the types here.

::: js/src/jsonparser.cpp
@@ +62,5 @@
> +{
> +    StableCharPtr ptr = begin;
> +    for (; ptr < current; ptr++) {
> +        if (*ptr == '\n' || *ptr == '\r') {
> +            ++*line;

This is almost nitpick, but I'd much rather see

    uint32_t col = 0;
    uint32_t row = 0;

and then increments/sets of those, then a final

    *column = col;
    *line = row;

When you do it this way, correctness doesn't depend on the caller, and it's easier to verify correctness because you only need read this local bit of code.  (This code on its own is not obviously correct or incorrect -- it all depends on the initial values the caller passed in.)

@@ +66,5 @@
> +            ++*line;
> +            *column = 1;
> +            // \r\n is treated as a single newline.
> +            if (*ptr == '\r' && *(ptr + 1) == '\n')
> +                ++ptr;

For a case like

  JSON.parse('{\r');

this will dereference a |current| that's at the end of the input range.  It's not valid to do this per RangedPtr's contract.  (I'm astonished to find that this doesn't assert right now, but I'm going to fix that ASAP -- bug 946382.)  So add in a |ptr + 1 < current| condition here before the '\n' test.

@@ +78,3 @@
>  JSONParser::error(const char *msg)
>  {
> +    if (errorHandling == RaiseError) {

Hmm, this block of lines is pretty mashed together.  Please add a blank line after the getTextPosition line and after the second JS_snprintf.

@@ +767,5 @@
>                case Colon:
>                case Comma:
> +                // Move the current pointer backwards so that the position
> +                // reported in the error message is correct.
> +                --current;

So, this is fine enough.  But you haven't convinced me this is the *only* such place that needs adjustment.  :-)  Just skimming start of file to end, I see a |*current++| inside string parsing (occurring for parsing string contents after an embedded escape) that looks similarly unideal:

js> JSON.parse('["\\t\u0000') 
typein:4:0 SyntaxError: JSON.parse: bad character in string literal at line 1 column 6 of the JSON data

By the logic of the change here, shouldn't that be line 1, column 5?

Looking through the rest of jsonparser.cpp, in order, I also see:

js> JSON.parse('["\\t\\q')
typein:6:0 SyntaxError: JSON.parse: bad escaped character at line 1 column 7 of the JSON data

I...think this is the only other case where we need to back up.  But honestly I think another person doing a fine-toothed comb through jsonparser.cpp is advised here, to be sure we have it all.
Attachment #8339779 - Flags: review?(jwalden+bmo) → review-
Thanks for your reviewing,Jeff.

>@@ +182,5 @@
> +    CHECK(Error(cx, "\n{a:2}"                   , "2", "2"));
> +    CHECK(Error(cx, "\n{\"a\":2,}"              , "2", "8"));
> +    CHECK(Error(cx, "\n]"                       , "2", "1"));
> +    CHECK(Error(cx, "\n'bad string'"            , "2", "1"));
> +    CHECK(Error(cx, "\"bad string\n\""          , "1", "1"));
>Erm, why is this correct?  Should be 1/12.  Have you run this test in a debug build?  If you're not testing with a debug build, you're making a big mistake, we >heavily depend on assertions to detect our mistakes before they ship!

This reason is JSON prohibits single quotes.
I think this test case is confusing.
So I change single quotes into double quotes.
> So, this is fine enough.  But you haven't convinced me this is the *only* such place that needs adjustment.  :-)  Just skimming start of file to end, I see a 
> |*current++| inside string parsing (occurring for parsing string contents after an embedded escape) that looks similarly unideal:
> js> JSON.parse('["\\t\u0000') 
> typein:4:0 SyntaxError: JSON.parse: bad character in string literal at line 1 column 6 of the JSON data
> By the logic of the change here, shouldn't that be line 1, column 5?

No. line 1, column 6 is right.
The needs of "--current" is that JSONParser::advance() advance current pointer in spite of the current pointer has error.
So If erase "--current" the result of the following code is line 1 column 3.

js> JSON.parse("[,]")
typein:2:0 SyntaxError: JSON.parse: unexpected character at line 1 column 3 of the JSON data
Attached patch bug507998.patch (obsolete) — Splinter Review
I fix the code and test case which pointed out.
Attachment #8339779 - Attachment is obsolete: true
Attachment #8342895 - Flags: review?(jwalden+bmo)
Comment on attachment 8342895 [details] [diff] [review]
bug507998.patch

Review of attachment 8342895 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to iseki.m.aa from comment #62)
> No. line 1, column 6 is right.
> The needs of "--current" is that JSONParser::advance() advance current
> pointer in spite of the current pointer has error.
> So If erase "--current" the result of the following code is line 1 column 3.
> 
> js> JSON.parse("[,]")
> typein:2:0 SyntaxError: JSON.parse: unexpected character at line 1 column 3
> of the JSON data

I think we may be talking about different increments that need to be rolled back here.  The rollback you added is fine.  The increments at https://hg.mozilla.org/integration/mozilla-inbound/file/f15bd9ef9b7e/js/src/jsonparser.cpp#l128 and https://hg.mozilla.org/integration/mozilla-inbound/file/f15bd9ef9b7e/js/src/jsonparser.cpp#l146 are what need to be rolled back, for proper column-selection.

js> JSON.parse('["\\t\u0000') 
typein:4:0 SyntaxError: JSON.parse: bad character in string literal at line 1 column 6 of the JSON data

01  [
02  "
03  <\>
04  t
05  \u0000

The first bad character is in column 5.

And likewise:

js> JSON.parse('["\\t\\q')
typein:6:0 SyntaxError: JSON.parse: bad escaped character at line 1 column 7 of the JSON data

01  [
02  "
03  <\>
04  5
05  <\>
06  q

The first bad character is in column 6 (an unexpected character is occurring as part of an escape).

::: js/src/jsapi-tests/testParseJSON.cpp
@@ +185,5 @@
> +    CHECK(Error(cx, "\n[1,]"                    , "2", "4"));
> +    CHECK(Error(cx, "\n{a:2}"                   , "2", "2"));
> +    CHECK(Error(cx, "\n{\"a\":2,}"              , "2", "8"));
> +    CHECK(Error(cx, "\n]"                       , "2", "1"));
> +    CHECK(Error(cx, "\"bad string\n\""          , "1", "12"));

Wait.  This line is perfectly fine, now that it's fixed from 1/1 to 1/12.  But there's nothing wrong with the previous

+    CHECK(Error(cx, "\r\n'bad string'"            , "2", "1"));

except that it's difficult to read.  I meant for you to change that line to

+    CHECK(Error(cx, "\r\n'wrongly-quoted string'" , "2", "1"));

so that you'd preserve the test (which is perfectly valid), just make clearer to the reader what's wrong with it, and why there would be an error at 2/1.

Similarly, I'd like you to add back

+    CHECK(Error(cx, "'bad string'"              , "1", "1"));

as it was in the previous patch, but changed to

+    CHECK(Error(cx, "'wrongly-quoted string'"   , "1", "1"));

and also

+    CHECK(Error(cx, "\r'bad string'"            , "2", "1"));

should be changed to

+    CHECK(Error(cx, "\r'wrongly-quoted string'" , "2", "1"));

and also

+    CHECK(Error(cx, "\r\n'bad string'"            , "2", "1"));

should be changed to

+    CHECK(Error(cx, "\r\n'wrongly-quoted string'" , "2", "1"));

@@ +258,5 @@
>  struct ContextPrivate {
>      unsigned unexpectedErrorCount;
>      unsigned expectedErrorCount;
> +    char column[10];
> +    char line[10];

Add a |static const size_t MaxSize = sizeof("4294967295");| and use MaxSize instead of 10s here.  (Sure, we don't hit those sizes in the test.  It's still bad code.)

@@ +263,4 @@
>  };
>  
>  static void
> +reportJSONError(JSContext *cx, const char *message, JSErrorReport *report)

I should have mentioned this last time, but could you capitalize this while you're here?  SpiderMonkey style capitalizes standalone, non-class methods:

static void
ReportJSONError(JSContext *cx, const char *message, JSErrorReport *report)

@@ +268,5 @@
>      ContextPrivate *p = static_cast<ContextPrivate *>(JS_GetContextPrivate(cx));
> +    // Although messageArgs[1] and messageArgs[2] are jschar*, we cast them to char*
> +    // here because JSONParser::error() stores char* strings in them.
> +    strcpy(p->line, (char *)report->messageArgs[1]);
> +    strcpy(p->column, (char *)report->messageArgs[2]);

You didn't change these, they're still wrong:

../../jsapi-tests/testParseJSON.cpp:189:CHECK failed: Error(cx, "\"bad string\n\"" , "1", "12")
TEST-UNEXPECTED-FAIL | testParseJSON_error | CHECK failed: strcmp(expectedColumn, p.column) == 0CHECK failed: Error(cx, "\"bad string\n\"" , "1", "12")

You want js_strncpy, without casts, and you want js_strlen(report->messageArgs[1]) as the n to pass in.
Attachment #8342895 - Flags: review?(jwalden+bmo) → review-
Attached patch bug507998.patchSplinter Review
Thank you for your reviewing.

I fix the code which pointed out and make js_strcmp to compare jschar.
Attachment #8342895 - Attachment is obsolete: true
Attachment #8345746 - Flags: review?(jwalden+bmo)
Comment on attachment 8345746 [details] [diff] [review]
bug507998.patch

Review of attachment 8345746 [details] [diff] [review]:
-----------------------------------------------------------------

This is close enough I ended up making the changes noted below myself, and pushing them combined with your patch as one.  :-)  Hope you don't mind that.  Thanks for the patch (and for working through all the various issues encountered along the way in it)!

::: js/src/jsapi-tests/testParseJSON.cpp
@@ +163,5 @@
>  END_TEST(testParseJSON_success)
>  
>  BEGIN_TEST(testParseJSON_error)
>  {
> +    CHECK(Error(cx, ""                                  , "1", "1"));

Ugh, all these lines changed the number of spaces in them for alignment purposes, so Bugzilla's interdiff thinks the entire thing changed.  For future reference, you might consider avoiding such changes when posting revised patches -- maybe make the alignment bits in a separate patch or something.  Or not, it's not the end of the world, but it does make for more reviewer effort.  It's a messy problem, sadly, for which there's not really a good solution.  :-\

(Personally, because of the above issue, I tend to think trying to align stuff is overrated, and I don't align the code I write.  But it's a personal choice, I don't complain about it when I see it, generally.)

@@ +228,5 @@
> +
> +    CHECK(Error(cx, "{\n\"a\":[2,3],\r\"b\":,5,6}"      , "3", "5"));
> +    CHECK(Error(cx, "{\r\"a\":[2,3],\n\"b\":,5,6}"      , "3", "5"));
> +    CHECK(Error(cx, "[\"\\t\\q"                         , "1", "6"));
> +    CHECK(Error(cx, "[\"\\t\u0000"                      , "1", "5"));

It occurs to me we don't have a test here for an input string like "[\"\\t\\uZ", where there's a malformed Unicode escape.  The current code will point out the error at the <\> that's the start of the escape -- not the individual non-hex number that follows.  So "[\"\\t\\uZ", "[\"\\t\\uAZ", "[\"\\t\\uAAZ", and "[\"\\t\\uAAAZ" will all point at the backslash, not at the first Z.  That's "good enough" for a first patch.  But we might want to change to be more precise, in a followup.  I'll add these as tests here when landing.

...and actually, after making that change.  These tests all test the same thing!  If you look, you can see that the |end - current < 4| case is handled separately from the case where one of the four subsequent characters is non-hexadecimal.  Those cases should probably be handled similarly.

And, um, after adding a bunch more tests, I realized that we were forgetting to roll back |current| in the case where a truncated \uF000 was encountered in a string.  That exposed that the column number was weirdly different between "\uF" and "\uFZZZZ".  So in the end I just ended up rewriting the Unicode escape stuff to all be exact-character-precise, and I added all the tests to exercise all the code paths for all the different ways things could fail.  I'll post the delta (run past jorendorff on IRC for rubberstamp) after this.

::: js/src/jsonparser.cpp
@@ +85,5 @@
> +        uint32_t column = 1, line = 1;
> +        getTextPosition(&column, &line);
> +
> +        const size_t MaxWidth = sizeof("4294967295");
> +        char columnNumber[MaxWidth], lineNumber[MaxWidth];

char columnNumber[MaxWidth];
char lineNumber[MaxWidth];

We're sort of trying to move away from multiple declarations that depend on C/C++'s ability to define differently-typed things all at once -- most often |char c, *ptr;|, but this is a somewhat similar case, because it's only by choice that both arrays have the same width.  So multiple definitions here seems slightly better.

@@ +775,5 @@
>                case Colon:
>                case Comma:
> +                // Move the current pointer backwards so that the position
> +                // reported in the error message is correct.
> +                --current;

It occurs to me that these "random"-looking decrements, rather than commenting all of them (or only this one), should probably be encapsulated in a single method whose name indicates its purpose.  I'll do that in a separate patch and/or bug, certainly not necessary to hold up this for that.  :-)

::: js/src/jsstr.cpp
@@ +4177,5 @@
> +js_strcmp(const jschar *lhs, const jschar *rhs)
> +{
> +    while (true) {
> +        if (*lhs == 0 && *rhs == 0)
> +            break;

SpiderMonkey style is to not have |else| after a control-flow statement that causes the |else| to not be necessary.  Here, if you break, you'll never hit the |else|.  Ditto for the |else| after the |return|.

A few minor tweaks past that.  First, instead of checking ordering we can check for inequality, returning the difference in that case.  This exposes more information than +1/-1 would, and it's slightly more compact to boot.  Then after that we only need one if-zero check to handle the end-of-string case.  And if we do it that way, we don't need the final |return 0;| at the end, either.

So this method body might as well be this, at the end of it all:

    while (true) {
        if (*lhs != *rhs)
            return int32_t(*lhs) - int32_t(*rhs);
        if (*lhs == 0)
            return 0;
        ++lhs, ++rhs;
    }

::: js/src/jsstr.h
@@ +225,5 @@
>  
>  extern size_t
>  js_strlen(const jschar *s);
>  
> +extern size_t

Use int32_t here -- suitable to hold the difference between two uint16_t characters, so that this can do the usual thing of returning the difference between the first different characters.
Attachment #8345746 - Flags: review?(jwalden+bmo) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/3c86dfbca278

Also at one point I found that StableCharPtr and the similar smart-character-pointer classes didn't have a useful operator=, so I fixed that in a separate push, made at the same time as this, needed for the delta to the patch here to work.  Or at least needed for one version of it, come to think of it, that might have been lost during tweaks at some point.  Still valuable even if not used here, for sure.

https://hg.mozilla.org/integration/mozilla-inbound/rev/2d2e77e82179
Target Milestone: --- → mozilla29
Followup -- seems MSVC at least (no idea why apparently no one else!) doesn't like "\u0000" as a C++ string literal:

https://hg.mozilla.org/integration/mozilla-inbound/rev/c0f0a2299dd0
And another, to remove an error-message-check from a test (error messages aren't a stable interface, tests shouldn't depend on them, else you get time bombs like this):

https://hg.mozilla.org/integration/mozilla-inbound/rev/03d86cd05b7e
Keywords: feature
Hardware: x86 → All
Keywords: dev-doc-needed
Whiteboard: [mentor=Waldo][lang=c++] → [mentor=Waldo][lang=c++][DocArea=JS]
This seems pretty well covered automatically. Is there any need for manual QA here?
Flags: needinfo?(iseki.m.aa)
(In reply to Ioana Budnar, QA [:ioana] from comment #73)
> This seems pretty well covered automatically. Is there any need for manual
> QA here?

Sorry, I don't know well about QA.
What is the manual QA and covered automatically?
Flags: needinfo?(iseki.m.aa)
(In reply to Masaya Iseki[:isk](UTC+9) from comment #74)
> (In reply to Ioana Budnar, QA [:ioana] from comment #73)
> > This seems pretty well covered automatically. Is there any need for manual
> > QA here?
> 
> Sorry, I don't know well about QA.
> What is the manual QA and covered automatically?

By covered automatically I was referring to the automated tests already testing this feature, like https://hg.mozilla.org/mozilla-central/diff/3c86dfbca278/js/src/jsapi-tests/testParseJSON.cpp.

Manual QA would mostly involve someone testing this feature manually. For example: if a part of this feature can't be tested automatically, we can verify that it works manually (exercise that part of the feature).
Thank you for your answer.

There are no part which can't be tested automatically.So it doesn't need manual QA.
Whiteboard: [mentor=Waldo][lang=c++][DocArea=JS] → [mentor=Waldo][lang=c++][DocArea=JS][qa-]
You need to log in before you can comment on or make changes to this bug.