Closed Bug 489089 Opened 11 years ago Closed 11 years ago

JSON.parse is way slower than it needs to be

Categories

(Core :: JavaScript Engine, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9.1b4

People

(Reporter: sayrer, Assigned: brendan)

Details

(Keywords: fixed1.9.1, Whiteboard: fixed-in-tracemonkey)

Attachments

(3 files, 5 obsolete files)

For some reason, I foolishly dropped an optimization the old implementation had, and also JSStringBuffers shouldn't be grown one byte at a time.

The attached patch is much faster. With the patch, the test located at

http://hg.mozilla.org/users/rsayre_mozilla.com/benchmarker/file/8a4d519f109c/SunSpider/tests/json/parse-financial.js

is about 3x faster, and now spends 1/3rd of its time in JS_DefineUCProperty. Here's the allocation avoidance:

before:
malloc/free: 4,647,202 allocs, 4,645,811 frees, 76,819,059 bytes allocated.

after:
malloc/free: 456,202 allocs, 454,811 frees, 19,633,059 bytes allocated.
w191, pls.
Flags: wanted1.9.1+
Attachment #373580 - Flags: review?(igor)
Attached patch patch for jsscan.cpp (obsolete) — Splinter Review
Underlying bug is in jsscan.cpp:GrowStringBuffer, and it bites E4X too. Sorry for the bug, it doesn't bite E4X as hard (evidence: no bug filed with XML in JS test input) and really, E4X is not as important as JSON.

/be
Attachment #373614 - Attachment is obsolete: true
Should convert JSStringBuffer struct/functions to idiomatic C++ class/methods, but not for this fix.

/be
Assignee: general → brendan
Attachment #373616 - Attachment is obsolete: true
Status: NEW → ASSIGNED
What produced the

malloc/free: 456,202 allocs, 454,811 frees, 19,633,059 bytes allocated.

lines? I played with malloc_history on MacOSX but didn't find anything like that.

/be
OS: Mac OS X → All
Priority: -- → P2
Hardware: x86 → All
Target Milestone: --- → mozilla1.9.1b4
I see no benefit to adding another layer of buffering (stack-allocated, but that is not material) above JSStringBuffer in js_ConsumeJSONText, which requires the override-jp->buffer extra arguments and logic, just to work around the 1994-era and so-far E4X-only GrowStringBuffer bug. The minimal changes are:

1. Fix jsscan.cpp:GrowStringBuffer to avoid realloc'ing by one jschar worst case.

2. Fix json.cpp to reuse objectKey and buffer rather than thrashing via malloc.

If something like PUSHCHAR is justified by profiling data, js_AppendChar or an inline method of a class JSStringBuffer can be provided, using the one underlying realloc'ed buffer.

/be
(In reply to comment #5)
> What produced the
> 
> malloc/free: 456,202 allocs, 454,811 frees, 19,633,059 bytes allocated.

Valgrind
(In reply to comment #6)
> 
> 1. Fix jsscan.cpp:GrowStringBuffer to avoid realloc'ing by one jschar worst
> case.
> 
> 2. Fix json.cpp to reuse objectKey and buffer rather than thrashing via malloc.

I thought it would be better to free the buffer if it had gotten a bit big.

> If something like PUSHCHAR is justified by profiling data, js_AppendChar or an
> inline method of a class JSStringBuffer can be provided, using the one
> underlying realloc'ed buffer.

Without the PUSHCHAR stuff, the initial capacity of at least jp->buffer needs to be set. That stack-allocated thing was a big win in the old XPCOM implementation.
Attachment #373580 - Flags: review?(igor)
Attachment 373617 [details] [diff] is slower than my initial cut:

277ms vs 296ms. Will stick it in the profiler.
Attached file test case
> 
> If something like PUSHCHAR is justified by profiling data, js_AppendChar or an
> inline method of a class JSStringBuffer can be provided, using the one
> underlying realloc'ed buffer.

The difference between the two implementations is now that js_AppendChar is slower than PUSHCHAR. It is short and not inlined.
(In reply to comment #8)
> (In reply to comment #6)
> > 
> > 1. Fix jsscan.cpp:GrowStringBuffer to avoid realloc'ing by one jschar worst
> > case.
> > 
> > 2. Fix json.cpp to reuse objectKey and buffer rather than thrashing via malloc.
> 
> I thought it would be better to free the buffer if it had gotten a bit big.

The parser is freed soon enough. Dynamic memory is for problems requiring dynamic range, we should let this become a problem.

(In reply to comment #11)
> > 
> > If something like PUSHCHAR is justified by profiling data, js_AppendChar or an
> > inline method of a class JSStringBuffer can be provided, using the one
> > underlying realloc'ed buffer.
> 
> The difference between the two implementations is now that js_AppendChar is
> slower than PUSHCHAR. It is short and not inlined.

Comment 6 last paragraph!

/be
> > > 
> > > If something like PUSHCHAR is justified by profiling data, js_AppendChar or an
> > > inline method of a class JSStringBuffer can be provided, using the one
> > > underlying realloc'ed buffer.
> > 
> > The difference between the two implementations is now that js_AppendChar is
> > slower than PUSHCHAR. It is short and not inlined.
> 
> Comment 6 last paragraph!

...is what I just quoted above? ;)

Shark shows the parser loop is a bit slower, and that js_AppendChar spends a lot of time in the prologue and epilogue. So, inlining should be an improvement, but I don't know if it will make up 20ms on the benchmark.
Attached patch js_FastAppendChar (obsolete) — Splinter Review
js_AppendChar stores two characters, not one. This is by design, but it makes js_AppendChar the wrong tool for the inner-scanner-loop job (jsscan.cpp doesn't use it in the main JS lexer). Exposing FastAppendChar to avoid this unnecessary zero-termination cost.

/be
Attachment #373617 - Attachment is obsolete: true
Sorry, bugmail-reading tunnel vision!

Profile the latest patch, it should have well-predicted near-target branches and run about as fast.

PUSHCHAR has fewer branches so it might win slightly, but apart from making js_FastAppendChar faster without losing something important (STRING_BUFFER_OK?), we shouldn't throw an extra layer of buffering and code size at the last few ms.

/be
(In reply to comment #15)
> Sorry, bugmail-reading tunnel vision!
> 
> Profile the latest patch, it should have well-predicted near-target branches
> and run about as fast.

It makes up about half of the difference. The results are more stable if you stick

    jp->buffer->grow(jp->buffer, 256);

in js_BeginJSONParse.
Will solicit review after sayrer profiles and stuff.

/be
Attachment #373653 - Attachment is obsolete: true
Nice. A touch faster, and passes regression tests. This profile is now dominated by property definition and object creation.
Attachment #373671 - Flags: review?(sayrer)
Attachment #373671 - Flags: review?(igor)
Attachment #373671 - Flags: review?(sayrer) → review+
Comment on attachment 373671 [details] [diff] [review]
hoist STRING_BUFFER_OK from js_FastAppendChar, fix OOM handling, don't dynamically allocate what should be inline struct members, etc.

>diff --git a/js/src/json.cpp b/js/src/json.cpp
>--- a/js/src/json.cpp
>+++ b/js/src/json.cpp
>@@ -1,4 +1,7 @@
> 
>+    jp->buffer.grow(&jp->buffer, JSON_INITIAL_BUFSIZE);
>     return jp;

Shouldn't this check for OOM in grow?


>@@ -832,40 +821,35 @@ HandleKeyword(JSContext *cx, JSONParser 
> static JSBool
> HandleData(JSContext *cx, JSONParser *jp, JSONDataType type)
> {
>-  JSBool ok = JS_FALSE;
>+    JSBool ok = JS_FALSE;
> 
>-  if (!STRING_BUFFER_OK(jp->buffer)) {
>-      // what to call this error?
>-      JS_ReportErrorNumber(cx, js_GetErrorMessage, NULL, JSMSG_JSON_BAD_PARSE);
>-      return JS_FALSE;
>-  }
>+    switch (type) {
>+      case JSON_DATA_STRING:
>+        ok = HandleString(cx, jp, jp->buffer.base, STRING_BUFFER_OFFSET(&jp->buffer));
>+        break;
> 
>-  switch (type) {
>-    case JSON_DATA_STRING:
>-      ok = HandleString(cx, jp, jp->buffer->base, STRING_BUFFER_OFFSET(jp->buffer));
>-      break;
>+      case JSON_DATA_KEYSTRING:
>+        js_AppendUCString(&jp->objectKey, jp->buffer.base, STRING_BUFFER_OFFSET(&jp->buffer));
>+        ok = STRING_BUFFER_OK(&jp->objectKey);
>+        if (!ok) {
>+            JS_ReportOutOfMemory(cx);
>+        }

Nit: no braces around JS_ReportOutOfMemory.

>+        break;
> 
>-    case JSON_DATA_KEYSTRING:
>-      js_AppendUCString(jp->objectKey, jp->buffer->base, STRING_BUFFER_OFFSET(jp->buffer));
>-      ok = STRING_BUFFER_OK(jp->objectKey);
>-      break;
>+      case JSON_DATA_NUMBER:
>+        ok = HandleNumber(cx, jp, jp->buffer.base, STRING_BUFFER_OFFSET(&jp->buffer));
>+        break;
> 
>-    case JSON_DATA_NUMBER:
>-      ok = HandleNumber(cx, jp, jp->buffer->base, STRING_BUFFER_OFFSET(jp->buffer));
>-      break;
>+      case JSON_DATA_KEYWORD:
>+        ok = HandleKeyword(cx, jp, jp->buffer.base, STRING_BUFFER_OFFSET(&jp->buffer));
>+        break;
> 
>-    case JSON_DATA_KEYWORD:
>-      ok = HandleKeyword(cx, jp, jp->buffer->base, STRING_BUFFER_OFFSET(jp->buffer));
>-      break;
>+      default:
>+        JS_NOT_REACHED("Should have a JSON data type");

Pre-existing nit: replace the last JSON_DATA_KEYWORD case and the default case with:

      default:
        JS_ASSERT(type == JSON_DATA_KEYWORD);

This way it would be possible to avoid useless JSBool ok = JS_TRUE initialization.

>+    js_RewindStringBuffer(&jp->buffer);

This must not be called when buffer is not OK as the function does    JS_ASSERT(STRING_BUFFER_OK(sb));
Attachment #373671 - Flags: review?(igor) → review-
Indeed there is no error checking silver bullet, especially not due to errno-like STRING_BUFFER_OK non-solution. I knew from long Unix experience that errno was a botch. Exceptions or return value checks are the only reliable ways.

/be
Attachment #373671 - Attachment is obsolete: true
Attachment #373871 - Flags: review?(igor)
Bugzilla interdiff works. It won't show, but I did check, that HandleData's failure return value is always checked and immediately propagated, avoiding a double test of STRING_BUFFER_OK and double-OOM-report.

/be
Comment on attachment 373871 [details] [diff] [review]
fixes for review issues, plus more error checking

>+static inline void
>+js_FastAppendChar(JSStringBuffer *sb, jschar c)
>+{
>+    JS_ASSERT(STRING_BUFFER_OK(sb));
>+    if (!ENSURE_STRING_BUFFER(sb, 1))
>+        return;

I suspect that the ultimate solution would be to change js_FastAppendChar to return false on a failure. With modern compiler it may even generate better code than those delayed TRING_BUFFER_OK(sb) checks. For example, given the following code,

struct S {
    char *buffer;
    char *end;

    bool slow_add(char c);

    bool add(char c)
    {
        if (buffer == end)
            return slow_add(c);
        *buffer++ = c;
        return true;
    }
};

bool test(S *s)
{
    return s->add('X') && s->add('Y') && s->add('Z');
}

GCC 4.2.4 at -O3 does the right thing for s->add('X') && s->add('Y') && s->add('Z') and generates the code that corresponds to (I was rather impressed with that):

    if (s->buffer == s->end) goto grow1;
  fast_add_1: 
    *s->buffer++ = 'X';
    
    if (s->buffer == s->end) goto grow2;
  fast_add_2: 
    *s->buffer++ = 'Y';
    
    if (s->buffer == s->end) goto grow3;
  fast_add_3: 
    *s->buffer++ = 'Z';
    return true;

  grow1:
    if (!s->grow()) return false;
    goto fast_add_1;
  
  grow2:
    if (!s->grow()) return false;
    goto fast_add_2;
  
  grow3:
    if (!s->grow()) return false;
    goto fast_add_3;

Note that there are no extra checks if the result of the inlined add is true. The fast path just checks for buffer boundaries. But this is for another bug.
Attachment #373871 - Flags: review?(igor) → review+
Nice, the prefetched path is the fast path and without PGO. Would GCC do that if you wrote out if (!s->add('X')) return false; if (!s->add('Y') return false; etc.?

Landed in tm:

http://hg.mozilla.org/tracemonkey/rev/89d209a23a3a

I filed bug 489419 on JSStringBuffer cleanup.

/be
Whiteboard: fixed-in-tracemonkey
Ahem:

http://hg.mozilla.org/tracemonkey/rev/2ee1b85ffe90

Necessary followup patch.

/be
Still failing. Backing out?
Whiteboard: fixed-in-tracemonkey
One more followup. I am a crispy critter.

http://hg.mozilla.org/tracemonkey/rev/eb2406a72609

/be
Whiteboard: fixed-in-tracemonkey
http://hg.mozilla.org/mozilla-central/rev/89d209a23a3a
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
(In reply to comment #27)
> http://hg.mozilla.org/mozilla-central/rev/89d209a23a3a

The followups were merged as well.
You need to log in before you can comment on or make changes to this bug.