Last Comment Bug 410890 - nsJSON speedups
: nsJSON speedups
Status: RESOLVED FIXED
: perf
Product: Core
Classification: Components
Component: DOM (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla1.9beta4
Assigned To: Robert Sayre
:
: Andrew Overholt [:overholt]
Mentors:
Depends on: 633934
Blocks:
  Show dependency treegraph
 
Reported: 2008-01-05 00:18 PST by Robert Sayre
Modified: 2011-02-14 05:01 PST (History)
9 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
simple benchmarking test using xpcshell script (3.59 KB, text/plain)
2008-01-07 11:03 PST, Mark Finkle (:mfinkle) (use needinfo?)
no flags Details
use a PRUnichar[] buffer in the Encode routine (4.68 KB, patch)
2008-01-07 17:08 PST, Robert Sayre
no flags Details | Diff | Splinter Review
another simple benchmark using a large array (217.68 KB, text/plain)
2008-01-10 13:42 PST, Mark Finkle (:mfinkle) (use needinfo?)
no flags Details
WIP, getting faster (18.76 KB, patch)
2008-01-12 17:20 PST, Robert Sayre
no flags Details | Diff | Splinter Review
WIP update (28.29 KB, patch)
2008-01-22 16:18 PST, Robert Sayre
no flags Details | Diff | Splinter Review
nit fix (28.34 KB, patch)
2008-01-22 16:34 PST, Robert Sayre
crowderbt: review+
jst: superreview+
Details | Diff | Splinter Review
address crowder and jst comments (9.57 KB, patch)
2008-02-05 13:33 PST, Robert Sayre
sayrer: review+
sayrer: superreview+
Details | Diff | Splinter Review
address crowder and jst comments (29.62 KB, patch)
2008-02-05 13:35 PST, Robert Sayre
sayrer: review+
sayrer: superreview+
mtschrep: approval1.9+
Details | Diff | Splinter Review

Description Robert Sayre 2008-01-05 00:18:20 PST
Shark says a good amount of decoding time is spent in string manipulation (appending one char at a time in the loop) and creating about:blank for the channel, so we'll route around that.
Comment 1 Mark Finkle (:mfinkle) (use needinfo?) 2008-01-07 11:03:00 PST
Created attachment 295803 [details]
simple benchmarking test using xpcshell script

This simple JS script can test the native JSON imp against the JSM version and the json.org (json2.js) version.

I tested like this from xpcshell:
js> version(170);
js> load("path/to/json-benchmark.js");
js> load("path/to/json2.js");
js> startEncode();
js> startDecode();

json2.js is in the tree or can be downloaded from json.org
Comment 2 Robert Sayre 2008-01-07 13:53:36 PST
Encode test case at 1000 iterations:

Shark Bottom-Up summary.

6.8%  szone_free  
5.3%  nsAString_internal::Replace(unsigned, unsigned, unsigned short const*, unsigned)  
4.2%  pthread_mutex_lock  
3.5%  nsAString_internal::ReplacePrep(unsigned, unsigned, unsigned) 
3.2%  pthread_mutex_unlock  
2.5%  js_Enumerate  
2.2%  JS_dtostr 
2.2%  szone_malloc  
2.1%  nsJSONWriter::WriteString(unsigned short const*, unsigned)  
2.0%  nsJSON::EncodeObject(JSContext*, long*, nsJSONWriter*, JSObject*, unsigned) 
2.0%  tiny_malloc_from_free_list  
1.8%  js_LookupPropertyWithFlags  
1.8%  szone_size  
1.6%  js_NewGCThing 

Three main issues here. 
  1.) String manipulation from nsJSONWriter::Write is expensive
  2.) js_AtomizeString calls to PR_Lock are expensive
  3.) JS_Enumerate allocations are expensive
Comment 3 Robert Sayre 2008-01-07 17:08:28 PST
Created attachment 295872 [details] [diff] [review]
use a PRUnichar[] buffer in the Encode routine

This patch pretty much removes string manipulation from the profile. We do
spend some extra time in nsJSON::Write, but it's a win overall.

Before

iterations  nsJSON  json.jsm  json2.js
20          5       22        38
50          9       52        95
100         17      103       196
200         34      198       379
500         88      496       955
750         130     994       1371
1000        166     958       2141
                    
After

iterations  nsJSON  json.jsm  json2.js
20          3       20        39
50          7       51        97
100         15      101       192
200         30      206       386
500         77      500       964
750         117     1047      1394
1000        145     972       2156
                              
Shark After:
     
6.7% szone_free  
3.8% pthread_mutex_lock  
3.7% pthread_mutex_unlock  
3.6% nsJSONWriter::Write()
2.8% js_Enumerate  
2.6% nsJSONWriter::WriteString()
2.5% JS_dtostr 
2.3% szone_malloc  
2.1% nsJSON::EncodeObject() 
2.1% js_NewGCThing 
2.0% szone_size  
2.0% js_LookupPropertyWithFlags  
1.8% tiny_malloc_from_free_list  
1.7% js_CallIteratorNext 
1.6% js_SearchScope
1.4% XUL SearchTable
Comment 4 Robert Sayre 2008-01-08 00:17:22 PST
Bottom-up view shows lots of allocator thrashing from string mutation.

Decode top-down view:

 96.2%  start 
 96.2%   _start 
 96.2%    main  
 88.1%     Process() 
 88.1%      ProcessFile()  
 88.0%       JS_ExecuteScript 
 88.0%        js_Execute  
 88.0%         js_Interpret 
 87.9%          js_Invoke 
 79.1%           Load()  
 78.9%            JS_ExecuteScript  
 78.9%             js_Execute 
 78.9%              js_Interpret  
 78.7%               js_Invoke  
 78.3%                XPC_WN_CallMethod()  
 78.1%                 XPCWrappedNative::CallMethod()  
 77.8%                  NS_InvokeByIndex_P  
 77.3%                   nsJSON::Decode()  
 77.2%                    nsJSON::DecodeInternal() 
 72.0%                     nsJSONListener::OnDataAvailable()  
 71.8%                      nsJSONListener::ProcessBytes() 
 71.4%                       nsJSONListener::Consume() 
 33.8%                        nsAString_internal::Replace() 
 19.8%                        nsJSONListener::HandleString()  
 8.6%                          nsJSONListener::PushValue() 
 4.0%                          nsAString_internal::Assign()  
 3.3%                          nsAString_internal::SetLength()  
 2.8%                          js_NewStringCopyN  
 0.4%                          nsStringBuffer::AddRef() 
Comment 5 Mark Finkle (:mfinkle) (use needinfo?) 2008-01-10 13:42:31 PST
Created attachment 296405 [details]
another simple benchmark using a large array

This is another xpcshell benchmark script. It uses a 1600 item array to create a variable length chunk of JSON data. It will use 25, 50, 100, 200, 400, 800 & 1600 item chunks to benchmark how the parsers handle different data sizes (rather than iterating over a fixed size chunk like the other benchmark script).

load xpcshell and init the same as before:
js> version(170);
js> load("path/to/json2.js");
js> load("path/to/json-array.js");
js> startEncode();
...
js> startDecode();
Comment 6 Robert Sayre 2008-01-12 17:20:50 PST
Created attachment 296770 [details] [diff] [review]
WIP, getting faster

encode attachment 295803 [details]

rounds  nsJSON-patch  nsJSON  JSON.jsm  json2.js
    20             3       3        21        43
    50             7       9        52        95
   100            15      16       108       189
   200            29      34       214       380
   500            76      87       504       943
   750           113     128      1077      1353
  1000           143     161       977      2134

decode attachment 295803 [details]
rounds  nsJSON-patch  nsJSON  JSON.jsm  json2.js
    20             3       6        13        10
    50             6      14        31        25
   100            12      26        67        53
   200            25      52       121       103
   500            63     128       303       260
   750            94     192       460       390
  1000           124     255       843       514

encode attachment 296405 [details]
rounds  nsJSON-patch  nsJSON  JSON.jsm  json2.js
    24             0       1         2         6
    49             0       1         5        11
    99             1       2        12        25
   199             3       3        23        48
   399             5       6        48        89
   799            10      11        94       177
  1599            20      23       191       356

decode attachment 296405 [details]
rounds  nsJSON-patch  nsJSON  JSON.jsm  json2.js
    24             0       1         2         2
    49             1       2         2         3
    99             1       3         4         5
   199             3       5         8        10
   399             4       9        17        20
   799             8      18        36        39
  1599            18      41        65        79
Comment 7 Robert Sayre 2008-01-22 16:18:52 PST
Created attachment 298586 [details] [diff] [review]
WIP update
Comment 8 Robert Sayre 2008-01-22 16:34:00 PST
Created attachment 298589 [details] [diff] [review]
nit fix
Comment 9 Robert Sayre 2008-01-24 07:12:45 PST
Comment on attachment 298589 [details] [diff] [review]
nit fix

switching review to crowder, brendan has bigger bacon to fry
Comment 10 Brian Crowder 2008-01-24 11:12:01 PST
Comment on attachment 298589 [details] [diff] [review]
nit fix

+    mBufferCount = 0;
+  }
+
+  memcpy(&mBuffer[mBufferCount], aBuffer, aLength * sizeof(PRUnichar));
+  mBufferCount += aLength;

What's the purpose of initially setting mBufferCount to 0 here?

+#define PUSHCHAR(_c)                         \

Ever undefine this when you're done with it?  Is that a spidermonkey-only style thing?

Otherwise looks good to me, pending jst's happiness.
Comment 11 Johnny Stenback (:jst, jst@mozilla.com) 2008-01-25 15:32:28 PST
Comment on attachment 298589 [details] [diff] [review]
nit fix

+static const nsString kJSONNumChars = NS_LITERAL_STRING("-+0123456789eE.");

Last I heard, which was a long time ago now so this could've changed, we're not supposed to have statics of types that have constructors and/or destructors.

Either way, that could be a nsDependentString to save on a malloc, but I also wonder if what this is used for couldn't be written faster with a little helper here that checks if c < '9' && c > '0' and a switch statement for the rest?

sr=jst either way.
Comment 12 Robert Sayre 2008-02-05 13:33:26 PST
Created attachment 301572 [details] [diff] [review]
address crowder and jst comments
Comment 13 Robert Sayre 2008-02-05 13:35:13 PST
Created attachment 301573 [details] [diff] [review]
address crowder and jst comments

missed -u8pN
Comment 14 Robert Sayre 2008-02-05 13:43:16 PST
(In reply to comment #10)
> (From update of attachment 298589 [details] [diff] [review])
> +    mBufferCount = 0;
> +  }
> +
> +  memcpy(&mBuffer[mBufferCount], aBuffer, aLength * sizeof(PRUnichar));
> +  mBufferCount += aLength;
> 
> What's the purpose of initially setting mBufferCount to 0 here?

If we hit that case, the result of adding aBuffer to mBuffer would be bigger than we can handle, so we put the contents of mBuffer in a string. We can then reuse the buffer for more content, and we start over from zero.

Note You need to log in before you can comment on or make changes to this bug.