Closed Bug 410890 Opened 17 years ago Closed 16 years ago

nsJSON speedups

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9beta4

People

(Reporter: sayrer, Assigned: sayrer)

References

Details

(Keywords: perf)

Attachments

(3 files, 5 obsolete files)

Shark says a good amount of decoding time is spent in string manipulation (appending one char at a time in the loop) and creating about:blank for the channel, so we'll route around that.
This simple JS script can test the native JSON imp against the JSM version and the json.org (json2.js) version.

I tested like this from xpcshell:
js> version(170);
js> load("path/to/json-benchmark.js");
js> load("path/to/json2.js");
js> startEncode();
js> startDecode();

json2.js is in the tree or can be downloaded from json.org
Encode test case at 1000 iterations:

Shark Bottom-Up summary.

6.8%  szone_free  
5.3%  nsAString_internal::Replace(unsigned, unsigned, unsigned short const*, unsigned)  
4.2%  pthread_mutex_lock  
3.5%  nsAString_internal::ReplacePrep(unsigned, unsigned, unsigned) 
3.2%  pthread_mutex_unlock  
2.5%  js_Enumerate  
2.2%  JS_dtostr 
2.2%  szone_malloc  
2.1%  nsJSONWriter::WriteString(unsigned short const*, unsigned)  
2.0%  nsJSON::EncodeObject(JSContext*, long*, nsJSONWriter*, JSObject*, unsigned) 
2.0%  tiny_malloc_from_free_list  
1.8%  js_LookupPropertyWithFlags  
1.8%  szone_size  
1.6%  js_NewGCThing 

Three main issues here. 
  1.) String manipulation from nsJSONWriter::Write is expensive
  2.) js_AtomizeString calls to PR_Lock are expensive
  3.) JS_Enumerate allocations are expensive
Version: unspecified → Trunk
This patch pretty much removes string manipulation from the profile. We do
spend some extra time in nsJSON::Write, but it's a win overall.

Before

iterations  nsJSON  json.jsm  json2.js
20          5       22        38
50          9       52        95
100         17      103       196
200         34      198       379
500         88      496       955
750         130     994       1371
1000        166     958       2141
                    
After

iterations  nsJSON  json.jsm  json2.js
20          3       20        39
50          7       51        97
100         15      101       192
200         30      206       386
500         77      500       964
750         117     1047      1394
1000        145     972       2156
                              
Shark After:
     
6.7% szone_free  
3.8% pthread_mutex_lock  
3.7% pthread_mutex_unlock  
3.6% nsJSONWriter::Write()
2.8% js_Enumerate  
2.6% nsJSONWriter::WriteString()
2.5% JS_dtostr 
2.3% szone_malloc  
2.1% nsJSON::EncodeObject() 
2.1% js_NewGCThing 
2.0% szone_size  
2.0% js_LookupPropertyWithFlags  
1.8% tiny_malloc_from_free_list  
1.7% js_CallIteratorNext 
1.6% js_SearchScope
1.4% XUL SearchTable
Bottom-up view shows lots of allocator thrashing from string mutation.

Decode top-down view:

 96.2%  start 
 96.2%   _start 
 96.2%    main  
 88.1%     Process() 
 88.1%      ProcessFile()  
 88.0%       JS_ExecuteScript 
 88.0%        js_Execute  
 88.0%         js_Interpret 
 87.9%          js_Invoke 
 79.1%           Load()  
 78.9%            JS_ExecuteScript  
 78.9%             js_Execute 
 78.9%              js_Interpret  
 78.7%               js_Invoke  
 78.3%                XPC_WN_CallMethod()  
 78.1%                 XPCWrappedNative::CallMethod()  
 77.8%                  NS_InvokeByIndex_P  
 77.3%                   nsJSON::Decode()  
 77.2%                    nsJSON::DecodeInternal() 
 72.0%                     nsJSONListener::OnDataAvailable()  
 71.8%                      nsJSONListener::ProcessBytes() 
 71.4%                       nsJSONListener::Consume() 
 33.8%                        nsAString_internal::Replace() 
 19.8%                        nsJSONListener::HandleString()  
 8.6%                          nsJSONListener::PushValue() 
 4.0%                          nsAString_internal::Assign()  
 3.3%                          nsAString_internal::SetLength()  
 2.8%                          js_NewStringCopyN  
 0.4%                          nsStringBuffer::AddRef() 
This is another xpcshell benchmark script. It uses a 1600 item array to create a variable length chunk of JSON data. It will use 25, 50, 100, 200, 400, 800 & 1600 item chunks to benchmark how the parsers handle different data sizes (rather than iterating over a fixed size chunk like the other benchmark script).

load xpcshell and init the same as before:
js> version(170);
js> load("path/to/json2.js");
js> load("path/to/json-array.js");
js> startEncode();
...
js> startDecode();
Attached patch WIP, getting faster (obsolete) — Splinter Review
encode attachment 295803 [details]

rounds  nsJSON-patch  nsJSON  JSON.jsm  json2.js
    20             3       3        21        43
    50             7       9        52        95
   100            15      16       108       189
   200            29      34       214       380
   500            76      87       504       943
   750           113     128      1077      1353
  1000           143     161       977      2134

decode attachment 295803 [details]
rounds  nsJSON-patch  nsJSON  JSON.jsm  json2.js
    20             3       6        13        10
    50             6      14        31        25
   100            12      26        67        53
   200            25      52       121       103
   500            63     128       303       260
   750            94     192       460       390
  1000           124     255       843       514

encode attachment 296405 [details]
rounds  nsJSON-patch  nsJSON  JSON.jsm  json2.js
    24             0       1         2         6
    49             0       1         5        11
    99             1       2        12        25
   199             3       3        23        48
   399             5       6        48        89
   799            10      11        94       177
  1599            20      23       191       356

decode attachment 296405 [details]
rounds  nsJSON-patch  nsJSON  JSON.jsm  json2.js
    24             0       1         2         2
    49             1       2         2         3
    99             1       3         4         5
   199             3       5         8        10
   399             4       9        17        20
   799             8      18        36        39
  1599            18      41        65        79
Attachment #295872 - Attachment is obsolete: true
Attached patch WIP update (obsolete) — Splinter Review
Attachment #296770 - Attachment is obsolete: true
Attached patch nit fix (obsolete) — Splinter Review
Attachment #298586 - Attachment is obsolete: true
Attachment #298589 - Flags: review?(brendan)
Comment on attachment 298589 [details] [diff] [review]
nit fix

switching review to crowder, brendan has bigger bacon to fry
Attachment #298589 - Flags: superreview?(jst)
Attachment #298589 - Flags: review?(crowder)
Attachment #298589 - Flags: review?(brendan)
Comment on attachment 298589 [details] [diff] [review]
nit fix

+    mBufferCount = 0;
+  }
+
+  memcpy(&mBuffer[mBufferCount], aBuffer, aLength * sizeof(PRUnichar));
+  mBufferCount += aLength;

What's the purpose of initially setting mBufferCount to 0 here?

+#define PUSHCHAR(_c)                         \

Ever undefine this when you're done with it?  Is that a spidermonkey-only style thing?

Otherwise looks good to me, pending jst's happiness.
Attachment #298589 - Flags: review?(crowder) → review+
Comment on attachment 298589 [details] [diff] [review]
nit fix

+static const nsString kJSONNumChars = NS_LITERAL_STRING("-+0123456789eE.");

Last I heard, which was a long time ago now so this could've changed, we're not supposed to have statics of types that have constructors and/or destructors.

Either way, that could be a nsDependentString to save on a malloc, but I also wonder if what this is used for couldn't be written faster with a little helper here that checks if c < '9' && c > '0' and a switch statement for the rest?

sr=jst either way.
Attachment #298589 - Flags: superreview?(jst) → superreview+
Attached patch address crowder and jst comments (obsolete) — Splinter Review
Attachment #298589 - Attachment is obsolete: true
Attachment #301572 - Flags: superreview+
Attachment #301572 - Flags: review+
Attachment #301572 - Flags: approval1.9?
Attachment #301572 - Attachment is patch: true
Attachment #301572 - Attachment mime type: application/octet-stream → text/plain
missed -u8pN
Attachment #301572 - Attachment is obsolete: true
Attachment #301573 - Flags: superreview+
Attachment #301573 - Flags: review+
Attachment #301573 - Flags: approval1.9?
Attachment #301572 - Flags: approval1.9?
(In reply to comment #10)
> (From update of attachment 298589 [details] [diff] [review])
> +    mBufferCount = 0;
> +  }
> +
> +  memcpy(&mBuffer[mBufferCount], aBuffer, aLength * sizeof(PRUnichar));
> +  mBufferCount += aLength;
> 
> What's the purpose of initially setting mBufferCount to 0 here?

If we hit that case, the result of adding aBuffer to mBuffer would be bigger than we can handle, so we put the contents of mBuffer in a string. We can then reuse the buffer for more content, and we start over from zero.
Attachment #301573 - Flags: approval1.9? → approval1.9+
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
OS: Mac OS X → All
Hardware: PC → All
Target Milestone: --- → mozilla1.9beta4
Depends on: 633934
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: