Closed Bug 578216 Opened 14 years ago Closed 14 years ago

Make eval(json-like string) fast

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: dmandelin, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(2 files, 3 obsolete files)

string-tagcloud calls eval once on a JSON-like string. On AWFY, this benchmark runs in 90 ms, of which 18 ms (20%) is in that one call to eval. JSC does it in 3 ms. We should find out how and do that or something better. This will help interp/JM/TM.

JM wanted ss win: 15 ms
The string that eval is called on looks like this:

([
  {
    "tag": "titillation",
    "popularity": 4294967296
  },
  {
    "tag": "foamless",
    "popularity": 1257718401
  },
  ...,
  {
    "tag": "mesoventral",
    "popularity": 4096
  }])

There are 2,500 entries in the array, so there are 10,000 lines, and 146,674 chars to parse.  Judging from a combination of Shark and Cachegrind results, I'm pretty sure the cost of the parsing is most of it, and property addition is a small amount.
Here's the relevant part of the top-down Shark results:

14.3%     obj_eval(JSContext*, unsigned int, js::Value*)
11.5%      js::Compiler::compileScript(JSContext*, JSObject*, JSStackFrame*, JSPrincipals*, unsigned int, unsigned short const*, unsigned long, __sFILE*, char const*, unsigned int, JSString*, unsigned int)
8.5%        js::Parser::statement()
8.5%         js::Parser::expr()
8.5%          js::Parser::assignExpr()
8.5%           js::Parser::condExpr()
8.5%            js::Parser::andExpr()
8.5%             js::Parser::bitXorExpr()
8.5%              js::Parser::relExpr()
8.5%               js::Parser::addExpr()
8.5%                js::Parser::mulExpr()
8.5%                 js::Parser::unaryExpr()
8.5%                  js::Parser::memberExpr(int)
8.5%                   js::Parser::primaryExpr(js::TokenKind, int)
8.5%                    js::Parser::parenExpr(JSParseNode*, int*)
8.5%                     js::Parser::expr()
8.5%                      js::Parser::assignExpr()
8.5%                       js::Parser::condExpr()
8.5%                        js::Parser::andExpr()
8.5%                         js::Parser::bitXorExpr()
8.5%                          js::Parser::relExpr()
8.5%                           js::Parser::addExpr()
8.5%                            js::Parser::mulExpr()
8.5%                             js::Parser::unaryExpr()
8.5%                              js::Parser::memberExpr(int)
8.5%                               js::Parser::primaryExpr(js::TokenKind, int)
8.2%                                js::Parser::assignExpr()
8.1%                                 js::Parser::condExpr()
8.1%                                  js::Parser::andExpr()
8.0%                                   js::Parser::bitXorExpr()
8.0%                                    js::Parser::relExpr()
7.8%                                     js::Parser::addExpr()
7.7%                                      js::Parser::mulExpr()
7.6%                                       js::Parser::unaryExpr()
7.5%                                        js::Parser::memberExpr(int)
7.3%                                         js::Parser::primaryExpr(js::TokenKind, int)
3.7%                                          js::Parser::assignExpr()

Is this level of call nesting expected just for an expression like the one in comment 1?  Maybe JSC's parser is much faster than ours.

The bottom up profile indicates that 5.2% of total time is within getTokenInternal().  The soon-to-be-landed patch in bug 584595 speeds up scanning, but only slightly.
The nesting you see through *Expr methods corresponds to the expression grammar, and it's necessary for correct operator precedence. A recursive descent parser will do that. Alternatives for parsing most of JS's expression grammar include operator precedence parsing -- worth a look, but perhaps there is more tuning to do in the current parser.

What JSC parser is being benchmarked in comment 0, the Bison-generated one or Ollie's new recursive descent one? I wonder if there's a special JSON parser that kicks in...

/be
Additional thoughts about optimizing scanning and parsing:

- During scanning most chars are copied twice.  I've opened bug 588648 about
  this.

- getTokenInternal() features highly but it's hard to optimize further.
  There are just lots of cases.

- w.r.t. the deep nesting in comment 2, there is a sequence of six functions
  that look like this:

    JSParseNode *
    Parser::bitOrExpr()
    {
        JSParseNode *pn = bitXorExpr();
        while (pn && tokenStream.matchToken(TOK_BITOR))
            pn = JSParseNode::newBinaryOrAppend(TOK_BITOR, JSOP_BITOR, pn,
                                                bitXorExpr(), tc);
        return pn;
    }

  If bitOrExpr() only called bitXorExpr() once, then bitXorExpr() could be
  inlined beautifully because this is the only place that calls it.  And
  likewise for all the functions in the chain.  I haven't worked out a
  reasonable way of re-expressing this function to achieve one call to
  bitXorExpr(), though;  a gold star to anyone who does.
(In reply to comment #4)
I haven't worked out a
>   reasonable way of re-expressing this function to achieve one call to
>   bitXorExpr(), though;  a gold star to anyone who does.

Something like this could work:

JSParseNode *pn = NULL;
do {
  JSParseNode *thisPn = bitXorExpr();
  if(pn)
    pn = JSParseNode::newBinaryOrAppend(TOK_BITOR, JSOP_BITOR, pn, thisPn, tc);
  else
    pn = thisPn;
} while(pn && tokenStream.matchToken(TOK_BITOR));

return pn;
(In reply to comment #4)
>   If bitOrExpr() only called bitXorExpr() once, then bitXorExpr() could be
>   inlined beautifully because this is the only place that calls it.

If it's a static function and there's only a single invocation site, any compiler worth its salt will inline it for you -- there's no benefit to keeping it out-of-line in that case.
(In reply to comment #5)
> 
> Something like this could work:

I tried doing that and forcing inlining.  The instruction count went down, but not by enough for the obfuscation to be worth it.  Thanks for the suggestion, though!
Attached patch Use JSON parser in eval (obsolete) — Splinter Review
This patch attempts to run the JSON-parser if the string is enclosed in parentheses. This is a 30% speedup for string-tagcloud on my pc -- about 13 ms. on awfy.
Comment on attachment 468020 [details] [diff] [review]
Use JSON parser in eval

Isn't this going to report spurious errors? It's also likely to ding perf when the expression is not JSON, although I don't know whether the benchmarketeers test that.

/be
(In reply to comment #9)
> Isn't this going to report spurious errors? 
Yes, that's a good point. We could add a suppressMessages parameter, or return the error and let the caller decide. Which one do you prefer?

> It's also likely to ding perf when
> the expression is not JSON, although I don't know whether the benchmarketeers
> test that.
This patch only runs the JSON-parser when the string is enclosed in parentheses, and at least on SS I saw no regressions. 

JSC does something similar (callEval in Interpreter.cpp always tries their LiteralParser).
(In reply to comment #8)
> Created attachment 468020 [details] [diff] [review]
> Use JSON parser in eval
> 
> This patch attempts to run the JSON-parser if the string is enclosed in
> parentheses. This is a 30% speedup for string-tagcloud on my pc -- about 13 ms.
> on awfy.

That's juicy. The JSON parser could be way faster too. Time to fix that, I guess.
We could (just like JSC) always try the JSON-parser. For most non-JSON strings the JSON-parser will complain after a few chars.

Anyway, I won't be working on this for now, until we know this is worthwhile and what further changes are needed here.
Jan: thanks for contributing this -- should be easy to carry forward, looks like a win, and the competition (tuning for their favorite benchmark) does it. It's even plausible that an expression passed to eval that starts with '(' and ends with ')' is highly likely to be JSON -- another thing we could study (and may, with MSR and Purdue colleagues). I'm willing to take it on faith for now ;-).

/be
(In reply to comment #13)

> It's even plausible that an expression passed to eval that starts with '(' and
> ends with ')' is highly likely to be JSON

Why is that?  Parentheses aren't valid JSON, right?
(In reply to comment #14)
> (In reply to comment #13)
> 
> > It's even plausible that an expression passed to eval that starts with '(' and
> > ends with ')' is highly likely to be JSON
> 
> Why is that?  Parentheses aren't valid JSON, right?

No, but they are required for eval of any object initialiser or JSON object, since otherwisei { begins a block statement.

IOW I'm supposing, along with Jan and presumably JSC folks (unless they were *only* tuning for SS :-P) that eval of an expression was used as a JSON decoder for want of the ES5-standard one, only now making its appearance.

/be
Blocks: 589664
(In reply to comment #10)
> > Isn't this going to report spurious errors? 
> Yes, that's a good point. We could add a suppressMessages parameter, or return
> the error and let the caller decide. Which one do you prefer?

I tested the patch and got errors in trace-tests.  Suppressing the messages is the only workable solution, AFAICT, because the errors are printed from within the JSON parser so by the time you examine the return value it's too late.

Having got that far, I couldn't help myself but implement the suppression :)  Patch is attached.

Instruction counts dropped from 185.2M to 165.7M under TM, that's a 1.18x improvement.  I see a 1.12x speed-up on my MacBook Pro, from 59.4ms to 53.1ms.

Well done to dmandelin for spotting this one!


(In reply to comment #11)
> 
> The JSON parser could be way faster too. Time to fix that, I guess.

I filed bug 589664 for this.
Attachment #468020 - Attachment is obsolete: true
Attachment #468201 - Flags: review?(sayrer)
Comment on attachment 468201 [details] [diff] [review]
patch that suppresses errors (against TM 51034:14a90a53ceeb)

>+        JSONParser *jp = js_BeginJSONParse(cx, vp, /* suppressErrors */true);

I like the style you used elsewhere, with " =" after the parameter name in the comment:

jsbuiltins.cpp:    return AddPropertyHelper(cx, obj, shape, /* isDefinitelyAtom = */false);
jsbuiltins.cpp:    return AddPropertyHelper(cx, obj, shape, /* isDefinitelyAtom = */true);
jstracer.cpp:        x = d2i(f, /* resultCanBeImpreciseIfFractional = */true);

>+        JSBool ok = jp != NULL;
>+        if (ok) {
>+            /* Run JSON-parser on string inside ( and ). */
>+            ok = js_ConsumeJSONText(cx, jp, chars+1, length-2);
>+            ok &= !!js_FinishJSONParse(cx, jp, NullValue());
>+            if (ok)
>+                return JS_TRUE;
>+        }

Looks like ok should be bool and you wouldn't need the !!.

We need to de-JSBool json.cpp and other newer files some time soon.

/be
Attached patch patch with Brendan's nits picked (obsolete) — Splinter Review
Attachment #468201 - Attachment is obsolete: true
Attachment #468229 - Flags: review?(sayrer)
Attachment #468201 - Flags: review?(sayrer)
(In reply to comment #17)
> 
> We need to de-JSBool json.cpp and other newer files some time soon.

Bug 589664 would be a good opportunity for json.cpp;  I added a comment to that effect.
Attachment #468229 - Flags: review?(sayrer) → review+
http://hg.mozilla.org/tracemonkey/rev/74ce80e8782e

Jan, thanks for the original patch!
Whiteboard: fixed-in-tracemonkey
Blocks: 590036
Backed out due to jsreftest browser failures on Linux:

http://hg.mozilla.org/tracemonkey/rev/71acb4ab388c
I was able to reproduce test failures on a Linux64 browser build.  For posterity:

REFTEST TEST-UNEXPECTED-FAIL | file:///home/njn/moz/ws9/js/src/tests/jsreftest.html?test=ecma_3/extensions/10.1.3-2.js | Unknown file:///home/njn/moz/ws9/js/src/tests/ecma_3/extensions/10.1.3-2.js:123: JSON.parse item 1
REFTEST TEST-UNEXPECTED-FAIL | file:///home/njn/moz/ws9/js/src/tests/jsreftest.html?test=ecma_3/extensions/regress-430740.js | Unknown file:///home/njn/moz/ws9/js/src/tests/ecma_3/extensions/regress-430740.js:61: JSON.parse item 2
REFTEST TEST-UNEXPECTED-FAIL | file:///home/njn/moz/ws9/js/src/tests/jsreftest.html?test=ecma_5/Expressions/11.1.5-01.js | Unknown file:///home/njn/moz/ws9/js/src/tests/ecma_5/Expressions/11.1.5-01.js:32: JSON.parse item 2
REFTEST TEST-UNEXPECTED-FAIL | file:///home/njn/moz/ws9/js/src/tests/jsreftest.html?test=js1_5/extensions/regress-231518.js | Unknown file:///home/njn/moz/ws9/js/src/tests/js1_5/extensions/regress-231518.js:62: JSON.parse item 2
REFTEST TEST-UNEXPECTED-FAIL | file:///home/njn/moz/ws9/js/src/tests/jsreftest.html?test=js1_5/extensions/regress-254375.js | Unknown file:///home/njn/moz/ws9/js/src/tests/js1_5/extensions/regress-254375.js:53: JSON.parse item 2
REFTEST TEST-UNEXPECTED-FAIL | file:///home/njn/moz/ws9/js/src/tests/jsreftest.html?test=js1_5/extensions/regress-96284-001.js | Unknown file:///home/njn/moz/ws9/js/src/tests/js1_5/extensions/regress-96284-001.js:127: JSON.parse item 1
REFTEST TEST-UNEXPECTED-FAIL | file:///home/njn/moz/ws9/js/src/tests/jsreftest.html?test=js1_5/extensions/regress-96284-002.js | Unknown file:///home/njn/moz/ws9/js/src/tests/js1_5/extensions/regress-96284-002.js:127: JSON.parse item 1
REFTEST TEST-UNEXPECTED-FAIL | file:///home/njn/moz/ws9/js/src/tests/jsreftest.html?test=js1_7/extensions/regress-379566.js | Unknown file:///home/njn/moz/ws9/js/src/tests/js1_7/extensions/regress-379566.js:62: JSON.parse item 1
TEST-UNEXPECTED-FAIL | file:///home/njn/moz/ws9/js/src/tests/jsreftest.html?test=js1_8/extensions/regress-385729.js | Exited with code 1 during test run
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | missing output line for total leaks!


Turns out the problem is that the 'jp' struct was destroyed in js_FinishJSONParse() and then it the 'suppressErrors' field was inspected subsequently.  I should have run under Valgrind earlier :(  The failure didn't show up in shell reftests because those don't run under Valgrind, and only a handful of trace-tests do (none which used eval in this way).  I guess it failed in the browser due to differences in the malloc/free implementation;  that also explains (with some hand-waving) why it failed on Linux but not on Mac -- it's probably a jemalloc vs. malloc thing.

Patch coming shortly.
Assignee: general → nnethercote
Status: NEW → ASSIGNED
Attachment #468229 - Attachment is obsolete: true
Attachment #468641 - Flags: review?(sayrer)
Attachment #468692 - Attachment is patch: true
Attachment #468692 - Attachment mime type: application/octet-stream → text/plain
Comment on attachment 468641 [details] [diff] [review]
patch, v3 (against TM 51193:fa851447f195)

This is fine. I realized that the parameters to JSONParseError should be ordered the other way, because the function cannot succeed. It's also not necessary to assign to |ok| in the js_FinishJSONParse. That switch is not house style, either so I changed that.

These are all nits, so I made those changes and I've attached them.
Attachment #468641 - Flags: review?(sayrer) → review+
There is a minor typo in one of the comments in http://hg.mozilla.org/tracemonkey/rev/5685f8de41fa:

    2.40 +js_BeginJSONParse(JSContext *cx, Value *rootVal, bool suppressErrors /*= true*/)
...
     3.8 +js_BeginJSONParse(JSContext *cx, js::Value *rootVal, bool suppressErrors = false);

------

/*= true*/ should be /*= false*/ to match the declaration.
(In reply to comment #27)
> 
> /*= true*/ should be /*= false*/ to match the declaration.

Good catch, thanks!  Fixed in:

http://hg.mozilla.org/tracemonkey/rev/25278e8a90c5
(In reply to comment #25)
> Comment on attachment 468641 [details] [diff] [review]
> patch, v3 (against TM 51193:fa851447f195)
> 
> This is fine. I realized that the parameters to JSONParseError should be
> ordered the other way, because the function cannot succeed.

It can't succeed but it may not fail with an error-as-exception, right?

It's a maybe-fallible, so cx as first param is plausible, but either way is ok.

/be
http://hg.mozilla.org/mozilla-central/rev/5685f8de41fa
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Depends on: 657367
Depends on: 657713
Depends on: 1337564
You need to log in before you can comment on or make changes to this bug.