Closed Bug 891209 Opened 7 years ago Closed 7 years ago

Parse "detectably simple" expressions quickly.

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla25

People

(Reporter: njn, Assigned: njn)

Details

Attachments

(1 file)

The full expression-parsing code is pretty heavyweight.  Turns out a lot of the
time we can quickly check early on if it's a "detectably simple" expression,
and fast-path that.
The patch is simple.  I removed the tokenStream.hadError() check, which you
added in bug 846406, because it didn't appear necessary (the tests agree).

The performance results are a bit inconsistent, but overall good enough that I
think this is worth landing.  All following results are 64-bit clang builds on
Linux.

asm.js
------
On an old version of the Unreal Citadel demo, parse time drops from ~2.95s to
~2.65s, which is 1.11x faster.

Kraken
------
Instructions:
---------------------------------------------------------------
| millions of instructions executed                           |
| total                        | compiled (may overestimate)  |
---------------------------------------------------------------
|  1086.804  1079.038 (1.007x) |   826.337   826.337 (------) | ai-astar
|   633.841   608.047 (1.042x) |   473.744   457.799 (1.035x) | audio-beat-det
|   655.546   652.876 (1.004x) |   340.174   340.174 (------) | audio-dft
|   570.226   567.551 (1.005x) |   436.925   436.925 (------) | audio-fft
|   691.652   688.909 (1.004x) |   514.330   514.330 (------) | audio-oscillat
|  1977.526  1831.983 (1.079x) |  1227.158  1227.158 (------) | imaging-gaussi
|  1409.038  1263.428 (1.115x) |   314.280   314.280 (------) | imaging-darkro
|  1525.926  1380.387 (1.105x) |   776.011   776.011 (------) | imaging-desatu
|   330.757   330.065 (1.002x) |     1.427     1.427 (------) | json-parse-fin
|   413.053   409.847 (1.008x) |     1.250     1.250 (------) | json-stringify
|  1542.971  1529.900 (1.009x) |   427.488   427.488 (------) | stanford-crypt
|   610.179   606.367 (1.006x) |   234.119   234.119 (------) | stanford-crypt
|   694.905   694.307 (1.001x) |   371.791   371.791 (------) | stanford-crypt
|   354.363   353.949 (1.001x) |   141.848   141.848 (------) | stanford-crypt
-------
| 12496.791 11996.659 (1.042x) |  6086.889  6070.944 (1.003x) | all

Big wins on the imaging tests, which define huge arrays-of-ints.
(BTW, audio-beat-detection appears to be non-deterministic, so don't believe
its numbers.)

Time: negligible difference.  Not sure why.

Parsemark
---------
Instructions
---------------------------------------------------------------
| millions of instructions executed                           |
| total                        | compiled (may overestimate)  |
---------------------------------------------------------------
|    52.511    51.531 (1.019x) |    983115    983115 (------) | 280slides-comb
|    77.036    75.458 (1.021x) |    981334    981334 (------) | ball-pool-comb
|    73.350    71.839 (1.021x) |    995185    995185 (------) | dojo
|    76.185    74.123 (1.028x) |    980998    980998 (------) | effectgames-ga
|    44.892    43.814 (1.025x) |    978495    978495 (------) | ga
|   254.378   240.424 (1.058x) |     1.024     1.024 (------) | gmail-combined
|    97.074    93.897 (1.034x) |    982975    982975 (------) | gravity-combin
|    49.310    48.146 (1.024x) |    978680    978680 (------) | jquery-min
|    69.498    67.014 (1.037x) |     1.000     1.000 (------) | jsgb-combined
|    77.264    75.353 (1.025x) |    981855    981855 (------) | mochikit
|   308.418   299.455 (1.030x) |     1.067     1.067 (------) | pipio-combined
|    59.940    58.528 (1.024x) |    983586    983586 (------) | processing-twi
|    85.976    84.374 (1.019x) |    981051    981051 (------) | sunspider-comb
|    47.601    46.390 (1.026x) |    978741    978741 (------) | tetris-combine
|    82.165    79.728 (1.031x) |    982220    982220 (------) | twitter-combin
|   100.282    98.021 (1.023x) |    994296    994296 (------) | v8-combined
|    42.290    41.415 (1.021x) |    978258    978258 (------) | yui-min
|   577.244   565.434 (1.021x) |     1.205     1.205 (------) | zimbra-combine
-------
|  2175.422  2114.954 (1.029x) |    18.058    18.058 (------) | all

Time: 1.029x faster.

Octane
------
(I'm excluding gbemu which causes problems I haven't yet debugged.)

Instructions:
---------------------------------------------------------------
| millions of instructions executed                           |
| total                        | compiled (may overestimate)  |
---------------------------------------------------------------
|   109.009   107.478 (1.014x) |     1.072     1.072 (------) | oc-box2d
|    50.458    49.721 (1.015x) |    978563    978563 (------) | oc-code-load
|  1020.589   966.159 (1.056x) |     1.641     1.641 (------) | oc-mandreel
|   568.758   514.443 (1.106x) |     1.179     1.179 (------) | oc-pdfjs
|    53.925    52.897 (1.019x) |     1.238     1.238 (------) | v8-crypto
|    44.684    43.904 (1.018x) |    979155    979155 (------) | v8-deltablue
|    80.193    78.724 (1.019x) |     1.345     1.345 (------) | v8-earley-boye
|    42.570    41.776 (1.019x) |    978544    978544 (------) | v8-navier-stok
|    45.693    44.878 (1.018x) |    980620    980620 (------) | v8-raytrace
|    55.211    53.826 (1.026x) |    979585    979585 (------) | v8-regexp
|    42.889    42.118 (1.018x) |    978764    978764 (------) | v8-richards
|    41.952    41.207 (1.018x) |    978567    978567 (------) | v8-splay
-------
|  2155.938  2037.138 (1.058x) |    13.331    13.331 (------) | all

Time:
===============================================================================
** TOTAL **:       1.110x as fast    184.3ms +/- 0.3%    166.0ms +/- 0.2%
===============================================================================
  oc:              1.118x as fast    170.9ms +/- 0.3%    152.8ms +/- 0.2%
    box2d:         1.016x as fast      9.6ms +/- 0.6%      9.5ms +/- 0.4%
    code-load:     ??                  1.0ms +/- 1.1%      1.0ms +/- 2.0%
    mandreel:      1.123x as fast     96.3ms +/- 0.3%     85.7ms +/- 0.2%
    pdfjs:         1.130x as fast     64.1ms +/- 0.5%     56.7ms +/- 0.3%

  v8:              1.015x as fast     13.4ms +/- 0.5%     13.2ms +/- 0.8%
    crypto:        1.021x as fast      2.5ms +/- 1.3%      2.4ms +/- 0.7%
    deltablue:     ??                  0.9ms +/- 0.5%      0.9ms +/- 1.8%
    earley-boyer:  -                   5.4ms +/- 0.8%      5.3ms +/- 1.9%
    navier-stokes: -                   0.6ms +/- 0.7%      0.6ms +/- 1.1%
    raytrace:      -                   1.2ms +/- 1.5%      1.2ms +/- 2.5%
    regexp:        1.058x as fast      1.6ms +/- 0.7%      1.5ms +/- 1.2%
    richards:      *1.026x as slow*    0.7ms +/- 0.6%      0.7ms +/- 2.1%
    splay:         -                   0.5ms +/- 2.7%      0.5ms +/- 1.5%

Score: maybe increased from ~14,700 to ~14,870;  it's so noisy it's hard to
tell.

The disparity between the time improvement and the score improvement is
interesting.  When measuring straight times the longer-running benchmarks like
mandreel and pdfjs are grossly overweighted, but also I think the Octane
harness loads each test just once and then runs them multiple times, and so
will underplay the benefit of any parsing wins (except maybe for code-load).
Chalk that up as reason #63 to dislike the V8/Octane scoring system.
Attachment #772438 - Flags: review?(jorendorff)
Comment on attachment 772438 [details] [diff] [review]
Parse "detectably simple" expressions quickly.

Review of attachment 772438 [details] [diff] [review]:
-----------------------------------------------------------------

Nice and simple.

::: js/src/frontend/Parser.cpp
@@ -5151,3 @@
>          return returnStatementOrYieldExpression();
> -    if (tokenStream.hadError())
> -        return null();

You're right, this isn't necessary; ungetting and re-getting TOK_ERROR works fine.

::: js/src/frontend/TokenStream.h
@@ +639,5 @@
>      }
>  
> +    bool nextTokenEndsExpr() {
> +        return exprEndingTokens[peekToken()];
> +    }

peekToken() can return TOK_ERROR, which is -1.

(Maybe you could just change it to 0 though.)

@@ +923,5 @@
>      jschar              *sourceMap;     /* source map's filename or null */
>      void                *listenerTSData;/* listener data for this TokenStream */
>      CharBuffer          tokenbuf;       /* current token string buffer */
>      int8_t              oneCharTokens[128];  /* table of one-char tokens */
> +    uint8_t             exprEndingTokens[128]; /* table of tokens that terminate exprs */

TOK_LIMIT rather than 128.

Note that the rest of these arrays are indexed by char values, not TokenType, so it's a bit misleading to copy oneCharTokens so closely.
Attachment #772438 - Flags: review?(jorendorff) → review+
> peekToken() can return TOK_ERROR, which is -1.
> 
> (Maybe you could just change it to 0 though.)
> 
> TOK_LIMIT rather than 128.

Yikes.  Thanks for catching those.

I'll change TOK_ERROR to 0, and do that in the first patch, and then do the actual change in a second patch, just to be careful.
https://hg.mozilla.org/mozilla-central/rev/4e2943f5b63d
https://hg.mozilla.org/mozilla-central/rev/f61d03dffc57
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
You need to log in before you can comment on or make changes to this bug.