Switch JavaScript parser to use a "pump-primed" token stream for increased perf

RESOLVED WONTFIX

Status

()

Core
JavaScript Engine
--
enhancement
RESOLVED WONTFIX
8 years ago
7 years ago

People

(Reporter: cdleary, Assigned: cdleary)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments)

The parser currently employs a push-back token stream using up to 2 tokens of lookahead.

A typical idiom is to:

- Pop a token off of the stream
- Try to dispatch to a substitution at the new position in the token stream by looking at the popped token's kind
- Failing at that, push it back in to the token stream and try something else.

This is analogous to a lookahead-based dispatch.


Example
-------

The Statement production pops a token off the stream, then analyzes its kind in an attempt to dispatch immediately to a number of substitutions, like the FunctionStatement production.

Note that the FunctionStatement production, like other nonterminals in this dispatch idiom, expects the token stream's next available token to be *after* the "function" terminal in the stream, which historically justified this pop-the-token-before-you-really-know-you-can-use-it approach.

When lookahead-based dispatch doesn't apply, the token is pushed back into the stream, and the Statement production delegates to the Expression production at the original position in the token stream.


Existing js::TokenStream API
----------------------------

The current API consists of the following methods:

- currentToken(): the token last "gotten" from the token stream.
- getToken(): makes the next token in the stream the current token. Note that this uses a lookahead token if it is available; otherwise, a token is lexed from the token stream on the fly.
- ungetToken(): pushes the current token back into the token stream as a (already-lexed) lookahead token. As a result, the current token is the previous current token.
- peekToken(): gets a token, notes the value, then ungets that token.
- matchToken(targetKind) -> bool: gets a token from the stream if its kind matches the target kind: calls peekToken(), branches, potentially calls getToken().

Because every method in the API ultimately calls into getToken(), there is a unpredictable branch that is executed at every meaningful point in the parser in the form of:

    if (lookahead) {
        lookahead--;
        return alreadyLexed[0];
    } else {
        return goAheadAndLex();
    }

Note that this branch is most likely inlined at all getToken() invocation sites in the current binaries.


Proposed enhancement: "pump priming"
------------------------------------

The term "pump-priming" indicates that the parser should always have a token pre-lexed when getToken() is called. The idea is to eliminate the ``if (lookahead)`` branch, or at least make it more predictable, through the specification of lookahead pre- and post-conditions for each production.

As an example of strengthening preconditions: imagine a production rule that currently permits either 0, 1, or 2 tokens of lookahead on its inbound edge. The 0 lookahead callers may be able to make non-branching pump-priming additions so that we can make a stronger precondition claim in the callee that there are always 1 or 2 tokens of lookahead.

Ideally, the new API would be reduced to the following operations:

- currentToken(): same as above.
- getToken(): as a precondition, at least one token is already lexed; therefore, this call is reduced to a simple circular-buffer cursor increment.
- peekToken(): as a precondition, at least one token is already lexed; return the value of that pre-lexed token.
- matchToken(targetKind) -> bool: same as above.
- pumpToken(): lex a token from the character stream.
- pumpTokenSlowly(): if there is no lookahead, then lex a token from the character stream. The branch makes this less preferable than using pumpToken() where possible.


Complications
-------------

In some scenarios we require two token of lookahead. For example, ``eval("getter")`` and ``function::`` end up creating an aberrant *two* tokens of lookahead within the Statement production before deferring to the Expression production. As expected in a top-down parser, this aberration at the top production rule ripples down into all substitutions, making it more difficult to pin down an exact number of lookahead tokens as a precondition.

I'm looking into whether these strange cases can be avoided by other methods, such as breaking the token stream abstraction and looking directly at the character stream.

Before I finish the patch, I'm going to gather some data on the predictability of the above branches.
Assignee: general → cdleary
Created attachment 437521 [details] [diff] [review]
WIP folded patch with no perf improvement.

Sadly, not showing any perf improvement on my desktop (Intel Allendale @ 2GHz). I foolishly went on implementing past the reftests going green, which accounts for some of the delay in getting a patch out. This patch passes the ecma jstests with the exception of ecma_5/strict. Note that the patch doesn't apply to trunk, but changeset bec74236a5c8, which I user-repo branched.

Before:

{
  "gravity_combined.js": {"average_ms":  95.70, "stddev_ms":   1.86},
   "tetris_combined.js": {"average_ms":  15.16, "stddev_ms":   0.76},
                "ga.js": {"average_ms":  12.20, "stddev_ms":   1.25},
     "jsgb_combined.js": {"average_ms":  36.74, "stddev_ms":   1.76},
        "jquery.min.js": {"average_ms":  18.70, "stddev_ms":   0.61},
   "zimbra_combined.js": {"average_ms": 464.96, "stddev_ms":  12.41},
  "twitter_combined.js": {"average_ms":  71.90, "stddev_ms":   2.15},
"280slides_combined.js": {"average_ms":  12.08, "stddev_ms":   0.66},
       "v8_combined.js": {"average_ms":  74.00, "stddev_ms":   1.28},
              "dojo.js": {"average_ms":  26.52, "stddev_ms":   0.64},
 "processing_twitch.js": {"average_ms":   8.36, "stddev_ms":   0.77},
    "gmail_combined.js": {"average_ms": 362.74, "stddev_ms":  11.03},
"effectgames_galaxy.js": {"average_ms":  56.38, "stddev_ms":   1.81},
"ball_pool_combined.js": {"average_ms":  54.10, "stddev_ms":   0.92},
           "yui-min.js": {"average_ms":   6.80, "stddev_ms":   2.72},
          "MochiKit.js": {"average_ms":  56.32, "stddev_ms":   2.15},
    "pipio_combined.js": {"average_ms": 316.48, "stddev_ms":  14.29},
"sunspider_combined.js": {"average_ms":  39.94, "stddev_ms":   3.51}
}

After:

{
  "gravity_combined.js": {"average_ms":  98.50, "stddev_ms":   2.11},
   "tetris_combined.js": {"average_ms":  16.54, "stddev_ms":   2.25},
                "ga.js": {"average_ms":  11.18, "stddev_ms":   0.62},
     "jsgb_combined.js": {"average_ms":  36.10, "stddev_ms":   0.50},
        "jquery.min.js": {"average_ms":  19.44, "stddev_ms":   0.64},
   "zimbra_combined.js": {"average_ms": 482.38, "stddev_ms":  23.87},
  "twitter_combined.js": {"average_ms":  72.02, "stddev_ms":   0.76},
"280slides_combined.js": {"average_ms":  11.96, "stddev_ms":   0.40},
       "v8_combined.js": {"average_ms":  74.04, "stddev_ms":   0.69},
              "dojo.js": {"average_ms":  27.32, "stddev_ms":   0.51},
 "processing_twitch.js": {"average_ms":   8.26, "stddev_ms":   0.44},
    "gmail_combined.js": {"average_ms": 372.00, "stddev_ms":  11.45},
"effectgames_galaxy.js": {"average_ms":  56.98, "stddev_ms":   1.27},
"ball_pool_combined.js": {"average_ms":  57.80, "stddev_ms":   3.94},
           "yui-min.js": {"average_ms":   5.54, "stddev_ms":   0.54},
          "MochiKit.js": {"average_ms":  55.56, "stddev_ms":   0.80},
    "pipio_combined.js": {"average_ms": 322.38, "stddev_ms":  13.86},
"sunspider_combined.js": {"average_ms":  38.64, "stddev_ms":   1.29}
}

Parsemark says there's nothing statistically significant about it. Will give it a run through valgrind tomorrow to see what's up.
Comparing kcachegrind results side by side does show the nonterminal methods sink in the branch miss profile, but it seems like it's just not enough to matter. There are a few reasons that I'm thinking we should forgo this change:

- Programming model: It's easier to |stream.getToken()| and have that method figure out what to do than keep track of the pump state. In order to avoid making the "should I lex now?" decision with a branch, "have I lexed already?" pre- and post-conditions are required between a nonterminal and its substitution nonterminals. As a result, future changes have an unfortunate propensity to ripple throughout the system.
- Weird states: I haven't accounted for the two-tokens-of-lookahead parse paths yet, which may require a whole new set of nonterminals -- a set that understands how to cope with two tokens of lookahead -- in order to avoid the same sorts of should-I-lex-now branches in nonterminals that currently expect a single token of lookahead. Because the nonterminals mainly "head call" into each other, nearly all of them have to be updated or replicated to handle the two-tokens-of-lookahead case.
- No perceived perf benefit ATM. The branches that we eliminated must have been predictable enough.
Created attachment 437685 [details]
OProfile before patch
Created attachment 437686 [details]
OProfile after patch
Any objections to closing this as WNF?
Status: NEW → ASSIGNED
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → WONTFIX
> Because every method in the API ultimately calls into getToken(), there is a
> unpredictable branch that is executed at every meaningful point in the parser
> in the form of:
> 
>     if (lookahead) {
>         lookahead--;
>         return alreadyLexed[0];
>     } else {
>         return goAheadAndLex();
>     }

FWIW, Cachegrind's branch predictor simulation says this branch is less unpredictable than you'd expect.  For an augmented version of parsemark, I see a misprediction rate a bit over 1%.

Bug 637549 takes a different tack;  with a slight tweak to the parsing of expressions, the number of calls to matchToken() (and thus the number of calls to getToken()) can be reduced dramatically.
You need to log in before you can comment on or make changes to this bug.