Closed Bug 548621 Opened 14 years ago Closed 14 years ago

JavaScript tests - create suite of tests for parsing perf

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: cdleary, Assigned: cdleary)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(2 files, 5 obsolete files)

~20 large files used in real-world apps.
Assignee: general → cleary
I think I have a usable corpus now -- here are the optimized build results on my 1.5GHz Core 2 under Linux.

$ ./parse_runner 
Computing averages over 10 runs with 5 warm-up runs
{ /* script: average parse seconds */
          "tetris_combined.js":    0.017,
                  "yui-min.js":    0.006,
       "effectgames_galaxy.js":    0.061,
          "zimbra_combined.js":    0.515,
                 "MochiKit.js":    0.062,
       "280slides_combined.js":    0.014,
               "jquery.min.js":    0.021,
         "twitter_combined.js":    0.085,
                       "ga.js":    0.012,
           "pipio_combined.js":    0.353,
            "jsgb_combined.js":    0.042,
                     "dojo.js":    0.030,
           "gmail_combined.js":    0.396,
         "gravity_combined.js":    0.118,
        "processing_twitch.js":    0.009,
       "ball_pool_combined.js":    0.063
}


OProfile lists the following as the top culprits in the optimized build:

$ sudo opcontrol --reset && sudo opcontrol --start && ./parse_runner && sudo opcontrol --shutdown && opreport -l -t1
[...]
samples  %        app name                 symbol name
68131    16.2507  parse_runner             js_GetToken
28997     6.9164  parse_runner             GetChar(JSTokenStream*)
22671     5.4075  no-vmlinux               /no-vmlinux
19316     4.6073  parse_runner             UnlinkFunctionBoxes(JSParseNode*, JSTreeContext*)
14850     3.5420  parse_runner             js_MatchToken
14820     3.5349  parse_runner             js_EmitTree
(In reply to comment #1)
>
> 22671     5.4075  no-vmlinux               /no-vmlinux

hmm, a lot of time in there. we should probably get malloc stacks for this stuff (easy to do with dtrace/instruments). might find some malloc storms.
POSIX-y parsing performance evaluation command-line tool.
Attachment #429908 - Flags: review?(jorendorff)
Suite of big JS payloads from across the web.
Forgot to close the file descriptors.
Attachment #429908 - Attachment is obsolete: true
Attachment #429918 - Flags: review?
Attachment #429908 - Flags: review?(jorendorff)
Attachment #429918 - Flags: review? → review?(jorendorff)
Good code. I think it would be better to expose a compile() function to the JS shell, sort of a poor man's Script object, and then run this like Sunspider.

  x = snarf(path);

  if (this.startShark) { connectShark(); startShark(); }
  if (this.startCallgrind) startCallgrind();
  var t0 = new Date;

  compile(x);

  var t1 = new Date;
  if (this.stopShark) { stopShark(); disconnectShark(); }
  if (this.stopCallgrind) { stopCallgrind(); dumpCallgrind(); }

I assume the Date object has good enough resolution. (If not, some stuff's broken...)
Attachment #429918 - Flags: review?(jorendorff)
Depends on: 549971
Attachment #429918 - Attachment is obsolete: true
Attachment #430367 - Flags: review?
Attachment #430367 - Flags: review? → review?(jorendorff)
>    shellpath = find_shell()
>    if not shellpath:
>        print >> sys.stderr, 'Could not find shell'
>        return -1

This logic ignores the "-s" option (options.shell).

>    for filepath in gen_filepaths(dirpath):
>        print '{0:>30}: {1:>6} ms'.format(os.path.split(filepath)[-1],
>                benchfile(filepath))

The original C++ parse-runner you wrote generated valid JSON, which I was using
to display benchmark results in a web page. Easy enough to recover, e.g.

    print '{0:>30}: {1:>6}'.format('"' + os.path.split(filepath)[-1] + '"',
            benchfile(filepath))

Nice tests, BTW-- they've decisively proved my refactoring to be consistently
slower. :/

Dave
This program should be less sensitive to noise in its determinations and lets you provide a baseline file. If it tells you something is faster, it's because t_avg + t_stddev < baseline_t_avg - baseline_t_stddev. (It somewhat incorrectly terms these the "worst" and "best" times from the respective runs.)

For example, basic function inlining found the following speedups:

            tetris_combined.js: Meh.
           gravity_combined.js: FASTER: worst time 134.92 > baseline best time 139.37
                         ga.js: Meh.
              jsgb_combined.js: FASTER: worst time 52.87 > baseline best time 54.33
                 jquery.min.js: FASTER: worst time 26.95 > baseline best time 27.68
           twitter_combined.js: FASTER: worst time 101.05 > baseline best time 103.23
         280slides_combined.js: Meh.
             gmail_combined.js: FASTER: worst time 509.58 > baseline best time 510.78
                       dojo.js: FASTER: worst time 37.95 > baseline best time 38.23
                v8_combined.js: FASTER: worst time 108.39 > baseline best time 111.45
          processing_twitch.js: Meh.
            zimbra_combined.js: Meh.
         effectgames_galaxy.js: FASTER: worst time 79.77 > baseline best time 81.41
         ball_pool_combined.js: FASTER: worst time 80.92 > baseline best time 82.26
                    yui-min.js: Meh.
                   MochiKit.js: FASTER: worst time 79.22 > baseline best time 81.50
             pipio_combined.js: Meh.
         sunspider_combined.js: FASTER: worst time 61.71 > baseline best time 62.78
Attachment #430367 - Attachment is obsolete: true
Attachment #430367 - Flags: review?(jorendorff)
(In reply to comment #8)

Whoops, will fix that. I also had to remove the `str.format` invocations because they're 2.6+ and it's supposed to be 2.4 compatible.
Fixes the explicit shell path bug. Now I think this is ready for review.
Attachment #430456 - Attachment is obsolete: true
Attachment #430459 - Flags: review?
Attachment #430459 - Flags: review? → review?(jorendorff)
(In reply to comment #9)
> Created an attachment (id=430456) [details]
> Better parser benchmarking script.
> 
> This program should be less sensitive to noise in its determinations and lets
> you provide a baseline file. If it tells you something is faster, it's because
> t_avg + t_stddev < baseline_t_avg - baseline_t_stddev. (It somewhat incorrectly
> terms these the "worst" and "best" times from the respective runs.)

That's a pretty good method. The t test is the standard statistical test for whether two normal distributions have different means; your method is probably not that far different from flagging things that the t test says are different at the 95% confidence level. And your underlying data may not even be normal (benchmark scores aren't, in my experience), so doing an actual t test would probably not add much value.
FYI baseline results on my Core 2 machine with CPU freq pinned at 1GHz.

{
      "tetris_combined.js": {"average_ms":   22, "stddev_ms":   0.81},
     "gravity_combined.js": {"average_ms":  140, "stddev_ms":   0.63},
                   "ga.js": {"average_ms":   16, "stddev_ms":   0.71},
        "jsgb_combined.js": {"average_ms":   55, "stddev_ms":   0.67},
           "jquery.min.js": {"average_ms":   28, "stddev_ms":   0.32},
     "twitter_combined.js": {"average_ms":  104, "stddev_ms":   0.77},
   "280slides_combined.js": {"average_ms":   18, "stddev_ms":   0.39},
       "gmail_combined.js": {"average_ms":  521, "stddev_ms":  10.22},
                 "dojo.js": {"average_ms":   39, "stddev_ms":   0.77},
          "v8_combined.js": {"average_ms":  112, "stddev_ms":   0.55},
    "processing_twitch.js": {"average_ms":   12, "stddev_ms":   0.55},
      "zimbra_combined.js": {"average_ms":  672, "stddev_ms":  12.37},
   "effectgames_galaxy.js": {"average_ms":   82, "stddev_ms":   0.59},
   "ball_pool_combined.js": {"average_ms":   83, "stddev_ms":   0.74},
              "yui-min.js": {"average_ms":    9, "stddev_ms":   3.19},
             "MochiKit.js": {"average_ms":   82, "stddev_ms":   0.50},
       "pipio_combined.js": {"average_ms":  459, "stddev_ms":  11.48},
   "sunspider_combined.js": {"average_ms":   63, "stddev_ms":   0.22}
}
There's an extra "import json" at the top which probably defeats the careful try_import_json stuff.

>        if t_worst < base_t_best: # Worst takes less time (better) tha...
>            result = 'FASTER: worst time %.2f > baseline best time %.2...
>        elif t_best > base_t_worst: # Best takes more time (worse) tha...
>            result = 'SLOWER: best time %.2f < baseline worst time %.2...

The code is correct, but the greater-than and less-than signs in the output are reversed.

>        if isinstance(relpath, list):
>            path_parts += relpath
>        else: path_parts.append(relpath)

Line break and indent after the colon, please.

>def gen_filepaths(dirpath, target_ext='.js'):
>    for filename in os.listdir(dirpath):
>        if not filename.endswith('.js'):
>            continue
>        yield os.path.join(dirpath, filename)

The parameter target_ext is not used. Also please reword it to avoid continue:

         if filename.endswith('.js'):
             yield ...

>def avg(seq): return sum(seq) / len(seq)

Again, please splurge and buy us an extra line for that.

bench() doesn't respect --quiet.

This is great stuff. r=me with the changes.
Attachment #430459 - Flags: review?(jorendorff) → review+
One more thing -- it would be kind to op.print_help() instead of the less helpful op.error() when the user runs the script without arguments.
> bench() doesn't respect --quiet.

Also, it wouldn't hurt to print its output to stderr.

Dave
(In reply to comment #15)
> One more thing -- it would be kind to op.print_help() instead of the less
> helpful op.error() when the user runs the script without arguments.

OptionParser.error prints the help out as well:

$ ./js/src/tests/parsemark.py
Usage: parsemark.py [options] dirpath

Pulls performance data on parsing via the js shell.
[...]

parsemark.py: error: dirpath required
Attachment #430459 - Attachment is obsolete: true
Attachment #430727 - Flags: review?
Attachment #430727 - Flags: review? → review?(jorendorff)
Attachment #430727 - Flags: review?(jorendorff) → review+
(In reply to comment #17)
> OptionParser.error prints the help out as well:

It doesn't show the options.
FYI: the mean across our JS corpus for pushed-back tokens encountered in the first GetToken loop is .84.
http://hg.mozilla.org/mozilla-central/rev/f32520cac623
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: