Closed Bug 643646 Opened 13 years ago Closed 2 years ago

Analyze v8-regexp performance

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: dmandelin, Unassigned)

References

Details

The usual. 2x difference here.
The test is complicated and seems hard to simplify. It really just runs a lot of regexes against a lot of texts. There is a hierarchy: there are 12 "blocks" run by functions with names runBlock0 ... runBlock11. The earlier blocks have longer run time than the later ones. It seems we are rather slower on the first few blocks, but equal or faster on most of the later blocks.

Each block runs a bunch of loops, where each loop usually runs just one regex. I examined block 0 in detail. It runs 19 loops, the first one 6511 iterations, then decreasing down to 85 iterations for the last. Again, the earlier loops generally run longer. Loops 1-5 take up a good portion of the total. And again, v8/nocs is faster on the earlier, long-running blocks, but the results vary more toward the end.

It looks like a fair chunk of our slowdown is from these two regexes:

    /^ba/
    /(((\w+):\/\/)([^\/:]*)(:(\d+))?)?([^#?]*)(\?([^#]*))?(#(.*))?/

If I take off the "^" from the first one, we are faster. So they might have some special handling for that. I'm not sure about the second regex. We don't seem to be falling back to PCRE.

If I run just the first loop (re0 and s0) with 'perf stat', I get:

 Performance counter stats for '/home/dmandelin/sources/v8/shell --nocrankshaft
/home/dmandelin/things/vbench/re/re1.js':

     137.956482  task-clock-msecs         #      0.995 CPUs
              1  context-switches         #      0.000 M/sec
              2  CPU-migrations           #      0.000 M/sec
           1422  page-faults              #      0.010 M/sec
      442418798  cycles                   #   3206.945 M/sec
      984708882  instructions             #      2.226 IPC
         339112  cache-references         #      2.458 M/sec
          56490  cache-misses             #      0.409 M/sec

    0.138706128  seconds time elapsed

 Performance counter stats for '/home/dmandelin/sources/tracemonkey/js/src/o32/shell/js -a -m /home/dmandelin/things/vbench/re/re1.js':

     201.616454  task-clock-msecs         #      0.995 CPUs
              1  context-switches         #      0.000 M/sec
              0  CPU-migrations           #      0.000 M/sec
           1089  page-faults              #      0.005 M/sec
      646635540  cycles                   #   3207.256 M/sec
     1341038081  instructions             #      2.074 IPC
          95372  cache-references         #      0.473 M/sec
          38913  cache-misses             #      0.193 M/sec

    0.202561741  seconds time elapsed

It looks like v8 just executes proportionally fewer instructions.

I'm not inclined to dig too much deeper here before refreshing the Yarr import and seeing how things look there. Otherwise, it looks like a lot of detail, where to improve our score we just have to examine our compiled code for each regex that takes a lot of time, figure out how to improve it, etc.

The bug assignee didn't login in Bugzilla in the last 7 months, so the assignee is being reset.

Assignee: dmandelin → nobody

This bug was opened several regexp engines ago, and targets a benchmark we no longer care about. Our current regexp engine is shared with V8, so neither browser should have much of an advantage here. I think it's safe to close this.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.