optimize RegExp.prototype.test with leading .*

RESOLVED WORKSFORME

Status

()

Core
JavaScript Engine
RESOLVED WORKSFORME
7 years ago
3 years ago

People

(Reporter: luke, Unassigned)

Tracking

(Blocks: 1 bug)

unspecified
mozilla11
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox10-)

Details

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

7 years ago
For this code:

var re = /.*star.*/i;
var str = "The Shawshank Redemption (1994)";
for (var k = 0; k < 1000000; k++)
  re.test(str);

we are 21x slower than v8.  However if you change that to "str.search(re)", then we are actually 13% faster.  What gives?  v8 has a special case optimization that strips the leading .* off regexps when called for 'test' (when the third char is not a ?).

Sub-optimally-written regexps keep popping up and it seems like it would be beneficial to have a general algebraic simplification pass on regexps.  I seem to recall research papers on this that we could leverage.  Regardless, I think we should do this .* optimization first :)  The example is taken from the Peacekeeper stringFilter sub-test, where we are 13x worse than Chrome.  I don't know how the sub-test scores are weighted, but this may be significantly hurting the overall score.
> I don't know how the sub-test scores are weighted

Neither does anyone else, short of reading their code: they don't disclose that information.
Actually, bug 499198 comment 5 says how they do the scoring.

Given that, a speedup by a factor of N on stringFilter would mean a score increase by a factor of N^(1/5) on the "string" part of peacekeeper, and hence a score increase by a factor of N^(1/25) for the overall benchmark.

If we can get a 10x speedup here, that's 10% increase in our overall benchmark score.  A 20x speedup (which sounds plausible given comment 0) is a 13% increase in overall benchmark score.

For just the "string" subsection, 10x and 20x correspond to 58% and 82% speedups respectively.
tracking-firefox10: --- → +
(In reply to Luke Wagner [:luke] from comment #0)
> For this code:
> 
> var re = /.*star.*/i;
> var str = "The Shawshank Redemption (1994)";
> for (var k = 0; k < 1000000; k++)
>   re.test(str);
> 
> we are 21x slower than v8.  However if you change that to "str.search(re)",
> then we are actually 13% faster.  What gives?  v8 has a special case
> optimization that strips the leading .* off regexps when called for 'test'
> (when the third char is not a ?).

I don't get this. Consider

 var greedy = /.*(ab.)/;
 var nongreedy = /.*?(ab.)/;
 var m1 = "ababc".match(greedy);
 var m2 = "ababc".match(nongreedy);

m1[1] will be "abc". m[2] will be "aba". The difference doesn't matter when using re.test, but really, /.*?/ is "closer" to re.search than /.*/ is. So if anything, I'd expect it to be the other way around. But really, I'd think it should strip off either a leading .*? or .*

What am I missing? (And it bothers me that this matters at all. Shouldn't re.test be converting to a DFA anyway, at least if you're not using whatever JS regexp features are not DFA-able? Looks like JS supports backreferences, so that'd be one. But re.test makes capturing groups not matter, which is the usual big one.)
(In reply to Steve Fink [:sfink] from comment #4)

You're right, it's for lazy star.

Currently the same native code is emitted for exec and test operations. Currently, native code is not emitted for patterns with backreferences.

With the regexp cache this should be an easy hack -- we'll just compile a different RegExPrivate with we peephole in |test| and purge it from the RegExpObject before we return. Next iteration through the loop it'll hit in the cache.
(In reply to Steve Fink [:sfink] from comment #4)

Actually, I spoke too soon.

http://code.google.com/p/v8/source/browse/branches/bleeding_edge/src/regexp.js?r=9401#260

So it's greedy star. Presumably the reasoning was:

    a) .* starts by eating as many characters as possible and gives them up one by one
    b) the regular test (without .*) eats as few characters as possible before attempting
       to match
    c) whether a match exists or not somewhere in the input string is independent of
       how many characters were *originally* eaten before/after it. The greedy star
       will give the characters up, index by index, until it finds-or-never-finds a match,
       same as the regular test but approaching from the opposite direction.

HOWEVER! :-) This doesn't quite work because we update the RegExpStatics on a successful |test|, in which case you could observe the difference in capturing groups. So, they actually run an |exec| on the original *after* seeing that the |test| with the greedy removed is true. Makes the negative case a lot faster in some instances, apparently.
Hmmm... I actually think in order to do this we may have to do a single entry cache ({RegExpObject => test-only RegExpPrivate} on the compartment) as well. I hate single entry caches.

The basic issue is that our existing cache maps {source atom => existing RegExpPrivate}, but taking a substring doesn't produce an atom.

The original regexp has a LinearString as its source, which we need to take a substring (DependentString) of and atomize, which is O(n). Instead, we *could* put a N entry cache for identity-based JSString=>JSAtom conversions per compartment (where N is most easily N=1), which could have other benefits for lock elision in the absence of single threaded runtime.

Luke, feedback?
Assignee: general → cdleary
Status: NEW → ASSIGNED
Created attachment 566724 [details] [diff] [review]
WIP: dot star performance hack.

This reduces the time on the microbenchmark by an order of magnitude (1670ms => 130ms). On my machine v8 clocks in at about 80ms. It's dependent on the RegExpPrivate cache from bug 634654.

We can file a followup bug if people want further investigation on the remaining performance difference (I've got to get back to IonMonkey stuff.)
Attachment #566724 - Flags: review?
Attachment #566724 - Flags: review? → review?(mrbkap)
I wouldn't worry about the remaining difference for now; followup bug makes total sense for that!
Oh yeah, forgot to attribute! Luke had a much nicer idea than the single element cache, which was to change the cache into a mapping from "original source" to a tagged union of ExecCapableRegExpPrivate|TestOptimizedRegExpPrivate. Bam.
Comment on attachment 566724 [details] [diff] [review]
WIP: dot star performance hack.

Review of attachment 566724 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/RegExpObject.h
@@ +407,5 @@
> +
> +    RegExpPrivateCacheKind kind() const {
> +        return (bits & 0x1)
> +                 ? RegExpPrivateCache_TestOptimized
> +                 : RegExpPrivateCache_ExecCapable;

Nit, the ? and : portions of this are slightly overindented.
Attachment #566724 - Flags: review?(mrbkap) → review+
We should get this crash sig sorted out before landing this RegExpPrivate-cache dependent patch.
Depends on: 700915
Created attachment 575334 [details] [diff] [review]
Dot star performance hack.

Sorry Blake, I ended up rotting the old patch with refactoring. Since it's changed quite a bit, do you mind looking this rebased one over before I commit it? Thankfully the refactoring made this a bit cleaner. Thanks!
Attachment #566724 - Attachment is obsolete: true
Attachment #575334 - Flags: review?(mrbkap)

Updated

7 years ago
Attachment #575334 - Flags: review?(mrbkap) → review+
I think something got messed up in the rebase, I see a slowdown on the test case! Backing out and investigating.
It's being removed from the cache because the refcount for the test optimized RegExpPrivate is dropping to zero when the matcher is complete. In the prior revision of the patch I created a dummy RegExpObject to persist until the next GC and hold a strong ref to the TestOptimized RegExpPrivate, which I didn't rebase.
Created attachment 577450 [details] [diff] [review]
Small fix that creates the dummy RegExp object again.

Talked this over with luke -- we only do this optimization with RegExps that are prefixed with .*, so creating a dummy object in that case really isn't such a big deal.
Attachment #577450 - Flags: review?(luke)
(Reporter)

Comment 19

7 years ago
Comment on attachment 577450 [details] [diff] [review]
Small fix that creates the dummy RegExp object again.

I don't even think the N.B. comment is necessary... the pathological case doesn't exactly leap to mind :)
Attachment #577450 - Flags: review?(luke) → review+
(In reply to Steve Fink [:sfink] from comment #4)
> But re.test makes capturing groups not matter, which is the usual big one.)

No, even ignoring the RegExp statics, you have to check for a reference to a captured group in the pattern. So re.test can't in general make capturing groups not matter. In particular and common patterns, it could, though.

Henry Spencer long ago had some utopian-optimal DFA implementation. Not sure anyone has tried using it for DFA-able JS regexps.

/be
https://hg.mozilla.org/mozilla-central/rev/eacdec27e5d3
Status: ASSIGNED → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
Dave - you tracked this for FF10. Is this still a concern for beta 10?
(In reply to Alex Keybl [:akeybl] from comment #23)
> Dave - you tracked this for FF10. Is this still a concern for beta 10?

Nah, I marked it because it was deemed a pretty important optimization. It's fine to ship it in 11.
tracking-firefox10: + → -
(Reporter)

Updated

6 years ago
Duplicate of this bug: 503107
(Reporter)

Comment 26

6 years ago
So it turns out the optimization was never actually turned on in the original patch.  See what happens to 'stripped' in:
  https://hg.mozilla.org/mozilla-central/rev/eacdec27e5d3#l6.88
The original patch did however land a test that would have failed if the optimization had been enabled though: the issue in comment 6 was never addressed and thus the RegExp statics can be incorrect.

In bug 724748 I accidentally fixed this bug (thereby enabling the .* optimization) but then I immediately hit the failing test.  So I explicitly disabled it again: http://hg.mozilla.org/mozilla-central/file/cb01e23f83cf/js/src/builtin/RegExp.cpp#l530.

Thus, this bug remains open pending a fix to jit-test/tests/basic/bug691797-regexp-1.js.
I did a quick test building with the .* optimization enabled in an optimized non-debug build, and the score went from 2670.5 (latest nightly) to 21459.5. In the latest Chrome I get 32311, so we're getting a lot closer.

(In reply to Luke Wagner [:luke] from comment #26) 
> Thus, this bug remains open pending a fix to
> jit-test/tests/basic/bug691797-regexp-1.js.

Perhaps we should mark it as REOPENED then, to avoid confusion?
(Reporter)

Comment 28

6 years ago
Yes indeed.  (I actually tried to mark it REOPENED but this was lost when I mid-aired myself with comment 25 ;)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Mass-reassigning cdleary's bugs to default. He won't work on any of them, anymore. I guess, at least.

@cdleary: shout if you take issue with this.
Assignee: cdleary → general
Status: REOPENED → NEW
Luke: Is this StartsWithGreedyStar() optimization for still relevant? In comment 26, you said the optimization is disabled, pending a fix for a broken test. Since then, however, the StartsWithGreedyStar() code was removed by YARR MatchOnly bug 808245.
Flags: needinfo?(luke)
(Reporter)

Comment 31

5 years ago
It would probably still help peacekeeper unless we've optimized this a different way.  Jan: do you know?
Flags: needinfo?(luke)

Updated

5 years ago
Flags: needinfo?(jdemooij)
(In reply to Luke Wagner [:luke] from comment #31)
> It would probably still help peacekeeper unless we've optimized this a
> different way.  Jan: do you know?

We're now about as fast as V8 on PK stringFilter. Comment 0 mentions 13x slower, so apparently something fixed this. Maybe something in YARR?
Flags: needinfo?(jdemooij)
(Assignee)

Updated

4 years ago
Assignee: general → nobody

Updated

3 years ago
Status: NEW → RESOLVED
Last Resolved: 7 years ago3 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.