Closed Bug 511777 Opened 16 years ago Closed 16 years ago

Avoid RegExp engine for String regex ops taking flat strings

Categories

(Core :: JavaScript Engine, defect)

Product:

Component:

Type:

defect

Priority:

Not set

Severity:

normal

Tracking

()

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

status1.9.2

---

beta1-fixed

People

(Reporter: luke, Assigned: luke)

References

Details

Attachments

(1 file, 3 obsolete files)

optimize for flat patterns 16 years ago Luke Wagner [:luke] 11.16 KB, patch		Details \| Diff \| Splinter Review
v.2 16 years ago Luke Wagner [:luke] 17.69 KB, patch		Details \| Diff \| Splinter Review
v.3, rebased and with comments addressed 16 years ago Luke Wagner [:luke] 16.87 KB, patch		Details \| Diff \| Splinter Review
v.4, nicer 16 years ago Luke Wagner [:luke] 21.26 KB, patch	Waldo : review+	Details \| Diff \| Splinter Review

Luke Wagner [:luke]

Assignee

Description

•

16 years ago

For operations like: "abcd".search("c") "abcd".match("c") "abcd".replace("c","C") we create a whole RegExp, even for 'replace', where the spec says that, when not a RegExp, the pattern argument is interpreted as a flat string. It seems like there is a fair amount of boilerplate to wade through to get to the underlying, essentially-strstr(), algorithm. I propose that we just implement said loop directly for flat strings. In addition to flat Strings in String.replace, I propose that we try to do this for 'search' and 'match' strings by searching the [length < fixed upper bound] pattern string for regexp meta-characters. To avoid fragmenting regexp logic, the functionality could be put in jsregexp.cpp and exported as functions: bool js_CharsRequireRegExpMatching(const char *, size_t); bool js_ExecuteFlatRegeExp(const char *, size_t, const char *, size_t); Does this sound like a good idea? It seems like it would show up in the benchmarks we care about.

Luke Wagner [:luke]

Assignee

Comment 1

•

16 years ago

(Depending on bug 460904 because it de-obfuscates String RegExp methods.)

David Mandelin [:dmandelin]

Comment 2

•

16 years ago

For the most part this seems OK, but one thing to check is whether |strstr| is faster than a regexp compiled to native code. It may not be, because the compilation achieves loop unrolling (specialized on the flat string contents). You could certainly simplify the compilation path a lot in any case.

Luke Wagner [:luke]

Assignee

Comment 3

•

16 years ago

That is a valid point, but: - I recall trace-lookup being somewhat expensive. Its possible that with that and all the other boilerplate surrounding the underlying loop, strstr could complete before the the regexp engine even starts. - We could do something like Duff's device to unroll for ourselves. Since the effectiveness of unrolling dies off a certain number of iterations, viz., when the loop body is longer than an i-cache line or when memory loads are the bottleneck, this should be able to achieve equivalent performance.

David Mandelin [:dmandelin]

Comment 4

•

16 years ago

(In reply to comment #3) > That is a valid point, but: > - I recall trace-lookup being somewhat expensive. Its possible that with that > and all the other boilerplate surrounding the underlying loop, strstr could > complete before the the regexp engine even starts. That is definitely true for small text strings, which makes these things hard to tune. > - We could do something like Duff's device to unroll for ourselves. Since the > effectiveness of unrolling dies off a certain number of iterations, viz., when > the loop body is longer than an i-cache line or when memory loads are the > bottleneck, this should be able to achieve equivalent performance. ++ on the 'viz.' usage. ;-) But Duff's doesn't specialize for the pattern string. I'm pretty sure memory loads *are* the bottleneck, but without specializing you have more loads for the pattern text. And I assume you mean bigger (ish) than the i-cache, not just one cache line. Which reminds me, I wanted to dig out my architecture textbooks to refresh myself on that stuff. Thanks!

Luke Wagner [:luke]

Assignee

Comment 5

•

16 years ago

(In reply to comment #4) > I'm pretty sure memory loads *are* the bottleneck, but without > specializing you have more loads for the pattern text. I expect the bottleneck would be with the memory-controller, and since the pattern text would be kept in the d-cache by constant usage, it shouldn't be sapping the memory-controller bandwidth. But my argument is growing further dependent on the micro-architecture; it is definitely worth doing an experiment before going any further, though.

Brendan Eich [:brendan]

Comment 6

•

16 years ago

We use Boyer-Moore-Horspool from String.prototype.indexOf, with thresholding that badly needs tuning. The BMH helper is factored out. /be

David Mandelin [:dmandelin]

Comment 7

•

16 years ago

(In reply to comment #5) > (In reply to comment #4) > > I'm pretty sure memory loads *are* the bottleneck, but without > > specializing you have more loads for the pattern text. > > I expect the bottleneck would be with the memory-controller, and since the > pattern text would be kept in the d-cache by constant usage, it shouldn't be > sapping the memory-controller bandwidth. But my argument is growing further > dependent on the micro-architecture; it is definitely worth doing an experiment > before going any further, though. I just so happen to have a directory full of regexp design files. I tried it on a sunspider-y flat regexp and the perf was not detectably different between the ->native approach and the wcsstr approach. BMH makes sense.

Luke Wagner [:luke]

Assignee

Comment 8

•

16 years ago

So I did a conservative first stab: only String.search, no loop unrolling, no BMH. On the microbenchmark: s = 0; for (var i = 0; i < 1000000; ++i) { s += "asdfasdfasdfasdfasdfasdfasdfasdfasdfilyhb".search("ilyhb"); } I get about 5x speedup without JIT and 10x with. I can't find any uses of flat String.search in sunspider, v8, or peacekeeper, but this portends good speedup for match and replace, of which there are many with flat string arguments in peackeeper/string.

Mike Shaver (:shaver emeritus)

Comment 9

•

16 years ago

Luke Wagner [:luke]

Assignee

Comment 10

•

16 years ago

Dup indeed.

Luke Wagner [:luke]

Assignee

Comment 11

•

16 years ago

Attached patch optimize for flat patterns (obsolete) — Details — Splinter Review

This patch does the flat-pattern optimization for String.replace, match, and search. Synthetic benchmarks (like the one above) show ~2x speedup for match and replace. On the stringChat, stringReplace portions of peacekeeper (calling run() on the raw .js file a few thousand times), the speedup is ~2x and for stringWeighted 20%.

Luke Wagner [:luke]

Assignee

Comment 12

•

16 years ago

Attached patch v.2 (obsolete) — Details — Splinter Review

So this patch gives a 40% higher score on Peacekeeper Text Parsing. (For reference, before: 1469, after: 2073) A couple of notes: - I did manual unrolling (a la Duff's device) and got a pretty good speedup (~40% doing indexOf in a loop, ~6% on peacekeeper Text Parsing). Yeah, its gross-lookin', but so is js_BoyerMooreHorspool, and now they are both factored into StringMatch so that the "which algorithm" logic is in one place. I bet if I root around jsstr.cpp I can find another place to use it... - I didn't end up tweaking the BMH threshhold because it seems that BMH wins not for string lengths > k, but for strings with many partial matches. Hence, even relatively short searches ("abcdabcdabcdabcdabcde".search("abcde")) show speedup, and conversely long inputs with no partial matches can show a tiny slowdown. The 512 lower bound makes sense by saying "we'll only try when the winning input wins big and the losing input loses small", so it looks as good as any other big number to me. - I noticed the types used in the BMH algo are too small for 64-bit systems. Changing BMH's types involved changing str_indexOf, so I factored the flat-matching algorithm so that indexOf and match/search/replace could all share. While I was in indexOf I added a path so that integer indices (e.g., as used in peacekeeper) don't involve a function calls and conversion to doubles and back. That's part of the peacekeeper speedup. This patch still doesn't look into regexps to see if they are flat or do anything special for global matches.

Attachment #397415 - Attachment is obsolete: true

Brendan Eich [:brendan]

Comment 13

•

16 years ago

Comment on attachment 397964 [details] [diff] [review] v.2 >diff --git a/js/src/jsapi.h b/js/src/jsapi.h >--- a/js/src/jsapi.h >+++ b/js/src/jsapi.h >@@ -258,16 +258,26 @@ UWORD_TO_ROOTED_JSVAL(JSContext *cx, siz > { > extern JSBool js_NewNumberInRootedValue(JSContext *cx, jsdouble d, jsval *vp); > if (i >= (size_t)JSVAL_INT_MAX) > return js_NewNumberInRootedValue(cx, i, vp); > *vp = INT_TO_JSVAL(i); > return JS_TRUE; > } > >+static JS_ALWAYS_INLINE JSBool >+WORD_TO_ROOTED_JSVAL(JSContext *cx, ptrdiff_t i, jsval *vp) >+{ >+ extern JSBool js_NewNumberInRootedValue(JSContext *cx, jsdouble d, jsval *vp); >+ if (!INT_FITS_IN_JSVAL(i)) >+ return js_NewNumberInRootedValue(cx, i, vp); >+ *vp = INT_TO_JSVAL(i); >+ return JS_TRUE; >+} >+ Not in jsapi.h! For one, js_NewNumberInRootedValue is not exported with JS_FRIEND_API let alone JS_PUBLIC_API. For another we do not worry about jsapi.h layering overhead. If the engine's internals need this inline, it should go in a non-public header. >diff --git a/js/src/jsregexp.cpp b/js/src/jsregexp.cpp >--- a/js/src/jsregexp.cpp >+++ b/js/src/jsregexp.cpp >@@ -5764,8 +5764,24 @@ JSBool > js_SetLastIndex(JSContext *cx, JSObject *obj, jsdouble lastIndex) > { > jsval v; > > return JS_NewNumberValue(cx, lastIndex, &v) && > JS_SetReservedSlot(cx, obj, 0, v); > } > >+bool >+js_FlatString(const jschar *chars, size_t length) >+{ >+ for (size_t i = 0; i < length; ++i) { >+ char c = chars[i]; >+ switch (c) { >+ /* Taken from the PatternCharacter production in 15.10.1. */ >+ case '^': case '$': case '\\': case '.': case '*': case '+': >+ case '?': case '(': case ')': case '[': case ']': >+ case '{': case '}': case '|': Half-indent (two spaces) case and goto labels, style guide sez... >+ return false; ... so you don't have to overindent this one c-basic-offset. >diff --git a/js/src/jsstr.cpp b/js/src/jsstr.cpp >--- a/js/src/jsstr.cpp >+++ b/js/src/jsstr.cpp >@@ -70,16 +70,18 @@ > #include "jsopcode.h" > #include "jsregexp.h" > #include "jsscope.h" > #include "jsstaticcheck.h" > #include "jsstr.h" > #include "jsbit.h" > #include "jsvector.h" > >+#include <wchar.h> Standard headers first, no need for blank line. >+#if 0 /* If someone decide's Duff's device is too gnarly. */ Duff's device rocks, no need for this #if 0 hedge. >+ switch ((textend - text) & 7) { >+ do { >+ case 0: if (*t++ == p0) { fixup = 8; goto match; } >+ case 7: if (*t++ == p0) { fixup = 7; goto match; } >+ case 6: if (*t++ == p0) { fixup = 6; goto match; } >+ case 5: if (*t++ == p0) { fixup = 5; goto match; } >+ case 4: if (*t++ == p0) { fixup = 4; goto match; } >+ case 3: if (*t++ == p0) { fixup = 3; goto match; } >+ case 2: if (*t++ == p0) { fixup = 2; goto match; } >+ case 1: if (*t++ == p0) { fixup = 1; goto match; } Half-indent case again, even when not directly within switch's body. I couldn't spot the <wchar> dependencies at a glance -- what were they? Nice speedup work! Gratifying to know my 1997-era SWAG of 512 survives. /be

Luke Wagner [:luke]

Assignee

Comment 14

•

16 years ago

(In reply to comment #13) > I couldn't spot the <wchar> dependencies at a glance -- what were they? I guess technically they are in my vim buffer history, which is to say, I was playing with wcsstr() and forgot to cut out the header :)

Luke Wagner [:luke]

Assignee

Comment 15

•

16 years ago

Attached patch v.3, rebased and with comments addressed (obsolete) — Details — Splinter Review

Attachment #397964 - Attachment is obsolete: true

Attachment #398052 - Flags: review?(jwalden+bmo)

Luke Wagner [:luke]

Assignee

Updated

•

16 years ago

Attachment #398052 - Flags: review?(jwalden+bmo)

Luke Wagner [:luke]

Assignee

Comment 16

•

16 years ago

Attached patch v.4, nicer — Details — Splinter Review

I thought I'd extend the optimization to flat RegExps, and in doing so I refactored/cleaned up and found bugs. Also, this patch assumes dvander's patch in bug 513348 which limits string lengths to 2^30-1 on x64 systems, so that I can take out the extra logic I added to handle too-big indices stored in jsvals. Then I realized (i.e., js/tests informed me) there is a lot of extra junk to do with RegExps, like setting all the static fields, so I turned that off. But the resulting patch is still better than v.3.

Attachment #398052 - Attachment is obsolete: true

Attachment #398218 - Flags: review?(jwalden+bmo)

Luke Wagner [:luke]

Assignee

Updated

•

16 years ago

Depends on: 513348
No longer depends on: 460904

Jeff Walden [:Waldo]

Comment 17

•

16 years ago

Comment on attachment 398218 [details] [diff] [review] v.4, nicer >diff --git a/js/src/jsregexp.cpp b/js/src/jsregexp.cpp >+bool >+js_FlatString(const jschar *chars, size_t length) >+{ >+ for (size_t i = 0; i < length; ++i) { >+ char c = chars[i]; >+ switch (c) { >+ /* Taken from the PatternCharacter production in 15.10.1. */ >+ case '^': case '$': case '\\': case '.': case '*': case '+': >+ case '?': case '(': case ')': case '[': case ']': case '{': >+ case '}': case '|': >+ return false; >+ default:; >+ } >+ } >+ return true; >+} This naming overload of "flat" is horrible. Invert its meaning and make that js_ContainsRegExpSpecialChars perhaps? Also, given that this is being exposed multiple places, it seems to me that it would be better as a method on JSString, with the usual naming-style changes. >diff --git a/js/src/jsstr.cpp b/js/src/jsstr.cpp >+js_BoyerMooreHorspool(const jschar *text, jsuint textlen, >+ const jschar *pat, jsuint patlen) >+{ > uint8 skip[BMH_CHARSET_SIZE]; >- jschar c; >- >- JS_ASSERT(0 < patlen && patlen <= BMH_PATLEN_MAX); >- for (i = 0; i < BMH_CHARSET_SIZE; i++) >+ >+ JS_ASSERT(patlen <= BMH_PATLEN_MAX); Why no > 0 assertion here, indeed why specifically remove it? >+ for (uint i = 0; i < BMH_CHARSET_SIZE; i++) > skip[i] = (uint8)patlen; uint as far as I can tell is an obsolescent type used in only one other location in SpiderMonkey; make this jsuint or some other unsigned type than uint. >+static JS_ALWAYS_INLINE jsint >+StringMatch(const jschar *text, jsuint textlen, >+ const jschar *pat, jsuint patlen) >+{ >+ if (patlen == 0) >+ return 0; >+ if (textlen < patlen) >+ return -1; >+ >+ /* XXX tune the BMH threshold (512) */ >+ if (textlen >= 512 && patlen <= BMH_PATLEN_MAX) { While you're here move 512 into a static const. >+ JSString *str; >+ NORMALIZE_THIS(cx, vp, str); >+ >+ JSString *str2 = ArgToRootedString(cx, argc, vp, 0); >+ if (!str2) >+ return JS_FALSE; patstr would be a better name for this variable. >+static JS_ALWAYS_INLINE bool >+DefProp(JSContext *cx, JSObject *obj, jsid id, jsval val) >+{ >+ return js_DefineProperty(cx, obj, id, val, JS_PropertyStub, JS_PropertyStub, >+ JSPROP_ENUMERATE); >+} This mini-typing aid doesn't seem necessary; just use obj->defineProperty(...).

Attachment #398218 - Flags: review?(jwalden+bmo) → review+

Luke Wagner [:luke]

Assignee

Comment 18

•

16 years ago

(In reply to comment #17) Thanks! > This naming overload of "flat" is horrible. Invert its meaning and make that > js_ContainsRegExpSpecialChars perhaps? Also, given that this is being exposed > multiple places, it seems to me that it would be better as a method on > JSString, with the usual naming-style changes. Roger on the renaming, but I don't agree on making it a member of JSString; this is regexp-specific logic and I don't think belongs in JSString. That's how one gets 2000-member classes. > >+js_BoyerMooreHorspool(const jschar *text, jsuint textlen, > >+ const jschar *pat, jsuint patlen) > >+{ > > uint8 skip[BMH_CHARSET_SIZE]; > >- jschar c; > >- > >- JS_ASSERT(0 < patlen && patlen <= BMH_PATLEN_MAX); > >- for (i = 0; i < BMH_CHARSET_SIZE; i++) > >+ > >+ JS_ASSERT(patlen <= BMH_PATLEN_MAX); > > Why no > 0 assertion here, indeed why specifically remove it? Oops, I got a bit overzealous when I switched to from jsint to jsuint. > >+static JS_ALWAYS_INLINE bool > >+DefProp(JSContext *cx, JSObject *obj, jsid id, jsval val) > >+{ > >+ return js_DefineProperty(cx, obj, id, val, JS_PropertyStub, JS_PropertyStub, > >+ JSPROP_ENUMERATE); > >+} > > This mini-typing aid doesn't seem necessary; just use obj->defineProperty(...). I added that so its easy to see what (id,value)s are being added, and not get lost in the river of text. Its much grosser looking without.

Jeff Walden [:Waldo]

Comment 19

•

16 years ago

We're quite far from JSString being a 2000-member class. Two JS_PropertyStubs are pretty ignorable, I think.

Luke Wagner [:luke]

Assignee

Comment 20

•

16 years ago

(In reply to comment #19) > We're quite far from JSString being a 2000-member class. Even if JSString had two members, the argument for not putting regular-expression logic in JSString would remain. Growing classes with giant interfaces happens 1 member at a time. > Two JS_PropertyStubs are pretty ignorable, I think. They might be if they were textually aligned, which, with reasonable column widths, they aren't. This is just factoring out redundancy, and I see it done in other places with as little or littler benefit as here.

Jeff Walden [:Waldo]

Comment 21

•

16 years ago

This seems like it would be preferable in JSObject, then, to avoid having everyone define their own helpers to do this (or not, as the case more often would be: inline definePropertyValue(JSContext* cx, jsid id, jsval v) { return defineProperty(cx, id, v, JS_PropertyStub, JS_PropertyStub, JSPROP_ENUMERATE); }

Luke Wagner [:luke]

Assignee

Comment 22

•

16 years ago

That's a nice idea. Alternatively, we could give the last 3 parameters of JSObject::defineProperty default values.

Jeff Walden [:Waldo]

Comment 23

•

16 years ago

(In reply to comment #22) > Alternatively, we could give the last 3 parameters of > JSObject::defineProperty default values. I'm in favor!

Brendan Eich [:brendan]

Comment 24

•

16 years ago

Dude, default parameters rule! /be

Luke Wagner [:luke]

Assignee

Comment 26

•

16 years ago

http://hg.mozilla.org/tracemonkey/rev/20ee84099e38

Luke Wagner [:luke]

Assignee

Comment 27

•

16 years ago

Hmm, mochitest/everythingelse failures...

Luke Wagner [:luke]

Assignee

Comment 28

•

16 years ago

backed out yesterday, found unthinkably stupid mistake made while addressing comment 17, http://hg.mozilla.org/tracemonkey/rev/f569d59461d4

Comment 29

•

16 years ago

http://hg.mozilla.org/mozilla-central/rev/f569d59461d4

Status: NEW → RESOLVED

Closed: 16 years ago

Resolution: --- → FIXED

Comment 30

•

16 years ago

http://hg.mozilla.org/releases/mozilla-1.9.2/rev/ad06044b2f3e

status1.9.2: --- → beta1-fixed

Flags: wanted1.9.2+

Boris Zbarsky [:bzbarsky]

Updated

•

16 years ago

Depends on: 526173

Updated

•

16 years ago

Depends on: 565608

Jan de Mooij [:jandem]

Updated

•

15 years ago

Depends on: 612838

You need to log in before you can comment on or make changes to this bug.