Closed Bug 1371215 Opened 7 years ago Closed 7 years ago

Some string matching improvements

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: jandem, Assigned: jandem)

References

Details

Attachments

(3 files)

      No description provided.
Apparently memchr used to be very slow on OS X, but nowadays it's fast and SIMD-optimized. I think this was fixed with OS X 12.10 but I'm not sure.

Also note that the code was checking for __clang__ which seems wrong.

The micro-benchmark below improves from 653 ms to 268 ms on OS X.

function f() {
    var arr = [];
    for (var i = 0; i < 128; i++)
        arr.push("foo".repeat(i) + "z");
    var t = new Date;
    for (var i = 0; i < 10000000; i++)
        res = arr[i % 128].indexOf("z");
    print(new Date - t);
}
f();
Attachment #8875642 - Flags: review?(luke)
FirstCharMatcher16bit is used when both the text and pattern are 16-bit. Not surprisingly, this case is pretty uncommon since most websites/benchmarks use Latin1 strings.

More importantly, on Linux (well non-Mac/Windows) it tries to be smart by using memchr to look for the low byte of the char16_t. This has a terrible performance cliff when we're searching for a char like \u1000 because usually most characters in a char16_t* string have 0 bytes.

If I make that code work on Mac too, the testcase below takes 19-20 seconds (!). Removing FirstCharMatcher16bit and using FirstCharMatcherUnrolled improves it to 844 ms.

So this code was never used on Mac/Windows, is not very hot since Latin1 strings, but has a bad perf cliff, so I think we should just remove it.

function f() {
    var s = "foo".repeat(100) + "\u1234";
    var t = new Date;
    for (var i = 0; i < 10000000; i++)
	res = s.indexOf("\u1000");
    print(new Date - t);
}
f();
Attachment #8875654 - Flags: review?(luke)
A few more things we can simplify and optimize:

* If we have an 8-bit string and the first pattern char > 0xff, we can return -1 immediately. This improves the test case below from 785 ms to 175 ms.

* After that, we know the first pattern char must be <= 0xff, so we can always use FirstCharMatcher8bit, even if the pattern is 16-bit. If I change the pattern below from "\u1234" to "z\u1234", it improves from 935 to 243 ms.

function f() {
    var s = "foo".repeat(100) + "z";
    var t = new Date;
    for (var i = 0; i < 10000000; i++)
	res = s.indexOf("\u1234");
    print(new Date - t);
}
f();
Attachment #8875678 - Flags: review?(luke)
Attachment #8875642 - Flags: review?(luke) → review+
Comment on attachment 8875654 [details] [diff] [review]
Part 2 - Remove FirstCharMatcher16bit

Review of attachment 8875654 [details] [diff] [review]:
-----------------------------------------------------------------

sgtm!
Attachment #8875654 - Flags: review?(luke) → review+
Comment on attachment 8875678 [details] [diff] [review]
Part 3 - More optimizations

Review of attachment 8875678 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!
Attachment #8875678 - Flags: review?(luke) → review+
Pushed by jandemooij@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f4048987a5e3
part 1 - Use the memchr optimization on OS X too, as it's fast now. r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/fce84022f5e3
part 2 - Remove FirstCharMatcher16bit as it's not used much and has some perf issues. r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/35b9efa24066
part 3 - Optimize and simplify Matcher a bit more. r=luke
https://hg.mozilla.org/mozilla-central/rev/f4048987a5e3
https://hg.mozilla.org/mozilla-central/rev/fce84022f5e3
https://hg.mozilla.org/mozilla-central/rev/35b9efa24066
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: