Closed
Bug 1371215
Opened 7 years ago
Closed 7 years ago
Some string matching improvements
Categories
(Core :: JavaScript Engine, enhancement)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla55
Tracking | Status | |
---|---|---|
firefox55 | --- | fixed |
People
(Reporter: jandem, Assigned: jandem)
References
Details
Attachments
(3 files)
772 bytes,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
2.69 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
1.41 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
No description provided.
Assignee | ||
Comment 1•7 years ago
|
||
Apparently memchr used to be very slow on OS X, but nowadays it's fast and SIMD-optimized. I think this was fixed with OS X 12.10 but I'm not sure. Also note that the code was checking for __clang__ which seems wrong. The micro-benchmark below improves from 653 ms to 268 ms on OS X. function f() { var arr = []; for (var i = 0; i < 128; i++) arr.push("foo".repeat(i) + "z"); var t = new Date; for (var i = 0; i < 10000000; i++) res = arr[i % 128].indexOf("z"); print(new Date - t); } f();
Attachment #8875642 -
Flags: review?(luke)
Assignee | ||
Comment 2•7 years ago
|
||
FirstCharMatcher16bit is used when both the text and pattern are 16-bit. Not surprisingly, this case is pretty uncommon since most websites/benchmarks use Latin1 strings. More importantly, on Linux (well non-Mac/Windows) it tries to be smart by using memchr to look for the low byte of the char16_t. This has a terrible performance cliff when we're searching for a char like \u1000 because usually most characters in a char16_t* string have 0 bytes. If I make that code work on Mac too, the testcase below takes 19-20 seconds (!). Removing FirstCharMatcher16bit and using FirstCharMatcherUnrolled improves it to 844 ms. So this code was never used on Mac/Windows, is not very hot since Latin1 strings, but has a bad perf cliff, so I think we should just remove it. function f() { var s = "foo".repeat(100) + "\u1234"; var t = new Date; for (var i = 0; i < 10000000; i++) res = s.indexOf("\u1000"); print(new Date - t); } f();
Attachment #8875654 -
Flags: review?(luke)
Assignee | ||
Comment 3•7 years ago
|
||
A few more things we can simplify and optimize: * If we have an 8-bit string and the first pattern char > 0xff, we can return -1 immediately. This improves the test case below from 785 ms to 175 ms. * After that, we know the first pattern char must be <= 0xff, so we can always use FirstCharMatcher8bit, even if the pattern is 16-bit. If I change the pattern below from "\u1234" to "z\u1234", it improves from 935 to 243 ms. function f() { var s = "foo".repeat(100) + "z"; var t = new Date; for (var i = 0; i < 10000000; i++) res = s.indexOf("\u1234"); print(new Date - t); } f();
Attachment #8875678 -
Flags: review?(luke)
Updated•7 years ago
|
Attachment #8875642 -
Flags: review?(luke) → review+
Comment 4•7 years ago
|
||
Comment on attachment 8875654 [details] [diff] [review] Part 2 - Remove FirstCharMatcher16bit Review of attachment 8875654 [details] [diff] [review]: ----------------------------------------------------------------- sgtm!
Attachment #8875654 -
Flags: review?(luke) → review+
Comment 5•7 years ago
|
||
Comment on attachment 8875678 [details] [diff] [review] Part 3 - More optimizations Review of attachment 8875678 [details] [diff] [review]: ----------------------------------------------------------------- Nice!
Attachment #8875678 -
Flags: review?(luke) → review+
Pushed by jandemooij@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/f4048987a5e3 part 1 - Use the memchr optimization on OS X too, as it's fast now. r=luke https://hg.mozilla.org/integration/mozilla-inbound/rev/fce84022f5e3 part 2 - Remove FirstCharMatcher16bit as it's not used much and has some perf issues. r=luke https://hg.mozilla.org/integration/mozilla-inbound/rev/35b9efa24066 part 3 - Optimize and simplify Matcher a bit more. r=luke
Comment 7•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f4048987a5e3 https://hg.mozilla.org/mozilla-central/rev/fce84022f5e3 https://hg.mozilla.org/mozilla-central/rev/35b9efa24066
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in
before you can comment on or make changes to this bug.
Description
•