Closed
Bug 1371215
Opened 8 years ago
Closed 8 years ago
Some string matching improvements
Categories
(Core :: JavaScript Engine, enhancement)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla55
Tracking | Status | |
---|---|---|
firefox55 | --- | fixed |
People
(Reporter: jandem, Assigned: jandem)
References
Details
Attachments
(3 files)
772 bytes,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
2.69 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
1.41 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
No description provided.
Assignee | ||
Comment 1•8 years ago
|
||
Apparently memchr used to be very slow on OS X, but nowadays it's fast and SIMD-optimized. I think this was fixed with OS X 12.10 but I'm not sure.
Also note that the code was checking for __clang__ which seems wrong.
The micro-benchmark below improves from 653 ms to 268 ms on OS X.
function f() {
var arr = [];
for (var i = 0; i < 128; i++)
arr.push("foo".repeat(i) + "z");
var t = new Date;
for (var i = 0; i < 10000000; i++)
res = arr[i % 128].indexOf("z");
print(new Date - t);
}
f();
Attachment #8875642 -
Flags: review?(luke)
Assignee | ||
Comment 2•8 years ago
|
||
FirstCharMatcher16bit is used when both the text and pattern are 16-bit. Not surprisingly, this case is pretty uncommon since most websites/benchmarks use Latin1 strings.
More importantly, on Linux (well non-Mac/Windows) it tries to be smart by using memchr to look for the low byte of the char16_t. This has a terrible performance cliff when we're searching for a char like \u1000 because usually most characters in a char16_t* string have 0 bytes.
If I make that code work on Mac too, the testcase below takes 19-20 seconds (!). Removing FirstCharMatcher16bit and using FirstCharMatcherUnrolled improves it to 844 ms.
So this code was never used on Mac/Windows, is not very hot since Latin1 strings, but has a bad perf cliff, so I think we should just remove it.
function f() {
var s = "foo".repeat(100) + "\u1234";
var t = new Date;
for (var i = 0; i < 10000000; i++)
res = s.indexOf("\u1000");
print(new Date - t);
}
f();
Attachment #8875654 -
Flags: review?(luke)
Assignee | ||
Comment 3•8 years ago
|
||
A few more things we can simplify and optimize:
* If we have an 8-bit string and the first pattern char > 0xff, we can return -1 immediately. This improves the test case below from 785 ms to 175 ms.
* After that, we know the first pattern char must be <= 0xff, so we can always use FirstCharMatcher8bit, even if the pattern is 16-bit. If I change the pattern below from "\u1234" to "z\u1234", it improves from 935 to 243 ms.
function f() {
var s = "foo".repeat(100) + "z";
var t = new Date;
for (var i = 0; i < 10000000; i++)
res = s.indexOf("\u1234");
print(new Date - t);
}
f();
Attachment #8875678 -
Flags: review?(luke)
Updated•8 years ago
|
Attachment #8875642 -
Flags: review?(luke) → review+
Comment 4•8 years ago
|
||
Comment on attachment 8875654 [details] [diff] [review]
Part 2 - Remove FirstCharMatcher16bit
Review of attachment 8875654 [details] [diff] [review]:
-----------------------------------------------------------------
sgtm!
Attachment #8875654 -
Flags: review?(luke) → review+
Comment 5•8 years ago
|
||
Comment on attachment 8875678 [details] [diff] [review]
Part 3 - More optimizations
Review of attachment 8875678 [details] [diff] [review]:
-----------------------------------------------------------------
Nice!
Attachment #8875678 -
Flags: review?(luke) → review+
Pushed by jandemooij@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f4048987a5e3
part 1 - Use the memchr optimization on OS X too, as it's fast now. r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/fce84022f5e3
part 2 - Remove FirstCharMatcher16bit as it's not used much and has some perf issues. r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/35b9efa24066
part 3 - Optimize and simplify Matcher a bit more. r=luke
Comment 7•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f4048987a5e3
https://hg.mozilla.org/mozilla-central/rev/fce84022f5e3
https://hg.mozilla.org/mozilla-central/rev/35b9efa24066
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in
before you can comment on or make changes to this bug.
Description
•