Closed
Bug 1383647
Opened 4 years ago
Closed 4 years ago
See if there's more to optimize in str_toLowerCase for small, Latin-1 strings which are already lower-cased
Categories
(Core :: JavaScript: Standard Library, defect)
Core
JavaScript: Standard Library
Tracking
()
RESOLVED
FIXED
mozilla57
| Tracking | Status | |
|---|---|---|
| firefox57 | --- | fixed |
People
(Reporter: anba, Assigned: anba)
References
(Blocks 1 open bug)
Details
Attachments
(2 files, 2 obsolete files)
|
21.42 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
|
5.34 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
Speedometer call counts per gdb breakpoints (*) (filtered with |if (str->d.u1.flags&63)==32| because I wanted to check for the presence of external strings).
Name Non-External External
Vanilla: 800 200
React: 3184 2011
React-Redux: 2724 1710
Ember: 4641 10
Backbone: 1135 302
AngularJS: 43485 1370
Angular2: 117 2
Vue: 1277 2
jQuery: 1027 200
Preact: 18 0
Inferno: 32407 0
Elm: 0 0
Flight: 2928 1301
So most calls are from AngularJS and Inferno
AngularJS:
Num Type Disp Enb Address What
15 breakpoint keep y 0x00007fffe8d37729 in js::str_toLowerCase(JSContext*, unsigned int, JS::Value*) at /home/andre/hg/mozilla-inbound/js/src/jsstr.cpp:1013
stop only if result==linear.ptr
breakpoint already hit 41960 times
16 breakpoint keep y 0x00007fffe8d37729 in js::str_toLowerCase(JSContext*, unsigned int, JS::Value*) at /home/andre/hg/mozilla-inbound/js/src/jsstr.cpp:1013
breakpoint already hit 43746 times
Inferno:
Num Type Disp Enb Address What
15 breakpoint keep y 0x00007fffe8d37729 in js::str_toLowerCase(JSContext*, unsigned int, JS::Value*) at /home/andre/hg/mozilla-inbound/js/src/jsstr.cpp:1013
stop only if result==linear.ptr
breakpoint already hit 21124 times
16 breakpoint keep y 0x00007fffe8d37729 in js::str_toLowerCase(JSContext*, unsigned int, JS::Value*) at /home/andre/hg/mozilla-inbound/js/src/jsstr.cpp:1013
breakpoint already hit 32590 times
Most AngularJS calls come from here [1] with |"false".toLowerCase()| and |"true".toLowerCase()|, so we'd need to optimize traversing a small, Latin-1 string which is already lower-cased. :-/
It still could be useful to search for optimization possibilities given that str_toLowerCase if the fourth most called built-in per bug 1365361.
(*) All frameworks run once from InteractiveRunner.html
[1] https://github.com/WebKit/webkit/blob/80b1e93d42aac34fddd652bb6f7749d26a265d36/PerformanceTests/Speedometer/resources/todomvc/architecture-examples/angularjs/node_modules/angular/angular.js#L18303-L18305
Comment 1•4 years ago
|
||
For the angular case, maybe we can fold toLowerCase in Ion?
| Assignee | ||
Comment 2•4 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #1) > For the angular case, maybe we can fold toLowerCase in Ion? Unfortunately the input strings aren't constant, so I don't think we can just constant-fold the calls. (Or did I misunderstand "fold toLowerCase in Ion"?) The AngularJS code looks like: actual = lowercase('' + actual); expected = lowercase('' + expected); return actual.indexOf(expected) !== -1; where the "lowercase" function is defined as |var lowercase = function(string) {return isString(string) ? string.toLowerCase() : string;};|. I was able to improve the performance it a bit by doing more or less useful things like: - Manually inlining unicode::CanLowerCase and then adding MOZ_LIKELY around the ASCII case. - And removing the two roots in str_toLowerCase. (Rooting analysis even seems to be okay with that change: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2d1821b2be1c116181b2667c25ec93c3a6a2a91f) These two changes improved this test case from ~170ms to ~140ms for me. For comparison: V8 needs ~115ms, Chakra ~120ms, and JSC 55ms (https://bugs.webkit.org/show_bug.cgi?id=162887). function f() { var ss = ["true", "false"]; var q = 0; var t = dateNow(); for (var i = 0; i < 5000000; ++i) { q += ss[i & 1].toLowerCase().length; } return [dateNow() - t, q]; } for (var i = 0; i < 10; ++i) print(f());
Comment 3•4 years ago
|
||
(In reply to André Bargull from comment #2) > Unfortunately the input strings aren't constant, so I don't think we can > just constant-fold the calls. (Or did I misunderstand "fold toLowerCase in > Ion"?) I was thinking maybe we know more about the types/strings if Ion is able to inline the "comparator" function and then inlines the "lowercase" function as well, but we probably won't have good enough type information anyway. I wonder what perf would look like with a MIR instruction that just calls into the VM. In your micro-benchmark it would also make it easier for Ion to optimize the array accesses because then it knows the call cannot affect the array length.
| Assignee | ||
Comment 4•4 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #3) > I wonder what perf would look like with a MIR instruction that just calls > into the VM. In your micro-benchmark it would also make it easier for Ion to > optimize the array accesses because then it knows the call cannot affect the > array length. Paired with MOZ_LIKELY for the ASCII case, it improves the µ-benchmark to ~95ms!
| Assignee | ||
Updated•4 years ago
|
Assignee: nobody → andrebargull
Status: NEW → ASSIGNED
| Assignee | ||
Comment 5•4 years ago
|
||
I've changed StringToLowerCase/StringToUpperCase to take a HandleString instead of a HandleLinearString, so we can directly call the function from the generated code. Apart from that, the patch simply adds a vm-call for str_toLowerCase/str_toUpperCase.
Attachment #8892433 -
Flags: review?(jdemooij)
| Assignee | ||
Comment 6•4 years ago
|
||
If we want to squeeze out a few more ms, we could add specialised CanUpperCase/CanLowerCase functions for Latin-1 characters: If we assume most Latin-1 strings are actually ASCII strings, we can add MOZ_LIKELY around the ASCII case. And to make non-ASCII, Latin-1 strings a bit faster, we can avoid the CharInfo lookup by using some mild bit-twiddling. (Is this still acceptable or is already "too clever"?)
Attachment #8892437 -
Flags: review?(jdemooij)
| Assignee | ||
Comment 7•4 years ago
|
||
(In reply to André Bargull from comment #5) > Created attachment 8892433 [details] [diff] [review] > bug1383647-part1-vmcall-from-ion.patch Note to self: This patch doesn't apply cleanly on inbound.
| Assignee | ||
Comment 8•4 years ago
|
||
Comment on attachment 8892437 [details] [diff] [review] bug1383647-part2-latin1-specialization.patch Review of attachment 8892437 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/vm/Unicode.h @@ +299,5 @@ > } > > +// Returns true iff ToUpperCase(ch) != ch. > +inline bool > +CanUpperCase(Latin1Char ch) JS::Latin1Char here and below
| Assignee | ||
Comment 9•4 years ago
|
||
Comment on attachment 8892433 [details] [diff] [review] bug1383647-part1-vmcall-from-ion.patch Review of attachment 8892433 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/MIR.h @@ +7501,5 @@ > + TRIVIAL_NEW_WRAPPERS > + NAMED_OPERANDS((0, string)) > + > + bool congruentTo(const MDefinition* ins) const override { > + return congruentIfOperandsEqual(ins); Also needs to test that |mode_| matches.
| Assignee | ||
Comment 10•4 years ago
|
||
Try-servering found two small issues, noted above. I'll probably post fixed patches later today.
| Assignee | ||
Comment 11•4 years ago
|
||
Updated part 1 so it applies cleanly on inbound and to fix the issue noted in comment #9.
Attachment #8892433 -
Attachment is obsolete: true
Attachment #8892433 -
Flags: review?(jdemooij)
Attachment #8892576 -
Flags: review?(jdemooij)
| Assignee | ||
Comment 12•4 years ago
|
||
Updated part two to fix the comment described in comment #8.
Attachment #8892437 -
Attachment is obsolete: true
Attachment #8892437 -
Flags: review?(jdemooij)
Attachment #8892577 -
Flags: review?(jdemooij)
| Assignee | ||
Comment 13•4 years ago
|
||
Try looks good: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8997b7d045fca1be174dfedb8d14aa563e09a8bf
Comment 14•4 years ago
|
||
Comment on attachment 8892576 [details] [diff] [review] bug1383647-part1-vmcall-from-ion.patch Review of attachment 8892576 [details] [diff] [review]: ----------------------------------------------------------------- Thanks, LGTM. Sorry for the delay.
Attachment #8892576 -
Flags: review?(jdemooij) → review+
Comment 15•4 years ago
|
||
Comment on attachment 8892577 [details] [diff] [review] bug1383647-part2-latin1-specialization.patch Review of attachment 8892577 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/vm/Unicode.h @@ +307,5 @@ > + > + // U+00B5 and U+00E0 to U+00FF, except U+00F7, have an uppercase form. > + bool canUpper = ch == MICRO_SIGN || > + (((ch & ~0x1F) == LATIN_SMALL_LETTER_A_WITH_GRAVE) && ch != DIVISION_SIGN); > + MOZ_ASSERT(canUpper == CanUpperCase(char16_t(ch))); Yeah this is almost too clever :) but there are only 128 char possibilities and this assert verifies all cases are handled correctly, so IMO this is fine.
Attachment #8892577 -
Flags: review?(jdemooij) → review+
| Assignee | ||
Comment 16•4 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #15) > ::: js/src/vm/Unicode.h > @@ +307,5 @@ > > + > > + // U+00B5 and U+00E0 to U+00FF, except U+00F7, have an uppercase form. > > + bool canUpper = ch == MICRO_SIGN || > > + (((ch & ~0x1F) == LATIN_SMALL_LETTER_A_WITH_GRAVE) && ch != DIVISION_SIGN); > > + MOZ_ASSERT(canUpper == CanUpperCase(char16_t(ch))); > > Yeah this is almost too clever :) but there are only 128 char possibilities > and this assert verifies all cases are handled correctly, so IMO this is > fine. TBH I think the test in CanLowerCase is even worse: and'ing |ch & 0xD7| to exclude U+00D7 and U+00DF. :-)
| Assignee | ||
Comment 17•4 years ago
|
||
Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8997b7d045fca1be174dfedb8d14aa563e09a8bf
Keywords: checkin-needed
Comment 18•4 years ago
|
||
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/a0e69aaf7f47 Part 1: Use direct vm calls for String.prototype.toLower/UpperCase. r=jandem https://hg.mozilla.org/integration/mozilla-inbound/rev/68ba6a81d15d Part 2: Add unicode::Can{Lower,Upper}Case specialized for Latin1 characters. r=jandem
Keywords: checkin-needed
Comment 19•4 years ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/a0e69aaf7f47 https://hg.mozilla.org/mozilla-central/rev/68ba6a81d15d
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
status-firefox57:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
You need to log in
before you can comment on or make changes to this bug.
Description
•