See if there's more to optimize in str_toLowerCase for small, Latin-1 strings which are already lower-cased

RESOLVED FIXED in Firefox 57

Status

()

defect
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: anba, Assigned: anba)

Tracking

(Blocks 1 bug)

Trunk
mozilla57
Points:
---

Firefox Tracking Flags

(firefox57 fixed)

Details

Attachments

(2 attachments, 2 obsolete attachments)

(Assignee)

Description

2 years ago
Speedometer call counts per gdb breakpoints (*) (filtered with |if (str->d.u1.flags&63)==32| because I wanted to check for the presence of external strings).

Name     Non-External   External
Vanilla:          800        200
React:           3184       2011
React-Redux:     2724       1710
Ember:           4641         10
Backbone:        1135        302
AngularJS:      43485       1370
Angular2:         117          2
Vue:             1277          2
jQuery:          1027        200
Preact:            18          0
Inferno:        32407          0
Elm:                0          0
Flight:          2928       1301


So most calls are from AngularJS and Inferno 

AngularJS:

Num     Type           Disp Enb Address            What
15      breakpoint     keep y   0x00007fffe8d37729 in js::str_toLowerCase(JSContext*, unsigned int, JS::Value*) at /home/andre/hg/mozilla-inbound/js/src/jsstr.cpp:1013
        stop only if result==linear.ptr
        breakpoint already hit 41960 times
16      breakpoint     keep y   0x00007fffe8d37729 in js::str_toLowerCase(JSContext*, unsigned int, JS::Value*) at /home/andre/hg/mozilla-inbound/js/src/jsstr.cpp:1013
        breakpoint already hit 43746 times



Inferno:

Num     Type           Disp Enb Address            What
15      breakpoint     keep y   0x00007fffe8d37729 in js::str_toLowerCase(JSContext*, unsigned int, JS::Value*) at /home/andre/hg/mozilla-inbound/js/src/jsstr.cpp:1013
        stop only if result==linear.ptr
        breakpoint already hit 21124 times
16      breakpoint     keep y   0x00007fffe8d37729 in js::str_toLowerCase(JSContext*, unsigned int, JS::Value*) at /home/andre/hg/mozilla-inbound/js/src/jsstr.cpp:1013
        breakpoint already hit 32590 times


Most AngularJS calls come from here [1] with |"false".toLowerCase()| and |"true".toLowerCase()|, so we'd need to optimize traversing a small, Latin-1 string which is already lower-cased. :-/
It still could be useful to search for optimization possibilities given that str_toLowerCase if the fourth most called built-in per bug 1365361.

(*) All frameworks run once from InteractiveRunner.html
[1] https://github.com/WebKit/webkit/blob/80b1e93d42aac34fddd652bb6f7749d26a265d36/PerformanceTests/Speedometer/resources/todomvc/architecture-examples/angularjs/node_modules/angular/angular.js#L18303-L18305
For the angular case, maybe we can fold toLowerCase in Ion?
(Assignee)

Comment 2

2 years ago
(In reply to Jan de Mooij [:jandem] from comment #1)
> For the angular case, maybe we can fold toLowerCase in Ion?

Unfortunately the input strings aren't constant, so I don't think we can just constant-fold the calls. (Or did I misunderstand "fold toLowerCase in Ion"?)

The AngularJS code looks like:

    actual = lowercase('' + actual);
    expected = lowercase('' + expected);
    return actual.indexOf(expected) !== -1;

where the "lowercase" function is defined as |var lowercase = function(string) {return isString(string) ? string.toLowerCase() : string;};|.


I was able to improve the performance it a bit by doing more or less useful things like:
- Manually inlining unicode::CanLowerCase and then adding MOZ_LIKELY around the ASCII case.
- And removing the two roots in str_toLowerCase. (Rooting analysis even seems to be okay with that change: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2d1821b2be1c116181b2667c25ec93c3a6a2a91f)

These two changes improved this test case from ~170ms to ~140ms for me. For comparison: V8 needs ~115ms, Chakra ~120ms, and JSC 55ms (https://bugs.webkit.org/show_bug.cgi?id=162887).

  function f() {
    var ss = ["true", "false"];
    var q = 0;
    var t = dateNow();
    for (var i = 0; i < 5000000; ++i) {
      q += ss[i & 1].toLowerCase().length;
    }
    return [dateNow() - t, q];
  }
  for (var i = 0; i < 10; ++i) print(f());
(In reply to André Bargull from comment #2)
> Unfortunately the input strings aren't constant, so I don't think we can
> just constant-fold the calls. (Or did I misunderstand "fold toLowerCase in
> Ion"?)

I was thinking maybe we know more about the types/strings if Ion is able to inline the "comparator" function and then inlines the "lowercase" function as well, but we probably won't have good enough type information anyway.

I wonder what perf would look like with a MIR instruction that just calls into the VM. In your micro-benchmark it would also make it easier for Ion to optimize the array accesses because then it knows the call cannot affect the array length.
(Assignee)

Comment 4

2 years ago
(In reply to Jan de Mooij [:jandem] from comment #3)
> I wonder what perf would look like with a MIR instruction that just calls
> into the VM. In your micro-benchmark it would also make it easier for Ion to
> optimize the array accesses because then it knows the call cannot affect the
> array length.

Paired with MOZ_LIKELY for the ASCII case, it improves the µ-benchmark to ~95ms!
(Assignee)

Updated

2 years ago
Assignee: nobody → andrebargull
Status: NEW → ASSIGNED
(Assignee)

Comment 5

2 years ago
I've changed StringToLowerCase/StringToUpperCase to take a HandleString instead of a HandleLinearString, so we can directly call the function from the generated code. Apart from that, the patch simply adds a vm-call for str_toLowerCase/str_toUpperCase.
Attachment #8892433 - Flags: review?(jdemooij)
(Assignee)

Comment 6

2 years ago
If we want to squeeze out a few more ms, we could add specialised CanUpperCase/CanLowerCase functions for Latin-1 characters:

If we assume most Latin-1 strings are actually ASCII strings, we can add MOZ_LIKELY around the ASCII case. And to make non-ASCII, Latin-1 strings a bit faster, we can avoid the CharInfo lookup by using some mild bit-twiddling. (Is this still acceptable or is already "too clever"?)
Attachment #8892437 - Flags: review?(jdemooij)
(Assignee)

Comment 7

2 years ago
(In reply to André Bargull from comment #5)
> Created attachment 8892433 [details] [diff] [review]
> bug1383647-part1-vmcall-from-ion.patch

Note to self: This patch doesn't apply cleanly on inbound.
(Assignee)

Comment 8

2 years ago
Comment on attachment 8892437 [details] [diff] [review]
bug1383647-part2-latin1-specialization.patch

Review of attachment 8892437 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/Unicode.h
@@ +299,5 @@
>  }
>  
> +// Returns true iff ToUpperCase(ch) != ch.
> +inline bool
> +CanUpperCase(Latin1Char ch)

JS::Latin1Char here and below
(Assignee)

Comment 9

2 years ago
Comment on attachment 8892433 [details] [diff] [review]
bug1383647-part1-vmcall-from-ion.patch

Review of attachment 8892433 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/MIR.h
@@ +7501,5 @@
> +    TRIVIAL_NEW_WRAPPERS
> +    NAMED_OPERANDS((0, string))
> +
> +    bool congruentTo(const MDefinition* ins) const override {
> +        return congruentIfOperandsEqual(ins);

Also needs to test that |mode_| matches.
(Assignee)

Comment 10

2 years ago
Try-servering found two small issues, noted above. I'll probably post fixed patches later today.
(Assignee)

Comment 11

2 years ago
Updated part 1 so it applies cleanly on inbound and to fix the issue noted in comment #9.
Attachment #8892433 - Attachment is obsolete: true
Attachment #8892433 - Flags: review?(jdemooij)
Attachment #8892576 - Flags: review?(jdemooij)
(Assignee)

Comment 12

2 years ago
Updated part two to fix the comment described in comment #8.
Attachment #8892437 - Attachment is obsolete: true
Attachment #8892437 - Flags: review?(jdemooij)
Attachment #8892577 - Flags: review?(jdemooij)
Comment on attachment 8892576 [details] [diff] [review]
bug1383647-part1-vmcall-from-ion.patch

Review of attachment 8892576 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, LGTM. Sorry for the delay.
Attachment #8892576 - Flags: review?(jdemooij) → review+
Comment on attachment 8892577 [details] [diff] [review]
bug1383647-part2-latin1-specialization.patch

Review of attachment 8892577 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/Unicode.h
@@ +307,5 @@
> +
> +    // U+00B5 and U+00E0 to U+00FF, except U+00F7, have an uppercase form.
> +    bool canUpper = ch == MICRO_SIGN ||
> +                    (((ch & ~0x1F) == LATIN_SMALL_LETTER_A_WITH_GRAVE) && ch != DIVISION_SIGN);
> +    MOZ_ASSERT(canUpper == CanUpperCase(char16_t(ch)));

Yeah this is almost too clever :) but there are only 128 char possibilities and this assert verifies all cases are handled correctly, so IMO this is fine.
Attachment #8892577 - Flags: review?(jdemooij) → review+
(Assignee)

Comment 16

2 years ago
(In reply to Jan de Mooij [:jandem] from comment #15)
> ::: js/src/vm/Unicode.h
> @@ +307,5 @@
> > +
> > +    // U+00B5 and U+00E0 to U+00FF, except U+00F7, have an uppercase form.
> > +    bool canUpper = ch == MICRO_SIGN ||
> > +                    (((ch & ~0x1F) == LATIN_SMALL_LETTER_A_WITH_GRAVE) && ch != DIVISION_SIGN);
> > +    MOZ_ASSERT(canUpper == CanUpperCase(char16_t(ch)));
> 
> Yeah this is almost too clever :) but there are only 128 char possibilities
> and this assert verifies all cases are handled correctly, so IMO this is
> fine.

TBH I think the test in CanLowerCase is even worse: and'ing |ch & 0xD7| to exclude U+00D7 and U+00DF. :-)

Comment 18

2 years ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a0e69aaf7f47
Part 1: Use direct vm calls for String.prototype.toLower/UpperCase. r=jandem
https://hg.mozilla.org/integration/mozilla-inbound/rev/68ba6a81d15d
Part 2: Add unicode::Can{Lower,Upper}Case specialized for Latin1 characters. r=jandem
Keywords: checkin-needed

Comment 19

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/a0e69aaf7f47
https://hg.mozilla.org/mozilla-central/rev/68ba6a81d15d
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
You need to log in before you can comment on or make changes to this bug.