Closed Bug 1019543 Opened 10 years ago Closed 10 years ago

Improve toLowerCase/toUpperCase

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla34

People

(Reporter: jandem, Assigned: jandem)

References

Details

Attachments

(1 file)

Bug 1019512 makes these functions work with Latin1 strings, but there are some other issues:

* If the string is small, we should use a (fat) inline string.
* If no characters are converted, we should just return the input string.
* ToLowerCase(c) and ToUpperCase(c) could probably be faster, at least if the argument is Latin1Char.
* toUpperCase currently returns a TwoByte string, because ToLowerCase(\u00FF) is a non-Latin1 char. We should only return a TwoByte string if the input contains this character.
(In reply to Jan de Mooij [:jandem] from comment #0)
> * toUpperCase currently returns a TwoByte string, because
> ToLowerCase(\u00FF) is a non-Latin1 char.

Er, s/ToLowerCase/ToUpperCase.
Depends on: 1019512
Attached patch PatchSplinter Review
This patch changes toLowerCase and toUpperCase to return the input string if possible.

Also, instead of relying on NewString to deflate the string in most cases, toUpperCase now first scans the string for Latin1 characters that are outside the Latin1 range when converted to upper case, to avoid an extra malloc/free.
Assignee: nobody → jdemooij
Status: NEW → ASSIGNED
Attachment #8463326 - Flags: review?(luke)
Comment on attachment 8463326 [details] [diff] [review]
Patch

Review of attachment 8463326 [details] [diff] [review]:
-----------------------------------------------------------------

Nice.

::: js/src/jsstr.cpp
@@ +819,5 @@
>  static JSString *
>  ToUpperCase(JSContext *cx, JSLinearString *str)
>  {
> +    typedef ScopedJSFreePtr<Latin1Char> Latin1CharPtr;
> +    typedef ScopedJSFreePtr<jschar> TwoByteCharPtr;

I think nowadays, we're supposed to be using UniquePtr (which is almost the same).  (Don't forget JS::FreeTraits!)

@@ +845,5 @@
> +        // upper case characters are not in the Latin1 range.
> +        bool resultIsLatin1;
> +        if (IsSame<CharT, Latin1Char>::value) {
> +            resultIsLatin1 = true;
> +            for (size_t j = i; j < length; j++) {

Could this loop be fused with the above?
Attachment #8463326 - Flags: review?(luke) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/56861f2edc5d

(In reply to Luke Wagner [:luke] from comment #3)
> I think nowadays, we're supposed to be using UniquePtr (which is almost the
> same).  (Don't forget JS::FreeTraits!)

Done.

> Could this loop be fused with the above?

It could but it'd be slower: right now the second loop starts where the first ends, and if we fuse them there'd be some unnecessary/redundant branches inside the loop.
https://hg.mozilla.org/mozilla-central/rev/56861f2edc5d
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: