Closed Bug 1380227 Opened 2 years ago Closed 2 years ago

Avoid many UTF16toUTF8 and UTF8toUTF16 conversions in nsStringBundle

Categories

(Core :: Internationalization, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox56 --- fixed

People

(Reporter: njn, Assigned: njn)

References

Details

Attachments

(1 file, 2 obsolete files)

nsStringBundle use entails a lot of unnecessary UTF8toUTF16 and UTF16oUTF8
conversions. We can get rid of some of them.
Most of the names passed to nsIStringBundle::{Get,Format}StringFromUTF8Name
have one of the two following forms:

- a 16-bit C string literal, which is then converted to an 8-bit string in
  order for the lookup to occur;

- an 8-bit C string literal converted to a 16-bit string, which is then
  converted back to an 8-bit string in order for the lookup to occur.

This patch introduces and uses alternative methods that can take an 8-bit C
string literal, which requires changing some signatures in other methods and
functions. It replaces almost all C++ uses of the old methods. It doesn't touch
JS uses of the old methods, because it's less clear exactly what conversions
happen across the JS/C++ boundary.

The change reduces the number of NS_ConvertUTF8toUTF16 and
NS_ConvertUTF16toUTF8 conversions while running Speedometer v2 from ~270,000 to
~160,000. (Most of these conversions involved the string
"deprecatedReferrerDirective" in nsCSPParser.cpp.)

It also makes the code slightly nicer in a lot of places.
Attachment #8885572 - Flags: review?(VYV03354)
Priority: -- → P3
Rebased patch.
Attachment #8885581 - Flags: review?(VYV03354)
Attachment #8885572 - Attachment is obsolete: true
Attachment #8885572 - Flags: review?(VYV03354)
Comment on attachment 8885581 [details] [diff] [review]
Introduce and use nsIStringBundle::{Get,Format}StringFromUTF8Name

Review of attachment 8885581 [details] [diff] [review]:
-----------------------------------------------------------------

::: intl/strres/nsIStringBundle.idl
@@ +29,5 @@
>  interface nsIStringBundle : nsISupports
>  {
>    wstring GetStringFromID(in long aID);
> +  // XXX: in practice, all or almost all names are ASCII string literals, so
> +  // the wstring form can be removed at some point.

The commit message says that you're not sure about js callers, so I'd not make that claim in the IDL until we're sure about the js callers.

Also, I'd prefer to not have XXX comments, if we actually can make that statement for js call sites, let's deprecate this API instead.
> The commit message says that you're not sure about js callers, so I'd not
> make that claim in the IDL until we're sure about the js callers.
> 
> Also, I'd prefer to not have XXX comments, if we actually can make that
> statement for js call sites, let's deprecate this API instead.

I'm happy to remove the XXX markers and change the comments to indicate that the UTF8Name methods are preferred because they don't require a 16-to-8-bit conversion.

Official deprecation via the [deprecated] attribute might be too much, given there are still two C++ uses of these methods, lots of JS uses, and lots of C++ uses in Thunderbird (alas).
Comment on attachment 8885581 [details] [diff] [review]
Introduce and use nsIStringBundle::{Get,Format}StringFromUTF8Name

Review of attachment 8885581 [details] [diff] [review]:
-----------------------------------------------------------------

::: intl/strres/nsIStringBundle.idl
@@ +34,1 @@
>    wstring GetStringFromName(in wstring aName);

How about changing this to AUTF8String for JS callers?
If you care about C++ signature compatibility for Thunderbird, you can do something like:
 [noscript, binaryName(GetStringFromName)] wstring GetStringFromUTF16Name(in wstring aName);
 [binaryName(GetStringFromNameJS)] wstring GetStringFromName(in AUTF8String aName);

@@ +34,2 @@
>    wstring GetStringFromName(in wstring aName);
> +  wstring GetStringFromUTF8Name(in string aName);

Please mark this as [noscript] if you employ the above suggestion.
Here's a slightly different version. I've kept the old methods for JS, but
changed the argument to AUTF8String as you suggested. I've introduced the new
methods for C++, but done it in such a way that the same names are used in both
languages, even though the functions are slightly different.
  
For Thunderbird, my plan is to just write a patch updating its C++ code for the
change.
Attachment #8886040 - Flags: review?(VYV03354)
This might be breaking change for comm-central.
Thanks for the heads-up.
Attachment #8886040 - Flags: review?(VYV03354) → review+
Attachment #8885581 - Attachment is obsolete: true
Attachment #8885581 - Flags: review?(VYV03354)
https://hg.mozilla.org/integration/mozilla-inbound/rev/129793760f2d718ed61ba17d9aacd57c7ec6785b
Bug 1380227 - Avoid many UTF16toUTF8 and UTF8toUTF16 conversions in nsStringBundle. r=emk.
Pushed by nnethercote@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/129793760f2d
Avoid many UTF16toUTF8 and UTF8toUTF16 conversions in nsStringBundle. r=emk.
https://hg.mozilla.org/mozilla-central/rev/129793760f2d
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Summary: Introduce and use nsIStringBundle::{Get,Format}StringFromUTF8Name → Avoid many UTF16toUTF8 and UTF8toUTF16 conversions in nsStringBundle
Duplicate of this bug: 199527
You need to log in before you can comment on or make changes to this bug.