nsIPrefBranch should have methods to get/set unicode strings

RESOLVED FIXED in Firefox 55

Status

()

enhancement
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: florian, Assigned: florian)

Tracking

54 Branch
mozilla55
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox55 fixed)

Details

Attachments

(6 attachments, 1 obsolete attachment)

As discussed in bug 1338306 comment 5 and bug 1338306 comment 6, getCharPref does not handle non-ASCII characters and we should provide a JS-counterpart of Preferences::GetString.

The

  Services.prefs.getComplexValue(pref, Ci.nsISupportsString).data;

and

  let str = Cc["@mozilla.org/supports-string;1"]
              .createInstance(Ci.nsISupportsString);
  str.data = val;
  Services.prefs.setComplexValue(aPrefName, Ci.nsISupportsString, str);

are patterns we should eliminate.
Posted patch Patch (obsolete) — Splinter Review
Attachment #8844696 - Flags: review?(benjamin)
Comment on attachment 8844696 [details] [diff] [review]
Patch

Review of attachment 8844696 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/libpref/nsIPrefBranch.idl
@@ +139,5 @@
> +  [optional_argc,binaryname(GetStringPrefWithDefault)]
> +  AUTF8String getStringPref(in string aPrefName,
> +                            [optional] in AUTF8String aDefaultValue);
> +  [noscript,binaryname(GetStringPref)]
> +  AUTF8String getStringPrefXPCOM(in string aPrefName);

Do we need this? C++ callers can use Preferences::GetString(). We had to add methods for other methods due to compatibility, but new methods have no compatibility problem.
"this" means the noscript getStringPrefXPCOM method.
Posted patch Patch v2Splinter Review
Addressed comment 2.
Attachment #8844832 - Flags: review?(benjamin)
Attachment #8844696 - Attachment is obsolete: true
Attachment #8844696 - Flags: review?(benjamin)
Attachment #8844832 - Flags: review?(benjamin) → review+
Posted file xpcshell script
Attachment #8847605 - Flags: review?(jaws)
Attached separately for easier review, but I will fold this into the script generated patch before landing.
Attachment #8847607 - Flags: review?(jaws)
Attachment #8847608 - Attachment description: add an eslint rule to reject usage of {get,set}ComplexValue for nsISupportsString and suggest {get,set}StringPref instead, → add an eslint rule
Attachment #8847608 - Flags: review?(jaws)
This can be folded into the patch adding the eslint rule when landing.
Attachment #8847609 - Flags: review?(jaws)
Attachment #8847605 - Flags: review?(jaws) → review+
Attachment #8847607 - Flags: review?(jaws) → review+
Attachment #8847608 - Flags: review?(jaws) → review+
Attachment #8847609 - Flags: review?(jaws) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/6e8fab91c7b5330502facd1317d1ddcb824c96b6
Bug 1345294 - nsIPrefBranch should have methods to get/set unicode strings, r=bsmedberg.

https://hg.mozilla.org/integration/mozilla-inbound/rev/5a8192a650e92565aa2e85721569dad58cc1922c
Bug 1345294 - script generated patch (+ some hand cleanup) to replace {get,set}ComplexValue for nsISupportsString by {get,set}StringPref, r=Mossop.

https://hg.mozilla.org/integration/mozilla-inbound/rev/79bacf0ea46664a09a1f09996e70d0d3a4af042d
Bug 1345294 - add an eslint rule to reject usage of {get,set}ComplexValue for nsISupportsString and suggest {get,set}StringPref instead, and make it pass, r=Mossop.
As far as I can tell, libpref actually stores strings internally as Latin1. E.g. modules/libpref/test/unit/data/testPref.js has this test pref:

> user_pref("testPref.char2", "älskar");

Furthermore, nsIPrefBranch::{get,set}CharPref() work fine with Latin1 strings.

If you call nsIPrefBranch::getCharPref() from JS on that test pref you'll get an XPConnect error, because it'll try to interpret the non-ASCII Latin1 string "älskar" as UTF-8, but it's not valid UTF-8.

Maybe things are ok so long as nsIPrefBranch::{get,set}StringPref() are used in tandem. Though I'm wondering how UTF-8 string values work when written to and read from file.

I haven't worked out all the details here yet, but it's a bit worrying.
I don't follow. getCharPref() will never try to interpret non-ASCII values as UTF-8. getStringPref() does.
(In reply to Masatoshi Kimura [:emk] from comment #15)
> I don't follow. getCharPref() will never try to interpret non-ASCII values
> as UTF-8. getStringPref() does.

A problem occurs if a pref is written with a Latin1 string value (either via setCharPref() or from reading a prefs file) and then you call getStringPref() on it, which tries to interpret it as UTF-8.
Oh, I see the problem: where I wrote "If you call nsIPrefBranch::getCharPref() from JS on that test pref" in comment 14, I should have written "If you call nsIPrefBranch::getStringPref() from JS on that test pref". Apologies for the error.
(In reply to Nicholas Nethercote [:njn] from comment #17)
> Oh, I see the problem: where I wrote "If you call
> nsIPrefBranch::getCharPref() from JS on that test pref" in comment 14, I
> should have written "If you call nsIPrefBranch::getStringPref() from JS on
> that test pref". Apologies for the error.

There's a problem if someone uses non-ascii characters with {get,set}CharPref (this worked for the latin1 case but has never been supported, and that's the reason why we needed getComplexValue(Ci.nsISupportsString)), or if someone mixes the {get,set}CharPref methods with the {get,set}StringPref methods.
Right. So internally it uses 8-bit chars, and the encoding used is determined by how the pref is set: prefs set by reading from file or by setCharPref() are effectively Latin1, and prefs set by setStringPref() are UTF-8. And as long as you use the appropriate getter things work out.
(In reply to Nicholas Nethercote [:njn] from comment #19)
> prefs set by reading from file or by setCharPref() are effectively Latin1,

Is it always latin1? I was afraid it would be the operating system's current default encoding.
(In reply to Florian Quèze [:florian] [:flo] from comment #20)
> (In reply to Nicholas Nethercote [:njn] from comment #19)
> > prefs set by reading from file or by setCharPref() are effectively Latin1,
> 
> Is it always latin1? I was afraid it would be the operating system's current
> default encoding.

It is always latin1 because we don't depend on operating system's feature to convert getCharPref() results and setCharPref() params.
Depends on: 1415567
Duplicate of this bug: 353812
You need to log in before you can comment on or make changes to this bug.