Closed Bug 999003 Opened 6 years ago Closed 3 years ago

Number.toLocaleString() without locale returns localized digits specified by the OS

Categories

(Core :: JavaScript: Internationalization API, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox29 - wontfix
firefox30 + wontfix
firefox31 - wontfix
firefox32 + wontfix
firefox33 + wontfix
firefox34 - wontfix
firefox-esr52 --- wontfix
firefox55 --- fixed

People

(Reporter: guigs, Assigned: zbraniecki)

References

Details

(4 keywords)

Attachments

(2 files)

Attached image pageinfo.png
Description: References: [https://support.mozilla.org/en-US/questions/993357]
In the interface of a new installation of Firefox, some numbers and encoding in the page information are displaying in Arabic. 

Screenshot: http://img809.imageshack.us/img809/4203/s0z1.jpg or attachments

Troubleshooting taken: 
*Started in 29 beta 1 -5(noticed first in 5)

*i think the problem in the new interface not with my language, when i tried all Firefox versions i tried a clean install with no addons and no other programs installed in the system

* is this encoding turned on in the operating system?
Edit: the Arabic language is installed, and the vm tested was hosted by the win 7 machine, it may have been detected. 
  For Windows settings:  Under Supplemental Language Support, tick the box that says Install Files for Complex Scripts and Right-to-Left Languages (including Thai).
@smontagu have you seen something like this before?
Flags: needinfo?(smontagu)
As far as I know we don't honour the numbering system set in the operating system, though probably we should (bug 151374).

Does bidi.numeral in about:config have a non-default setting?
Flags: needinfo?(smontagu)
the problem is fixed when i changed Format from: Arabic (Egypt) to English (United States)

Screenshots: with the problem http://img838.imageshack.us/img838/8786/x2c5.png

Problem fixed: http://img841.imageshack.us/img841/4336/5g1k.png

settings i didn't change: http://img843.imageshack.us/img843/8582/lm4a.png http://img845.imageshack.us/img845/5056/a9oh.png

but i can't keep it English (United States) i need to use it as Arabic (Egypt)

note: i use Firefox since v4 with Arabic (Egypt) and no problem like this before 

the problem not in page info only it with download dialog too look to this pic again

https://support.cdn.mozilla.net/media/uploads/images/2014-04-19-21-09-55-911af7.png

some numbers English and some Arabic not all of them with same language

and yes bidi.numeral in about:config have a non-default setting
what do you mean?
We need to find out exactly when this started happening in nightly builds to understand what caused it. I hope to have time to look at it more this week, but I'm not sure I have enough information to reproduce the bug. What is the your setting of bidi.numeral?
bidi.numeral is default Value 0 but when i change it to 2 it fix the problem

this problem started with the new Firefox interface
Regression range: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=56b0a41985f3&tochange=c22969eec61d

Very probably caused by bug 853301.

(In reply to BALTAGY from comment #7)
> bidi.numeral is default Value 0 but when i change it to 2 it fix the problem

That is a good work-around until the bug is fixed (though 3 might be better in some cases)
(In reply to BALTAGY from comment #4)
> the problem not in page info only it with download dialog too look to this
> pic again
> 
> https://support.cdn.mozilla.net/media/uploads/images/2014-04-19-21-09-55-
> 911af7.png
> 
> some numbers English and some Arabic not all of them with same language

That looks like a worse bug than the page info bug: it's using "٥" (U+0665 ARABIC-INDIC DIGIT FIVE) as the decimal point symbol!
Component: Internationalization → JavaScript Engine
OS: Mac OS X → All
Hardware: x86 → All
Summary: Page info numbers in arabic → ToLocaleString with undefined locale uses localized digits specified by the OS
My assumption is that the js i18n API shouldn't have changed the behaviour of legacy toLocaleString() with no locale or options: apparently it is using the localized digits specified in the Windows control panel. Chrome on the same system doesn't do that.

A knock-on effect from this causes the bug described in comment 9, because other code assumes that the result of toLocaleString() will be using ASCII digits:

 return this.gDecimalSymbol = Number(5.4).toLocaleString().match(/\D/);

at http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/downloads/DownloadUtils.jsm#49
Summary: ToLocaleString with undefined locale uses localized digits specified by the OS → toLocaleString with undefined locale uses localized digits specified by the OS
Attached file Simple testcase
Simple testcase
Attachment #8414364 - Attachment description: toLocaleString.html → Simple testcase
Keywords: intl
yes i have chrome too and no problem
Waldo, do you agree that the intl API should not have changed the result of Number.toLocaleString() with no locale or option arguments?
Flags: needinfo?(jwalden+bmo)
We'll track this for a forward fix in upcoming versions (not 29, this is not a driver for another dot release).
(In reply to Simon Montagu :smontagu from comment #11)
> Created attachment 8414364 [details]
> Simple testcase
> 
> Simple testcase

I do not wish to take this off topic and this observation may be irrelevant or wrong. When I try the test case above I get ١٢٣٬٤٥٦٫٧٨٩

The thousands separator is an apostrophe or similar, whereas maybe I would have expected a comma. Based on Reading this article I may be seeing an incorrect separator 
< http://msdn.microsoft.com/en-gb/goglobal/bb688127.aspx#eqg
> With this function, formatting 123456.789 would become:
>  For English (United States)	123,456.789
>  For Arabic (Egypt)	١٢٣,٤٥٦.٧٨٩

(Tested using Using en-GB or en-US Firefox on Ubuntu )
Blocks: 1009795
Duplicate of this bug: 1010535
Bug 1009795 says (new Date()).toLocaleString() is also affected.
Summary: toLocaleString with undefined locale uses localized digits specified by the OS → toLocaleString with undefined locale returns a localized value specified by the OS
Duplicate of this bug: 1010219
Updated the compat doc based on the current reports and my quick test. Corrections welcome.
https://developer.mozilla.org/en-US/Firefox/Releases/29/Site_Compatibility
Summary: toLocaleString with undefined locale returns a localized value specified by the OS → Number.toLocaleString() without locale returns localized digits specified by the OS
Duplicate of this bug: 998746
Duplicate of this bug: 1010136
Duplicate of this bug: 1009795
No longer blocks: 1009795
Duplicate of this bug: 1005464
Flags: needinfo?(yfdyh000)
(In reply to John Hesling [:John99] from comment #17)
> I do not wish to take this off topic and this observation may be irrelevant
> or wrong. When I try the test case above I get ١٢٣٬٤٥٦٫٧٨٩
> 
> The thousands separator is an apostrophe or similar, whereas maybe I would
> have expected a comma.

٬ is the Unicode character U+066C ARABIC THOUSANDS SEPARATOR. I'm not sure why the MSDN article uses a comma as separator, but maybe there are some Arabic-speaking countries that do use the comma (with the "arabic-indic" digits, i.e. not the Magreb countries that use the ASCII digits and the European style "." as thousands separator and "," as decimal separator)
Flags: needinfo?(yfdyh000)
Duplicate of this bug: 1013091
Firefox 30 will be shipped soon. How can we solve the bug? I'm not sure about spec but at least the Firefox UI defect should be fixed.
Is the decimal separator available through the JS Internationalization API?
A quick solution might be specifying the en-US locale for toLocaleString?
http://dxr.mozilla.org/mozilla-central/search?q=toLocaleString+ext%3Ajs
Naïve question:
The OP specified in the OS that all apps should use arabic format for dates (and numbers). Why are they expecting to see en-US? Doesn't sound like a bug to me..
And isn't it easiest to just override the associated env variable(s) for the Firefox process?
(In reply to Kohei Yoshino [:kohei] from comment #30)
> A quick solution might be specifying the en-US locale for toLocaleString?
> http://dxr.mozilla.org/mozilla-central/search?q=toLocaleString+ext%3Ajs

The purpose of this code was to use period as the decimal separator for some locales. It's easier to hard-code "," than specifying the en-US locale.
I guess toLocaleString previously returned a number string in English regardless of the browser locale. So toLocaleString('en-US') would match the former implementation.

Or we can probably use toLocaleString(navigator.language) to localize the number while avoiding the mismatch issue.
If the legacy toLocaleString() always formats a number in en-US, why the fix for bug 597852 worked?
Ah, then my guess was wrong.
We're way past the point for taking on speculative fixes in FF30 so this will have to target FF31 for a proper fix.
Sorry for massive delay here, distracted by a monstrous bit of security work that's largely eaten the last two months for me...

(In reply to Simon Montagu :smontagu from comment #10)
> My assumption is that the js i18n API shouldn't have changed the behaviour
> of legacy toLocaleString() with no locale or options

This isn't the case, as Kohei notes through comment 15's second link.

The Internationalization API retrofits the various toLocaleString methods (and Date-specific methods for date/time separately, which would be bug 1009795) in ES5 to be performed in terms of calls to the Intl.* APIs.  So for example:

13.2.1 Number.prototype.toLocaleString ([locales [, options]])

This definition supersedes the definition provided in ES5, 15.7.4.3.

When the toLocaleString method is called with optional arguments locales and options, the following steps are taken:

1. Let x be this Number value (as defined in ES5, 15.7.4).
2. If locales is not provided, then let locales be undefined.
3. If options is not provided, then let options be undefined.
4. Let numberFormat be the result of creating a new object as if by the expression new Intl.NumberFormat(locales, options) where Intl.NumberFormat is the standard built-in constructor defined in 11.1.3.
5. Return the result of calling the FormatNumber abstract operation (defined in 11.3.2) with arguments numberFormat and x.

(The FormatNumber operation referred to here is, I think, equivalent to numberFormatObject.format(x) here.)

The old JSAPI locale conversion hooks, set using XPCLocale and JSLocaleCallbacks, basically go completely unused with Internationalization built.

> A knock-on effect from this causes the bug described in comment 9, because
> other code assumes that the result of toLocaleString() will be using ASCII
> digits:
> 
>  return this.gDecimalSymbol = Number(5.4).toLocaleString().match(/\D/);
> 
> at
> http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/downloads/
> DownloadUtils.jsm#49

...yeah, that's not what it should be doing.  It should create a NumberFormat object, then use that to format the number in the desired way.  I haven't taken the time to read them yet to evaluate understandability from a developer perspective, but I understand we have MDN documentation for all the Intl stuff, that explains how to do locale-sensitive number formatting:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/NumberFormat

(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #29)

...and you think you're lagging...

> Is the decimal separator available through the JS Internationalization API?

No.  The primitives that it exposes (I have fairly high on my stack of things to do writing a blog post introducing Internationalization) are largely limited to get-some-values-and-process them.  The API is designed to abstract around these sorts of low-level details about how things are or are not formatted, decimal separators being such a language implementation detail (to put it one way).  The download manager wants to format a number in a certain number of digits, pretty much; that appears pretty easily doable using Intl.NumberFormat, without needing to assume the existence of, and feature-sniff, the decimal separator.
Flags: needinfo?(jwalden+bmo)
Simon, we are going to release beta 5 next Friday. If we want to fix for this, the fix will have to land soon.
Flags: needinfo?(smontagu)
Just like 29 and 30, we won't block the release because of this bug. However, we want to keep the tracking flags for the next releases.
any hope this problem will be fixed soon? it's 3 months now and i try every new beta but same problem!!
Jeff, could you help here? Thanks
Flags: needinfo?(jwalden+bmo)
Sadly, I need to won't fix this again for 32 as we've made no progress. ni on Naveed to see if we can make progress even if that's to be explicit in that we're not going to fix this bug.
Flags: needinfo?(nihsanullah)
Fixing the Page Info use should be pretty simple.  browser/base/content/pageinfo/pageInfo.js, change the formatNumber function currently on line 1188 to do something different than it does now:

function formatNumber(number)
{
  return (+number).toLocaleString();  // coerce number to a numeric value before calling toLocaleString()
}

But having reread this bug, I don't actually know *what* people want here.  Apparently using the user's locale to format the number doesn't make people happy (why, I don't know).  Comment 33 makes the "obvious" (to me, as not particularly an i18n expert) solution of adding providing an "en-US" argument to toLocaleString.  Why isn't that acceptable?  Is it simply that nobody actually is in charge enough here, to affirmatively say how Page Info should behave?

I'm going to assume, given that lack of attention, that the problem is this is miscategorized as a JS engine bug, and that front end people can say what they actually want here.  Because I sure don't know...
Component: JavaScript Engine → Page Info Window
Flags: needinfo?(jwalden+bmo)
Product: Core → Firefox
Version: 29 Branch → Trunk
For completeness, I should note that as far as toLocaleString behavior itself goes, near-trunk v8 gives me this:

[jwalden@find-waldo-now src]$ LANG=ar-EG ~/google/v8/out/x64.debug/d8 
V8 version 3.28.70 (candidate) [console: readline]
d8> Number(123456.789).toLocaleString()
"١٢٣٬٤٥٦٫٧٨٩"
d8> Number(123456.789).toLocaleString("ar-EG")
"١٢٣٬٤٥٦٫٧٨٩"
d8> 
[jwalden@find-waldo-now src]$ LANG=en-US ~/google/v8/out/x64.debug/d8 
V8 version 3.28.70 (candidate) [console: readline]
d8> Number(123456.789).toLocaleString()
"123,456.789"
d8> Number(123456.789).toLocaleString("ar-EG")
"١٢٣٬٤٥٦٫٧٨٩"

v8 just (July ~18) updated to the same ICU and CLDR we're using, so it's probable that their behavior with Number.prototype.toLocaleString will become consistent with ours very soon.  I don't think there's an underlying JS engine behavior to change here.
Putting my UI hat on (no, it's not a dunce cap, what makes you say that?), there's a threshold question to answer.  What's the purpose of displaying sizes (cache sizes, image dimensions, scaling, etc.), anyway?

If it's a web developer display, then normal English decimal numbers and separators make sense, because the language of web development (to the extent of the names in JS, and in JS only treating English numbers as numbers when tokenizing, etc.) is English.  One might imagine a developer putting an image in a page, then using Page Info to determine the size to put in the height/width attributes on the <img> tag, for example.  In such a scenario numbers that can be copied make a lot of sense.  And even if not literally copied, the formatting matches the mental mode of thinking about how to write code.  On the other hand, if it's a display for users to know a little bit about the page they're viewing, then locale-specific formatting makes sense.

Given Page Info isn't in the Web Developer submenu, I'd have thought it was designed for users, and therefore locale-specific formatting would be right.  But comments here say otherwise, so what do I know.  Maybe it's in some uncanny valley of trying not to be too webdev or something.
I don't think we should just use en-US here. While the decimal separator might make sense to all web devs, the thousands separator won't.

I think we should use the same language as used for the rest of the UI. Is that what toLocaleString() used to do?

I also think toLocaleString() should default to navigator.language. This is usually the same as the UI language but not necessarily, as the user can change this in our language settings (Options/Preferences -> Content -> Languages).
The locale used by toLocaleString() by default is as determined by this:

    char *locale, *lang, *p;
#ifdef HAVE_SETLOCALE
    locale = setlocale(LC_ALL, nullptr);
#else
    locale = getenv("LANG");
#endif
    // convert to a well-formed BCP 47 language tag
    if (!locale || !strcmp(locale, "C"))
        locale = const_cast<char*>("und");
    lang = JS_strdup(this, locale);
    if (!lang)
        return nullptr;
    if ((p = strchr(lang, '.')))
        *p = '\0';
    while ((p = strchr(lang, '_')))
        *p = '-';
    // use lang as locale

JS_SetDefaultLocale will override this.  We call it in xpc_LocalizeRuntime, which is called on every startup, so the above possibly doesn't matter too much except for the JS shell.  (I set it out in case I'm misreading anything.)

The value passed to JS_SetDefaultLocale is as determined by getting the locale service, asking for the application locale, then querying that for the "NISLOCALE_TIME" category.  This category stuff is managed by hashtable, so is not easily determined except by the particular code paths that constructed the locale.  The application locale is constructed in an OS-specific manner from quite variable sources of information, so there's really no good way to describe exactly what feeds in here.

The algorithm used by navigator.language is to select the first element from intl.accept_languages, which typically is taken from chrome://global/locale/intl.properties.  No idea what our policies are about what goes in there.

Changing xpc_LocalizeRuntime to check intl.accept_languages seems like the simple fix, providing predictable behavior, if just using what navigator.language uses is desired.  But I don't know the ramifications of changing the locale provided to JSAPI beyond this single place, or of having the locale service provide different information from what JSAPI sees.

All of which is to say, why in the world do we have so many different ways to do all this.  :-(
Here's my personal POV. I'm also flagging gandalf with a needinfo, 'cause he's been sitting in the i18n spec talks longer than I did.

We should use the document language, followed by the UI language, followed by the OS language. If we'd like to insert accept-lang, I'd put it last or second to last.

This is different to what we used to do in the past, I think. IIRC, we just used what libc returned, so a French user on a Japanese OS visiting a German page would get Japanese number formats. Or was that just for Date? One of those things that support toLocaleString behaved like that.

I've never been an intimate friend of that approach UI/UX wise.
Flags: needinfo?(gandalf)
Intl says that the default locale is determined by a DefaultLocale() method: "The DefaultLocale abstract operation returns a String value representing the structurally valid (6.2.2) and canonicalized (6.2.3) BCP 47 language tag for the host environment’s current locale."

I can sort of squint really hard and read "host environment" as accommodating the language of the document (which JS would view in terms of the language of the "realm", which is to say the global object, the window containing that document).  But it's an awkward fit because elsewhere "host environment" is used in concert with "current time zone" in a possessive sense.  It's pretty unnatural to think of individual realms having a different current time zone.
Hah, and I'd go for: UI language, followed by the OS language, because that's how I read http://ecma-international.org/ecma-402/1.0/#sec-6.2.4 - Borrowing UI/UX hat I'd not want to see document's locale leaking onto browser's UI.
Flags: needinfo?(gandalf)
(In reply to Zibi Braniecki [:gandalf] from comment #50)
> Hah, and I'd go for: UI language, followed by the OS language, because
> that's how I read http://ecma-international.org/ecma-402/1.0/#sec-6.2.4 -
> Borrowing UI/UX hat I'd not want to see document's locale leaking onto
> browser's UI.

This concern seems misguided. This API is primarily for web content, so you'll possibly see the browser UI's language leak into web content, not the other way around. That's why navigator.language as the default makes more sense to me. In our UI code we could explictly pass the UI locale to toLocaleString, so the default behavior for web content doesn't necessarily matter.
(In reply to Dão Gottwald [:dao] from comment #51)
> This concern seems misguided. This API is primarily for web content, so
> you'll possibly see the browser UI's language leak into web content, not the
> other way around. That's why navigator.language as the default makes more
> sense to me. In our UI code we could explictly pass the UI locale to
> toLocaleString, so the default behavior for web content doesn't necessarily
> matter.

Oh, I got confused by the pageinfo example given above.

In that case I agree with comment 48 - document's language, UI language, followed by OS language.
It appears that there are two issues here: What UI wants and what the API wants. Both aspects are getting actively discussed here.

When we have mostly reached agreement Jeff will implement the change (if any).
Assignee: nobody → jwalden+bmo
Flags: needinfo?(nihsanullah)
Since this bug is super quiet, and just like for the last few releases, I am going to wontfix this one for 33 and untrack it for 34...

Please resubmit with a patch if any.
6 months no fix, i don't know how much a problem will  take to solve it, anyway thanks
Reproducible
Version 	47.0.1
Build ID 	20160623154057
User Agent 	Mozilla/5.0 (Windows NT 10.0; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0
I hope :waldo will forgive me, but I'm taking this. I think I have a patch.
Assignee: jwalden+bmo → gandalf
Component: Page Info Window → JavaScript: Internationalization API
Product: Firefox → Core
Flags: needinfo?(smontagu)
I think this has been fixed by me in bug 1346674 by setting the default locale for JS environment to be the app locale, instead of OS locale.
Marking as fixed. If you encounter it again, please, reopen.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Depends on: 1346674
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.