Closed Bug 165200 Opened 22 years ago Closed 20 years ago

Number.toLocaleString() returns incorrect value

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED

People

(Reporter: pavils, Assigned: tor)

References

()

Details

(Keywords: helpwanted, intl, l12y)

Attachments

(1 file, 6 obsolete files)

The toLocaleString method of Number object doesn't return correct value on systems where the Decimal symbol setting of Regional settings is set to other character than "." (dot).
What does it return? What would you expect it to return? Note from ECMA-262: 15.7.4.3 Number.prototype.toLocaleString() Produces a string value that represents the value of the Number formatted according to the conventions of the host environment’s current locale. This function is implementation-dependent, and it is permissible, but not encouraged, for it to return the same thing as toString. I suspect we may be going down the 'not encouraged' route.
Note this is similar to bug 153586, "Date.prototype.toLocaleString ( ) not implemented correctly" Pavils: can you give an example of what you expect, and what you are getting instead? Thanks -
Assignee: rogerl → khanson
Status: UNCONFIRMED → NEW
Ever confirmed: true
Pavels is doing something like: foo = 5 / 2; foo.toLocaleString(); and getting "5.2" instead of the European "5,2"
That's right, I should have been explaining this more precisely. In my country, the decimal separator is "," (comma), therefore number 1.23 would be known as "1,23". At the same time, many people start switching to "." (dot). That means, I can never be sure whether my intranet application client would have "." or "," for decimal separator. I need to format numbers according to users locale because my application generates tab-delimited datasheet which can be copied in Excel for further analysis. Excel would not recognize numbers as numbers, and hence the problem. I see that here Mozilla goes by "not encouraged" way, yet is should be dead simple to read the locale value and use that instead of ".". Unfortunately I am not C-savvy, so I can't really produce a patch by myself.
Raising severity to major - this is an important issue for our international users.
Severity: normal → major
Summary: Number.toLocaleString returns incorrect value → Number.toLocaleString() returns incorrect value
Attached patch Proposed solution (obsolete) — Splinter Review
Here's a strawman patch - if the locale specifies a non-'.' decimal, substitute it in the result string. Trouble is, I can't get it to work except by twiddling in the debugger. The locale struct always contains the default "C" locale, regardless of rebooting my machine etc.
Any comments, reviews for this patch?
I'd expect it to honor all of the windows locale settings, so 100000.1 for me should be: 100,000.1 and for biesi it'd probably be 100.000,1 as such it isn't sufficient to just twiddle the decimal. I belive that the thousand sep can actually be a hundreds or tens of thousands sep if it likes (it can also be nothing). there's an api for this (since win16) but i can't find it right now
Doing locale stuff requires total platform specific code (similar to what is in nsLocale right now) You can't rely on the C locale for all platforms. Maybe we should pull some of the nslocale code into JS? We need date and number formatting to be locale specific.
Attachment #101605 - Attachment is obsolete: true
Keywords: helpwanted, intl, l12y
OS: Windows 98 → All
Hardware: PC → All
I have corrected a typo in the back-up decimal separator. Also, I have corrected the back-up thousand separator to apostrophe. (Looking through Windows' Regional Settings, I found that the official languages of Switzerland (French, German, Italian) all use ' as thousands and . as decimal separators.) I think, this is the most unambiguous and clear behaviour one would expect of Mozilla in the case where the locale does not have information about separators. Once again, this is not the default behaviour, it is a back-up one. P.S. We should use either apostrophe or white-space for the thousand separator, not commas.
Attachment #134712 - Attachment is obsolete: true
Using ' as a fallback seperator is something that nobody will expect. I'd think using a C locale fallback of no thousands seperator and a period as the decimal seperator would lessen the element of surprise.
(In reply to comment #12) > Using ' as a fallback seperator is something that nobody will expect. I'd > think using a C locale fallback of no thousands seperator and a period as > the decimal seperator would lessen the element of surprise. This is intended to be readable by a human, not a machine. The idea is to actually use a separator, so we must use some good separator. Apostrophe (or space) is the best possible back-up separator. (I agree, it might look strange to some people from the first time, but it is not ambiguous.)
Attachment #145222 - Attachment is obsolete: true
Attachment #147766 - Flags: review?(brendan)
We should not use commas for grouping as a fall-back, as it would be very confusing for some peoples. We should use something that could be accepted by everyone, and comma as a group separator is not going to fit. Here are the pluses of apostrophe for thousand and period for decimal separators: 1. Regional Settings of Windows XP claim that this notation is used at the home of the ISO -- Switzerland. They have several official languages in the country, and in the original countries of those languages (Germany, France, Italy), the opposite of the British notation is used. Therefore, Swiss had to come up with some unambiguous notation (and they did). 2. This might be funny to you, but most calculators (like CASIO) use exactly this notation --- apostrophes for thousand and periods for decimal separators. Go to the US site of CASIO and see it yourself. Have you ever wondered why? Because most people are guaranteed to understand it, or at least not to be mislead by such a notation.
Comment on attachment 147766 [details] [diff] [review] use british fallback (comma/thousands, period/decimal) >+ /* stuff we only need to do once */ >+ if (!locale) { >+ locale = localeconv(); >+ if (locale->thousands_sep) >+ thousands = strdup(locale->thousands_sep); >+ else >+ thousands = ","; >+ if (locale->decimal_point) >+ decimal = strdup(locale->decimal_point); >+ else >+ decimal = "."; >+ >+ thousandsLength = strlen(thousands); >+ decimalLength = strlen(decimal); >+ } This is not thread-safe -- put it in a function and #ifdef JS_THREADSAFE, use PR_CallOnce? Or perhaps better (we don't use PR_CallOnce anywhere yet -- people do use JS_THREADSAFE but not NSPR, so we'd be making work for those folks), put this state in the JSRuntime and init it in js_InitRuntimeNumberState. And free the strdup'd stuff in js_FinishRuntimeNumberState. Nit: how about "Separator" or "Sep" in the names of these things? Dumb question: FC1 man 7 locale shows char *grouping; in struct lconv -- how is that supposed to be used in conjunction with decimal_point and thousands_sep? >+ >+ /* create the string, move back to bytes to make string >+ twiddling a bit easier and so we can insert platform >+ charset seperators. */ >+ if (!num_toString(cx, obj, 0, argv, rval)) >+ return JS_FALSE; >+ JS_ASSERT(JSVAL_IS_STRING(*rval)); >+ numStr = JSVAL_TO_STRING(*rval); >+ num = js_GetStringBytes(numStr); This is enough code that it seems better to me to inline the guts of js_NumberToString, except for the js_NewStringCopyZ at the end: jsint i; char buf[DTOSTR_STANDARD_BUFFER_SIZE]; char *numStr; if (JSDOUBLE_IS_INT(d, i)) { numStr = IntToString(i, buf, sizeof buf); } else { numStr = JS_dtostr(buf, sizeof buf, DTOSTR_STANDARD, 0, d); if (!numStr) { JS_ReportOutOfMemory(cx); return JS_FALSE } } That way you get non-const access to buf, which is on the stack. >+ >+ /* find bit before the decimal */ >+ if ((dec = strchr(num, '.')) != NULL) >+ digits = dec - num; Nit, here and below: expand tabs, don't insert them into js* files. >+ else >+ digits = strlen(num); >+ end = num+digits; Nit: space around binary operators is general prevailing style. >+ >+ /* figure out how long resulting string will be */ >+ size = digits + >+ (dec ? decimalLength + strlen(dec+1) : 0) + >+ thousandsLength * ((digits - 1) / 3); >+ if ((digits > 3) && ((digits % 3) == 1) && (*num == '-')) Nit: overparenthesized relation/equality expressions. >+ size -= thousandsLength; >+ >+ buf = (char *) JS_malloc(cx, size); Unchecked malloc. There's no need for this if we can treat non-null decimal_point and thousands_sep members of struct lconv as pointing to exactly one non-NUL char. Then you could add the worst case to the local buf's dimension: char buf[((DTOSTR_STANDARD_BUFFER_SIZE * 4) / 3) + 1]; (or something like that), and of course use DTOSTR_STANDARD_BUFFER_SIZE, not sizeof buf, when calling JS_dtostr. >+ >+ tmpDest = buf; >+ tmpSrc = num; >+ >+ while (tmpSrc < num + (digits % 3)) >+ *(tmpDest++) = *(tmpSrc++); Nit: overparenthesized * operand here and below, although not consistently. >+ while (tmpSrc < end) { >+ if ((tmpDest != buf) && (tmpDest[-1] != '-')) { >+ strcpy(tmpDest, thousands); >+ tmpDest += thousandsLength; >+ } >+ *(tmpDest++) = *(tmpSrc++); >+ *(tmpDest++) = *(tmpSrc++); >+ *(tmpDest++) = *(tmpSrc++); >+ } >+ >+ if (dec) { >+ strcpy(tmpDest, decimal); >+ tmpDest += decimalLength; >+ strcpy(tmpDest, dec+1); Is there no need for thousands_sep in long fractional part strings? >+ } else >+ *tmpDest++ = '\0'; >+ >+ str = JS_NewStringCopyZ(cx, buf); Unchecked JS_NewStringCopyZ -- if (!str), return JS_FALSE (but don't leak buf). >+ *rval = STRING_TO_JSVAL(str); >+ JS_free(cx, buf); If you have to malloc buf, use js_NewString to hand off its memory to the deflated string cache, freeing buf only if (!str). >+ >+ return JS_TRUE; > } Hope this helps, /be
(In reply to comment #16) > Dumb question: FC1 man 7 locale shows char *grouping; in struct lconv -- how is > that supposed to be used in conjunction with decimal_point and thousands_sep? It's a small pandora's box that allows grouping for something other than thousands. For instance "320" would have the first seperator at the thousand point, then at the hundreds thereafter. What we're used to is "30" (330 for some reason in the fc1 locale).
Attachment #147766 - Attachment is obsolete: true
Comment on attachment 147766 [details] [diff] [review] use british fallback (comma/thousands, period/decimal) tor: can you request review from me on the latest patch? Thanks, /be
Attachment #147766 - Flags: review?(brendan)
Attachment #148116 - Flags: review?(brendan)
Comment on attachment 148116 [details] [diff] [review] address most of brendan's comments, handle grouping >Index: jscntxt.h >=================================================================== >RCS file: /cvsroot/mozilla/js/src/jscntxt.h,v >retrieving revision 3.63 >diff -u -9 -p -r3.63 jscntxt.h >--- jscntxt.h 13 Apr 2004 01:25:17 -0000 3.63 >+++ jscntxt.h 10 May 2004 16:51:28 -0000 >@@ -209,18 +209,23 @@ struct JSRuntime { > JSScopeProperty *propertyFreeList; > JSArenaPool propertyArenaPool; > > /* Script filename table. */ > struct JSHashTable *scriptFilenameTable; > #ifdef JS_THREADSAFE > PRLock *scriptFilenameTableLock; > #endif > >+ /* Number localization, used by jsnum.c */ >+ char *thousandsSeperator; >+ char *decimalSeperator; Please spell these properly (separator, not seperator). >+ char *numGrouping; const char * for all of these. > static JSBool > num_toLocaleString(JSContext *cx, JSObject *obj, uintN argc, > jsval *argv, jsval *rval) > { >-/* >- * For now, forcibly ignore the first (or any) argument and return toString(). >- * ECMA allows this, although it doesn't 'encourage it'. >- * [The first argument is being reserved by ECMA and we don't want it confused >- * with a radix] >- */ >- return num_toString(cx, obj, 0, argv, rval); >+ char *thousandsSeperator, thousandsLength; >+ char *decimalSeperator, decimalLength; Ditto for the pointers (tranpose decl matrix!). >+ char *numGrouping; Constipate here too. >+ JSRuntime *rt = cx->runtime; >+ JSString *numStr, *str; >+ char *num, *buf, *dec, *end, *tmpSrc, *tmpDest, *tmpGroup; >+ int digits, size, remainder, nrepeat = 0; >+ >+ if (rt->thousandsSeperator) >+ thousandsSeperator = rt->thousandsSeperator; >+ else >+ thousandsSeperator = "'"; >+ if (rt->decimalSeperator) >+ decimalSeperator = rt->decimalSeperator; >+ else >+ decimalSeperator = "."; >+ if (rt->numGrouping) >+ numGrouping = rt->numGrouping; >+ else >+ numGrouping = "\3\0"; Use ?: rather than if/else. >+ /* create the string, move back to bytes to make string >+ twiddling a bit easier and so we can insert platform >+ charset seperators. */ Nit: major comment style should follow "when in Rome". >+ if (!num_toString(cx, obj, 0, argv, rval)) >+ return JS_FALSE; >+ JS_ASSERT(JSVAL_IS_STRING(*rval)); >+ numStr = JSVAL_TO_STRING(*rval); >+ num = js_GetStringBytes(numStr); >+ >+ /* find bit before the decimal */ >+ if ((dec = strchr(num, '.')) != NULL) >+ digits = dec - num; Tab alert, eek! Please expand all tabs. >+ else >+ digits = strlen(num); Ditto. >+ if (dec) { >+ strcpy(tmpDest, decimalSeperator); >+ tmpDest += decimalLength; >+ strcpy(tmpDest, dec + 1); >+ } else >+ *tmpDest++ = '\0'; When-in-Rome says to fully brace else clause if then clause is braced. >+ str = JS_NewString(cx, buf, size); >+ if (!str) { >+ JS_free(cx, buf); >+ return JS_FALSE; >+ } >+ >+ *rval = STRING_TO_JSVAL(str); >+ >+ return JS_TRUE; > } [snip] >+ locale = localeconv(); I wonder what ports will burn over this, or related. Don't let that stop you! >+ if (locale->thousands_sep) >+ rt->thousandsSeperator = strdup(locale->thousands_sep); >+ if (locale->decimal_point) >+ rt->decimalSeperator = strdup(locale->decimal_point); >+ if (locale->grouping) >+ rt->numGrouping = strdup(locale->grouping); >+ > return JS_TRUE; How about using JS_strdup and using return rt->thousandsSeparator && rt->decimalSeparator && rt->numGrouping; To propagate OOM? > } > > void > js_FinishRuntimeNumberState(JSContext *cx) > { > JSRuntime *rt = cx->runtime; > > js_UnlockGCThingRT(rt, rt->jsNaN); > js_UnlockGCThingRT(rt, rt->jsNegativeInfinity); > js_UnlockGCThingRT(rt, rt->jsPositiveInfinity); > > rt->jsNaN = NULL; > rt->jsNegativeInfinity = NULL; > rt->jsPositiveInfinity = NULL; >+ >+ if (rt->thousandsSeperator) { >+ free(rt->thousandsSeperator); >+ rt->thousandsSeperator = NULL; >+ } >+ if (rt->decimalSeperator) { >+ free(rt->decimalSeperator); >+ rt->decimalSeperator = NULL; >+ } >+ if (rt->numGrouping) { >+ free(rt->numGrouping); >+ rt->numGrouping = NULL; >+ } Use JS_free (it's null-safe) and just set all three rt->* to NULL after. Code size win, and it coheres with early failure return from js_InitRuntimeNumberState (failure there causes js_NewContext to js_DestroyContext, which will js_FinishRuntimeNumberState). One more patch and I'll gladly r=. Thanks, /be
Attachment #148116 - Flags: review?(brendan) → review-
Assignee: khanson → tor
Attachment #148116 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Comment on attachment 153421 [details] [diff] [review] Address all of brendan's comments except for "[snip]" (?) >+ numGrouping = rt->numGrouping ? rt->numGrouping : "\3\0"; The \0 isn't necessary, but doesn't hurt -- take it or leave it (but see below if you leave it). >+ /* Find bit before the decimal. */ >+ if ((dec = strchr(num, '.')) != NULL) >+ digits = dec - num; >+ else >+ digits = strlen(num); Sorry I missed this, it too could use ?: (same left-hand side of assignment), and pulling the nested assignment out of the if condition into a preceding statement would match local style. >+ while (*tmpGroup != CHAR_MAX && *tmpGroup != 0) { >+ if (*tmpGroup >= remainder) >+ break; >+ size += thousandsLength; >+ remainder -= *tmpGroup; >+ tmpGroup++; >+ } >+ if (*tmpGroup == 0 && *numGrouping != 0) { Use '\0' instead of 0 when comparing *tmpGroup, to match "\3\0"? I'd leave the explicit \0 in the string literal, and use the matching char escape here. >+ nrepeat = (remainder - 1) / *(tmpGroup - 1); >+ size += thousandsLength * nrepeat; >+ remainder -= nrepeat * *(tmpGroup - 1); Twice above, tmpGroup[-1] might be more readable, especially after the * operator in the last line above. >- return JS_TRUE; >+ locale = localeconv(); >+ if (locale->thousands_sep) >+ rt->thousandsSeparator = JS_strdup(cx, locale->thousands_sep); >+ if (locale->decimal_point) >+ rt->decimalSeparator = JS_strdup(cx, locale->decimal_point); >+ if (locale->grouping) >+ rt->numGrouping = JS_strdup(cx, locale->grouping); >+ >+ return rt->thousandsSeparator && rt->decimalSeparator && rt->numGrouping; Argh, I suggested that return without noticing that any of locale->* could be null. Here's a thought: instead of doing the defaulting in num_toLocaleString on every call, quote: + thousandsSeparator = rt->thousandsSeparator ? rt->thousandsSeparator : "'"; + decimalSeparator = rt->decimalSeparator ? rt->decimalSeparator : "."; + numGrouping = rt->numGrouping ? rt->numGrouping : "\3\0"; Do it here, once, at runtime initialization time, and always strdup so you can always free: + rt->thousandsSeparator = JS_strdup(cx, locale->thousands_sep ? locale->thousands_sep : "'"); + rt->decimalSeparator = JS_strdup(cx, locale->decimal_point ? locale->decimal_point : "."); + rt->numGrouping = JS_strdup(cx, locale->grouping ? locale->grouping : "\3\0"); + + return rt->thousandsSeparator && rt->decimalSeparator && rt->numGrouping; Saves runtime assuming num_toLocaleString is called more than once, should be same code size. Sorry if I missed this, but can you remind me why the default thousands separator should be "'", not ","? r=me with these changes. /be
Attachment #153421 - Flags: review+
(In reply to comment #22) > Sorry if I missed this, but can you remind me why the default thousands > separator should be "'", not ","? Brendan, see comment 15 of this bug. Thanks.
Attached patch round seven...Splinter Review
Attachment #153421 - Attachment is obsolete: true
Comment on attachment 153440 [details] [diff] [review] round seven... Hey, I r+'d last time with the changes listed. But since you patched, I must review again! >@@ -287,23 +289,99 @@ num_toString(JSContext *cx, JSObject *ob > *rval = STRING_TO_JSVAL(str); > return JS_TRUE; > } > > static JSBool > num_toLocaleString(JSContext *cx, JSObject *obj, uintN argc, > jsval *argv, jsval *rval) > { >-/* >- * For now, forcibly ignore the first (or any) argument and return toString(). >- * ECMA allows this, although it doesn't 'encourage it'. >- * [The first argument is being reserved by ECMA and we don't want it confused >- * with a radix] >- */ >- return num_toString(cx, obj, 0, argv, rval); >+ char thousandsLength, decimalLength; >+ const char *numGrouping, *tmpGroup; >+ JSRuntime *rt = cx->runtime; >+ JSString *numStr, *str; >+ char *num, *buf, *dec, *end, *tmpSrc, *tmpDest; >+ int digits, size, remainder, nrepeat = 0; Nit: initialize rt below, after early returns whose predecessor basic blocks don't need it. Ditto nrepeat. Thanks again for fixing this. I wonder whether any ports will break. Let's find out! /be
Attachment #153440 - Flags: review+
Checked in (phew). Checking in jscntxt.h; /cvsroot/mozilla/js/src/jscntxt.h,v <-- jscntxt.h new revision: 3.65; previous revision: 3.64 done Checking in jsnum.c; /cvsroot/mozilla/js/src/jsnum.c,v <-- jsnum.c new revision: 3.58; previous revision: 3.57 done
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
From OS2GCC Depend build on ports tinderbox: jsnum.c gcc -o jsnum.o -c -DOSTYPE=\"OS22\" -DOSARCH=\"OS2\" -DEXPORT_JS_API -DJS_USE_SAFE_ARENA -I../../dist/include/js -I../../dist/include -IE:/OS2_2.45_Clobber/mozilla/obj/dist/include/nspr -IE:/OS2_2.45_Clobber/mozilla/js/src -Wall -W -Wno-unused -Wpointer-arith -Wcast-align -Wno-long-long -pedantic -Zomf -pipe -DNDEBUG -DTRIMMED -O2 -s -include ../../mozilla-config.h -DMOZILLA_CLIENT -Uunix -U__unix -U__unix__ -Wp,-MD,.deps/jsnum.pp E:/OS2_2.45_Clobber/mozilla/js/src/jsnum.c In file included from E:/OS2_2.45_Clobber/mozilla/js/src/jscntxt.h:50, from E:/OS2_2.45_Clobber/mozilla/js/src/jsnum.c:57: E:/OS2_2.45_Clobber/mozilla/js/src/jsdhash.h:510: warning: `regparm' attribute is incompatible with earlier defined `stdcall' attribute E:/OS2_2.45_Clobber/mozilla/js/src/jsnum.c: In function `js_InitRuntimeNumberState': E:/OS2_2.45_Clobber/mozilla/js/src/jsnum.c:568: dereferencing pointer to incomplete type E:/OS2_2.45_Clobber/mozilla/js/src/jsnum.c:568: dereferencing pointer to incomplete type E:/OS2_2.45_Clobber/mozilla/js/src/jsnum.c:570: dereferencing pointer to incomplete type E:/OS2_2.45_Clobber/mozilla/js/src/jsnum.c:570: dereferencing pointer to incomplete type E:/OS2_2.45_Clobber/mozilla/js/src/jsnum.c:572: dereferencing pointer to incomplete type E:/OS2_2.45_Clobber/mozilla/js/src/jsnum.c:572: dereferencing pointer to incomplete type Looks like there's no struct lconv on OS/2 GCC in locale.h
Blocks: numbers
For OS/2 the struct declaration should be added to \gcc32\include\locale.h struct lconv { char *decimal_point; char *thousands_sep; char *grouping; char *int_curr_symbol; char *currency_symbol; char *mon_decimal_point; char *mon_thousands_sep; char *mon_grouping; char *positive_sign; char *negative_sign; char int_frac_digits; char frac_digits; char p_cs_precedes; char p_sep_by_space; char n_cs_precedes; char n_sep_by_space; char p_sign_posn; char n_sign_posn; }; Would be nice to mention this on OS/2 builds instruction page...
The very latest GCC fixes this. We shouldn't add lconv, we should require people to get the latest CSD for GCC.
This is marked as Fixed, but I can still reproduce in Firefox 0.9.3. Is it supposed to be fixed?
It was checked into the trunk after firefox branched.
Is anyone working on moving the Number.toLocaleString() code to firefox? It still isn't in Firefox version 1.0.4.
you say that as if ff 1.0.4 was new. it isn't.
(In reply to comment #32) > Is anyone working on moving the Number.toLocaleString() code to firefox? It > still isn't in Firefox version 1.0.4. Firefox is not a fork of the Mozilla source tree, you know -- the branch on which 1.0.x is being maintained will not have all the great new features in the trunk. Try Deer Park Alpha 1 (http://www.mozilla.org/projects/firefox/). /be
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: