Inconsistent default locale checks for cached Intl instances

RESOLVED FIXED in Firefox 59

Status

()

RESOLVED FIXED
2 years ago
2 months ago

People

(Reporter: anba, Assigned: anba)

Tracking

Trunk
mozilla59
Points:
---

Firefox Tracking Flags

(firefox59 fixed)

Details

Attachments

(2 attachments, 1 obsolete attachment)

(Assignee)

Description

2 years ago
Date.prototype.toLocale[Date,Time]String() validates that the current locale matches the cached locale [1], but no equivalent checks are performed for Number.prototype.toLocaleString() [2] and String.prototype.localeCompare() [3].

Question: Do we support changing the default locale at runtime?


[1] https://dxr.mozilla.org/mozilla-central/rev/0ddfec7126ec503b54df9c4b7c3b988906f6c882/js/src/builtin/Date.js#59
[2] https://dxr.mozilla.org/mozilla-central/rev/0ddfec7126ec503b54df9c4b7c3b988906f6c882/js/src/builtin/Number.js#28
[3] https://dxr.mozilla.org/mozilla-central/rev/0ddfec7126ec503b54df9c4b7c3b988906f6c882/js/src/builtin/String.js#583
(Assignee)

Comment 1

a year ago
@gandalf: Do we need to fix this for bug 1349401?
Flags: needinfo?(gandalf)
Nope.

If I understand the flow correctly, fixing bug 1349401 will actually make this more meaningful because once we fix bug 1349401 JS context's locale will actually be representing browser's UI locale (and maybe, in the future, website's locale).

This bug on the other hand, is all about invalidating the cache when this happens.
What we could do, once we have bug 1349401 is to invalidate the cache there instead of testing it on each call here.
Flags: needinfo?(gandalf)
(Assignee)

Comment 3

a year ago
Ah okay, I was under the impression that bug 1349401 allows to change the JS context's locale at runtime, because that won't work correctly with the current Number.prototype.toLocaleString() and String.prototype.localeCompare() implementation.
> Ah okay, I was under the impression that bug 1349401 allows to change the JS context's locale at runtime,

It is!

> because that won't work correctly with the current Number.prototype.toLocaleString() and String.prototype.localeCompare() implementation.

And we'll need to fix this bug eventually as well, but it doesn't block :)
(Assignee)

Comment 5

a year ago
Created attachment 8855343 [details] [diff] [review]
bug1319843.patch

Patch to check current RuntimeDefaultLocale() matches the locale of the saved NumberFormat/Collator. 

Calling RuntimeDefaultLocale() leads to measurable performance regression (~30% slower) in µ-benchmarks calling Number.prototype.toLocaleString and String.prototype.localeCompare, so we probably shouldn't apply the patch and instead try to use the approach suggested in comment #2.
Any plans to land this? :)
Flags: needinfo?(andrebargull)
(Assignee)

Comment 7

9 months ago
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #6)
> Any plans to land this? :)

I think we first want to fix the performance regression noted in comment #5 before we can land the change.
Flags: needinfo?(andrebargull)
(Assignee)

Comment 8

9 months ago
I took another look at this patch and it seems most of the performance regressions come from allocating the string in RuntimeDefaultLocale(). So if we add a new |IsRuntimeDefaultLocale(locale)| function to test if a given string matches the default locale, we should be able to avoid or at least reduce these performance regressions.
(Assignee)

Comment 9

9 months ago
Created attachment 8931709 [details] [diff] [review]
bug1319843-part1-avoid-string-alloc.patch

Adds `IsRuntimeDefaultLocale(...)` and `intl_isDefaultTimeZone(...)` as the counterparts to `RuntimeDefaultLocale()` and `intl_defaultTimeZone()` to check if a given value is the default locale resp. time zone. That way we can avoid the extra string allocations in `RuntimeDefaultLocale()` and `intl_defaultTimeZone()` when we just need to check if the cached locale (or time zone) is still the default one.

This improves the following Date.prototype.toLocaleString() µ-benchmark from 740ms to 700ms:
---
function f() {
    var date = new Date();
    var t = dateNow();
    for (var i = 0; i < 100000; ++i) { date.toLocaleString(); }
    return dateNow() - t;
}
for (var i = 0; i < 10; ++i) print(f());
---

And this String.prototype.toLocaleUpperCase() µ-benchmark from 105ms to 50ms:
---
function f() {
    var t = dateNow();
    for (var i = 0; i < 1000000; ++i) { "aaa".toLocaleLowerCase(); }
    return dateNow() - t;
}
for (var i = 0; i < 10; ++i) print(f());
---
Assignee: nobody → andrebargull
Status: NEW → ASSIGNED
Attachment #8931709 - Flags: review?(gandalf)
(Assignee)

Updated

9 months ago
Attachment #8855343 - Attachment is obsolete: true
(Assignee)

Comment 10

9 months ago
Created attachment 8931710 [details] [diff] [review]
bug1319843-part2-check-current-locale.patch

Check the current locale matches the locale of the cached Collator (for String.prototype.localeCompare) resp. the cached NumberFormat (for Number.prototype.toLocaleString).


The performance for this Number.prototype.toLocaleString µ-benchmark decreases from 225ms to 250ms:
---
function f() {
    var t = dateNow();
    for (var i = 0; i < 1000000; ++i) { NaN.toLocaleString(); }
    return dateNow() - t;
}
for (var i = 0; i < 10; ++i) print(f());
---

And the performance for this String.prototype.localeCompare µ-benchmark decreases from 90ms to 185ms (higher iteration count compared to the Number.prototype.toLocaleString benchmark from above!):
---
function f() {
    var t = dateNow();
    for (var i = 0; i < 5000000; ++i) { "".localeCompare(""); }
    return dateNow() - t;
}
for (var i = 0; i < 10; ++i) print(f());
---
Attachment #8931710 - Flags: review?(gandalf)
(Assignee)

Comment 11

9 months ago
The µ-benchmarks in comment #10 use `NaN.toLocaleString()` resp. `"".localeCompare("")` to measure only the effect of the added default locale check and to minimize the cost of the actual Collator or NumberFormat operation. For example in this µ-benchmark  the relative cost of the locale check is only ~14% instead of 100% slower when compared to comment #10.

---
// 145 -> 165 per each to |f|.
function f() {
    var t = dateNow();
    for (var i = 0; i < 1000000; ++i) "aaaaa".localeCompare("aaaaA");
    return dateNow() - t;
}
for (var i = 0; i < 10; ++i) print(f());
---
Attachment #8931710 - Flags: review?(gandalf) → review+
Comment on attachment 8931709 [details] [diff] [review]
bug1319843-part1-avoid-string-alloc.patch

Review of attachment 8931709 [details] [diff] [review]:
-----------------------------------------------------------------

That looks great! Thanks for the patch. It's a subtle improvement that we may have more of to do in the Intl.cpp/js.

Few notes left up to your discretion.

::: js/src/builtin/Intl.cpp
@@ +953,5 @@
>  static const size_t INITIAL_CHAR_BUFFER_SIZE = 32;
>  
> +template <typename ICUStringFunction, size_t InlineCapacity>
> +static int32_t
> +Call(JSContext* cx, const ICUStringFunction& strFn, Vector<char16_t, InlineCapacity>& chars)

What's the value of extracting it out of `Call(JSContext* cx, const ICUStringFunction& strFn)` if the only time you call the extracted function is from `js::intl_isDefaultTimeZone` where you allocate the Vector anyway?

Also, should this be inlined?

@@ +3151,5 @@
> +    if (str->length() == size_t(size)) {
> +        JS::AutoCheckCannotGC nogc;
> +        equals = str->hasLatin1Chars()
> +                 ? EqualChars(str->latin1Chars(nogc), chars.begin(), str->length())
> +                 : EqualChars(str->twoByteChars(nogc), chars.begin(), str->length());

I did some shallow research around timezone IDs, and it seems to me that we may get away with just latin1Chars here (and if not return an early false)?

::: js/src/vm/SelfHosting.cpp
@@ +1836,5 @@
> +        JS::AutoCheckCannotGC nogc;
> +        const Latin1Char* latin1Locale = reinterpret_cast<const Latin1Char*>(locale);
> +        equals = str->hasLatin1Chars()
> +                 ? EqualChars(str->latin1Chars(nogc), latin1Locale, str->length())
> +                 : EqualChars(str->twoByteChars(nogc), latin1Locale, str->length());

Similarly to the timezone - I'm pretty sure that a well-formed BCP47 code can only be latin1. Escape early otherwise?
Attachment #8931709 - Flags: review?(gandalf) → review+
(Assignee)

Comment 13

9 months ago
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #12)
> ::: js/src/builtin/Intl.cpp
> @@ +953,5 @@
> >  static const size_t INITIAL_CHAR_BUFFER_SIZE = 32;
> >  
> > +template <typename ICUStringFunction, size_t InlineCapacity>
> > +static int32_t
> > +Call(JSContext* cx, const ICUStringFunction& strFn, Vector<char16_t, InlineCapacity>& chars)
> 
> What's the value of extracting it out of `Call(JSContext* cx, const
> ICUStringFunction& strFn)` if the only time you call the extracted function
> is from `js::intl_isDefaultTimeZone` where you allocate the Vector anyway?
> 
> Also, should this be inlined?

The existing `Call(JSContext*, const ICUStringFunction&)` [1] function always creates a JSString [2] at the end, but for `intl_isDefaultTimeZone` we only want to call the ICU function without creating the JSString. That's the only reason the additional `Call(JSContext*, const ICUStringFunction&, Vector<char16_t, InlineCapacity>&)` was added. I should have made that more clear in the patch description. Sorry about that!

[1] https://searchfox.org/mozilla-central/rev/8839daefd69087d7ac2655b72790d3a25b6a815c/js/src/builtin/Intl.cpp#957
[2] https://searchfox.org/mozilla-central/rev/8839daefd69087d7ac2655b72790d3a25b6a815c/js/src/builtin/Intl.cpp#977


> @@ +3151,5 @@
> > +    if (str->length() == size_t(size)) {
> > +        JS::AutoCheckCannotGC nogc;
> > +        equals = str->hasLatin1Chars()
> > +                 ? EqualChars(str->latin1Chars(nogc), chars.begin(), str->length())
> > +                 : EqualChars(str->twoByteChars(nogc), chars.begin(), str->length());
> 
> I did some shallow research around timezone IDs, and it seems to me that we
> may get away with just latin1Chars here (and if not return an early false)?

Hmm, yes and no. The way this function is currently used, we always end up with Latin-1 strings, so we could remove the two-byte string paths. But that means we add certain assumptions about how ICU [3,4] (`TimeZone::detectHostTimeZone()` always creates a time zone whose ID consists of `char` and not `char16_t` characters; the same applies for the TZ detection for Windows), other Intl methods [5,6] (`intl_defaultTimeZone` directly returns the result of `ucal_getDefaultTimeZone`; `intl_isDefaultTimeZone` is only called with string previously returned by `intl_defaultTimeZone`), and JSString [7] (`NewStringCopyN` creates a Latin-1 string when the string contents are all representable as Latin-1 characters) works. That's why I prefer to use the canonical way to handle both string types, even if one of the string types should never appear here in practice. 

[3] https://searchfox.org/mozilla-central/rev/8839daefd69087d7ac2655b72790d3a25b6a815c/intl/icu/source/i18n/timezone.cpp#454
[4] https://searchfox.org/mozilla-central/rev/8839daefd69087d7ac2655b72790d3a25b6a815c/js/src/vm/DateTime.cpp#411-418
[5] https://searchfox.org/mozilla-central/rev/8839daefd69087d7ac2655b72790d3a25b6a815c/js/src/builtin/Intl.cpp#3078
[6] https://searchfox.org/mozilla-central/rev/8839daefd69087d7ac2655b72790d3a25b6a815c/js/src/builtin/Intl.js#764
[7] https://searchfox.org/mozilla-central/rev/8839daefd69087d7ac2655b72790d3a25b6a815c/js/src/vm/String.cpp#1374-1375
> but for `intl_isDefaultTimeZone` we only want to call the ICU function without creating the JSString.

Ahh, that makes sense! Thank you! I knew you added it to avoid allocation, but somehow got myself into thinking that you still allocated that new string at the end. My bad!

> That's why I prefer to use the canonical way to handle both string types, even if one of the string types should never appear here in practice. 

Makes sense. I doubt we'll ever get non-ascii chars in timezone ID, but there's no harm in supporting them in case we do :)

Comment 16

9 months ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/32d665c12706
Part 1: Avoid string allocations when checking for default locales and time zones. r=gandalf
https://hg.mozilla.org/integration/mozilla-inbound/rev/fbf50b919f3c
Part 2: Validate current locale matches cached locale for Number.toLocaleString() and String.localeCompare(). r=gandalf
Keywords: checkin-needed

Comment 17

9 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/32d665c12706
https://hg.mozilla.org/mozilla-central/rev/fbf50b919f3c
Status: ASSIGNED → RESOLVED
Last Resolved: 9 months ago
status-firefox59: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
status-firefox53: affected → ---
You need to log in before you can comment on or make changes to this bug.