Implement caseFirst option in Intl.Collator

RESOLVED FIXED in Firefox 55

Status

()

defect
RESOLVED FIXED
6 years ago
2 years ago

People

(Reporter: mozillabugs, Assigned: anba)

Tracking

(Blocks 1 bug, {dev-doc-complete})

Trunk
mozilla55
All
macOS
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox55 fixed)

Details

(Whiteboard: [parity-chrome] [parity-opera])

Attachments

(2 attachments, 3 obsolete attachments)

(Reporter)

Description

6 years ago
The implementation of Intl.Collator currently does not support the caseFirst option and the associated kf key in the Unicode extension of language tags. caseFirst/kf is an optional feature in the ECMAScript Internationalization API Specification.

The feature is somewhat expensive to implement because ICU does not provide an API to directly determine the default setting of this option per locale. A workaround is to create a UCollator for the locale and examine its attributes.
Mass-moving existing Intl-related bugs to the new Core :: JavaScript: Internationalization API component.

If you think this bug has been moved in error, feel free to move it back to Core :: JavaScript Engine.

[Mass change filter: core-js-intl-api-move]
Component: JavaScript Engine → JavaScript: Internationalization API
FWIW Chrome and Opera already support this, so [parity-chrome] and [parity-opera] should be added to the whiteboard info.

Sebastian
(In reply to Sebastian Zartner from comment #2)
> FWIW Chrome and Opera already support this, so [parity-chrome] and
> [parity-opera] should be added to the whiteboard info.
> 
> Sebastian

Done.

In the future, you're welcome to add that info yourself.
Whiteboard: [parity-chrome] [parity-opera]
(In reply to Gordon P. Hemsley [:GPHemsley] from comment #3)
> (In reply to Sebastian Zartner from comment #2)
> > FWIW Chrome and Opera already support this, so [parity-chrome] and
> > [parity-opera] should be added to the whiteboard info.
> > 
> > Sebastian
> 
> Done.
> 
> In the future, you're welcome to add that info yourself.

I don't have the rights for that, otherwise I would have done it.

Sebastian
Assignee: general → nobody
(Assignee)

Comment 5

3 years ago
Posted patch caseFirst.patch (obsolete) — Splinter Review
Instead of querying ICU to determine which locales use caseFirst=upper, I've simply hard-coded the three (well, actually only two because Old Church Slavonic isn't supported by UCollator *shock*) locales in Intl.js. In a follow-up bug we can extend make_intl_data.py to read the collation files from CLDR.
Assignee: nobody → andrebargull
Status: NEW → ASSIGNED
Attachment #8792577 - Flags: review?(jwalden+bmo)

Comment 6

3 years ago
> ICU does not provide an API to directly determine the default setting of this option per locale 

I'm not sure what was tried, but this is easy to determine. I'll attach some sample code. Please do NOT hard code this.

Comment 7

3 years ago
Posted file caseQuery.cpp
Attachment #8794911 - Attachment mime type: text/x-c++src → text/plain
(Assignee)

Comment 8

2 years ago
Posted patch bug866473.patch (obsolete) — Splinter Review
Updated patch to apply cleanly on inbound and also incorporated the suggestion from comment #6.
Attachment #8792577 - Attachment is obsolete: true
Attachment #8792577 - Flags: review?(jwalden+bmo)
Attachment #8822696 - Flags: review?(jwalden+bmo)
Comment on attachment 8822696 [details] [diff] [review]
bug866473.patch

Review of attachment 8822696 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/builtin/Intl.cpp
@@ +15,1 @@
>  #include "mozilla/Casting.h"

Not alphabetical, won't build on tbpl.

::: js/src/builtin/Intl.js
@@ +1637,5 @@
>  };
>  
>  
> +/**
> + * Returns the default caseFirst values for the given locale and usage.

Maybe note the first element is the default?

::: js/src/tests/Intl/Collator/caseFirst.js
@@ +7,5 @@
> +// Locales which use caseFirst=off for the standard (sort) collation type.
> +const defaultLocales = Intl.Collator.supportedLocalesOf(["en", "de", "es", "sv", "ar", "zh", "ja"]);
> +
> +// Locales which use caseFirst=upper for the standard (sort) collation type.
> +const upperFirstLocales = Intl.Collator.supportedLocalesOf(["cu", "da", "mt"]);

Hmm.  So no locales affirmatively have caseFirst=lower?  That's consistent with the patch, just I guess a little odd.

@@ +135,5 @@
> +    assertEq(colKfLower.resolvedOptions().caseFirst, "lower");
> +    assertEq(colKfLower.compare("A", "a"), 1);
> +
> +    assertEq(colKfUpper.resolvedOptions().caseFirst, "upper");
> +    assertEq(colKfUpper.compare("A", "a"), -1);

Why not check the ("a", "A") behavior as well?

@@ +139,5 @@
> +    assertEq(colKfUpper.compare("A", "a"), -1);
> +}
> +
> +
> +// caseFirst set through options value.

Why don't you have a test of caseFirst (attempted to be) set *both* through a Unicode extension tag and through an options value?  I'm out of practice reading ECMA-402, but https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Collator suggests that the "caseFirst" property supersedes the Unicode locale extension if both are present, and I'd expect that hasn't changed since that was written.  We definitely should have testing of that.

@@ +152,5 @@
> +    assertEq(colKfLower.resolvedOptions().caseFirst, "lower");
> +    assertEq(colKfLower.compare("A", "a"), 1);
> +
> +    assertEq(colKfUpper.resolvedOptions().caseFirst, "upper");
> +    assertEq(colKfUpper.compare("A", "a"), -1);

And ("a", "A") here as well.
Attachment #8822696 - Flags: review?(jwalden+bmo) → review+
(Assignee)

Comment 10

2 years ago
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #9)
> ::: js/src/tests/Intl/Collator/caseFirst.js
> @@ +7,5 @@
> > +// Locales which use caseFirst=off for the standard (sort) collation type.
> > +const defaultLocales = Intl.Collator.supportedLocalesOf(["en", "de", "es", "sv", "ar", "zh", "ja"]);
> > +
> > +// Locales which use caseFirst=upper for the standard (sort) collation type.
> > +const upperFirstLocales = Intl.Collator.supportedLocalesOf(["cu", "da", "mt"]);
> 
> Hmm.  So no locales affirmatively have caseFirst=lower?  That's consistent
> with the patch, just I guess a little odd.

Yes, caseFirst=lower isn't used by any locale. According to the ICU user guide [1], caseFirst=lower isn't generally used:
> There is almost no difference between the Off and Lowercase_First options in terms of results, so
> typically users will not use Lowercase_First: only Off or Uppercase_First.

[1] http://userguide.icu-project.org/collation/concepts#TOC-Main-References
(Assignee)

Comment 11

2 years ago
Posted patch bug866473.patch (obsolete) — Splinter Review
Updated per review comments, carrying r+.
Attachment #8822696 - Attachment is obsolete: true
Attachment #8842863 - Flags: review+

Comment 13

2 years ago
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c92fa71c097e
Implement caseFirst option in Intl.Collator. r=Waldo
Keywords: checkin-needed
Backed out for hazard failures in Intl.cpp:

https://hg.mozilla.org/integration/mozilla-inbound/rev/5e87b990f4b08e32c23e1377a4393d34ec121553

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=08df0e07f0ba02a27e532b25a0d3415166a6bba1&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=81170612&repo=mozilla-inbound

/home/worker/checkouts/gecko/js/src/builtin/Intl.cpp:1471:64: error: 'ucol_openAvailableLocales' was not declared in this scope
/home/worker/checkouts/gecko/js/src/builtin/Intl.cpp:1497:92: error: 'ucol_getAttribute' was not declared in this scope
Flags: needinfo?(andrebargull)
(Assignee)

Comment 15

2 years ago
Forgot to add the placeholder method definitions when the Intl API is disabled. Carrying r+ from Waldo.
Attachment #8842863 - Attachment is obsolete: true
Flags: needinfo?(andrebargull)
Attachment #8843344 - Flags: review+

Comment 17

2 years ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e818ea427a08
Implement caseFirst option in Intl.Collator. r=Waldo
Keywords: checkin-needed
Backed out for failing testGCGrayMarking:

https://hg.mozilla.org/integration/mozilla-inbound/rev/693e0c6a44e0b221805405eb9058ac461db622a2

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=e818ea427a08740381fc55276b83cd1d4aebe87d&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable&filter-resultStatus=success&filter-searchStr=cgc
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=81525363&repo=mozilla-inbound

[task 2017-03-03T19:18:47.159845Z] TEST-PASS | testGCHeapPostBarriers | ok
[task 2017-03-03T19:18:47.159859Z] testGCGrayMarking
[task 2017-03-03T19:18:47.159885Z] TEST-UNEXPECTED-FAIL | testGCGrayMarking | CHECK failed: IsMarkedBlack(target)
[task 2017-03-03T19:18:47.159902Z] testGCFinalizeCallback
Flags: needinfo?(andrebargull)
(Assignee)

Comment 19

2 years ago
Hi jonco, can you take a look at the failure in testGCGrayMarking? It seems to happen because I'm exporting one additional function to the self-hosting global. For example when I removed the export in SelfHosting.cpp:

---
diff --git a/js/src/builtin/Intl.js b/js/src/builtin/Intl.js
--- a/js/src/builtin/Intl.js
+++ b/js/src/builtin/Intl.js
@@ -1623,6 +1623,6 @@ function collatorCaseFirst(locale, usage
         var actualLocale = BestAvailableLocaleIgnoringDefault(availableLocales, locale);
 
-        if (intl_isUpperCaseFirst(actualLocale))
-            return ["upper", "false", "lower"];
+        // if (intl_isUpperCaseFirst(actualLocale))
+        //     return ["upper", "false", "lower"];
     }
 
diff --git a/js/src/vm/SelfHosting.cpp b/js/src/vm/SelfHosting.cpp
--- a/js/src/vm/SelfHosting.cpp
+++ b/js/src/vm/SelfHosting.cpp
@@ -2618,5 +2618,5 @@ static const JSFunctionSpec intrinsic_fu
     JS_FN("intl_GetLocaleInfo", intl_GetLocaleInfo, 1,0),
     JS_FN("intl_ComputeDisplayNames", intl_ComputeDisplayNames, 3,0),
-    JS_FN("intl_isUpperCaseFirst", intl_isUpperCaseFirst, 1,0),
+    // JS_FN("intl_isUpperCaseFirst", intl_isUpperCaseFirst, 1,0),
     JS_FN("intl_IsValidTimeZoneName", intl_IsValidTimeZoneName, 1,0),
     JS_FN("intl_NumberFormat", intl_NumberFormat, 2,0),
---

the failure in testGCGrayMarking went away. I'm a bit lost how to fix this issue. :-(
Flags: needinfo?(andrebargull) → needinfo?(jcoppeard)
Depends on: 1344454
Sorry about that anba, that was my fault.  Patch coming up in bug 1344454.
Flags: needinfo?(jcoppeard)
(Assignee)

Comment 21

2 years ago
(In reply to Jon Coppeard (:jonco) from comment #20)
> Sorry about that anba, that was my fault.  Patch coming up in bug 1344454.

Thanks for the quick response! :-)
(Assignee)

Comment 22

2 years ago
Let's try to land this again now that bug 1344454 has been fixed.

New try to validate cgc builds succeed: https://treeherder.mozilla.org/#/jobs?repo=try&revision=67d7bb777f77dd2b46af8460da46fc0310770b56
Keywords: checkin-needed

Comment 23

2 years ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/124b0ef2d816
Implement caseFirst option in Intl.Collator. r=Waldo
Keywords: checkin-needed

Comment 24

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/124b0ef2d816
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.