Closed Bug 866473 Opened 11 years ago Closed 7 years ago

Implement caseFirst option in Intl.Collator

Categories

(Core :: JavaScript: Internationalization API, defect)

All
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: mozillabugs, Assigned: anba)

References

Details

(Keywords: dev-doc-complete, Whiteboard: [parity-chrome] [parity-opera])

Attachments

(2 files, 3 obsolete files)

The implementation of Intl.Collator currently does not support the caseFirst option and the associated kf key in the Unicode extension of language tags. caseFirst/kf is an optional feature in the ECMAScript Internationalization API Specification.

The feature is somewhat expensive to implement because ICU does not provide an API to directly determine the default setting of this option per locale. A workaround is to create a UCollator for the locale and examine its attributes.
Mass-moving existing Intl-related bugs to the new Core :: JavaScript: Internationalization API component.

If you think this bug has been moved in error, feel free to move it back to Core :: JavaScript Engine.

[Mass change filter: core-js-intl-api-move]
Component: JavaScript Engine → JavaScript: Internationalization API
FWIW Chrome and Opera already support this, so [parity-chrome] and [parity-opera] should be added to the whiteboard info.

Sebastian
(In reply to Sebastian Zartner from comment #2)
> FWIW Chrome and Opera already support this, so [parity-chrome] and
> [parity-opera] should be added to the whiteboard info.
> 
> Sebastian

Done.

In the future, you're welcome to add that info yourself.
Whiteboard: [parity-chrome] [parity-opera]
(In reply to Gordon P. Hemsley [:GPHemsley] from comment #3)
> (In reply to Sebastian Zartner from comment #2)
> > FWIW Chrome and Opera already support this, so [parity-chrome] and
> > [parity-opera] should be added to the whiteboard info.
> > 
> > Sebastian
> 
> Done.
> 
> In the future, you're welcome to add that info yourself.

I don't have the rights for that, otherwise I would have done it.

Sebastian
Assignee: general → nobody
Attached patch caseFirst.patch (obsolete) — Splinter Review
Instead of querying ICU to determine which locales use caseFirst=upper, I've simply hard-coded the three (well, actually only two because Old Church Slavonic isn't supported by UCollator *shock*) locales in Intl.js. In a follow-up bug we can extend make_intl_data.py to read the collation files from CLDR.
Assignee: nobody → andrebargull
Status: NEW → ASSIGNED
Attachment #8792577 - Flags: review?(jwalden+bmo)
> ICU does not provide an API to directly determine the default setting of this option per locale 

I'm not sure what was tried, but this is easy to determine. I'll attach some sample code. Please do NOT hard code this.
Attached file caseQuery.cpp
Attachment #8794911 - Attachment mime type: text/x-c++src → text/plain
Attached patch bug866473.patch (obsolete) — Splinter Review
Updated patch to apply cleanly on inbound and also incorporated the suggestion from comment #6.
Attachment #8792577 - Attachment is obsolete: true
Attachment #8792577 - Flags: review?(jwalden+bmo)
Attachment #8822696 - Flags: review?(jwalden+bmo)
Comment on attachment 8822696 [details] [diff] [review]
bug866473.patch

Review of attachment 8822696 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/builtin/Intl.cpp
@@ +15,1 @@
>  #include "mozilla/Casting.h"

Not alphabetical, won't build on tbpl.

::: js/src/builtin/Intl.js
@@ +1637,5 @@
>  };
>  
>  
> +/**
> + * Returns the default caseFirst values for the given locale and usage.

Maybe note the first element is the default?

::: js/src/tests/Intl/Collator/caseFirst.js
@@ +7,5 @@
> +// Locales which use caseFirst=off for the standard (sort) collation type.
> +const defaultLocales = Intl.Collator.supportedLocalesOf(["en", "de", "es", "sv", "ar", "zh", "ja"]);
> +
> +// Locales which use caseFirst=upper for the standard (sort) collation type.
> +const upperFirstLocales = Intl.Collator.supportedLocalesOf(["cu", "da", "mt"]);

Hmm.  So no locales affirmatively have caseFirst=lower?  That's consistent with the patch, just I guess a little odd.

@@ +135,5 @@
> +    assertEq(colKfLower.resolvedOptions().caseFirst, "lower");
> +    assertEq(colKfLower.compare("A", "a"), 1);
> +
> +    assertEq(colKfUpper.resolvedOptions().caseFirst, "upper");
> +    assertEq(colKfUpper.compare("A", "a"), -1);

Why not check the ("a", "A") behavior as well?

@@ +139,5 @@
> +    assertEq(colKfUpper.compare("A", "a"), -1);
> +}
> +
> +
> +// caseFirst set through options value.

Why don't you have a test of caseFirst (attempted to be) set *both* through a Unicode extension tag and through an options value?  I'm out of practice reading ECMA-402, but https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Collator suggests that the "caseFirst" property supersedes the Unicode locale extension if both are present, and I'd expect that hasn't changed since that was written.  We definitely should have testing of that.

@@ +152,5 @@
> +    assertEq(colKfLower.resolvedOptions().caseFirst, "lower");
> +    assertEq(colKfLower.compare("A", "a"), 1);
> +
> +    assertEq(colKfUpper.resolvedOptions().caseFirst, "upper");
> +    assertEq(colKfUpper.compare("A", "a"), -1);

And ("a", "A") here as well.
Attachment #8822696 - Flags: review?(jwalden+bmo) → review+
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #9)
> ::: js/src/tests/Intl/Collator/caseFirst.js
> @@ +7,5 @@
> > +// Locales which use caseFirst=off for the standard (sort) collation type.
> > +const defaultLocales = Intl.Collator.supportedLocalesOf(["en", "de", "es", "sv", "ar", "zh", "ja"]);
> > +
> > +// Locales which use caseFirst=upper for the standard (sort) collation type.
> > +const upperFirstLocales = Intl.Collator.supportedLocalesOf(["cu", "da", "mt"]);
> 
> Hmm.  So no locales affirmatively have caseFirst=lower?  That's consistent
> with the patch, just I guess a little odd.

Yes, caseFirst=lower isn't used by any locale. According to the ICU user guide [1], caseFirst=lower isn't generally used:
> There is almost no difference between the Off and Lowercase_First options in terms of results, so
> typically users will not use Lowercase_First: only Off or Uppercase_First.

[1] http://userguide.icu-project.org/collation/concepts#TOC-Main-References
Attached patch bug866473.patch (obsolete) — Splinter Review
Updated per review comments, carrying r+.
Attachment #8822696 - Attachment is obsolete: true
Attachment #8842863 - Flags: review+
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c92fa71c097e
Implement caseFirst option in Intl.Collator. r=Waldo
Keywords: checkin-needed
Backed out for hazard failures in Intl.cpp:

https://hg.mozilla.org/integration/mozilla-inbound/rev/5e87b990f4b08e32c23e1377a4393d34ec121553

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=08df0e07f0ba02a27e532b25a0d3415166a6bba1&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=81170612&repo=mozilla-inbound

/home/worker/checkouts/gecko/js/src/builtin/Intl.cpp:1471:64: error: 'ucol_openAvailableLocales' was not declared in this scope
/home/worker/checkouts/gecko/js/src/builtin/Intl.cpp:1497:92: error: 'ucol_getAttribute' was not declared in this scope
Flags: needinfo?(andrebargull)
Attached patch bug866473.patchSplinter Review
Forgot to add the placeholder method definitions when the Intl API is disabled. Carrying r+ from Waldo.
Attachment #8842863 - Attachment is obsolete: true
Flags: needinfo?(andrebargull)
Attachment #8843344 - Flags: review+
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e818ea427a08
Implement caseFirst option in Intl.Collator. r=Waldo
Keywords: checkin-needed
Backed out for failing testGCGrayMarking:

https://hg.mozilla.org/integration/mozilla-inbound/rev/693e0c6a44e0b221805405eb9058ac461db622a2

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=e818ea427a08740381fc55276b83cd1d4aebe87d&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable&filter-resultStatus=success&filter-searchStr=cgc
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=81525363&repo=mozilla-inbound

[task 2017-03-03T19:18:47.159845Z] TEST-PASS | testGCHeapPostBarriers | ok
[task 2017-03-03T19:18:47.159859Z] testGCGrayMarking
[task 2017-03-03T19:18:47.159885Z] TEST-UNEXPECTED-FAIL | testGCGrayMarking | CHECK failed: IsMarkedBlack(target)
[task 2017-03-03T19:18:47.159902Z] testGCFinalizeCallback
Flags: needinfo?(andrebargull)
Hi jonco, can you take a look at the failure in testGCGrayMarking? It seems to happen because I'm exporting one additional function to the self-hosting global. For example when I removed the export in SelfHosting.cpp:

---
diff --git a/js/src/builtin/Intl.js b/js/src/builtin/Intl.js
--- a/js/src/builtin/Intl.js
+++ b/js/src/builtin/Intl.js
@@ -1623,6 +1623,6 @@ function collatorCaseFirst(locale, usage
         var actualLocale = BestAvailableLocaleIgnoringDefault(availableLocales, locale);
 
-        if (intl_isUpperCaseFirst(actualLocale))
-            return ["upper", "false", "lower"];
+        // if (intl_isUpperCaseFirst(actualLocale))
+        //     return ["upper", "false", "lower"];
     }
 
diff --git a/js/src/vm/SelfHosting.cpp b/js/src/vm/SelfHosting.cpp
--- a/js/src/vm/SelfHosting.cpp
+++ b/js/src/vm/SelfHosting.cpp
@@ -2618,5 +2618,5 @@ static const JSFunctionSpec intrinsic_fu
     JS_FN("intl_GetLocaleInfo", intl_GetLocaleInfo, 1,0),
     JS_FN("intl_ComputeDisplayNames", intl_ComputeDisplayNames, 3,0),
-    JS_FN("intl_isUpperCaseFirst", intl_isUpperCaseFirst, 1,0),
+    // JS_FN("intl_isUpperCaseFirst", intl_isUpperCaseFirst, 1,0),
     JS_FN("intl_IsValidTimeZoneName", intl_IsValidTimeZoneName, 1,0),
     JS_FN("intl_NumberFormat", intl_NumberFormat, 2,0),
---

the failure in testGCGrayMarking went away. I'm a bit lost how to fix this issue. :-(
Flags: needinfo?(andrebargull) → needinfo?(jcoppeard)
Depends on: 1344454
Sorry about that anba, that was my fault.  Patch coming up in bug 1344454.
Flags: needinfo?(jcoppeard)
(In reply to Jon Coppeard (:jonco) from comment #20)
> Sorry about that anba, that was my fault.  Patch coming up in bug 1344454.

Thanks for the quick response! :-)
Let's try to land this again now that bug 1344454 has been fixed.

New try to validate cgc builds succeed: https://treeherder.mozilla.org/#/jobs?repo=try&revision=67d7bb777f77dd2b46af8460da46fc0310770b56
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/124b0ef2d816
Implement caseFirst option in Intl.Collator. r=Waldo
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/124b0ef2d816
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.