Closed Bug 1686052 Opened 2 years ago Closed 2 years ago

Update our in-tree ICU to 68.2

Categories

(Core :: JavaScript: Internationalization API, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
91 Branch
Tracking Status
firefox86 --- wontfix
firefox91 --- fixed

People

(Reporter: anba, Assigned: anba)

References

()

Details

Attachments

(12 files, 1 obsolete file)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

Updating to ICU 68 is needed for bug 1653024. And it also updates to CLDR 38 and brings various bug fixes and improvements.

Release notes: http://site.icu-project.org/download/68

I'll upload the changes as draft patches, because I'm not sure who is available for reviewing.

Update the patch from bug 1614941, because CLDR 38 adds its own patterns for
"MMMMd" and "yMMMM", which both conflict with our customised patterns. The
customised patterns were updated per the discussion in
https://bugzilla.mozilla.org/show_bug.cgi?id=1614941#c32.

Remove the patches from bug 1433303 and bug 1534160. Both are no longer needed,
because they've been integrated into upstream.

https://unicode-org.atlassian.net/browse/ICU-10879 added a separate
"sources.txt" file which lists all source files per directory.

Depends on D101365

Update to ICU 68.2 by running "update-icu.sh" with "maint/maint-68" as the target.

ICU 68.2 ships with tzdata 2020d, so update again to 2020f.

Updates the numbering and measuring unit systems data to CLDR 38 by running
make_intl_data.py numbering and make_intl_data.py units.

Depends on D101391

Updating to CLDR 38 means a couple of format strings have changed, update the
expected results accordingly.

Depends on D101392

Change the system requirement to ICU 68 in order to remove some conditional
code in the next patch.

Depends on D101393

builtin/TestingFunctions.cpp:
ucal_getHostTimeZone() has been promoted to "stable" in ICU 68, so we can
remove the ifndef U_HIDE_DRAFT_API guard.

intl/ListFormat:
UListFormatter{Type,Width} were also promoted to "stable" in ICU 68. This
allows to enable "type" and "style" options for Intl.ListFormat by default.

Depends on D101394

The two ICU bugs (ICU-10220 and ICU-12345) have been resolved and it's now no
longer necessary to call uloc_addLikelySubtags before calling
uloc_minimizeSubtags.

Drive-by change:

  • Replaced strlen with std::char_traits<char>::length.

Depends on D101395

UTS #35 (version 38) overhauled the language tag canonicalisation algorithm
and CLDR 38 also added new alias entries to replace some deprecated language
subtags with a new, preferred form.

Overview of canonicalisation changes:

  • Script aliases must now be processed. (Script alias data was already present
    in CLDR, but the previous canonicalisation algorithm never processed it.)
  • Sign language canonicalisation was added.
  • Grandfathered tags are now handled like any other tag. It is no longer
    required to perform exact matches, but instead individual subtags are
    compared. For example both "art-lojban" and "art-ZZ-lojban" are now
    canonicalised to "jbo" resp. "jbo-ZZ".

Changes in make_intl_data.py:

  • Split writeMappingsBinarySearchBody from writeMappingsBinarySearch so it
    can be used in the new writeSignLanguageMappingsFunction function.
  • writeMappingsBinarySearchBody splits the name parameter in two distinct
    parameters source_name and target_name. This is also needed for the new
    writeSignLanguageMappingsFunction function.
  • Change writeVariantTagMappings to allow no replacements.
    • For example the alias entry <languageAlias type="und_bokmal" replacement="und">
      removes the "bokmal" variant subtags, but doesn't add any replacement tags.
  • Replace "grandfathered" with "legacy", because UTS and CLDR no longer use
    that term.
  • readSupplementalData was changed to collect all alias rules into a single
    dict. This matches how the new UTS #35 canonicalisation algorithm is
    specified. Later they are split into individual dictionaries for each subtag.

Now that we no longer need to support old Python versions, we can use newer
features like format strings. I've used them exclusively in new code and also
replaced other str.format() calls in functions which were modified in this
patch.

Depends on D101396

Attached file Bug 1686052 - Part 12: Update test262. (obsolete) —

Depends on D101399

I've been distracted, but poke me on these when they're ready for a review and I'll hop on them.

Attachment #9196397 - Attachment description: Bug 1686052 - Part 1: Update or remove ICU patches. → Bug 1686052 - Part 1: Update or remove ICU patches. r=jwalden!
Attachment #9196398 - Attachment description: Bug 1686052 - Part 2: Update ICU build script to use new "sources.txt" file. → Bug 1686052 - Part 2: Update ICU build script to use new "sources.txt" file. r=jwalden!
Attachment #9196456 - Attachment description: Bug 1686052 - Part 4: Update tzdata in ICU data files to 2020f. → Bug 1686052 - Part 4: Update tzdata in ICU data files to 2020f. r=jwalden!
Attachment #9196457 - Attachment description: Bug 1686052 - Part 5: Update numbering and measuring unit systems. → Bug 1686052 - Part 5: Update numbering and measuring unit systems. r=jwalden!

Sure. Here you are! :-)

Attachment #9196458 - Attachment description: Bug 1686052 - Part 6: Update expected tests results. → Bug 1686052 - Part 6: Update expected tests results. r=jwalden!
Attachment #9196459 - Attachment description: Bug 1686052 - Part 7: Bump minimum required ICU version to 68.2. → Bug 1686052 - Part 7: Bump minimum required ICU version to 68.2. r=jwalden!
Attachment #9196460 - Attachment description: Bug 1686052 - Part 8: Remove guards around previous ICU draft APIs. → Bug 1686052 - Part 8: Remove guards around previous ICU draft APIs. r=jwalden!
Attachment #9196461 - Attachment description: Bug 1686052 - Part 9: Remove no longer needed locale maximization when removing likely subtags. → Bug 1686052 - Part 9: Remove no longer needed locale maximization when removing likely subtags. r=jwalden!
Attachment #9196463 - Attachment description: Bug 1686052 - Part 10: Use new language tag canonicalisation algorithm. → Bug 1686052 - Part 10: Use new language tag canonicalisation algorithm. r=jwalden!
Attachment #9196464 - Attachment description: Bug 1686052 - Part 11: Updating ICU requires a clobber. → Bug 1686052 - Part 11: Updating ICU requires a clobber. r=jwalden!
Attachment #9196466 - Attachment description: Bug 1686052 - Part 13: Update test262 exclusions. → Bug 1686052 - Part 12: Update test262 exclusions. r=jwalden!
Severity: -- → N/A
Priority: -- → P1
Summary: Update our in-tree ICU to 68 → Update our in-tree ICU to 68.2

Does ICU 68.2 include a fix for ICU-21465? If not, please cherry-pick the fix. Otherwise we will repeat Chromium issue 1168528.

ICU-21465 has been backported after 68.2 release. The 68.2 fix list does not contain ICU-21465, either.

Flags: needinfo?(andrebargull)

I think generally we run this from the maint/maint-xx branch. The patches here run up to the December 16 change. So, part 3 will need an update for this. (And perhaps a subsequent part or two will need additional tweaking too, not 100% certain.) I'm sure we can get to that before landing all this, once I review the last part...

Blocks: 1697729

Still going through the last patch here. It is, of course, the most complicated of them -- and I want to be sure I'm not missing anything in the logic -- so it's slow going. Hoping for this week, tho.

Blocks: 1678437
Attachment #9196465 - Attachment is obsolete: true
Attachment #9196453 - Attachment description: Bug 1686052 - Part 3: Update in-tree ICU to release 68.2. → Bug 1686052 - Part 3: Update in-tree ICU to release 68.2. r=jwalden!
Attachment #9196456 - Attachment description: Bug 1686052 - Part 4: Update tzdata in ICU data files to 2020f. r=jwalden! → Bug 1686052 - Part 4: Update tzdata in ICU data files to 2021a. r=jwalden!
Attachment #9196463 - Attachment description: Bug 1686052 - Part 10: Use new language tag canonicalisation algorithm. r=jwalden! → Bug 1686052 - Part 10: Use new language tag canonicalisation algorithm. r=tcampbell!
Pushed by andre.bargull@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/a0b54333e664
Part 1: Update or remove ICU patches. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/dfcf9f05fd75
Part 2: Update ICU build script to use new "sources.txt" file. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/be466ae533b6
Part 3: Update in-tree ICU to release 68.2. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/dfddac0a5a30
Part 4: Update tzdata in ICU data files to 2021a. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/b7afcd23f704
Part 5: Update numbering and measuring unit systems. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/59f05c0546d8
Part 6: Update expected tests results. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/f44ca702dcf3
Part 7: Bump minimum required ICU version to 68.2. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/2ff57c5c3cce
Part 8: Remove guards around previous ICU draft APIs. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/bb0fccdb6e48
Part 9: Remove no longer needed locale maximization when removing likely subtags. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/cd3cae9feddb
Part 10: Use new language tag canonicalisation algorithm. r=tcampbell
https://hg.mozilla.org/integration/autoland/rev/5acae134bbb8
Part 11: Updating ICU requires a clobber. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/61484a56d30a
Part 12: Update test262 exclusions. r=jwalden

(In reply to Masatoshi Kimura [:emk] from comment #16)

Does ICU 68.2 include a fix for ICU-21465? If not, please cherry-pick the fix. Otherwise we will repeat Chromium issue 1168528.

The patches were all updated to the current ICU 68 maintenance release. Thanks for the heads-up!

Flags: needinfo?(andrebargull)
You need to log in before you can comment on or make changes to this bug.