Update our in-tree ICU to 64 and update to Unicode 12
Categories
(Core :: JavaScript: Internationalization API, enhancement)
Tracking
()
People
(Reporter: anba, Assigned: anba)
References
Details
Attachments
(11 files)
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
10.00 MB,
application/octet-stream
|
anba
:
review+
Waldo
:
feedback+
|
Details |
5.60 MB,
application/octet-stream
|
anba
:
review+
Waldo
:
feedback+
|
Details |
From http://site.icu-project.org/download/64:
ICU 64 updates to Unicode 12 and to CLDR 35 locale data with many additions and corrections, and some new languages. ICU adds a data filtering/subsetting mechanism, improved formatting API, and a C++ LocaleBuilder.
That means we need to combine the ICU update with the Unicode update.
Try with RC2:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d93a45589082049ec140fd5201d1c918125c7342
Mostly green except one WPT test which needs to be updated.
Comment 1•6 years ago
|
||
Thank you Andre! This update will also give us new APIs for RelativeTimeFormat.formatToParts IIRC. Anything else for ECMA402?
Assignee | ||
Comment 2•6 years ago
|
||
Anything else for ECMA402?
From http://site.icu-project.org/download/64:
- Unicode and CLDR updates
ulistfmt_formatStringsToResult
,ulistfmt_resultAsValue
, andufmtval_nextPosition
are maybe useful forIntl.ListFormat.formatToParts
? (But I haven't yet checked what these functions actually do, currently I'm simply guessing what they might do based on their names...)- And as you said,
ureldatefmt_formatToResult
,ureldatefmt_resultAsValue
, andufmtval_nextPosition
are possibly the missing functions needed forIntl.RelativeTimeFormat.formatToParts
.
Apart from that, the update also includes fixes for multiple OOM issues (bug 1495244, bug 1431142). And the new data filter builder could be used to further trim icudtxxl.dat
by about 500KB (unzipped) resp. ~50KB (zipped), when we remove support for "confusables", "stringprep", "unames", and "cnvalias" (the last one requires UCONFIG_NO_CONVERSION
to be set). The data for these features isn't used for Intl
and I guess it's also not necessary for other ICU uses in the tree, but I still need to check that with :m_kato (probably?).
Assignee | ||
Comment 3•6 years ago
|
||
New Try-build testing all platforms/tests: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5a679ea6cffafc33226939c0844047aad8a5d258
Seems to look pretty good!
Assignee | ||
Comment 4•6 years ago
|
||
We're setting LC_ALL to "en_US.UTF-8" in js/src/tests/lib/tests.py, but somehow
this doesn't always work, for example on the Try-servers the default locale is
set to "und", but this effect wasn't reproducible locally. Switch to ICU's
default locale function which contains multiple fallbacks to ensure LC_ALL is
honoured as expected.
Assignee | ||
Comment 5•6 years ago
|
||
Depends on D25261
Assignee | ||
Comment 6•6 years ago
|
||
- Local patches for bug 1504656 and bug 1513934 are now present upstream.
- Patch for bug 1172609 needed to be rebased to apply cleanly.
- Update patch for bug 1172609 to follow new static mutex handling in ICU.
Depends on D25262
Assignee | ||
Comment 7•6 years ago
|
||
- ICU no longer supports removing resource files manually, instead the new data
filter builder needs to be used. - This includes test data, which also must be preserved in ICU 64.
- umutex.h contains an implicit call to
new
in ICU64 which needs to be allowed
in check_vanilla_allocations.py. - Drive-by: Make ICU with multiple jobs for faster rebuilds.
Depends on D25263
Assignee | ||
Comment 8•6 years ago
|
||
Assignee | ||
Comment 9•6 years ago
|
||
Depends on D25265
Assignee | ||
Comment 10•6 years ago
|
||
Unicode 12 changes the character property of U+166D (CANADIAN SYLLABICS CHI SIGN)
from Po (Punctuation, other) to So (Symbol, other), therefore replace U+166D with
U+166E (CANADIAN SYLLABICS FULL STOP).
Depends on D25267
Assignee | ||
Comment 11•6 years ago
|
||
- Add markers to generated files to discourage manual edits.
- Add generated file to clang-format-ignore.
- Handle currently not supported Georgian characters in the Perl script.
Depends on D25268
Assignee | ||
Comment 12•6 years ago
|
||
Depends on D25269
Assignee | ||
Comment 13•6 years ago
|
||
The actual patch file is too large (84.831.406 bytes unzipped, 16.352.610 bytes zipped), so I can neither upload it through moz-phab
, nor arc
(even when --less-context
is used), nor uploading it to Bugzilla directly (Bugzilla has a 10MB limit). So I had to split it into two zip-files. Alternatively it's also uploaded to https://depot.tu-dortmund.de/yxrd4 (valid for 28 days).
And the feedback request is actually a review request, but without being able to set r? we need to get creative. :-)
Assignee | ||
Comment 14•6 years ago
|
||
Part two of the split zip file.
Assignee | ||
Comment 15•6 years ago
|
||
FWIW, here's the moz-phab
error output:
Creating new revision:
467741:ed5f39250c5e Bug 1533481 - Part 5: Update in-tree ICU to release 64.1. r=jwalden!
Linting...
No lint engine configured for this project.
Running unit tests...
No unit test engine is configured for this project.
Diff for 'intl/icu/source/data/brkitr/dictionaries/cjdict.txt' with
context is 4,504,160 bytes in length. Generally, source changes should
not be this large. If this file is a huge text file, try using the
'--less-context' flag. If the file is not a text file, you can mark it
Exception
The program is attempting to read user input, but stdin is being piped from some other source (not a TTY).
(Run with `--trace` for a full exception trace.)
'binary'. Mark this file as 'binary' and continue? [y/N]
CalledProcessError: Command '['/home/andre/bin/arc', 'diff', '--base', 'arc:this', '--allow-untracked', '--no-amend', '--no-ansi', '--message-file', '/tmp/tmpu_HTeO', '--create']' returned non-zero exit status 1
After that the local repository needed to be fixed manually with hg evolve
to move the changesets back into the correct order.
And the arc diff
output:
andre@VBdev:~/hg/mozilla-inbound$ arc diff --less-context --create
You have a saved revision message in '.hg/arc/create-message'.
You can use this message, or discard it.
Do you want to use this message? [Y/n] y
Linting...
No lint engine configured for this project.
Running unit tests...
No unit test engine is configured for this project.
Diff for 'intl/icu/source/data/brkitr/dictionaries/cjdict.txt' with
context is 4,504,160 bytes in length. Generally, source changes should
not be this large. If the file is not a text file, you can mark it
'binary'. Mark this file as 'binary' and continue? [y/N] y
Diff for 'intl/icu/source/test/testdata/BidiCharacterTest.txt' with
context is 6,968,550 bytes in length. Generally, source changes should
not be this large. If the file is not a text file, you can mark it
'binary'. Mark this file as 'binary' and continue? [y/N] y
Diff for 'intl/icu/source/test/testdata/BidiTest.txt' with context is
8,457,590 bytes in length. Generally, source changes should not be this
large. If the file is not a text file, you can mark it 'binary'. Mark
this file as 'binary' and continue? [y/N] y
mmap() failed: [12] Cannot allocate memory
mmap() failed: [12] Cannot allocate memory
mmap() failed: [12] Cannot allocate memory
Comment 16•6 years ago
|
||
Comment on attachment 9054286 [details]
bug1533481-part5-icu-update.zip.001
rs=me
Comment 17•6 years ago
|
||
Comment on attachment 9054287 [details]
bug1533481-part5-icu-update.zip.002
rs=me
Assignee | ||
Comment 18•6 years ago
|
||
Comment on attachment 9054286 [details]
bug1533481-part5-icu-update.zip.001
Add r+ in addition to f+ per comment #16 for tracking purposes.
Assignee | ||
Comment 19•6 years ago
|
||
Comment on attachment 9054287 [details]
bug1533481-part5-icu-update.zip.002
Add r+ in addition to f+ per comment #17 for tracking purposes.
Updated•6 years ago
|
Assignee | ||
Comment 20•6 years ago
|
||
Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=22228ca898566f7306f415f43f08492a9274ddb4
Parts 1-4 and 6-10 are Phabricator requests, but Part 5 isn't (due to size restrictions in Phabricator). Instead Part 5 had to be uploaded as a split zip-file (bug1533481-part5-icu-update.zip.001
and bug1533481-part5-icu-update.zip.002
) directly to Bugzilla (the zipped patch had to split because of size restrictions in Bugzilla). It is still necessary to apply all parts in the correct order to avoid compiler errors resp. test failures. Sorry for the inconvenience!
Comment 21•6 years ago
|
||
Pushed by nerli@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/eaddff053db4
Part 1: Use ICU to retrieve the system default locale. r=jwalden!
https://hg.mozilla.org/integration/autoland/rev/6acec9d8b4a7
Part 2: Update SpiderMonkey to Unicode 12. r=jwalden!
https://hg.mozilla.org/integration/autoland/rev/de8c3ab271f7
Part 3: Remove no longer needed ICU patches or update to apply cleanly. r=jwalden!
https://hg.mozilla.org/integration/autoland/rev/912fb96c21b3
Part 4: Prepare update to ICU 64. r=jwalden!
https://hg.mozilla.org/integration/autoland/rev/cedfd8518726
Part 5: Update in-tree ICU to release 64.1. r=jwalden!
https://hg.mozilla.org/integration/autoland/rev/347e9b869ceb
Part 6: Update switch to handle new UNumberFormatFields entries. r=jwalden!
https://hg.mozilla.org/integration/autoland/rev/635920ce336f
Part 7: Update expected results after ICU update. r=jwalden!
https://hg.mozilla.org/integration/autoland/rev/cf0206013ba0
Part 8: Remove test for U+166D after update to Unicode 12. r=jfkthame!
https://hg.mozilla.org/integration/autoland/rev/8955c8b9e8b3
Part 9: Update Gecko to Unicode 12. r=jfkthame!
https://hg.mozilla.org/integration/autoland/rev/d10b57a89a73
Part 10: Updating ICU requires a clobber. r=jwalden!
Comment 22•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/eaddff053db4
https://hg.mozilla.org/mozilla-central/rev/6acec9d8b4a7
https://hg.mozilla.org/mozilla-central/rev/de8c3ab271f7
https://hg.mozilla.org/mozilla-central/rev/912fb96c21b3
https://hg.mozilla.org/mozilla-central/rev/cedfd8518726
https://hg.mozilla.org/mozilla-central/rev/347e9b869ceb
https://hg.mozilla.org/mozilla-central/rev/635920ce336f
https://hg.mozilla.org/mozilla-central/rev/cf0206013ba0
https://hg.mozilla.org/mozilla-central/rev/8955c8b9e8b3
https://hg.mozilla.org/mozilla-central/rev/d10b57a89a73
Comment 23•6 years ago
|
||
Coverity found 147 new issues in that release.
Some issues are marked as high.
Andi or myself can provide access.
Comment 24•6 years ago
|
||
Actually, I was mislead by the automation email. Coverity could not correlate the new defects with the old.
We are good, we didn't regress (even if ICU would benefit from some cleanup)
(oh, I didn't realized that they moved to github and that they accept PR \o/)
Updated•6 years ago
|
Assignee | ||
Comment 25•6 years ago
|
||
There also seem to be plans to add Coverity builds directly to ICU, maybe that'll help to reduce the warnings: https://unicode-org.atlassian.net/browse/ICU-20495
Description
•