Closed Bug 1353650 Opened 7 years ago Closed 7 years ago

Update our in-tree ICU to 59

Categories

(Core :: JavaScript: Internationalization API, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox55 --- wontfix
firefox56 --- fixed

People

(Reporter: anba, Assigned: anba)

References

Details

Attachments

(13 files, 5 obsolete files)

892 bytes, patch
Waldo
: review+
Details | Diff | Splinter Review
1.14 KB, patch
jfkthame
: review+
jbeich
: feedback-
Details | Diff | Splinter Review
2.01 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
997 bytes, patch
Waldo
: review+
Details | Diff | Splinter Review
6.30 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
6.10 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
2.19 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
16.39 KB, patch
anba
: review+
Details | Diff | Splinter Review
4.08 KB, patch
anba
: review+
Details | Diff | Splinter Review
15.53 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
4.93 MB, application/x-zip-compressed
Waldo
: review+
Details
775 bytes, patch
glandium
: review+
Details | Diff | Splinter Review
835 bytes, patch
anba
: review+
Details | Diff | Splinter Review
ICU 59 is scheduled for release this month.

Important changes:
- Windows XP and Windows Vista are no longer supported.
- char16_t is the default type for UChar
- Includes bidi data files from Unicode 10 beta.
- Includes segmentation data files from Unicode 10 beta

The Unicode 10 changes may require coordination with the rest of the platform.
Four patches are no longer needed and one needed to be updated.

intl/icu-patches/bug-1228227-bug-1263325-libc++-gcc_hidden.diff
- Fixed in http://bugs.icu-project.org/trac/ticket/12023

intl/icu-patches/bug-1325858-close-key.diff
- Fixed in http://bugs.icu-project.org/trac/ticket/12908

intl/icu-patches/ucol_getKeywordValuesForLocale-ulist_resetList.diff
- Fixed in https://ssl.icu-project.org/trac/ticket/12827

intl/icu-patches/unum_formatDoubleForFields.diff
- Fixed in https://ssl.icu-project.org/trac/ticket/12684
Attachment #8854902 - Flags: review?(jwalden+bmo)
ICU renamed all *.c files to *.cpp in ICU59.
Attachment #8854903 - Flags: review?(jwalden+bmo)
Applies https://github.com/googlei18n/sfntly/pull/73 to our local copy of sfntly.
Attachment #8854904 - Flags: review?(jfkthame)
Attached file bug1353650-part4-icu59.zip (obsolete) —
Updates to the ICU59 release candidate (https://sourceforge.net/p/icu/mailman/message/35762202/).
UTC and GMT are now two separate time zones, adjust the test case accordingly. 

http://blog.unicode.org/2017/03/cldr-version-31-released.html
Attachment #8854910 - Flags: review?(jwalden+bmo)
Set the minimum version for the system-ICU option to ICU59.

This is needed, because we're going to use ICU59 API (see next patches) and because of the UChar changes in ICU59.
Attachment #8854912 - Flags: review?(jwalden+bmo)
UChar now defaults to char16_t on all platforms, so I think we can remove the reinterpret_cast when converting between both types. UCHAR_TYPE is currently still configurable on the command-line, but ICU plans to remove this switch in the future.
Attachment #8854913 - Flags: review?(jwalden+bmo)
unum_formatDoubleForFields is now always present. :-)
Attachment #8854914 - Flags: review?(jwalden+bmo)
uplrules_getKeywords was added to the C API, so we can remove the C++ API calls.
Attachment #8854916 - Flags: review?(jwalden+bmo)
Replace uplrules_select with the new uplrules_selectWithFormat, so we can use the correct number formatting for Intl.PluralRules.

Also added a test case which fails with the previous implementation and now passes.
Attachment #8854917 - Flags: review?(jwalden+bmo)
UDAT_AM_PM_MIDNIGHT_NOON_FIELD and UDAT_FLEXIBLE_DAY_PERIOD_FIELD are no longer draft APIs.

(UNUM_FIELD_COUNT and UDAT_FIELD_COUNT were already deprecated in ICU58, but I forgot to update the switch-cases back then.)
Attachment #8854918 - Flags: review?(jwalden+bmo)
Attachment #8854904 - Flags: review?(jfkthame) → review+
Attachment #8854902 - Flags: review?(jwalden+bmo) → review+
Attachment #8854903 - Flags: review?(jwalden+bmo) → review+
Attachment #8854910 - Flags: review?(jwalden+bmo) → review+
Attachment #8854912 - Flags: review?(jwalden+bmo) → review+
Attachment #8854913 - Flags: review?(jwalden+bmo) → review+
Attachment #8854914 - Flags: review?(jwalden+bmo) → review+
Attachment #8854916 - Flags: review?(jwalden+bmo) → review+
Attachment #8854917 - Flags: review?(jwalden+bmo) → review+
Attachment #8854918 - Flags: review?(jwalden+bmo) → review+
This is a (semi-)automated bug making you aware that there is an available upgrade for an embedded third-party library. You can leave this bug open, and it will be updated if a newer version of the library becomes available. If you close it as WONTFIX, please indicate if you do not wish to receive any future bugs upon new releases of the library.

icu is currently at version 58.2 in mozilla-central, and the latest version of the library released is 59.1.

I fetched the latest version of the library from http://site.icu-project.org/download/.
@m_kato and jfkthame:
ICU 59 updates some Unicode data (bidi data files and segmentation data files) to Unicode 10 beta, but otherwise continues to use Unicode 9. Do we need to coordinate this partial update to Unicode 10 with the other Unicode data in Gecko? (Just asking so I don't repeat the issues with the ICU 58 update - bug 1299615, comment #30.)
Flags: needinfo?(m_kato)
Flags: needinfo?(jfkthame)
Update part 7 to apply cleanly on inbound, carrying r+.
Attachment #8854913 - Attachment is obsolete: true
Attachment #8859673 - Flags: review+
Update part 9 to apply cleanly on inbound, carrying r+.
Attachment #8854916 - Attachment is obsolete: true
Attachment #8859674 - Flags: review+
Part 1 needed to be updated to add a workaround for https://ssl.icu-project.org/trac/ticket/13030:
Added intl/icu-patches/u_setMemoryFunctions-callconvention-anachronism-msvc.diff and updated intl/update-icu.sh to apply the new patch.
Attachment #8854902 - Attachment is obsolete: true
Attachment #8859675 - Flags: review?(jwalden+bmo)
The actual update to ICU 59.1.
Attachment #8854909 - Attachment is obsolete: true
Attachment #8859676 - Flags: review?(jwalden+bmo)
Attachment #8859675 - Flags: review?(jwalden+bmo) → review+
Attachment #8859676 - Flags: review?(jwalden+bmo) → review+
(In reply to André Bargull from comment #14)
> @m_kato and jfkthame:
> ICU 59 updates some Unicode data (bidi data files and segmentation data
> files) to Unicode 10 beta, but otherwise continues to use Unicode 9. Do we
> need to coordinate this partial update to Unicode 10 with the other Unicode
> data in Gecko? (Just asking so I don't repeat the issues with the ICU 58
> update - bug 1299615, comment #30.)

Although Android don't still use ICU (Fennec/arm Nightly turns on ICU, but others such as beta and release turn off it now), we might have to keep it since it isn't 10 stable.  If all ICU's data changes to 10 beta, we should use same data on our table.
Flags: needinfo?(m_kato)
Comment on attachment 8854904 [details] [diff] [review]
bug1353650-part3-includes-sfntly.patch

Review of attachment 8854904 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/sfntly/cpp/src/sample/chromium/subsetter_impl.cc
@@ +22,5 @@
>  #include <iterator>
>  #include <map>
>  #include <set>
>  
> +#include <unicode/unistr.h>

Can you add the file to config/system-headers? Otherwise, it breaks --with-system-icu build.

../../gfx/sfntly/cpp/src/Unified_cpp_gfx_sfntly_cpp_src0.o: In function `sfntly::SubsetterImpl::LoadFont(char const*, unsigned char const*, unsigned long)':
objdir/gfx/sfntly/cpp/src/Unified_cpp_gfx_sfntly_cpp_src0.cpp:(.text._ZN6sfntly13SubsetterImpl8LoadFontEPKcPKhm+0x218): undefined reference to `icu::UnicodeString::~UnicodeString()'
/usr/bin/ld: ../../gfx/sfntly/cpp/src/Unified_cpp_gfx_sfntly_cpp_src0.o: relocation R_X86_64_PC32 against `_ZN3icu13UnicodeStringD1Ev' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
Attachment #8854904 - Flags: feedback-
Blocks: 1360131
(In reply to Makoto Kato [:m_kato] (PTO: 4/30-5/5) from comment #19)
> (In reply to André Bargull from comment #14)
> > @m_kato and jfkthame:
> > ICU 59 updates some Unicode data (bidi data files and segmentation data
> > files) to Unicode 10 beta, but otherwise continues to use Unicode 9. Do we
> > need to coordinate this partial update to Unicode 10 with the other Unicode
> > data in Gecko? (Just asking so I don't repeat the issues with the ICU 58
> > update - bug 1299615, comment #30.)
> 
> Although Android don't still use ICU (Fennec/arm Nightly turns on ICU, but
> others such as beta and release turn off it now), we might have to keep it
> since it isn't 10 stable.  If all ICU's data changes to 10 beta, we should
> use same data on our table.

Hmm, I don't understand the part about "[...] we might have to keep it [...]". What exactly do you mean by "it"? Do you mean we have to keep ICU 58 because ICU 59 contains Unicode 9 and Unicode 10 beta files?

I did a bit of digging in the ICU sources and based on this file (http://source.icu-project.org/repos/icu/tags/release-59-1/icu4c/source/data/unidata/changes.txt), ICU 59 uses the following Unicode 10 beta files:
  BidiBrackets.txt
  BidiCharacterTest.txt
  BidiMirroring.txt
  BidiTest.txt
  extracted/DerivedBidiClass.txt
  LineBreak.txt
  auxiliary/*

Based on http://searchfox.org/mozilla-central/source/intl/unicharutil/tools/genUnicodePropertyData.pl, we also seem to read
  BidiBrackets.txt
  BidiMirroring.txt
  LineBreak.txt


So there could be some conflict when we use our tables, but also ICU methods, because these three files are from different Unicode versions. For example in bug 1281448, comment #2, :jfkthame mentioned we use ICU constants for line breaks, so having different LineBreak.txt files could be problematic. Can anyone confirm this?
NI :m_kato for questions raised in comment #21.
Flags: needinfo?(m_kato)
My concern is that Fennec/Android doesn't still use ICU on beta and release channel.  If BIDI rendering is different on w/ ICU vs w/o ICU, we should pay attention to update it.  As long as I check Rev 36 of UAX #9, it has a update of Section 2.7.  If :jfktheme agrees updating ICU (including Rev 36 of UAX #9), I have no objection since he is owner of text rendering.

And we don't use ICU's LineBreaker.  So we can ignore LineBreak.txt on ICU.
Flags: needinfo?(m_kato)
(In reply to Makoto Kato [:m_kato] from comment #23)
> My concern is that Fennec/Android doesn't still use ICU on beta and release
> channel.  If BIDI rendering is different on w/ ICU vs w/o ICU, we should pay
> attention to update it.

Ah okay, understand. Thanks for the explanation! :-)

> If :jfktheme agrees updating ICU (including Rev 36
> of UAX #9), I have no objection since he is owner of text rendering.

Then I'll wait for :jfktheme's answer.
(In reply to Makoto Kato [:m_kato] from comment #23)
> My concern is that Fennec/Android doesn't still use ICU on beta and release
> channel.  If BIDI rendering is different on w/ ICU vs w/o ICU, we should pay
> attention to update it.  As long as I check Rev 36 of UAX #9, it has a
> update of Section 2.7.  If :jfktheme agrees updating ICU (including Rev 36
> of UAX #9), I have no objection since he is owner of text rendering.
> 
> And we don't use ICU's LineBreaker.  So we can ignore LineBreak.txt on ICU.

We don't use the ICU LineBreaker code, but we do rely on the line-break properties and corresponding enum values. So we need to make sure that any changes in LineBreak.txt -- which could result in changes to ICU's ULineBreak enum -- are properly accounted for.

(In this case, though, it doesn't look like anything changed there. Just adding LineBreak properties for new characters is generally fine, it would only be if a new LineBreak class appears -- as happened in ICU58 -- that we'd need to adjust code in nsJISx4051LineBreaker.cpp to match.)

AFAICS, the UAX#9 (bidi-related) updates look harmless for us.
Flags: needinfo?(jfkthame)
Adds unicode/unistr.h to config/system-headers per comment #20.
Attachment #8866766 - Flags: review?(mh+mozilla)
Attached patch bug1353650-part12-clobber.patch (obsolete) — Splinter Review
Updating ICU requires a clobber (bug 1315397).
Attachment #8866768 - Flags: review+
Attachment #8866766 - Flags: review?(mh+mozilla) → review+
Update part12 to apply cleanly
Attachment #8866768 - Attachment is obsolete: true
Attachment #8881896 - Flags: review+
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e7bbd7e18b1a
Part 1: Adjust custom ICU patches for ICU 59 update. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/ca83110136c9
Part 2: Update ICU stubdata build file for ICU 59. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/605d00e70409
Part 3: Add missing includes in subsetter_impl.cc. r=jfkthame
https://hg.mozilla.org/integration/mozilla-inbound/rev/2e5271ad53af
Part 3.1: Add unicode/unistr.h to system-headers to unbreak system-ICU builds. r=glandium
https://hg.mozilla.org/integration/mozilla-inbound/rev/3b59d9b42f78
Part 4: Update to ICU 59. rs=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/1dcae90ce0b7
Part 5: Update expected time zone string after ICU update. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/e69d4b4152c8
Part 6: Change minimum system ICU version to ICU 59. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/14fbfaa855fe
Part 7: Remove reinterpret_cast for UChar now that UChar defaults to char16_t. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/9fba9d8fc530
Part 8: Remove conditional code to check for presence of unum_formatDoubleForFields. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/22d85759ecf8
Part 9: Replace PluralRules::getKeywords in favour of uplrules_getKeywords. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/63c35f0dc789
Part 10: Use uplrules_selectWithFormat to select the correct plural rule. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/c6ae60267402
Part 11: Update versioned switch cases to match current ICU version. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/c4de7d62221d
Part 12: Updating ICU requires a clobber. r=clobber
Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: