Update our in-tree ICU to 59

RESOLVED FIXED in Firefox 56

Status

()

RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: anba, Assigned: anba)

Tracking

Trunk
mozilla56
Points:
---

Firefox Tracking Flags

(firefox55 wontfix, firefox56 fixed)

Details

Attachments

(13 attachments, 5 obsolete attachments)

892 bytes, patch
Waldo
: review+
Details | Diff | Splinter Review
1.14 KB, patch
jfkthame
: review+
jbeich
: feedback-
Details | Diff | Splinter Review
2.01 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
997 bytes, patch
Waldo
: review+
Details | Diff | Splinter Review
6.30 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
6.10 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
2.19 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
16.39 KB, patch
anba
: review+
Details | Diff | Splinter Review
4.08 KB, patch
anba
: review+
Details | Diff | Splinter Review
15.53 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
4.93 MB, application/x-zip-compressed
Waldo
: review+
Details
775 bytes, patch
glandium
: review+
Details | Diff | Splinter Review
835 bytes, patch
anba
: review+
Details | Diff | Splinter Review
(Assignee)

Description

2 years ago
ICU 59 is scheduled for release this month.

Important changes:
- Windows XP and Windows Vista are no longer supported.
- char16_t is the default type for UChar
- Includes bidi data files from Unicode 10 beta.
- Includes segmentation data files from Unicode 10 beta

The Unicode 10 changes may require coordination with the rest of the platform.
(Assignee)

Comment 1

2 years ago
Created attachment 8854902 [details] [diff] [review]
bug1353650-part1-update-icu-patches.patch

Four patches are no longer needed and one needed to be updated.

intl/icu-patches/bug-1228227-bug-1263325-libc++-gcc_hidden.diff
- Fixed in http://bugs.icu-project.org/trac/ticket/12023

intl/icu-patches/bug-1325858-close-key.diff
- Fixed in http://bugs.icu-project.org/trac/ticket/12908

intl/icu-patches/ucol_getKeywordValuesForLocale-ulist_resetList.diff
- Fixed in https://ssl.icu-project.org/trac/ticket/12827

intl/icu-patches/unum_formatDoubleForFields.diff
- Fixed in https://ssl.icu-project.org/trac/ticket/12684
Attachment #8854902 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 2

2 years ago
Created attachment 8854903 [details] [diff] [review]
bug1353650-part2-update-stubfile-name.patch

ICU renamed all *.c files to *.cpp in ICU59.
Attachment #8854903 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 3

2 years ago
Created attachment 8854904 [details] [diff] [review]
bug1353650-part3-includes-sfntly.patch

Applies https://github.com/googlei18n/sfntly/pull/73 to our local copy of sfntly.
Attachment #8854904 - Flags: review?(jfkthame)
(Assignee)

Comment 4

2 years ago
Created attachment 8854909 [details]
bug1353650-part4-icu59.zip

Updates to the ICU59 release candidate (https://sourceforge.net/p/icu/mailman/message/35762202/).
(Assignee)

Comment 5

2 years ago
Created attachment 8854910 [details] [diff] [review]
bug1353650-part5-test-timezone.patch

UTC and GMT are now two separate time zones, adjust the test case accordingly. 

http://blog.unicode.org/2017/03/cldr-version-31-released.html
Attachment #8854910 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 6

2 years ago
Created attachment 8854912 [details] [diff] [review]
bug1353650-part6-minimum-system-icu.patch

Set the minimum version for the system-ICU option to ICU59.

This is needed, because we're going to use ICU59 API (see next patches) and because of the UChar changes in ICU59.
Attachment #8854912 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 7

2 years ago
Created attachment 8854913 [details] [diff] [review]
bug1353650-part7-uchar-is-char16.patch

UChar now defaults to char16_t on all platforms, so I think we can remove the reinterpret_cast when converting between both types. UCHAR_TYPE is currently still configurable on the command-line, but ICU plans to remove this switch in the future.
Attachment #8854913 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 8

2 years ago
Created attachment 8854914 [details] [diff] [review]
bug1353650-part8-numberformat-formattofields.patch

unum_formatDoubleForFields is now always present. :-)
Attachment #8854914 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 9

2 years ago
Created attachment 8854916 [details] [diff] [review]
bug1353650-part9-pluralrules-keywords.patch

uplrules_getKeywords was added to the C API, so we can remove the C++ API calls.
Attachment #8854916 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 10

2 years ago
Created attachment 8854917 [details] [diff] [review]
bug1353650-part10-pluralrules-select-numberformat.patch

Replace uplrules_select with the new uplrules_selectWithFormat, so we can use the correct number formatting for Intl.PluralRules.

Also added a test case which fails with the previous implementation and now passes.
Attachment #8854917 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 11

2 years ago
Created attachment 8854918 [details] [diff] [review]
bug1353650-part11-switch-cases.patch

UDAT_AM_PM_MIDNIGHT_NOON_FIELD and UDAT_FLEXIBLE_DAY_PERIOD_FIELD are no longer draft APIs.

(UNUM_FIELD_COUNT and UDAT_FIELD_COUNT were already deprecated in ICU58, but I forgot to update the switch-cases back then.)
Attachment #8854918 - Flags: review?(jwalden+bmo)
Attachment #8854904 - Flags: review?(jfkthame) → review+
Attachment #8854902 - Flags: review?(jwalden+bmo) → review+
Attachment #8854903 - Flags: review?(jwalden+bmo) → review+
Attachment #8854910 - Flags: review?(jwalden+bmo) → review+
Attachment #8854912 - Flags: review?(jwalden+bmo) → review+
Attachment #8854913 - Flags: review?(jwalden+bmo) → review+
Attachment #8854914 - Flags: review?(jwalden+bmo) → review+
Attachment #8854916 - Flags: review?(jwalden+bmo) → review+
Attachment #8854917 - Flags: review?(jwalden+bmo) → review+
Attachment #8854918 - Flags: review?(jwalden+bmo) → review+
This is a (semi-)automated bug making you aware that there is an available upgrade for an embedded third-party library. You can leave this bug open, and it will be updated if a newer version of the library becomes available. If you close it as WONTFIX, please indicate if you do not wish to receive any future bugs upon new releases of the library.

icu is currently at version 58.2 in mozilla-central, and the latest version of the library released is 59.1.

I fetched the latest version of the library from http://site.icu-project.org/download/.
(Assignee)

Comment 14

2 years ago
@m_kato and jfkthame:
ICU 59 updates some Unicode data (bidi data files and segmentation data files) to Unicode 10 beta, but otherwise continues to use Unicode 9. Do we need to coordinate this partial update to Unicode 10 with the other Unicode data in Gecko? (Just asking so I don't repeat the issues with the ICU 58 update - bug 1299615, comment #30.)
Flags: needinfo?(m_kato)
Flags: needinfo?(jfkthame)
(Assignee)

Comment 15

2 years ago
Created attachment 8859673 [details] [diff] [review]
bug1353650-part7-uchar-is-char16.patch

Update part 7 to apply cleanly on inbound, carrying r+.
Attachment #8854913 - Attachment is obsolete: true
Attachment #8859673 - Flags: review+
(Assignee)

Comment 16

2 years ago
Created attachment 8859674 [details] [diff] [review]
bug1353650-part9-pluralrules-keywords.patch

Update part 9 to apply cleanly on inbound, carrying r+.
Attachment #8854916 - Attachment is obsolete: true
Attachment #8859674 - Flags: review+
(Assignee)

Comment 17

2 years ago
Created attachment 8859675 [details] [diff] [review]
bug1353650-part1-update-icu-patches.patch

Part 1 needed to be updated to add a workaround for https://ssl.icu-project.org/trac/ticket/13030:
Added intl/icu-patches/u_setMemoryFunctions-callconvention-anachronism-msvc.diff and updated intl/update-icu.sh to apply the new patch.
Attachment #8854902 - Attachment is obsolete: true
Attachment #8859675 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 18

2 years ago
Created attachment 8859676 [details]
bug1353650-part4-icu59.zip

The actual update to ICU 59.1.
Attachment #8854909 - Attachment is obsolete: true
Attachment #8859676 - Flags: review?(jwalden+bmo)
Attachment #8859675 - Flags: review?(jwalden+bmo) → review+
Attachment #8859676 - Flags: review?(jwalden+bmo) → review+
(In reply to André Bargull from comment #14)
> @m_kato and jfkthame:
> ICU 59 updates some Unicode data (bidi data files and segmentation data
> files) to Unicode 10 beta, but otherwise continues to use Unicode 9. Do we
> need to coordinate this partial update to Unicode 10 with the other Unicode
> data in Gecko? (Just asking so I don't repeat the issues with the ICU 58
> update - bug 1299615, comment #30.)

Although Android don't still use ICU (Fennec/arm Nightly turns on ICU, but others such as beta and release turn off it now), we might have to keep it since it isn't 10 stable.  If all ICU's data changes to 10 beta, we should use same data on our table.
Flags: needinfo?(m_kato)

Comment 20

2 years ago
Comment on attachment 8854904 [details] [diff] [review]
bug1353650-part3-includes-sfntly.patch

Review of attachment 8854904 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/sfntly/cpp/src/sample/chromium/subsetter_impl.cc
@@ +22,5 @@
>  #include <iterator>
>  #include <map>
>  #include <set>
>  
> +#include <unicode/unistr.h>

Can you add the file to config/system-headers? Otherwise, it breaks --with-system-icu build.

../../gfx/sfntly/cpp/src/Unified_cpp_gfx_sfntly_cpp_src0.o: In function `sfntly::SubsetterImpl::LoadFont(char const*, unsigned char const*, unsigned long)':
objdir/gfx/sfntly/cpp/src/Unified_cpp_gfx_sfntly_cpp_src0.cpp:(.text._ZN6sfntly13SubsetterImpl8LoadFontEPKcPKhm+0x218): undefined reference to `icu::UnicodeString::~UnicodeString()'
/usr/bin/ld: ../../gfx/sfntly/cpp/src/Unified_cpp_gfx_sfntly_cpp_src0.o: relocation R_X86_64_PC32 against `_ZN3icu13UnicodeStringD1Ev' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
Attachment #8854904 - Flags: feedback-
(Assignee)

Updated

2 years ago
Blocks: 1360131
(Assignee)

Comment 21

2 years ago
(In reply to Makoto Kato [:m_kato] (PTO: 4/30-5/5) from comment #19)
> (In reply to André Bargull from comment #14)
> > @m_kato and jfkthame:
> > ICU 59 updates some Unicode data (bidi data files and segmentation data
> > files) to Unicode 10 beta, but otherwise continues to use Unicode 9. Do we
> > need to coordinate this partial update to Unicode 10 with the other Unicode
> > data in Gecko? (Just asking so I don't repeat the issues with the ICU 58
> > update - bug 1299615, comment #30.)
> 
> Although Android don't still use ICU (Fennec/arm Nightly turns on ICU, but
> others such as beta and release turn off it now), we might have to keep it
> since it isn't 10 stable.  If all ICU's data changes to 10 beta, we should
> use same data on our table.

Hmm, I don't understand the part about "[...] we might have to keep it [...]". What exactly do you mean by "it"? Do you mean we have to keep ICU 58 because ICU 59 contains Unicode 9 and Unicode 10 beta files?

I did a bit of digging in the ICU sources and based on this file (http://source.icu-project.org/repos/icu/tags/release-59-1/icu4c/source/data/unidata/changes.txt), ICU 59 uses the following Unicode 10 beta files:
  BidiBrackets.txt
  BidiCharacterTest.txt
  BidiMirroring.txt
  BidiTest.txt
  extracted/DerivedBidiClass.txt
  LineBreak.txt
  auxiliary/*

Based on http://searchfox.org/mozilla-central/source/intl/unicharutil/tools/genUnicodePropertyData.pl, we also seem to read
  BidiBrackets.txt
  BidiMirroring.txt
  LineBreak.txt


So there could be some conflict when we use our tables, but also ICU methods, because these three files are from different Unicode versions. For example in bug 1281448, comment #2, :jfkthame mentioned we use ICU constants for line breaks, so having different LineBreak.txt files could be problematic. Can anyone confirm this?
(Assignee)

Comment 22

2 years ago
NI :m_kato for questions raised in comment #21.
Flags: needinfo?(m_kato)
My concern is that Fennec/Android doesn't still use ICU on beta and release channel.  If BIDI rendering is different on w/ ICU vs w/o ICU, we should pay attention to update it.  As long as I check Rev 36 of UAX #9, it has a update of Section 2.7.  If :jfktheme agrees updating ICU (including Rev 36 of UAX #9), I have no objection since he is owner of text rendering.

And we don't use ICU's LineBreaker.  So we can ignore LineBreak.txt on ICU.
Flags: needinfo?(m_kato)
(Assignee)

Comment 24

2 years ago
(In reply to Makoto Kato [:m_kato] from comment #23)
> My concern is that Fennec/Android doesn't still use ICU on beta and release
> channel.  If BIDI rendering is different on w/ ICU vs w/o ICU, we should pay
> attention to update it.

Ah okay, understand. Thanks for the explanation! :-)

> If :jfktheme agrees updating ICU (including Rev 36
> of UAX #9), I have no objection since he is owner of text rendering.

Then I'll wait for :jfktheme's answer.
(In reply to Makoto Kato [:m_kato] from comment #23)
> My concern is that Fennec/Android doesn't still use ICU on beta and release
> channel.  If BIDI rendering is different on w/ ICU vs w/o ICU, we should pay
> attention to update it.  As long as I check Rev 36 of UAX #9, it has a
> update of Section 2.7.  If :jfktheme agrees updating ICU (including Rev 36
> of UAX #9), I have no objection since he is owner of text rendering.
> 
> And we don't use ICU's LineBreaker.  So we can ignore LineBreak.txt on ICU.

We don't use the ICU LineBreaker code, but we do rely on the line-break properties and corresponding enum values. So we need to make sure that any changes in LineBreak.txt -- which could result in changes to ICU's ULineBreak enum -- are properly accounted for.

(In this case, though, it doesn't look like anything changed there. Just adding LineBreak properties for new characters is generally fine, it would only be if a new LineBreak class appears -- as happened in ICU58 -- that we'd need to adjust code in nsJISx4051LineBreaker.cpp to match.)

AFAICS, the UAX#9 (bidi-related) updates look harmless for us.
Flags: needinfo?(jfkthame)
(Assignee)

Comment 26

2 years ago
Created attachment 8866766 [details] [diff] [review]
bug1353650-part3.1-unistr-to-systemheaders.patch

Adds unicode/unistr.h to config/system-headers per comment #20.
Attachment #8866766 - Flags: review?(mh+mozilla)
(Assignee)

Comment 27

2 years ago
Created attachment 8866768 [details] [diff] [review]
bug1353650-part12-clobber.patch

Updating ICU requires a clobber (bug 1315397).
Attachment #8866768 - Flags: review+
Attachment #8866766 - Flags: review?(mh+mozilla) → review+
(Assignee)

Comment 28

2 years ago
Created attachment 8881896 [details] [diff] [review]
bug1353650-part12-clobber.patch

Update part12 to apply cleanly
Attachment #8866768 - Attachment is obsolete: true
Attachment #8881896 - Flags: review+

Comment 30

2 years ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e7bbd7e18b1a
Part 1: Adjust custom ICU patches for ICU 59 update. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/ca83110136c9
Part 2: Update ICU stubdata build file for ICU 59. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/605d00e70409
Part 3: Add missing includes in subsetter_impl.cc. r=jfkthame
https://hg.mozilla.org/integration/mozilla-inbound/rev/2e5271ad53af
Part 3.1: Add unicode/unistr.h to system-headers to unbreak system-ICU builds. r=glandium
https://hg.mozilla.org/integration/mozilla-inbound/rev/3b59d9b42f78
Part 4: Update to ICU 59. rs=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/1dcae90ce0b7
Part 5: Update expected time zone string after ICU update. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/e69d4b4152c8
Part 6: Change minimum system ICU version to ICU 59. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/14fbfaa855fe
Part 7: Remove reinterpret_cast for UChar now that UChar defaults to char16_t. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/9fba9d8fc530
Part 8: Remove conditional code to check for presence of unum_formatDoubleForFields. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/22d85759ecf8
Part 9: Replace PluralRules::getKeywords in favour of uplrules_getKeywords. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/63c35f0dc789
Part 10: Use uplrules_selectWithFormat to select the correct plural rule. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/c6ae60267402
Part 11: Update versioned switch cases to match current ICU version. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/c4de7d62221d
Part 12: Updating ICU requires a clobber. r=clobber
Keywords: checkin-needed
Duplicate of this bug: 1360131
status-firefox55: affected → wontfix
You need to log in before you can comment on or make changes to this bug.