Closed Bug 1225401 Opened 9 years ago Closed 7 years ago

Remove cities timezone data from ICU

Categories

(Core :: JavaScript: Internationalization API, defect)

Unspecified
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox54 --- fixed

People

(Reporter: m_kato, Assigned: m_kato)

References

Details

Attachments

(3 files, 3 obsolete files)

My understand is that DateTimeFormat doesn't allow city data.  But our ICU dat still has this data.

If removing this, it reduces 400KB and Chrome already removes it.
Blocks: 1215247
I should wait the review until landing ICU 56.1 update.
Assignee: nobody → m_kato
Attachment #8688759 - Attachment is obsolete: true
Comment on attachment 8691695 [details] [diff] [review]
Part 1. Update update-icu.sh to remove cities from zone data

Our Intl API implementation doesn't allow that time zone is city name.  So city names are unnecessary.

Also Chromium already removes cities from ICU data to reduce data size.
Attachment #8691695 - Flags: review?(jwalden+bmo)
Comment on attachment 8691696 [details] [diff] [review]
Part 2. Remove cities from ICU data

This is diff by remove script.   I can reduce ~400KB binary size.
Attachment #8691696 - Flags: review?(jwalden+bmo)
Comment on attachment 8691695 [details] [diff] [review]
Part 1. Update update-icu.sh to remove cities from zone data

Review of attachment 8691695 [details] [diff] [review]:
-----------------------------------------------------------------

::: intl/update-icu.sh
@@ +48,5 @@
> +# Remove cities in timezone data.
> +for i in ${icu_dir}/source/data/zone/*.txt
> +do
> +  grep -E '^        "meta:' $i > /dev/null
> +  if [ $? -eq 0 ]; then

IRC kibitzes that this would be better written as:

if grep -q -E '^        "meta:' $i; then

Please do so.  :-)

@@ +49,5 @@
> +for i in ${icu_dir}/source/data/zone/*.txt
> +do
> +  grep -E '^        "meta:' $i > /dev/null
> +  if [ $? -eq 0 ]; then
> +    sed -i '/^    zoneStrings/, /^        "meta:/ {

IRC alleges: "I believe that's: between lines matching zoneStrings and meta: if the line matches zoneStrings, print; if the line matches meta, print; otherwise delete".  That would correspond with the patch changes, at least based one examining one or two files.  Is that description correct?  If so add a comment explaining this.  I propose the following:

"""
Between lines matching the given zoneStrings and meta patterns, inclusive: if the line matches the given zoneStrings pattern, include (print) it; if the line matches the given meta pattern, include (print) it; otherwise remove (delete) it.
"""
Attachment #8691695 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8691696 [details] [diff] [review]
Part 2. Remove cities from ICU data

Review of attachment 8691696 [details] [diff] [review]:
-----------------------------------------------------------------

rs=me assuming the previous patch's comments from me understood the patch and its effects correctly.
Attachment #8691696 - Flags: review?(jwalden+bmo) → review+
Also, it'd be nice to sync this up with what v8 does, so the two can be compared by the interested reader.  Did the sed command do something like what v8 does?  Could you include a link to what v8 does in the patch, so the interested reader can easily compare the approaches?
OK, so please file something upstream in ICU. You might add comments to http://bugs.icu-project.org/trac/ticket/10922  - we need to get requirements like this. I said the same to Chrome people.

The internals of ICU's data should NOT be considered stable.
FYI, these patches are going to be obsoleted by bug 1239083.
Depends on: 1239083
Currently our implementation of ECMA-402 supports Etc/UTC only.

But, ECMA-402 2.0 (http://www.ecma-international.org/ecma-402/2.0/#sec-time-zone-names) allows IANA Time Zone Database too.  So for 2.0, I might not have to remove these localization data...  So I keep open state.
Comment on attachment 8840355 [details] [diff] [review]
Part 1. Remove cities when syncing ICU source code

Update script for ICU 58.2.  Cities data seems to be is still unused.
Attachment #8840355 - Flags: review?(jwalden+bmo)
Comment on attachment 8840357 [details] [diff] [review]
Part 2. Update ICU source in tree after removing cities data

Update ICU code into our tree after running Part 1's script.
Attachment #8840357 - Flags: review?(jwalden+bmo)
Comment on attachment 8840358 [details] [diff] [review]
Part 3. Update icudt58l.dat

Update ICU data into our tree after running Part 1's script.
Attachment #8840358 - Flags: review?(jwalden+bmo)
Attachment #8691695 - Attachment is obsolete: true
Attachment #8691696 - Attachment is obsolete: true
Can this stuff still be removed, now that we support the timeZone property on DateTimeFormat options?  It looks like all time zone information is recorded in intl/icu/source/data/misc/zoneinfo64.txt now, and so touching these files shouldn't affect our semantics.  Am I correct about that?  Would like to know that before I try too hard to review any of this.
Flags: needinfo?(m_kato)
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #20)
> Can this stuff still be removed, now that we support the timeZone property
> on DateTimeFormat options?  It looks like all time zone information is
> recorded in intl/icu/source/data/misc/zoneinfo64.txt now, and so touching
> these files shouldn't affect our semantics.  Am I correct about that?  Would
> like to know that before I try too hard to review any of this.

Yes, it works timeZone property like new Date().toLocaleTimeString("en", {timeZone: "America/New_York"}) even if removing localized city name.
Flags: needinfo?(m_kato)
Attachment #8840355 - Flags: review?(jwalden+bmo) → review+
Attachment #8840357 - Flags: review?(jwalden+bmo) → review+
Attachment #8840358 - Flags: review?(jwalden+bmo) → review+
Pushed by m_kato@ga2.so-net.ne.jp:
https://hg.mozilla.org/integration/mozilla-inbound/rev/292d9e369264
Part 1. Remove cities when syncing ICU source code. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/6696016a0198
Part 2. Update ICU source in tree after removing cities data. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/8c51fd25a389
Part 3. Update icudt58l.dat. r=Waldo
do we know how much this shaves off of the apk size hit of ICU?
Pushed by m_kato@ga2.so-net.ne.jp:
https://hg.mozilla.org/integration/mozilla-inbound/rev/45e867a1c89e
Part 1. Remove cities when syncing ICU source code. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/c056f2674719
Part 2. Update ICU source in tree after removing cities data. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/cc40078f0ac6
Part 3. Update icudt58l.dat. r=Waldo
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #24)
> do we know how much this shaves off of the apk size hit of ICU?

around 140KB.
Some apk size wins from this:

== Change summary for alert #5297 (as of March 03 2017 04:57 UTC) ==

Improvements:

  0%  installer size summary android-4-0-armv7-api15 opt        34966457.08 -> 34834684.25
  0%  installer size summary android-api-15-gradle opt          35323165.33 -> 35193592.33
  0%  installer size summary android-4-0-armv7-api15 debug      36108531.42 -> 35978278.67
  0%  installer size summary android-4-2-x86 opt                37471828.38 -> 37337097.58

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=5297
Depends on: 1345336
You need to log in before you can comment on or make changes to this bug.