Big5-HKSCS 2004 <==> Unicode Table Update

RESOLVED FIXED in mozilla1.8.1beta2

Status

()

defect
RESOLVED FIXED
13 years ago
13 years ago

People

(Reporter: hfwong1, Assigned: smontagu)

Tracking

({fixed1.8.1})

Trunk
mozilla1.8.1beta2
Points:
---
Bug Flags:
blocking1.8.1 +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 1 obsolete attachment)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4

After the release of Big5-HKSCS 2001, the Hong Kong government updated the Big5-HKSCS table in 2004 and added many new chinese characters. The new table is available publicly for downloads on the official website of Hong Kong Government.

So, the Big5-HKSCS table that Mozilla is using is outdated and it is causing troubles to Chinese communities because many words cannot be displayed properly...

I hope mozilla can update this table ASAP so that Chinese user can view webpages written in Big5-HKSCS 2004 correctly. 

Here is the new BIG5-HKSCS table released by the Hong Kong Government:
http://www.info.gov.hk/digital21/chi/hkscs/download/hkscs-2004-big5-iso.txt

For more information about the update, please go to
http://www.info.gov.hk/digital21/eng/hkscs/mapping_table.html

Reproducible: Always
I hope mozilla can update this table.
update , update, update.
I wonder if Mozilla is gonna do anything to this issue?
This bug is causing lots of troubles to Hong Kong people...
I wish the Big5-HKSCS 2004 Unicode Table can be updated ASAP
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
http://www.microsoft.com/typography/unicode/950.txt used by intl/uconv/tools/gen-big5hkscs-2001-mozilla.pl doesn't seem to exist any more. I There is http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP950.TXT, but I don't know if it has the same format. I may have to do some reverse engineering.
It should be the same CP950 table used by Microsoft.
And it doesn't matter if it's the same or not...
Why don't we just use the new big5-hkscs table released by hk government?
http://www.info.gov.hk/digital21/chi/hkscs/download/hkscs-2004-big5-iso.txt 
The last version of the Big5-HKSCS conversion tables was generated from three files:
http://www.microsoft.com/typography/unicode/950.txt
http://www.info.gov.hk/digital21/chi/hkscs/download/big5-iso.txt
http://www.info.gov.hk/digital21/chi/hkscs/download/big5cmp.txt

If the Hong Kong government files are sufficient, I'll adjust the generation script to use them.
(In reply to comment #6)
> The last version of the Big5-HKSCS conversion tables was generated from three
> files:
> http://www.microsoft.com/typography/unicode/950.txt
> http://www.info.gov.hk/digital21/chi/hkscs/download/big5-iso.txt
> http://www.info.gov.hk/digital21/chi/hkscs/download/big5cmp.txt
> 
> If the Hong Kong government files are sufficient, I'll adjust the generation
> script to use them.
> 

"hkscs-2004-big5-iso.txt" acts like "big5-iso.txt"
That means, generating whole table still requires "CP950.TXT" or "950.txt"
A page in BIG5-HKSCS:
http://input.foruto.com/jptxt/arti003.htm
A site with the latest BIG5-HKSCS characters
http://code.web.idv.hk/h2u/h2u.php
It's probably more informative to see a diff of the files from which hkscs.ut and hkscs.uf are generated.

Things to notice: there are no new entries in the .ut file (from Big5 to Unicode). All the new characters were already mapped to the PUA. These mappings have been changed to the mappings in the new BIG5-HKSCS table, except in the case of mappings to Unicode Plane 2, which still use the old PUA mappings (we can't change that until bug 162431 is fixed).

In the .ut file (Unicode to Big5), I've removed the additional mappings from the "Kangxi Radicals" area mentioned in bug 182089 comment 23, since they don't seem to be in the HKSCS-2004 table.
Attachment #232418 - Attachment is obsolete: true
Thx for the patch!
SO when can this patch be checked in?
Attachment #232712 - Flags: review?(jshin1987)
Comment on attachment 232712 [details] [diff] [review]
diff of hkscs.uf and hkscs.ut for checkin

r=jshin
Attachment #232712 - Flags: review?(jshin1987) → review+
Checked in.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
(In reply to comment #16)
> Checked in.

Actually not, I'm having problems with CVS.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Really checked in.
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Just thought I'd put on the radar, though it may be too late (though maybe not, as it's only a data file update). Awfully long wait for Firefox 3 so people in Hong Kong (and others around the world) can read their own language properly :-/

(if it's too late for Firefox 2, perhaps it can be considered for the first point release afterwards)
Flags: blocking1.8.1?
Marking blocking the final release. Didn't we take a big Unicode 5.0 update? Does this add on to that?
Flags: blocking1.8.1? → blocking1.8.1+
Target Milestone: --- → mozilla1.8.1
(In reply to comment #20)
> Marking blocking the final release. Didn't we take a big Unicode 5.0 update?
> Does this add on to that?

This is orthogonal to that. These data tables are for conversion between the Big HKSCS legacy code page and Unicode
Summary: Big5-HKSCS 2004 Unicode Table Update → Big5-HKSCS 2004 <==> Unicode Table Update
Comment on attachment 232712 [details] [diff] [review]
diff of hkscs.uf and hkscs.ut for checkin

Asking approval for this data-file only patch.
Attachment #232712 - Flags: approval1.8.1?
Comment on attachment 232712 [details] [diff] [review]
diff of hkscs.uf and hkscs.ut for checkin

a=schrep for drivers
Attachment #232712 - Flags: approval1.8.1? → approval1.8.1+
I checked this in on the branch so that it could make the b2 candidate builds.
mozilla/intl/uconv/ucvtw/hkscs.ut 	1.3.92.1
mozilla/intl/uconv/ucvtw/hkscs.uf 	1.3.92.1
Keywords: fixed1.8.1
Target Milestone: mozilla1.8.1 → mozilla1.8.1beta2
You need to log in before you can comment on or make changes to this bug.