Closed
Bug 74670
Opened 24 years ago
Closed 24 years ago
Need some entries in unixcharset.properties for Solaris new Asian locales.
Categories
(Core :: Internationalization, defect, P2)
Tracking
()
VERIFIED
FIXED
mozilla0.9.1
People
(Reporter: eyan, Assigned: ftang)
References
Details
Attachments
(1 file)
576 bytes,
patch
|
Details | Diff | Splinter Review |
In Solaris 9, some new Asian locales are integrated. so unixcharset.properties
need add some entries for these new locales.
After check the unixcharset.properties file, we think need add some entries for
the following locales of Solaris 9:
1. zh_HK.BIG5HK
2. zh_HK.UTF-8
3. zh_CN.GB18030
4. th_TH.ISO8859-11
5. hi_IN.UTF-8
6. zh_TW.UTF-8
the entries in unixcharset.properties need just like that:
locale.all.zh_HK.BIG5HK=Big5-HKSCS
locale.all.zh_CN.GB18030=x-gb18030 or gb18030
locale.all.th_TH.ISO8859-11=ISO-8859-11
locale.all.hi_IN.UTF-8=UTF-8
locale.all.zh_HK.UTF-8=UTF-8
locale.all.zh_TW.UTF-8=UTF-8
Notes: x-gb18030 or gb18030 are not defined now because 72525 is still open.
Maybe Big5-HKSCS is also not defined.
Updated•24 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 2•24 years ago
|
||
The fix of bug 54000 can cover the most cases for our new locales.
We don't have to change unixcharset.properties for
*.UTF-8
*.BIG5HK
But th_TH.ISO8859-11 should be defined.
locale.all.th_TH.ISO8859-11=TIS-620
I could not find any entry for ISO-8869-11. Is this correct?
For GB18030,
Brian@Sun, Ervin, could you try nl_langinfo(CODESET) in GB18030 locale
and tell me the returned string?
Comment 3•24 years ago
|
||
The following is the nl_langinfo(CODESET) of the locales in Solaris:
ko --- 5601
zh(zh_CN.EUC) --- gb2312
zh_TW(zh_TW.EUC) --- cns11643
zh.GBK(zh_CN.GBK) --- GBK
zh_CN.GB18030 --- GB18030
zh_TW.BIG5 --- BIG5
zh_HK.BIG5HK --- Big5-HKSCS
th(th_TH,th_TH.TIS610,th_TH.ISO-8859-11) --- TIS620.2533
Thanks.
Brian.
Comment 4•24 years ago
|
||
Brian and Frank,
I have checked the current implementation and here are my questions.
I don't understand which file will need changes, charsetalias.properties
or unixcharset.properties, or both. I'd like to know the policy.
1. get nl_langinfo(CODESET)
2. check the entry in charsetalias.properties, if OK return charset
3. check the entry in unixcharset.properties, if OK return charset
4. fallback to ISO8859-1
For example, zh_HK.BIG5HK entry is not unixcharset.properties now, but
it returns Big5-HKSCS by 1 and 2. It seems OK on Solaris, however,
what will happen when it runs on system which does not have nl_langinfo()?
If we consider such system, should we define entry in unixcharset.properties?
Also, for th_TH.ISO8859-11, nl_langinfo(CODESET) returns TIS620.2533.
There is no entry in charsetalias.properties, also in unixcharset.properties.
If we could add the following to charsetalias.properties, the charset
returns by 2.
tis620.2533=TIS-620
However, if we consider the system which does not have nl_langinfo(),
I'm thinking we will need add the following to unixcharset.properties.
locale.all.th_TH.ISO8859-11=TIS-620
Or, it will work when the entry above only in unixcharset.properties.
So, my question is I want to know which file we should modify,
unixcharset.properties
or charsetalias.properties
or both?
Comment 5•24 years ago
|
||
Katakai,
In your debug build can you add a printf to the locale file to output the
charset? This way you can see if the correct encoding is being used.
http://lxr.mozilla.org/seamonkey/source/intl/uconv/src/nsUNIXCharset.cpp#239
238 #if HAVE_NL_LANGINFO && defined(CODESET)
239 nl_langinfo_codeset = nl_langinfo(CODESET);
240 NS_ASSERTION(nl_langinfo_codeset, "cannot get nl_langinfo(CODESET)");
+ if (nl_langinfo_codeset)
+ printf("nl_langinfo(CODESET) = %s\n", nl_langinfo_codeset);
+ else
+ printf("nl_langinfo(CODESET) returned NULL\n");
Comment 6•24 years ago
|
||
re: what will happen when it runs on system which does not have nl_langinfo()?
==============================================================================
The decision to use ns_langinfo is made at compile time on a per OS basis. I
know that Linux, Solaris, HPUX, AIX will all use it.
If the nl_langinfo returns an alternate name for the encoding and it is
reasonable to add to charsetalias.properties we should.
If it is not reasonable to add to charsetalias.properties then we will need to
create a unixcharset.<OSARCH>.properties file to remap to a useable value.
Only if a system's nl_langinfo is incomplete and does not return any value for a
locale would we put it in the depreciated unixcharset.properties.
Comment 7•24 years ago
|
||
Thanks Brian,
So, I understand GB18030 and TIS620.2533 definitions should be into
charsetalias.properties. Any problem?
Assignee | ||
Comment 8•24 years ago
|
||
Mark this as moz0.9.2 P2
Priority: -- → P2
Target Milestone: --- → mozilla0.9.2
Comment 9•24 years ago
|
||
Comment 10•24 years ago
|
||
Attached the patch for charsetalias.properties, not unixcharset.properties.
Frank, Brian, can you take a look the patch?
Comment 11•24 years ago
|
||
Changing QA contact to katakai@japan.sun.com.
QA Contact: andreasb → katakai
Assignee | ||
Comment 12•24 years ago
|
||
please do not add
gb18030=GB18030
we should use lower case here since I already add some other code for that.
reassign this bug to me and I will land both.
the tis one is ok
Assignee: bstell → ftang
Comment 14•24 years ago
|
||
QA contact to Ervin. I believe the latest nightly has the fix for TIS.
Can you try?
QA Contact: katakai → eyan
Assignee | ||
Comment 15•24 years ago
|
||
mark it as fixed
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Target Milestone: mozilla0.9.2 → mozilla0.9.1
Reporter | ||
Comment 16•24 years ago
|
||
TIS charset now can be displayed OK in Mozilla nightly build 2001051310.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•