Closed Bug 157602 Opened 23 years ago Closed 23 years ago

Non-ascii (in ISO-8859-1) title not displayed correctly on JA linux

Categories

(MailNews Core :: Internationalization, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 150131

People

(Reporter: jeesun, Assigned: nhottanscp)

Details

Attachments

(7 files)

Non-ascii title not displayed correctly on JA linux Platform: JA linux 7.1 Steps: 1. Send yourself a mail which has accent characters (i used êtes-vous sû) in subject and body. Choose ISO-8859-1 encoding. And receive the msg. (Or Subscribe to news.mozilla.org/netscape.public.mozilla.qa.i18n and see any msg with accent letters) 2. Notice that the title is not displayed correctly.
QA Contact: marina → jeesun
8 bit Latin1 characters are not displayed on JA locale, this is a known problem. I'll search for dup.
dup of bug 150131?
On linux, address book card has the same problem. For entries which have latin-1 chars, the window title doesn't display the name correctly.
Jungshik, do you think this is a dup of bug 150131?
The first case (a French mail subject not rendered correctly on the window title bar under non-Latin-1 locale) is for sure a dup. of 150131. As for the second screenshot (attachment 91854 [details]), it seems like there's another issue. Letters with accent should be rendered as ?'s instead of Kanjis if this is also a dup. of bug 150131. However, they're not, which means that a translation between encodings is missing somewhere in the implementation of bookmark code. To be sure, I have to know how the French name in the screenshot was entered in the first place. Jeesun, could you tell me how you entered it?
Hmm. something is wrong with either my copies of Mozillas (my own build,and 1.1alpha) or my Window manager. An addr. book card window does not have window title bar so that I can't test it on my machine. I'll reboot to Win2k and test it there. In the meantime, I tried to make some sense out of the screenshot. The name used in the screenshot (décelées Après) is represented by 64 e9 63 65 6c e9 65 73 20 41 70 72 e8 73 in ISO-8859-1 and 64 c3 a9 63 65 6c c3 a9 65 73 20 41 70 72 c3 a8 73 in UTF-8 I've just checked abook.mab file and found that it stores strings in 'a kind of UTF-8'. Thereore, a translation step is missing somewhere between abook.mab and the window title bar. Three Kanjis in the shot are (set your encoding to UTF-8 to read the following) EUC-JP SJIS 宴 U+5BB4 0xB1E3 0x8983 怨 U+6028 0xB1E5 0x8985 回 U+56DE 0xB2F3 0x89F1 Since I don't know how EUC-JP(or JIS X 0208) encoder is implemented in Mozilla, I can't make much sense out of this. Just in case, Jeesun, could you show me the values of $LC_* and $LANG? $ env | egrep '(LC_|LANG)' would do it in Bourn-shell like shell. I also like to know what window manager you're using and what you have in /etc/sysconfig/i18n(if it's RedHat or Mandrake) and ~/.i18n
I conducted some experiments under Win2k (both KO and EN locale) and Linux(both ko_KR.UTF-8 and ko_KR.eucKR locale). Judging from the result of these experiments, this bug is almost certainly a dup. of bug 150131. I'm not yet certain why some Kanji characters appeared in attachment 91854 [details] instead of ?'s. However, my test results under both Win2k and Linux (in case of the latter, I used 'xprop' to confirm that WM_NAME and _NET_WM_NAME properties are set as they should be for addressbook window and mail/news display window) clearly indicate that two cases I thought of as separate in comment #6 and comment #7 are not separate. Before resolving this bug as a dup, I'd like to see the result of 'xprop' on the window in attachment 91854 [details]. Jeesun, after opening up that window in Mozilla, you can run 'xprop' from an xterm, move the cross-mouse-pointer over the window and press the left button. In the xterm, you'll get a screenful of output like the following: ----------- snip... _NET_WM_NAME(UTF8_STRING) = 0x70, 0xc3, 0xbc, 0x6b, 0xc3, 0xbc, 0x20, 0x2d, 0x20, 0x6e, 0x65, 0x74, 0x73, 0x63, 0x61, 0x70, 0x65, 0x2e, 0x70, 0x75, 0x62, 0x6c, 0x69, 0x63, 0x2e, 0x6d, 0x6f, 0x7a, 0x69, 0x6c, 0x6c, 0x61, 0x2e, 0x71, 0x61, 0x2e, 0x69, 0x31, 0x38, 0x6e, 0x20, 0x6f, 0x6e, 0x20, 0x6e, 0x65, 0x77, 0x73, 0x2e, 0x6d, 0x6f, 0x7a, 0x69, 0x6c, 0x6c, 0x61, 0x2e, 0x6f, 0x72, 0x67, 0x20, 0x2d, 0x20, 0x4d, 0x6f, 0x7a, 0x69, 0x6c, 0x6c, 0x61 snip.... WM_LOCALE_NAME(STRING) = "ko_KR.eucKR" WM_NAME(STRING) = "p?k? - netscape.public.mozilla.qa.i18n on news.mozilla.org - Mozilla" snip... -------- I'm interested in three of them - _NET_WM_NAME, WM_LOCALE_NAME and WM_NAME. BTW, you may also try http://jshin.net/moztest/frenchname.html and http://jshin.net/moztest/frenchname.utf8.html
Jungshik, please notice that even the first case is showing Kanji instead of ?'s on the window's title bar
>To be sure, I have to know how the French name in the screenshot was >entered in the first place. I copied the letters from http://home.netscape.com/fr site and pasted them into mail compose window. >Just in case, Jeesun, could you show me the values of >$LC_* and $LANG? > $ env | egrep '(LC_|LANG)' LANG=ja_JP.eucJP GDM_LANG=ja_JP There are no $LC env variables >I also like to know what window manager I don't know. How can I tell? >you're using and what you have in /etc/sysconfig/i18n LANG="ja_JP.eucJP" SUPPORTED="zh_TW.euctw:zh_TW:zh:en_US:en:ja_JP.eucJP:ja_JP:ja:ko_KR.euckr:ko_KR:ko" SYSFONT="lat0-16" SYSFONTACM="iso01"
Attached file The result of xprop
>Before resolving this bug as a dup, I'd like to see the result of >'xprop' on the window in attachment 91854 [details]. See the attached text file.
Jeesun, Thank you for testing. Can you also try http://jshin.net/moztest/frenchname.html (ISO-8859-1) and http://jshin.net/moztest/frenchname.utf8.html (UTF-8) to see whether the window title bar has the same problem? I think it does. As for the result of xprop, I'm sorry what I wrote in comment #8 may have been confusing. I'd like you run 'xprop' over a _real_ *Mozilla* window with Kanji's in the titlebar (instead of a image display program showing the screenshot)
Sorry I misunderstood your comment. Here's a new xprop result
> I'd like you run 'xprop' over > a _real_ *Mozilla* window with Kanji's in the titlebar I ran 'xprop WM_NAME 8x' and 'xprop _NET_WM_NAME 8x' over a Mozilla window displaying http://jshin.net/moztest/frenchname.html under both ja_JP.eucJP locale and ko_KR.eucKR locale. The result is interesting although I don't think there's anything Mozilla is doing wrong here. Under both locales, Mozilla sets _NET_WM_NAME (UTF8_STRING) correctly (i.e. the UTF-8 string of the title of the page above with French name). This is thanks to a patch committed last July for bug 9449. In case of WM_NAME, Mozilla sets WM_NAME(STRING) to 'd?cel?es Apr?s' under ko_KR.eucKR locale. I expected the same under ja_JP.eucJP. However, under ja_JP.eucJP locale, for some unknown reason, it sets WM_NAME(COMPOUND_TEXT) to the following: WM_NAME(COMPOUND_TEXT) = "d\033$(B\017+1c\033(Bel\033$(B\017+1e\033(Bs Apr\033$(B\017+2s where '\033' denotes ESC(0x27) and '\017' denotes Shift-IN(0x0f). "ESC $ ( B" is not a sequence defined in ISO-2022 although it could well be used synonymously with "ESC $ B" for designating JIS C 6226:1983 as G0. More mysterious is why 'Shift-In' is there. This bug may have exposed a bug in XFree86's compound_text/string handling code or gtk's bug. Anyway, it seems like there's very little Mozilla can do here other than what we already know about this (see bug 9449 and bug 150131). As I wrote in bug 9449 and bug 150131, we've already taken care of the window title bar issue under Linux with _NET_WM_NAME patch. Nonetheless, it still bothers me that Mozilla's behavior is different under ko_KR.eucKR from that under ja_JP.eucJP. Naoki, can you think of any place where Japanese and Korean are treated differently ? Although not likely, there's a possibility that for ja_JP.eucJP, a translation step is missing somwwhere while it's not for ko_KR.eucKR. I'm adding Katakai-san to CC so that he can take a look at this.
Jeesun, thank you for the screenshots. It turns out that U+00E8 and U+00E9 are representable in EUC-JP because they're covered by JIS X 0212. That's why I got a sequence like '0x8f 0xab 0xb1' in the following: (note ix86 is little-endian and two octets in 16bit word are reversed) where '0x8f' is used to declare that the following two octets represent a char. in JIS X 0212. '0xab 0xb1' is indeed U+00E9 in JIS X 0212 (invoked on GR). (my understanding of X11 Compound Text encoding was wrong and there must be something more in it than ISO 2022. I have X11 C_T spec. somewhere, but I didn't bother to dig it up.) (xprop was run under ja_JP.eucJP) $ xprop | egrep '^WM_NAME' | sed 's/WM_NAME(COMPOUND_TEXT) = "//' \ | hexdump 0000000 8f64 b1ab 335c 3334 6c65 ab8f 5cb1 3433 0000010 7335 4120 7270 ab8f 5cb2 3633 2033 202d 0000020 6f4d 697a 6c6c 2061 427b 6975 646c 4920 0000030 3a44 3220 3030 3032 3136 3031 7d38 0a22 Then, why astray Kanjis in the title bar? I think that's because RH JA 7.1 is misconfigured. /etc/gtk/gtkrc.ja on my RH (En) 7.1 does not have any JIS X 0212 fonts. Adding jis x 0212 font there may or may not solve the problem. Some window managers require a separate fontset specification. Anyway, this is a duplicate of bug 150131 (and 9449) and I suggest this be marked as such.
Mark this as dup of bug 150131 as Jungshik suggested. *** This bug has been marked as a duplicate of 150131 ***
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → DUPLICATE
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: