Open Bug 66326 Opened 24 years ago Updated 2 years ago

Japanese yen is displayed differently between the Subject field and body text in Composer

Categories

(MailNews Core :: Composition, defect, P4)

x86
Windows NT

Tracking

(Not tracked)

Future

People

(Reporter: momoi, Unassigned)

References

Details

(Keywords: intl)

Attachments

(2 files)

** Observed with 1/22/2001 Win32 build ** 1. Bring up Mail Composer under Japanese Windows. Have the Japanese keyboard selected. 2. Set the encoding to Western (ISO-8859-1). 3. Now input the backslash/yen character in the Subject field. You should see the *yen* character. 4. Now move the cursor to the body text field, and input the backslash/yen character. This time you will see the *backslash* character. Thus, under the same Japanese IME, the same input shows up differently.
QA Contact: esther → ji
So the desired behavior is that the subject field to display backslash instead of yen sign?
I believe that is the correct behavior. If the encoding of the mail msg is in ISO-8859-1, it should be the backslash. If it is in ISO-2022-JP, then the yen character.
I don't think we can specify a charset to non HTML widgets (e.g. url bar, address book fields).
Keywords: intl
There is a bug about passing charset for XML. bug 41981 - need to pass XML encoding to nsIDocument But even we can pass a charset to XUL, we cannot change a charset of the particular text field (e.g. subject field). And we do not want to change charset of XUL to mail charset (e.g. ISO-2022-JP) which makes localized strings (in UTF-8) unreadable.
Not possible for mozilla0.8 and mozilla0.9, mark as future for now. This is a side effect of using unicode and caused by Shift_JIS is mapping character differently for 0x5C. So the users with Japanese OS always see yen sign instead of backslash. This is not desirable for multilingual application but it's the default behavior for many applications (e.g. file explorer, note pad, dos prompt, etc...). If the bug was about not showing yen for JapaneseOS then the priority would be higher but this bug is otherway.
Status: NEW → ASSIGNED
Priority: -- → P4
Target Milestone: --- → Future
Can you possibly make it so that it will be also backslash in the body text? At least that way there is consistency. Currently, the 2 inputs produce different results. I don't think that is desirable at all.
> Can you possibly make it so that it will be also backslash > in the body text? I meant the "yen" character. The above should read: Can you possibly make it so that it will be also the yen character in the body text?
As long as the user sets message send default to Japanese, the body also shows yen instead of backlash. I think most of the Japanese users set the default to Japanese (that's the default setting for JA versions).
Cans omebody with a Japanese Windows version test the validity of this bug?
Product: MailNews → Core
Assignee: nhottanscp → nobody
Status: ASSIGNED → NEW
QA Contact: ji → composition
Product: Core → MailNews Core
masayuki, can you test this and update bug? thanks
Here's another, like bug 6463
See Also: → 64363
(In reply to Wayne Mery (:wsmwk) from comment #11) > Here's another, like bug 6463 You meant 64363 OK, now I see it: I think I can replicate the issue in 64363. This is with Japanese locale version of TB, under Windows 10 (with Japanese locale chosen). If I follow the following steps, the subject input field shows Japanese YEN mark whereas the main body text shows BACKSLASH when I hit the key for backslash. (Now it is interesting to note in this comment field of bugzilla, I see BACKSLASH when I type the "YEN"/"Backslash Key". Hmm.. This is under Windows 10. That is why I used "YEN" and "Backslash" instead of typing in a single symbol.) 1. I open a message compose window for a new mail. 2. I change the encoding of the main text message to Western Europe (that is how the choice for ISO-8859-1 shows up under Japanese windows). The default was UTF-8 for message body. 3. For the subject, when I hit the key for backslash, I see Japanese Yen currency mark. 4. For the main body, when I hit the key for backslash, I see BACKSLASH. Screenshot attached (Both steps 3 and 4 are performed when NO Japanese INPUT Conversion is enabled, that is the key press to text input is more or less straightforward: I probably have not typed BACKSLASH into SUBJECT field very often before and did not realize the discrepancy of the subject and main body text. This is indeed a problem if I want to include a character string (as in string bundle that contains escaped character with a backslash. Or is it a problem? What does the recipient see? That is also a crucial part of the problem. So I sent the e-mail to myself. I will post the result in another comment.
While composing, I am not sure if "Western Europe" for encoding picks up ISO-8859-1 as charset since I see "Windows-1252" in the windows pane of the message compose windows in the attached image: https://bug66326.bmoattachments.org/attachment.cgi?id=8955941 But this is the only way under Windows (or for that matter under linux) with which we can obtain Western European encoding. Maybe, under linux, we get ISO-8859-1 (or is it an equivalent encoding? Only under different names?) Anyway, when I sent the message and receive the result, the superficial outlook is the same as the image during composition. I get Japanese YEN currency mark in the subject line, and BACKSLASH in the message body. Now I got curious. I saved the received message and opened it using MS Windows memo file (I think it simply tried to open it as UTF-8 plain text file.) Then I noticed the following relevant lines. It seems that it is a display issue of forcing the "localization" over the character code value of "\" (Backslash) in the SUBJECT line only (???). This is what I see after opening the saved message in memo (or memopad). --- begin quote --- From: "ISHIKAWA,chiaki" <ishikawa@yk.rim.or.jp> Subject: \\\\===abcdefg To: "ishikawa, chiaki" <ishikawa@yk.rim.or.jp> Message-ID: <44afee6e-69f5-3a22-9468-21691c7d06e8@yk.rim.or.jp> Date: Mon, 5 Mar 2018 05:46:30 +0900 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Brightmail-Tracker: AAAABDDXuTIw2RSGMNe5MDDa9Fg= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrPqMTGxcIABLru0XOiDN6+ErfYd+A7iwOjx/EX75kCGKM4/PJLFIILEnMTODJ6dl9gK2CoAFENjAyrGIVzS8sNizMSi0sSE4uKC4ryUwwMNzFCDdzBuP2izClGSSlxXo/IOVFCAsVAQzJK80pSi+KLSnNSi18xinMwKgnzGkUBZXky80qKM9NhMhIcTEoivL5vZ0QJ8QJNR0hJNTBu8Lx3Kobrk6VjY/xix7nr+/el6UzJCtAQ3nqQW0Re4xnbd50bMy68OKxnk9yk/zfdPYpl+eHrZV1Nq4xSLtq+Z7l4/mDtle+S7Za1U68urG9JSyypP5hiKPBg12+ef3NYHr5avavn1X3vt/yz4yxOSsa5Rs6+devEFVZDj0MSm3NeWz4S1PmpxAL0sqEWc1FxIgB1duTKQgEAAA== X-Text-Classification: personal X-POPFile-Link: http://127.0.0.1:9090/jump_to_message?view=99530 \\\\====abcdefg What happens when I sent this e-mail to myself, and see it under default setting. --- end quote --- I am a little perplexed. I have no idea what the correct fix to what parts would be.
UTF-8 should allow us to input Japanese YEN currency mark and BACKSLASH in the same body text and show them as different symbols. Now I am not sure how I can input these two symbols differently. When I try to use English keyboard with Japanese input method I cannot seem to type BACKSLASH into main text area of the default UTF-8 composition window. No matter which mode (transparent vs Japanese Input ) I use, the symbol which appears in the main text area (not the subject) compose window is Japanese YEN currency mode. This is certainly a different behavior (different from the case when I chose Western Europe encoding for main mail body text.)
Probably Thunderbird is using a Western font (such as Courier New) for message body and a Japanese font (such as MS UI Gothic) for subject. Japanese Windows fonts have a YEN SIGN glyph at the BACKSLASH code position (U+005C). Windows will always input a BACKSLASH code regardless of the key top symbol, but it will be displayed as YEN SIGN when a Japanese font is selected.
Yeah, this must be YenSign issue. https://en.wikipedia.org/wiki/Yen_sign#Shift_JIS_and_Code_page_932 Prior to Gecko 1.9 (Firefox 3), we've had a hack around this. That was, U+005C were always replaced to U+00A5 only in nsTextFrame only if the page's lang was ja-JP. However, this hack was removed for performance and keeping nsTextFrame simpler as far as possible. Although, even if we'd take back the hack, we have no way to show both "backslash" and "yen sign" with a font since Japanese font usually have yen sign glyph for U+005C on Windows.
(In reply to Masatoshi Kimura [:emk] from comment #15) > Probably Thunderbird is using a Western font (such as Courier New) for > message body and a Japanese font (such as MS UI Gothic) for subject. > Japanese Windows fonts have a YEN SIGN glyph at the BACKSLASH code position > (U+005C). Windows will always input a BACKSLASH code regardless of the key > top symbol, but it will be displayed as YEN SIGN when a Japanese font is > selected. I agree with the above since when I sent the problematic e-mail message to a different account and saw it under linux, BOTH the subject and main body text shows BACKSLASHes (!) instead of Japanese YEN mark. I think comment from Masayuki san in comment 16 explains the reason behind the discrepancy. Without the hack that has performance implications, we can't show the proper glyph under the Windows, then. (Probably under linux, it is a different story?) But now I have a question. Firefox 3 was a long time ago. Between now and then, CPU power has increased so much that the performance penalty mentioned in comment 16 may not be an issue any more. I mean today a lower-end smartphone has a multi-core CPU operating over 1GHz of clock, and has a few gigabytes of memory. This is basically the PC of the yesteryear. Desktop PCs where TB is used typically also probably more powerful (esp. CPU clock-wise). I suspect that there are Window users who would go for correct display of glyphs because they need "correct" display of Japanese YEN currency mark vs BACKSLASHes. Accounting type workers come to mind. TB is not a speed daemon as far as display speed goes, is it? (I think we have more performance issues at higher-level like trying to show the subject lines of same subject hierarchy at the same time (unnecessary to my eyes): when there are thousand message during testing purposes, it kills TB, I mean the display refresh comes to crawl (this was my experience up to last summer. I got tired of testing some patches due to this bad behavior.). A performance penalty spread over many parts in a small chunk won't be noticed as badly as the performance slowdown which I mentioned in the previous sentence. Of course, I have no idea what type of performance penalty was observed back in the days prior to Firefox 3. Anyway, if the performance penalty perceived is no longer there, we may want to revert the hack for Japanese Windows, presumably the largest user host OS where TB is used. I use at least one Windows OS, and one linux OS constantly. I also have a couple Windows PCs where I run TB occasionally (on the road, or testing purposes). It is true that Japanese font situation is a mess under the linux, too. But at least, this YEN mark and Backslash discrepancy does not seem to happen there.
This is not only a performance issue but also a correctness issue. We can't tell if a sender intended to represent YEN SIGN or BACKSLASH with U+005C. Moreover, we have no way to display BACKSLASH because Microsoft Japanese fonts have the YEN SIGN glyph for *both* U+005C and U+00A5 and have no BACKSLASH glyph. This is not a problem that we should (and can) work around.
Yeah, Japanese fonts for Windows usually don't have backslash glyph anyway. Even if we could show backslash glyph for U+005C forcibly, most Japanese users may feel backslashes as odd result because this historical issue isn't so major for them. E.g., they understand yen sign character is a path delimiter on Windows. Additionally, most users type U+005C as yen sign unless they use macOS (e.g., a lot of web sites still use backslashes for prefix of price labels, so, you can see "\980-" as price on non-Windows platforms even in 2018!). So, displayed as backslash may make them confused.
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: