Bug 19081
Opened 25 years ago
Closed 15 years ago
ISO-2022-JP conversion bugs for single byte katakana
(MailNews Core :: Internationalization, defect, P4)
MailNews Core
(Not tracked)
(Reporter: ji, Assigned: m_kato)
(Keywords: intl)
(4 files)
Build: 1999111709-M12
OS: RH6.0
Send out a mail with half-width katakana in the subject and message body.
When receiving it, the halt-width atakana disappears in the thread pane
and in the message pane the string following the half-width katakana
disappears as well. The same message displays alright using 4.7.
Steps of reproduce:
1. Open mail compose window and select menu "View | Charset | Japanese (
2. Enter "カタカナカタカナ漢字" in the subject and message body, send the mail
to the testing account itself.
3. When receiving the mail, you'll see in the thread pane the subject displays
and in the message pane, both in the subject and message body only
カタカナ displays.
Below is the "view source" with 4.7:
Subject: =?ISO-2022-JP?B?GyRCJSslPyUrJUobKEq2wLbFGyRCNEE7ehsoQg==?=
Content-Type: text/html; charset=ISO-2022-JP
Content-Transfer-Encoding: 7bit
Comment 1•25 years ago
Set 5894 as depend bug. It should eventually be converted to full-width.
But currently they should turned to question marks (unmapped chars) instead of
Comment 2•25 years ago
I think we have ISO-2022-JP encoder/decoder problems for Hankaku Katakana.
Here's what I got on sending the string (view this under Japanese (Auto-Detect)):
in the Subject header and body.
1. Sent from 11/15/99 Linux M11 build:
Subject: =?ISO-2022-JP?B?GyRCJU8lcyUrJS8bKErK3ba4GyRCSD4zURsoQg==?=
Content-Type: text/plain; charset=ISO-2022-JP; format=flowed
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by id UAA15287
2. Sent from 4.71 -- 11/15/99 Win32 build:
Subject: =?iso-2022-jp?B?GyRCJU8lcyUrJS8bKElKXTY4GyRCSD4zURsoQg==?=
Content-Type: text/html; charset=iso-2022-jp
Content-Transfer-Encoding: 7bit
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
(Note: You can use a B64 decoder found here: http://kaze:8000/tools/base64decode.html)
The important fact is that 5.0 is not removing Hankaku data, it is just encoding it wrong.
Note that 4.71 is generating "<ESC>(I" for 7-bit Hankaku but 5.0 is incorrectly
generating "<ESC>(B" and then at the same time generating 8-bit JIS (correctly).
(Note that Messaging server QP'ed it because 8-bit data were in the mail.)
This is why Mozilla cannot display some part of the string sent from Mozilla.
So we have a few problems.
1. 5.0 cannot display its own Hankaku generation but can display the one sent from 4.71.
On the other hand, 4.71 can display both types correctly.
--> This indicates we don't have a good JIS decoder which can tolerate these
variations in Hankaku encodings.
In particular we need to be able to decode, "<ESC>(I", JIS7 Hankaku, and JIS8 Hankaku.
Also tolerate a mistake such as "<ESC>(B" followed by JIS7 Hankaku code points just as
4.x does.
2. We need to be able to encode properly using "<ESC>(I" in case we offer Hankaku-send option
via prefs.js. This is the only encoding we should use in case we allow Hankaku in send.
I think we should fix these problems pretty soon, particularly the decoding problem and
improvement and tolerance in decoding.
Comment 3•25 years ago
I need to make clear that the data string sent from 4.71:
should be represented more accurately as:
where I use <ESC> in place of the escape character which may not displayed
dinstinctly enough under Japanese encoding.
By the way the B-encoded Subject header by Mozilla is identical to the text data
when it is decoded.
Comment 4•25 years ago
I probably should clarify my position on display. I think we should display Hankaku in text body as the data
contains Hankaku. 4.x does displays Zenkaku in the headers even though the real data are in Hankaku, but it displays
Hankaku in the text body. I'm not sure why the headers don't use Hankaku but we might want to copy
this behavior.
On send, Hankaku -> Zenkaku should be the default, but with a prefs.js option (no UI) to send Hankaku if needed.
Comment 5•25 years ago
I notice that there are a few other bugs filed earlier for Hankaku problems. Are the
problems 1 & 2 covered in these other bugs? It was my understainding that decoding
with some tolerance should be working now.
Updated•25 years ago
Assignee: nhotta → cata
Comment 6•25 years ago
I think the decoder problem should be a separate bug.
We need to provide an easy reproducible case to cata.
Momoi san, could you create ISO-2022-JP string and attach to the bug?
Use hankaku "aiueo" (\uFF71\uFF72\uFF73\uFF74\uFF75), I couldn't do that because
4.x editor converts hankaku to zenkaku when saving.
Comment 7•25 years ago
Attached a .txt file which contains "a-i-u-e-o" as described
by nhotta encoded in ISO-2022-JP with the escape sequence
ESC(I followed by 7-bit JIS. This is how we should encode
Hankaku Katakana.
Comment 8•25 years ago
Comment 9•25 years ago
I guess I messed up the MIME type. Please change the extension
from .cgi to .txt when saving it.
Comment 10•25 years ago
Should this have [BETA] in the summary?
Comment 12•25 years ago
I think I should own this, I rewrote the ISO-2022-JP decoder. Probably a bug in
Assignee: cata → ftang
Updated•25 years ago
Comment 13•25 years ago
the attached test cases render correctly under ISO-2022-JP now. Mark this fixed.
Comment 14•25 years ago
Hi, ji, can you check this out on Linux, Windows,and Mac?
QA Contact: momoi → ji
Reporter | ||
Comment 15•25 years ago
I checked today's builds. On linux and windows, the half-width char
is converted to corresponding full-width char both in thread pane and
message view pane when sent out using ISO-2022-JP. And on mac, the half-width
char is not converted to full-width and is displayed just as it is both in
thread pane and message view pane. It looks like there are still some problems
Comment 16•25 years ago
Is this display problem on Mac or sending problem. Will you see the half-width
kana in Mac browser when your page is in full-width kana ?
Reporter | ||
Comment 17•25 years ago
It seems a sending problem.
When sending a mail containing half-width katakana from mac, the half-width
katakan is not converted to full-width when received on mac or windows.
I also can see this with 4.x on mac.
Reporter | ||
Comment 18•25 years ago
Retested again on mac. The half-width katakana is converted to full-width.
I might have done something wrong last time when I tested mac version.
So now all three platforms can convert the half-width katakanas to full-width.
There are two issues left related to half-width katakana:
1.Change the pref file to be able to send out half-width katakana, as we do with 4.X.
Are we going to keep this feature for 5.0?
2.We may need to be good enough to display half-width katakanas sent from the other mail clients.
Comment 19•25 years ago
The second issue, I agree that we want to display them correctly.
The pref option, we can use it to test hankaku mail display.
Comment 20•25 years ago
mark it fix. The origional problem is now gone. IQA now need to developer more
test cases for different pref setting.
Closed: 25 years ago
Resolution: --- → FIXED
Comment 21•25 years ago
ftang, what do you want to do about Mozilla not being
able to display the Hankaku string it sends out?
What Hankaku encoding shoudl we do? 4.7x can display what
Mozilla sends out but Mozilla cannot.
By the way, Mozilla browser also cannot display ISO-2022-JP
saved Hankaku characters created by its own composer.
Can we do something better? Hankaku may not be used in
Mail all that often but we should be able to display
them in Browser.
We need to deal with this, I think, either in this bug or
in another bug.
I'm re-opening this bug and will attach a test message,
which displays OK undrer 4.72 but not under Mozilla even though
it was created by Mozilla.
Resolution: FIXED → ---
Comment 22•25 years ago
Comment 23•25 years ago
Comment 24•25 years ago
Comment 25•25 years ago
The bug was marked as FIXED because the attachment (posted on 11/19) displayed
correctly. It has ESC(I for hankaku.
The latest attachment (posted on 3/8) has ESC(B for hankaku. I don't think we
should support displaying hankaku with ESC(B.
So this is a problem of sending (converting to ISO-2022-JP from unicode). We
should generate ESC(I instead of ESC(B.
This is not a mail specific problem but usually ISO-2022-JP is used for mail and
in mozilla it's only used when the pref (no UI) is on explicitly.
Comment 26•25 years ago
It looks like the nsIUnicodeEncoder have problem. It generate 8-bits data in the
ESC ( B seq... bad bad behavior.
Is the remaining problem a beta1 stoper. If so, please put beta1 to the keyword.
Comment 27•25 years ago
After discussing this with ftang, we decided to do the following:
1. There should be a bug about correctly generating "<ESC>(I" for
7-bit Hankaku.
2. There should be a bug about interpreting different Hankaku
escape sequences. momoi will investigate and file a
separate bug on it.
About #1, this bug was from the beginning about Mozilla not
generating the correct escape sequence. ji said it and I said it.
I provided the exmple from 4.71 only to show you what should
be generated, not to use it as a test case to see if Mozilla
can display it. Mozilla can display "<ESC>(I" Hankaku even
before this bug was filed.
So, I think we should keep this bug for item #1 above.
I'll file a new one for #2.
Comment 29•25 years ago
change the summary from "Wrong handling about Japanese half-width katakana in
the subject and message body" to "ISO-2022-JP conversion bugs for single byte
Summary: Wrong handling about Japanese half-width katakana in the subject and message body → ISO-2022-JP conversion bugs for single byte katan
Comment 31•25 years ago
Reassigned to nhotta. Tentatively set TM to M17.
Assignee: bobj → nhotta
Target Milestone: M16 → M17
Updated•25 years ago
Comment 32•25 years ago
For mail/news, single byte katakana is only sent when a backend only pref is
set. So I don't think this is critical for beta2 for mail/news.
Is this bug for mail/news only or reproducible for form submission?
Summary: ISO-2022-JP conversion bugs for single byte katan → ISO-2022-JP conversion bugs for single byte katakana
Comment 33•25 years ago
base on the assumption that we won't provide pref ui to turn that pref on, we
should not mark this beta2.
Keywords: nsbeta2
Whiteboard: [nsbeta2+]
Updated•24 years ago
Target Milestone: M17 → M28
Updated•24 years ago
Target Milestone: --- → Future
Updated•20 years ago
Product: MailNews → Core
Updated•18 years ago
Attachment #2971 -
Attachment mime type: text/plain → text/plain; charset=iso-2022-jp
Updated•17 years ago
Product: Core → MailNews Core
Updated•16 years ago
QA Contact: ji → i18n
Assignee | ||
Comment 34•15 years ago
Not reproduce on 3.1. It is converted half-width katakana to full-width.
So I resolved by WORKSFORME.
Assignee: nhottanscp → m_kato
Closed: 25 years ago → 15 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.