Closed Bug 1075436 Opened 10 years ago Closed 10 years ago

Trying to forward a Japanese e-mail (ISO-2022-JP encoding for mail main body) with certain attachment files cause a garbled main text in mail composite window

Categories

(Thunderbird :: Message Compose Window, defect)

31 Branch
defect
Not set
major

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1026989

People

(Reporter: ishikawa, Unassigned)

Details

Attachments

(8 files, 2 obsolete files)

23.50 KB, application/msword
Details
12.36 KB, application/vnd.ms-word
Details
8.57 KB, text/plain
Details
8.58 KB, text/plain
Details
3.73 KB, text/plain
Details
421.61 KB, image/png
Details
324.59 KB, image/png
Details
13.36 KB, message/rfc822
Details
In the past, I have experienced a garbled main body text when I tried to forward a message that came in from someone. I am using Japanese locale under linux (I specified x86 in this initial report. I will check if is also true on x86_64, too. I suspect it is. Not sure about Windows7. This again would be checked, but maybe unlikely?) Over the years, I have figured out that this happened when there are attachment files to the incoming e-mails. Luckily, this garbling of text (often called Mojibake in Japanese) happened last week. Due to the nature of the correspondence, I could not post the original e-mail, but coupled with my suspicion of attachment of files, I created a simpler e-mail and could reproduce the issue. I am going to upload - set of five files, that were attached to an e-mail that I sent to myself. I think the particular order of attachments may be very important. Also, I am going to upload the screen images to explain what it looks like when the garbling of main text occurs. - The list of attached files is shown correctly. - When I checked the display character code of the received mail, it shows "Shift JIS" which is patently wrong. This is because the main mail body text was in ISO-2022-JP (!) I think plain text attachment that contain Shift_JIS encoded characters might have confused TB. Now that I am causing my TB to get confused when I try to forward the problematic message myself, now I am sure that I have caused others (TB users) the same issue when I sent others such an e-mail combination of attachments (not exactly sure what other than I think we need Shift_JIS encoded plain text file somewhere in the list at least). TIA I hope the information is good enough.
After uploading the plain text files buggy-attachment-sjis-{1,2,3}.txt, I noticed that it does not say what code system it is in. The first two ones are MS-WORD (.doc) and MS-WORD (.docx the newer format). The last three files contain Japanese text in Shift_JIS encoding. You have to make sure that they are in Shift_JIS encoding when you download it and want to test the scenario on your computer. On top of that buggy-attachment-sjis-1.txt has UNIX line terminator (LF only) where as buggy-attachment-sjis-{2,3}.txt has DOS terminator (CR LF combination). I don't know if it is relevant or not. Now I am going to upload the screen images. Hmm... I have some issues trying to capture the screen image while I am pulling down a menu to show the encoding used. So let me try this later on. Anyway, to reproduce the issue, I create a Japanese message (the main mail text is Japanese and is sent out as ISO-2022-JP text with these attachment files.) Fact 1: When I receive this e-mail, I can read the e-mail and the attachment files (including the file names shown correctly) Fact 2: BUT, at this moment, strangely TB shows that the used encoding is Shift_JIS, instead of ISO-2022-JP. (pulldown menu Display -> character encoding.) Fact 3: When I try to forward this e-mail, the main mail text (the Japanese portion) got garbled in the composition window. Somehow, the listing of attachment files is shown correctly: the Japanese filenames are not mangled. Fact 3b: I noticed today while tinkering to reproduce the issue, that if I try to edit the received message as new, again, the mail composition window shows the main mail body text garbled (the Japanese portion). The listing of attached files is correctly shown. So it is not necessarily the forwarding, but just trying to reedit the main text body causes the problem to manifest. Fact 4: BTW, if I simply try to return a response to the original message, somehow the main mail text is quoted correctly. It ends up in the mail composition window without getting garbled (with ">" at the front.) The only difference with Fact 3 and 3a above is that the act of attaching the files does not occur here as opposed to the case of Fact 3, and Fact 3a above. NOTE* I have noticed that the message goes out as one mixed mime message Content-Type: multipart/mixed; boundary="------------060907060509000107020603" and the main mail body text was created as a part encoded in ISO-2022-JP: This is a multi-part message in MIME format. --------------060907060509000107020603 Content-Type: text/plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit これはテストです。 This is a test. and the subsequent attachment shows, especially the plain text file (Shift_JIS encoded) one showed this. Note the charset thingy: --------------060907060509000107020603 Content-Type: text/plain; charset=Shift_JIS; name="buggy-attachment-sjis-1.txt" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="buggy-attachment-sjis-1.txt" I suspect that TB gets confused after inspecting the charset of the attached file and uses a left over information of this Shift_JIS attachment when showing the display character code (as Shift JIS) as opposed to the ISO-2022-JP. Or that the encoder/decoder is left in a strange mode which handles Shift_JIS encoding/decodding against ISO-2022-JP data when the main text body in the mime portion is decoded and put into the composition window. What does this mean? Oh well, right, I have found that just attaching a single file, buggy-attachment-sjis-1.txt to a Japanese (ISO-2022-JP) outgoing mail causes the same misbehavior of garbling the main mail body text when the recipient (TB) tries to forward the said mail or re-edit it as new (in the composition window). So testing this should be easy now. Screen shots have to wait for now. TIA
OS: Linux → All
Hardware: x86 → All
Version: unspecified → 31
OK, in terms of bug's ubiquity, I found out that the same issue occurs with the latest TB (31.1.2) under Windows7 64-bit, and so I think this is platform-agnostic bug. So I changed the platform to be ALL, and the version of TB to 31. Here I am going to upload a screen shot that I captured under windows 7 64-bits. (There are more tools to capture screen images easily under windows. I used a tool called Wink.: I found that showing the encoding [display] -> [character encoding] by pressing mouse button interferes with a screen capture technique under linux, for example.) Here it goes. I sent out a message with ISO-2022-JP as the main mail body text encoding, and attached a plain text file (Shift_JIS encoded text is inside). (Done on the TB under windows 7 64-bit.) I received the mail myself using the same TB. The message is shown correctly. The main text is readable. The attached file is one of the uploaded file. But note that TB shows incorrect encoding of the mail message. It says Shift_JIS, but main body is in ISO-2022-JP. (I think somehow TB picked up the encoding of the attached file and remember it as the main body's.)
Attachment #8498194 - Attachment description: The problematic message received. Visible OK, but shown encoding (Shift_JIS) does not match the main body's (ISO-2022-JP). → SCREEN: The problematic message received. Visible OK, but shown encoding (Shift_JIS) does not match the main body's (ISO-2022-JP).
This is the screen shot of the composition window when the problematic mail was forwarded. Please note that the main body text (Japanese portion is garbled), but subject line remains readable. Hope these help someone to look into this garbled message (Mojibake) issue. TIA
Sounds phenomenon of bug 715823, but I'm not sure because data about "message headers of the problematic mail which was forwarded" is not provided by you...
(In reply to WADA from comment #8) > Sounds phenomenon of bug 715823, but I'm not sure because data about > "message headers of the problematic mail which was forwarded" is not > provided by you... Sounds like a similar case. I don't know why the patch has not been produced yet. FYI, I am uploading the message that will cause the garbling of mail body text in the mail composition window when I try to forward it. Main body is in ISO-2022-JP, and the attachment is Shift_JIS encoded Japanese text file. TIA PS: I deleted Received and Authentication headers.
Oops, the deletion of a few header lines and leave a blank line has made it impossible to load it into TB directly. Hopefully fixed.
Attachment #8499382 - Attachment is obsolete: true
Whew. I did not want to try this by everybody and send an e-mail to me: so I changed from and to: data.
Attachment #8499385 - Attachment is obsolete: true
Phenomenon you saw == following == Bug summary of bug 715823. i.e. dup of bug 715823. > if different charset is used in part under multipart/mixed, > charset of last part is used in forward/edit as new, without converting data to the charset)
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
Summary: Trying to forward a Japanese e-mail with certain attachment files cause a garbled main text in mail composite window → Trying to forward a Japanese e-mail (ISO-2022-JP encoding for mail main body) with certain attachment files cause a garbled main text in mail composite window
While bug 715823 was being processed, only "garbled display in Forward Inline case" was resolved by unknown patch of unknown bug. However, essential issue of "charset is picked up from last attachment" in Thunderbird was not resolved by any one. It was resolved by bug 1026989 recently. Duping to bug 1026989 for ease of tracking, understanding involved problems.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: