Incoming messages with windows-874 are readable but the selected encoding is incorrect(ISO-8859-1) (AVG inserts 2nd multipart/alternative with text/plain;charset=us-ascii only in it to multipart/mixed mail, then Tb uses ISO-8859-1 as mail body charset)

RESOLVED DUPLICATE of bug 1026989

Status

MailNews Core
MIME
--
minor
RESOLVED DUPLICATE of bug 1026989
8 years ago
2 years ago

People

(Reporter: Preecha Wara, Unassigned)

Tracking

(Blocks: 1 bug)

Trunk
x86
Windows XP
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [See comment #13 for workaround, disable 'Certify e-mail' of AVG])

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

8 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8
Build Identifier: 3.1.4

Incoming messages with TIS-620 (Thai) encoding (I think other encodings would have the same problem) are readable but the selected encoding for the message being read is Western (ISO-8859-1) on the View -> Character Encoding menu. If I change from ISO-8859-1 to TIS-620, the message will be still readable. If I change back to ISO-8859-1, the message will become unreadable.

This has a few strange effects.

If I change the encoding from ISO-8859-1 to TIS-620 every time I read a TIS-620 message, everything will be fine. (Except that it's quite awkward to that every time)
This can be used as a workaround.

If I leave the selected encoding as ISO-8859-1 while the actual encoding is TIS-620 and I forward the message, the body of forwarded message in the compose window will be unreadable. (The subject is still readable.) After I forward that message once, the reply function will have this problem too. But the replied message will be unreadable in both the body and subject.

Reproducible: Always

Steps to Reproduce:
1. Open a message with Thai TIS-620 encoding.
2. See at the View -> Character Encoding menu if the selected encoding is Western ISO-8859-1. If it's TIS-620, we can't reproduce this problem.
3. Reply this message. The body and the subject will be readable. (bug does not yet occur)
4. Close the compose window of the replied message.
5. Forward this message. The body of the forwarded message will be unreadable while the subject is still readable.
6. Close the compose window of the forwarded message.
ึ7. Reply the message, both the body and subject will be unreadable.

If we didn't close the compose window but really sent the message, the recipient will not be able to read it.
Actual Results:  
Forwarded message will be unreadable in the body only.
Replied message will be unreadable in both the body and subject.

Expected Results:  
The expected behavior is that the selected encoding on the View - Character Encoding menu should be the same as actual encoding of the message.

Select the correct encoding on the View -> Character Encoding menu. Then everything will go fine.
As seen in bug 597369, charset of composition window is taken from currently selected mail when "Forward" and "Edit As New". Because ISO-8859-1 is shown as checked at View/Character Encoding in your case, any actual result can be explained.

Why ISO-8859-1 is shown at View/Character Encoding in your environment, even though mail of TIS-620 and glyph of characters of TIS-620 is displayed as you expect?
As TIS-620 is 8bit code, it's impossible to know that mail is written in TIS-620, unless TIS-620 only byte-code is contained in mail, if charset is not properly specified in mail, even when auto-detect is enabled.  
> http://www.langbox.com/codeset/tis620.html

Does your problem occur any time with the mail?
For example, click other mail with proper charset=TIS-620 and view it, click the mail again and vew the mail, forward or edit as new.  

When your problem started to occur? After you upgrated to Tb 3.1.4 from Tb 3.1.3?

Can you attach mail data(saved in .eml) and screen shot?
(Reporter)

Comment 2

8 years ago
If I remember correctly, I started to notice this problem in Tb 3.0.x.
Tb 3.1.3 and Tb 3.1.4 still have this problem.

I just discovered that I happens with my corporate email only.
Gmail and Windows Live emails are OK.
So the problem may be related to mail server configurations too.

However, Tb is smart enough to auto-detect the encoding correctly. But is it possible to set the selected encoding accordingly?

It's hard to capture screen since it has many steps.
(Reporter)

Comment 3

8 years ago
Created attachment 476831 [details]
EML file saved from Gmail Inbox
(Reporter)

Comment 4

8 years ago
Created attachment 476832 [details]
EML file saved from corporate Inbox
(Reporter)

Comment 5

8 years ago
Created attachment 476833 [details]
EML file saved from Gmail Inbox
(Reporter)

Updated

8 years ago
Attachment #476831 - Attachment is obsolete: true
(Reporter)

Comment 6

8 years ago
attached wrong file
(Reporter)

Comment 7

8 years ago
EML file saved from corporate Inbox   (6.85 KB, message/rfc822) 
(filename: SavedFromCorporateInbox.eml) can be opened and viewed correctly in Tb but the selected encoding is Western while the actual encoding is Windows-874 (Thai).

EML file saved from Gmail Inbox   (6.91 KB, message/rfc822) 
(filename: SavedFromGmailInbox.eml) can be opened and view correctly and the selected encoding is also correct (Windows-874 (Thai)).

I think these two files should be enough for you to tell what's wrong with the different selected encodings.
(Reporter)

Comment 8

8 years ago
Sorry I can't make Windows Live Web mail interface to send in TIS-620. But Windows-874 has the same effect.
Mail structure.
> Content-Type: multipart/mixed; boundary="_2a8d20b7-96f0-48d0-ad3e-09280d782eaf_"
>
>   --_2a8d20b7-96f0-48d0-ad3e-09280d782eaf_
>   Content-Type: multipart/alternative; boundary="_c056b4c1-2dae-4cba-9c12-6a87f7abf68f_"
>
>     --_c056b4c1-2dae-4cba-9c12-6a87f7abf68f_
>     Content-Type: text/plain; charset="windows-874"
>
>     --_c056b4c1-2dae-4cba-9c12-6a87f7abf68f_
>     Content-Type: text/html; charset="windows-874"
>     --_c056b4c1-2dae-4cba-9c12-6a87f7abf68f_--
>
>     --_2a8d20b7-96f0-48d0-ad3e-09280d782eaf_
>     Content-Type: image/jpeg
>
>     --_2a8d20b7-96f0-48d0-ad3e-09280d782eaf_
>     Content-Type: multipart/alternative; boundary="=======AVGMAIL-48910406======="
>
>     --=======AVGMAIL-48910406=======
>     Content-Type: text/plain; x-avg=cert; charset=us-ascii
>
>     --=======AVGMAIL-48910406=======--
>
>   --_2a8d20b7-96f0-48d0-ad3e-09280d782eaf_--
>

Phenomenon you stated is observed with Tb 3.1.4 and Tb trunk nigtly build.
If multipart/alternative part in multipart/alternative part in multipart/mixed mail is removed, problem doesn't occur, windows-874 is shown/used as expected.
If charset of text/plain part in multipart/alternative part in multipart/alternative part in multipart/mixed mail is changed to TIS-620, shown charset at View/Character Encoding becomes TIS-620.

multipart/alternative, bottom part is most preferble part, so "text/plain part with charset=us-ascii in multipart/alternative" in multipart/alternative part in multipart/mixed mail is the most preferable part and mail body of the mail.

Tb's bug is:
"Can you read this message? ..." in text/plain or text/html part with charset=windows-874 under multipart/alternative is not correctly ignored, even though most preferable part exists in multipart/alternative.

It seems bad AVGMAIL setup. Why should multipart/alternative part be added to mail of multipart/alternative part in multipart/mixed?
Correct mail structure interpretation. Sorry for my misunderstanding.
> Content-Type: multipart/mixed; boundary="_2a8d20b7-96f0-48d0-ad3e-09280d782eaf_"
>
>   --_2a8d20b7-96f0-48d0-ad3e-09280d782eaf_
>   Content-Type: multipart/alternative; boundary="_c056b4c1-2dae-4cba-9c12-6a87f7abf68f_"
>
>     --_c056b4c1-2dae-4cba-9c12-6a87f7abf68f_
>     Content-Type: text/plain; charset="windows-874"
>
>     --_c056b4c1-2dae-4cba-9c12-6a87f7abf68f_
>     Content-Type: text/html; charset="windows-874"
>     --_c056b4c1-2dae-4cba-9c12-6a87f7abf68f_--
>
>   --_2a8d20b7-96f0-48d0-ad3e-09280d782eaf_
>   Content-Type: image/jpeg
>
>   --_2a8d20b7-96f0-48d0-ad3e-09280d782eaf_
>   Content-Type: multipart/alternative; boundary="=======AVGMAIL-48910406======="
>
>     --=======AVGMAIL-48910406=======
>     Content-Type: text/plain; x-avg=cert; charset=us-ascii
>
>     --=======AVGMAIL-48910406=======--
>
>   --_2a8d20b7-96f0-48d0-ad3e-09280d782eaf_--

AVGMAIL added a part in multipart/mixed(==an attachment), multipart/alternative though.
View/Character Encoding.
(1) remove part added by AVGMAIL => windows-874
(2) remove charset=us-ascii(text/plain; x-avg=cert) => windows-874
(3) change charset(text/plain; x-avg=cert; charset=TIS-620) => TIS-620
(4-A) change structure of attachment
    {multipart/mixed ... {multipart/alternative {text/plain;charset=...} } }
 -> {multipart/mixed ... {text/plain;charset=...} }
    ==> windows-874
(4-B) change structure of attachment
{multipart/mixed ... {text/.;charset=TIS-620} {text.;charset=windows-1252} } }
    ==> windows-874
Confirming.

charset of text/plain in "second multipart/alternative"(==atachment) is used as charset of mail body. It should be charset in first multipart/alternative (==mail body in multipart/mixed mail).

If charset=us-ascii can be removed by AVGMAIL setting, it's possibly workaround of your case. If structure of added part by AVGMAIL can be changed to simple text/plain part, it's also a workaround.
Status: UNCONFIRMED → NEW
Component: General → MIME
Ever confirmed: true
Product: Thunderbird → MailNews Core
QA Contact: general → mime
Version: unspecified → Trunk
(Reporter)

Comment 11

8 years ago
So Tb displays each part of the message separately and treat the last encoding as the most preferable encoding. Is that correct?

Most non-English speakers all over the world should have their native languages mixed with English in their messages. Their native encodings should be able to display US-ASCII or ISO-8859-1. Is it possible to make US-ASCII and ISO-8559-1 less preferable if another encoding was found previously in the message?

Or to make it more controllable, can you provide an option to set encoding priority? If I set TIS-620 as number 1, Windows-874 as #2, ISO-8859-1 as #3, US-ASCII as #4, and a message has one or more of these encodings, let Tb choose the higher priority.

One strange thing is that why AVG adds footer to my incoming messages in my corporate inbox but not in Gmail inbox. AVG email scanning is turned on without specifying mail domains to scan. Why does AVG choose my corporate inbox but not Gmail? Does Tb has anything to do with this?
(In reply to comment #11)
> So Tb displays each part of the message separately

Yes. It is why each part of different charset is shown as expected.

> and treat the last encoding as the most preferable encoding.

No. Problem in multiple multipart/alternative parts handling of Tb.
As I wrote, if text/plain;charset=us-ascii part is attached in multipart/mixed mail by AVGMAIL instead of "text/plain;charset=us-ascii in multipart/alternative part", no problem occurs with Tb. 

> Is that correct?

Needless to say, Tb's bug.
However, cause is bad " 'only text/plain;charset=us-ascii part in multipart/alternative' in multipart/mixed " which is generated by AGVMAIL. Syntactically, it's right, but semantically, it's wrong, because of multipart/alternative. Because data is text/plain only, text/plain;charset=us-ascii part shouldn't be encapsulated in multipart/alternative. See dependency tree of meta bug 505172 for other multipart/alternative related issues.

> One strange thing is that why AVG adds footer to my incoming messages in my
> corporate inbox but not in Gmail inbox. AVG email scanning is turned on without
> specifying mail domains to scan. Why does AVG choose my corporate inbox but not
> Gmail? Does Tb has anything to do with this?

I don't know, and Tb is irrelevant to it. It depends on your AGVMAIL setup on your PC or AGVMAIL setup on your mail server.
If the AVGMAIL is one on PC instead of one on server, SSL(Gmail) and non-SSL(your corporate server) may be reason. AFAIK, "mail scan of SSL connection" is currently supported by avast! 5 only(among charge free AV software).
Blocks: 505172
(Reporter)

Comment 13

8 years ago
Thanks for your help. 
Now I can remove virus scan certification footer of AVG (which always uses us-ascii encoding).

It might be useful for other users, so I list how to disable AVG footer here:
- Open AVG interface (version 9 on the client)
- Select 'Advanced Settings' from the 'Tools' menu
- Locate 'E-mail Scanner' on the left pane
- On the right pane, uncheck 'Certify e-mail'
- Click 'OK'
Summary: Incoming messages with TIS-620 (Thai) encoding are readable but the selected encoding is incorrect (Western = ISO-8859-1) → Incoming messages with windows-874 are readable but the selected encoding is incorrect(ISO-8859-1) (AVG inserts 2nd multipart/alternative with text/plain;charset=us-ascii only in it to multipart/mixed mail, then Tb uses ISO-8859-1 as mail body charset)
(In reply to comment #13)
> Now I can remove virus scan certification footer of AVG (which always uses
> us-ascii encoding).

And, always with " only 'text/plain;charset=us-ascii' is encapsulated in multipart/alternative " too.
By Google search for "AGV multipart alternative", AVG's bug of next can be seen for not so new AVG versions.
  Old AVG generated next structure.
  multipart/mixed;boundary="boundary-of-original-mail"
    --boundary-of-original-mail
    part-1 of original mail
    --boundary-of-original-mail
    part-2 of original mail
    --boundary-of-original-mail
    part-3 of original mail
   (Following is added by AVG, then mail structure is corrupted)
    --boundary-of-original-mail
    multipart/alternative;boundary="boundary-of-original-mail" <=same boundary!
    --boundary-of-original-mail                                <=same boundary!
    text/plain; x-avg=cert; charset=us-ascii
  --boundary-of-original-mail--
Sigh...
Whiteboard: [See comment #13 for workaround, disable 'Certify e-mail' of AVG]
"Garbled mail data at composition window caused by wrong charset which was produced by this bug" was new problem from Tb 3.1. See bug 597369 comment #5, please.
Preecha this might have been fixed by bug 351224. Could you take a few
minutes and download the latest nightly (
http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-comm-central/
), backup your profile and test and let us know if this is fixed or not ?
Duplicate of this bug: 604628
Depends on: 597369
Duplicate of this bug: 633217

Comment 19

7 years ago
This issue is now getting worse. Previously, it was merely text. Now no pictures are carried forward...

When is this going to be fixed. I have been patient but that is running very short. I need a fix now or will explore products other than Mozilla. 

mark.

Comment 20

7 years ago
Please respond ASAP
Depends on: 716983
Removing myslef on all the bugs I'm cced on. Please NI me if you need something on MailNews Core bugs from me.

Updated

3 years ago
Blocks: 618465
Duplicate of this bug: 618465
Phenomenon when Forward is same as bug 701818(and bug 716983), and these two bugs were closed as dup of bug 715823.
Phenomenon when Reply/Edit As New is bug 1026989.
Changing dependency for ease of tracking.
Depends on: 715823, 1026989
No longer depends on: 716983
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1026989
No longer depends on: 1026989
Root cause of "charset is pcked up from last attachment" was resolved by bug 1026989 at last.

Some valuable bug reports from users were hidden/lost/ignored by "duping to bug 716983, then duping bug 716983 to other bug".
Unfortunately, or fotunately, only "garbled display in Forward Inline case" was resolved by unknown bug while the other bug was being processed, so nothing was resolved by the other bug.

Duping this bug to bug 1026989, for ease of tracking, search, understanding current status of problem.
No longer blocks: 618465
FYI.
Patch of bug 1026989, one liner patch, was proposed in the other bug to which bug 716983(and bug 701818 too) had been duped to. The patch was dig out from the other bug by bug 1026989, and long lived "charset is picked up from last attachment" issue has been resolved by bug 1026989.

Why such "digging out hidden important patch from the other bug" should be needed in B.M.O?
You need to log in before you can comment on or make changes to this bug.