Closed Bug 537114 Opened 14 years ago Closed 14 years ago

Attached message's encoding not preserved

Categories

(Thunderbird :: Message Compose Window, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 532779

People

(Reporter: mi+mozilla, Unassigned)

Details

When another e-mail is attached to the one being composed, its charset is not preserved in the attachment's headers. For example, here is the header of the part -- notice the charset=... is missing after text/html;

--------------090102070209030401040507
Content-Type: text/html;
 name="=?UTF-8?B?0KfQsNGB0YLQuNC90LAg0LLQutC70LDQtNC10L3QvtCz0L4g0L/QvtCy0ZbQtNC+?=
 =?UTF-8?B?0LzQu9C10L3QvdGP?="
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename*0*=UTF-8''%D0%A7%D0%B0%D1%81%D1%82%D0%B8%D0%BD%D0%B0%20%D0%B2%D0;
 filename*1*=%BA%D0%BB%D0%B0%D0%B4%D0%B5%D0%BD%D0%BE%D0%B3%D0%BE%20%D0%BF;
 filename*2*=%D0%BE%D0%B2%D1%96%D0%B4%D0%BE%D0%BC%D0%BB%D0%B5%D0%BD%D0%BD;
 filename*3*=%D1%8F

When the resulting message is later viewed by the recipient (or even by sender in the Sent-folder), the attachment is displayed in the charset of the main message, which could be quite different.
> bug summary : Attached message's encoding not preserved

"Attached text file" instead of "Attached message", isn't it?

AFAIK, there is no way to know charset really used for the HTML file(at least on Win/Linux. Mac OS may have meta data for charset). A software can do next only:
 - charset guessing, like View/Character encoding/Auto-detect
 - If "the HTML file is written in unicode" is garanteed,]
   BOM can be used to determone proper value of charset parameter.
How Tb can know really used charset for the HTML file?
Please note that many mailer(including Tb) uses <meta http-equiv="content-type:html/text; charset=xxx"> in HTML source upon rendering of the HTML data.

AFAIK, there is no way to specify user requested charset for attached file.
Are you requesting "user requested charset for attached file"?
If so, AFAIR, it's already requested enhancement.

If attached mail(message/rfc822 part), charset of the attached mail is written in mail header data of the attached mail, unless the attached mail is malformed mail. You say "Attached message" in bug summary, but your example is text/html part and Version:3.0.
Are you talking about problem of Bug 532779?
(In reply to comment #1)
> > bug summary : Attached message's encoding not preserved
> 
> "Attached text file" instead of "Attached message", isn't it?

The way the attachment was produced was by dragging an e-mail message to the newly composed one's attachment area... The one I was composing was in KOI8-R. The attached one was in UTF8. Here is the entire MIME-structure, with boundaries:

.... e-mail headers ........
Content-Type: multipart/mixed;
 boundary="------------090102070209030401040507"

This is a multi-part message in MIME format.
--------------090102070209030401040507
Content-Type: multipart/alternative;
 boundary="------------020707090401060709090400"

--------------020707090401060709090400
Content-Type: text/plain; charset=KOI8-R; format=flowed
Content-Transfer-Encoding: 8bit

....... text version of the new e-mail ............

--------------020707090401060709090400
Content-Type: text/html; charset=KOI8-R
Content-Transfer-Encoding: 8bit

........ HTML version of the new e-mail ...........

--------------020707090401060709090400--

--------------090102070209030401040507
Content-Type: text/html;
 name="=?UTF-8?B?0KfQsNGB0YLQuNC90LAg0LLQutC70LDQtNC10L3QvtCz0L4g0L/QvtCy0ZbQtNC+?=
 =?UTF-8?B?0LzQu9C10L3QvdGP?="
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename*0*=UTF-8''%D0%A7%D0%B0%D1%81%D1%82%D0%B8%D0%BD%D0%B0%20%D0%B2%D0;
 filename*1*=%BA%D0%BB%D0%B0%D0%B4%D0%B5%D0%BD%D0%BE%D0%B3%D0%BE%20%D0%BF;
 filename*2*=%D0%BE%D0%B2%D1%96%D0%B4%D0%BE%D0%BC%D0%BB%D0%B5%D0%BD%D0%BD;
 filename*3*=%D1%8F

........ The attached e-mail message ..............

--------------090102070209030401040507--

> You say "Attached message" in bug summary, but your example is text/html
> part and Version:3.0.

Now that I saved the attachment into a file, I do see, that it is simply an HTML-rendering of my earlier e-mail (without the charset specified even in META), rather than message/rfc822.

> Are you talking about problem of Bug 532779?

Yes, so it seems... Except I am seeing it on RedHat, rather than Windows, and I have an additional complaint: the attached HTML has no charset information at all... If, indeed, rendering a message into HTML is good for anything (it should not be used in this case, but somebody did implement it for SOMETHING), the resulting HTML ought to contain the correct charset inside its META-tag...
(In reply to comment #2)
> (In reply to comment #1)
> 
> Now that I saved the attachment into a file, I do see, that it is simply an
> HTML-rendering of my earlier e-mail (without the charset specified even in
> META), rather than message/rfc822.
> 
> > Are you talking about problem of Bug 532779?
> 
> Yes, so it seems... Except I am seeing it on RedHat, rather than Windows, and
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.