Closed Bug 391010 Opened 17 years ago Closed 14 years ago

Thunderbird does not recognize UTF-16 text file attachments with BOM.

Categories

(Thunderbird :: Mail Window Front End, defect)

x86
Windows XP
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: V.Haisman, Unassigned)

Details

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6 Build Identifier: Thunderbird version 2.0.0.6 (20070728) Hi, the message pane does not show text UTF-16 with BOM attachments properly. It looks like interprets the text as ASCII/Latin1, maybe UTF-8, and thus shows just bytes up to first NUL. Reproducible: Always Steps to Reproduce: 1. Send yourself email with text UTF-16 with BOM file. 2. Open the email. 3. See for youself :)
WORKSFORME for display when appropriately specified mail header. (Thunderbird trunk 2007/8/01 build, MS Win-XP) (1) Composing When compose, there is no way to specify charset for attachment currently, then following headers are generated. Because of BOM, base64 is automatically used. (first part) > Content-Type: text/plain; charset=ISO-2022-JP > Content-Transfer-Encoding: 7bit (second part) > Content-Type: text/plain; name="UTF-16-LE-BOM.TXT" > Content-Transfer-Encoding: base64 > Content-Disposition: inline; filename="UTF-16-LE-BOM.TXT" (2) Display Default of charset=us-ascii, but Tb probably uses charset of mail body when displaying text attachment when no charset parameter. However, if charset is properly specified, Thunderbird trunk correctly displays text attachment with BOM in inline. > Content-Type: text/plain; charset=UTF-16; name="UTF-16-LE-BOM.TXT" Workaround currently possible: (1) Delete all mails in Unsent of "Local Folder", and compact it (2) Compose a mail, and Send Later (3) If Unsent folder is opened, click other mail folder(close file of Unsent) (4) Edit file of Unsent (not Unsent.msf) by text editor, and add charset=UTF-16; to Content-Type: header (5) Click(open) Unsent folder, and Properties/Rebuild Index (6) See the mail, and execute send immediately.
Reporter what you think about comment #1?
Whiteboard: closeme 2009-10-09
(In reply to comment #2) > Reporter what you think about comment #1? I think that it works as a workaround if there is no other way. But it is not something I would consider usable for everyday use.
Whiteboard: closeme 2009-10-09
A few general/practical workarounds for this kind of issue; (i) Attach as ZIPed file (ii) Change file extension(e.g. UTF16-TXT), attach (probably sent with application/octet-stream by Tb) In both cases, recipient can view the file after save to local disk. (iii) "Display Attachments Inline" + View/Character Encoding/UTF-16 by recipient. (I don't know this kind of operation is available in any mailer or not) For text type file, content checking are already done. For example, check mixed CR/LF/CRLF or not, and if mixed, send in base64 for safe mail sending. I think BOM checking is better to be done for text type file attachment.
I think Thunderbird doesn't try very **** figuring out the right encoding of an attachment. I get e-mails with attached UTF-8 .tex-Files. Thunderbird displays none of them correctly. Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101208 Thunderbird/3.1.7 $ locale LANG=de_DE.utf8 I iterated through: Receiving the mail as: No.1: --------------020407000905060307070100 Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit --------------020407000905060307070100 Content-Type: text/x-tex; name="woche11.tex" Content-Transfer-Encoding: 8bit Content-Disposition: attachment; filename="woche11.tex" Result: Special Chars are displayed as: ÃŒ for ü and therelike. No.2: --------------040801020701010601080201 Content-Type: text/plain; charset="ISO-8859-15"; format=flowed Content-Transfer-Encoding: 7bit --------------040801020701010601080201 Content-Type: text/x-tex; name="woche9.tex" Content-Transfer-Encoding: 8bit Content-Disposition: attachment; filename="woche9.tex" No.3: Switch Thunderbird's default encoding for received e-mails back and forth between ISO-8859-15 and UTF-8. (+ Restart Thunderbird) Result: this doesn't matter. Neither for the UTF-8 mail nor for the latin1(ISO-8859-15) Workaround: Click in the Menue: "View"->"Encoding"->"UTF-8". Now the attachment text is displayed correct in the message Window. (This will be forgotten when you open the e-mail the next time.) Conclusion: Thunderbird doesn't use its default encoding, nor the system's locale, nor the e-mail's encoding for displaying the attachment in the message window. Greeting Stefan
What Thunderbird really uses is the folder specific default encoding. Setting this to "UTF-8" in the folder's options displays the attachments correct. This makes sense. So, I guess, that I just mis-guessed the way how Thunderbird tries to guess the encoding of an attachment. So basically there is no bug here for me, but personally I'd use the E-Mails encoding for displaying an attachment with unknown encoding. Greetings Stefan
I think this can be closed?
(In reply to comment #6) > What Thunderbird really uses is the folder specific default encoding. > Setting this to "UTF-8" in the folder's options displays the attachments > correct. Invalid, because isn't a bug (see comment #6)
Status: UNCONFIRMED → RESOLVED
Closed: 14 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.