Open Bug 247483 Opened 20 years ago Updated 2 years ago

save message as text: ISO high chars converted to UTF-8, saved as MacRoman

Categories

(Thunderbird :: Message Reader UI, defect)

PowerPC
macOS
defect

Tracking

(Not tracked)

People

(Reporter: katjana, Unassigned)

Details

User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7) Gecko/20040517 Camino/0.8b
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7) Gecko/20040517 Camino/0.8b

'Save message' called by cmd+s or from menu bar.
Dialogue opens; ok
Dedault extension: eml
Tab to next field: no (inert)
Mouse to next field and pick html or text: 
  Name field does not change extenstion (retains eml)

Result: saved file >
as html :ok 
as text: national characters are not rendered correctly. (NB 8859-1 encoding 
works well for Norwegian everywhere else in Thunderbird)  

Reproducible: Always
Steps to Reproduce:
1. 'Save message' called by cmd+s or from menu bar.
2.
3.
Default theme (from installation)
No extensions installed except dictionaries (Norwegian, spell-nb.xpi) Previous
version of programme had been deleted (also profile)
Actual Results:  
Cf "details"


Expected Results:  
1) tab shifts focus between fields in dialog.
2) default extension changes to htm or txt - depending on choice made by user
3) text encoding should be commensurate with Apple standard (OSX) (font and
keyboard layout picked by Apple user in system preferences OR by user in
thunderbird preferences.
thunderbird 0.7.1

The bug is still there. 
i.e.
symtoms: 1) tab does not switch between fields in dialog "save message as"
and 2) encoding is not right for text (txt) "all files" and "eml"
(In reply to comment #0)
> Expected Results:  
> 1) tab shifts focus between fields in dialog.

This WorksForMe in Win2K, may be a Mac-only problem.

> 2) default extension changes to htm or txt - depending on choice made by user

Do you in fact get   MailSubject.eml.htm   or   MailSubject.eml.txt  ?  
That's bug 238271.

> 3) text encoding should be commensurate with Apple standard (OSX) (font and
> keyboard layout picked by Apple user in system preferences OR by user in
> thunderbird preferences.

Can you attach a broken .txt file so we can see exactly what is wrong with it?
2004, 2 Sept. 
Yes, when saving to html /text I get MailSubject.eml.htm /MailSubject.eml.txt
---------------
Files saved as 'mail' or 'all files' are identical: 
[from/date/to etc]
...
MIME-version: 1.0
Content-type: text/plain; format=flowed; charset=ISO-8859-1
Content-transfer-encoding: quoted-printable
X-Accept-Language: en-us, en
User-Agent: Mozilla Thunderbird 0.7.3 (Macintosh/20040803)
X-T2-Posting-ID: ajCqVMelykQIHdXQx8hqiw==
Original-recipient: rfc822;katjana@mac.com
[Message text:]
The default char. encoding should be iso-8859-1
For instance characters =E6 =F8 =E5
-----------
Mail saved as text look like this: 
Subject: saving to text
[from/date/to]
The default char. encoding should be iso-8859-1
For instance characters æ ø å
----------
Thunderbird 1.0, OSX 10.3.7

1) tab key works in the Save dialog, identically to other Mac apps. Note that
OSX tab order does not match the visual layout, and the default blue highlight
is hard to see on some blue buttons.

2) file extension doubling is bug 238271.

3) My Tbird does not have any text/plain mail with non-ascii characters, so I
was unable to test this.

Therefore, rewriting Summary to focus on 3.
Keywords: qawanted
Summary: save message as: 1) inert tab-key 2) save as text/html: extention remains 3) save as text: non-national encoding → save message as: text: non-national encoding fails
Thanks to Mike's testcase suggestion, I can confirm this bug and his theory. Mac
Thunderbird converts ISO-8859-1 (possibly others) into UTF-8, but does not set
the text encoding on the file, so they are translated as pairs of MacRoman
characters:

 æ ø å
C3 = square root
A6 = paragraph mark
B8 = PI
A5 = bullet

OS X handles Unicode nicely, but if you don't specify, it assumes MacRoman.
Compare to the similar Firefox bug 271454. I am sure that 
http://developer.apple.com will explain the solution, which I leave as an
exercise for the reader.

From Mike:

"For instance characters æ ø å 
In ISO-8859-1, the character codes are:   E6  F8  E5
In UTF-8, they are:    C3 A6   C3 B8   C3 A5

"What I suspect is happening is: TB saves-as-text and converts to UTF-8 
(why, I don't know); then when the file is loaded into a Mac editor, it 
decodes per the Mac's default (and nonstandard) encoding; so "C3" is the 
character which, in Unicode, has a code of 8730 -- the Unicode character 
for the Square Root symbol -- and the codes A6, B8, and A5 similarly are 
encoded to other characters that are completely different from the 
intended characters.  (8719, btw, is the Unicode character for the 
mathematical Product symbol, an upper-case pi.)"
Status: UNCONFIRMED → NEW
Component: Mail Window Front End → General
Ever confirmed: true
Keywords: qawanted
Summary: save message as: text: non-national encoding fails → save message as text: ISO high chars converted to UTF-8, saved as MacRoman
QA Contact: general
Assignee: mscott → nobody
Frankie, still see this?
Component: General → Message Reader UI
QA Contact: general → message-reader
Whiteboard: closeme 2009-06-25
Yes. According to at least one person, this bug is actually in Core -> Serializers, which would mean it is a dupe of bug 271454 (or vice versa).
Whiteboard: closeme 2009-06-25
The behaviour here is actually the opposite of bug 271454. As bug 271454 comment 3 says, saving an HTML file as text from the browser preserves the original encoding, unlike saving from TB which converts to UTF-8.

The saved files from TB are recognized as UTF-8 by Aquamacs and OpenOffice, FWIW, and also by TextEdit if one sets the "Plain text file encoding | Opening files" preference to UTF-8 instead of the default "Automatic".

IMHO this is TextEdit's bug, not ours. In theory we could work around by prepending a BOM to the saved file, but in general it's my belief that UTF-8 BOMs cause more problems than they solve.
No longer depends on: 271454
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.