content destroyed without warning when accepting option to automatically send as UNICODE (UTF-8)

RESOLVED FIXED in mozilla1.7beta

Status

MailNews Core
Composition
--
critical
RESOLVED FIXED
15 years ago
6 years ago

People

(Reporter: Leif Halvard Silli, Assigned: Jungshik Shin)

Tracking

({intl})

Trunk
mozilla1.7beta

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

8.42 KB, patch
Scott MacGregor
: review+
Bienvenu
: superreview+
Details | Diff | Splinter Review
(Reporter)

Description

15 years ago
User-Agent:       
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7a) Gecko/20040207 Firebird/0.8.0+

Your mail sending preferences are set to 'ISO-8859-1' and you are editing a
message where the encoding inside View->Charset Encoding is set to the same. You
type in some characters that are outside the character set of 'ISO-8859-1'. When
you want to send you do the normal thing, and up pops a warning that tells me
that I have used characters outside the encoding. It also tells that it is
usually OK to send as UNICODE(UTF-8) and that if I simply hit «OK», the message
will be stored or sent in exactly that format. I hit 'OK'. But when the message
reaches the receiver all characters outside ISO-8859-1 have been «converted»
into question marks.

Reproducible: Always
Steps to Reproduce:
1. Compos a letter containg e.g. the russian letter 'Я' and norwegian 'Å'.
   (it should be safe to copye an daste the above line into your mail)
2. Make sure the encoding is set to ISO-8859-1 before you hit 'Send'
3. Hit 'Send' and notice the warning that pops up, which you read.
4. Hit 'OK' to get the letter sent using the promised UTF-8 encoding.


Actual Results:  
When you received the mail the russian letter will be a question mark. I.e.
instead of being sent using UTF-8 --as promised-- the letter was sent using
ISO-8859-1 and the letters outside the scope of 8859-1 were destroyed.
Insidently, if you start with a russian encoding then it is the norwegian letter
that gets destroyed as the mail is sent using the russian character set. Note
that the loss of data is visible both in the stored version in the 'Sent' folder
as wel as in the received version.

Expected Results:  
The sofwtare should have done what it promises to do. If it is not able to do
what it currently promises then it should not give us the option to hit 'OK'.
Because one can of course fix the problem by choosing UTF-8 before hitting 'Send'.

Tested in: released version of Mozilla 1.6 for OSX
and in Mozilla Thunderbird 0.5b (20040204)
(Reporter)

Comment 1

15 years ago
*** Bug 233362 has been marked as a duplicate of this bug. ***

Comment 2

15 years ago
Can confirm this on Win98 with 1.6 and 1.7a (20040206).
Doesn't happen if user_pref("intl.fallbackCharsetList.ISO-8859-1", "UTF-8"); set
so wil automatically fall to UTF-8 if needed.
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: MacOS X → All
Hardware: Macintosh → All

Comment 3

15 years ago
This is regression from of bug 194862.
The alert text was changed and some JS code added. But the main code at
~nsMsgCompose.cpp#1019 still behaves like before: the message is sent without
changing the character coding.

Jungshik, I CC'ed you here because you did the patch.
I can't see when !gMsgCompose.checkCharsetConversion(getCurrentIdentity(),
fallbackCharset) at MsgComposeCommands.js#1651 is true and so your manual set
fallbackCharset.value = "UTF-8" won't be executed at all.

So I think we'll at least need something like 
  m_compFields->SetCharacterSet("UTF-8");
after nsMsgCompose.cpp#1024.
(Assignee)

Comment 4

15 years ago
Works for me on 1.6 with the default fallback of ISO-8859-1 to Windows-1252. 
 
> I can't see when !gMsgCompose.checkCharsetConversion(getCurrentIdentity(),
> fallbackCharset) at MsgComposeCommands.js#1651 is true and so your manual set
> fallbackCharset.value = "UTF-8" won't be executed at all.

 No, that part is not to blame.  That statement doesn't have to be reached.
|fallbackCharset| is set in
|gMsgCompose.checkCharsetConversion(getCurrentIdentity(),fallbackCharset)| if
|gMsgCompose.checkCharsetConversion(get.....)| succeeds. 


Why don't you post characters that turned into question marks in your test?
Please, set View | Character Coding to UTF-8 _before_ posting non-ASCII
characters in bugzilla.



Comment 5

15 years ago
I can reproduce this on WinXP.

Composition encoding was manually set ISO-8859-1, and I pressed OK on all
Unicode warning popups.
          
Msg Subject   Msg Body      UI                           Received msg
---------------------------------------------------------------------
Test          Test          No popup                     ISO-8859-1
Test          native char   Sending Mail... popup        ISO-8859-1
                            & Unicode warning popup
native char   Test          Unicode warning popup        UTF-8
native char   native char   Unicode warning popup        UTF-8

I think row 2 is this bug.
Keywords: intl

Comment 6

15 years ago
Jungshik, the combination of Å and Я is what I used according to Leifs report to
get the alert pop up.

I know that the JS code is not to blame because it's not reached (though I don't
know when it will ever be reached). The code not changed in SendMsg() (around
line 1024) should be the one to blame.
(Reporter)

Comment 7

15 years ago
Allthough cyrillic Я and Norwegian Ø are just an example. Just set the encoding
of the message you send  to something that you know are different from the
characters contained in the message body. (Btw, 'Ø' is contained in both
ISO-8859-1 and in Windows-1252 but not in for instance Koi-8, which you could try.)
(Assignee)

Comment 8

15 years ago
Issac and Christian, thank you for for finding that out. And, thanks, Leif, for
reporting it.

I confirmed the problem in case 2 in comment #2 with characters outside
Windows-1252, let alone ISO-8859-1.  With characters covered by Windows-1252 but
not by ISO-8859-1, it's fine.

I was totally misled by the following sentence to believe that we're checking
the message body as well as the message header.

> The message you composed contains characters not found in the selected 
> Character Coding, so your message may become unreadable after you send or 
> save it.

It turned out that in JS code, we only checked the message header  while the
message body is checked in C++ code (nsMsgCompose.cpp) as Christian wrote. 

I should've paid attention to the following comment :

// Check if the headers of composing mail can be converted to a mail charse

at 
http://lxr.mozilla.org/mozilla/source/mailnews/compose/resources/content/MsgComposeCommands.js#1642
Assignee: sspitzer → jshin
(Assignee)

Comment 9

15 years ago
Created attachment 140866 [details] [diff] [review]
patch

This is effectively a one-liner, but I added comments and changed the variable
name to make clear what's being done.
(Assignee)

Comment 10

15 years ago
Comment on attachment 140866 [details] [diff] [review]
patch

>Index: mailnews/compose/src/nsMsgCompose.cpp
>
>+          m_compFields->SetCharacterSet("UTF-8");

This is not enough. I've gotta convert the content of the body to UTF-8, here.
Somehow, it doesn't work...

Comment 11

15 years ago
Ah, JS only checks the header, yes. I overread that comment too ...

Yeah, the conversion is done (or resp. tried) in nsMsgI18NSaveAsCharset() and
fallbackCharset only contains the destination charset. So setting
m_compFields->SetCharacterSet to UTF-8 allone changes nothing.

What works is to call nsMsgI18NSaveAsCharset() as we now do and do the same
again before setting the charset to UTF-8.

With one modification, replace parameter m_compFields->GetCharacterSet() with a
constant "UTF-8".

@@ -1023,6 +1023,12 @@ NS_IMETHODIMP nsMsgCompose::SendMsg(MSG_
         PR_FREEIF(outCString);
         return NS_ERROR_MSG_MULTILINGUAL_SEND;
       }
+
+      PR_Free(outCString);
+      rv = nsMsgI18NSaveAsCharset(contentType, "UTF-8", 
+                              msgBody.get(), &outCString, 
+                              getter_Copies(fallbackCharset), &isAsciiOnly);
+      m_compFields->SetCharacterSet("UTF-8");
     }
     // re-label to the fallback charset
     else if (fallbackCharset)


I don't know if we should test the result again and alert the user in case of
failure. There might still be characters not convertable to UTF-8, yes?

Comment 12

15 years ago
Ah, not to forget GetBodyFromEditor() at nsMsgSend.cpp#1758, there is a similar
situation we should take care of.
(Assignee)

Comment 13

15 years ago
Created attachment 140935 [details] [diff] [review]
updated patch

fixes both files.
Attachment #140866 - Attachment is obsolete: true
(Assignee)

Comment 14

15 years ago
Comment on attachment 140935 [details] [diff] [review]
updated patch

asking for r/sr.
Attachment #140935 - Flags: superreview?(bienvenu)
Attachment #140935 - Flags: review?(sspitzer)
(Reporter)

Comment 15

15 years ago
As the patch merely make a feature currently in 1.6 work as it should, can we
expect to see it in 1.7? And in that context: are the Status & Resolution fields
correct? Perhaps a review won't happen before one claim it resolved? At least we
are past the 'New' status... Best regards, Leif.

Comment 16

15 years ago
Leif, see please http://bugzilla.mozilla.org/bug_status.html for the meaning of
Status and Resolution. But I change the status to Assigned.

I bet it will go into 1.7, but not 1.7a because that's to short.
Status: NEW → ASSIGNED
(Assignee)

Comment 17

15 years ago
ping, sspitzer. can you review? 
(Assignee)

Comment 18

15 years ago
Comment on attachment 140935 [details] [diff] [review]
updated patch

asking mscott for review
Attachment #140935 - Flags: review?(sspitzer) → review?(mscott)
(Assignee)

Comment 19

15 years ago
nominating as a blocker to 1.7beta and setting the target milestone to 1.7b
Flags: blocking1.7b?
Target Milestone: --- → mozilla1.7beta

Updated

15 years ago
Attachment #140935 - Flags: superreview?(bienvenu) → superreview+

Comment 20

15 years ago
Comment on attachment 140935 [details] [diff] [review]
updated patch

> nsCAutoString attachment1 [details] [diff] [review]_body;

I think that should be a nsCString. body text is likely to be longer than the
default auto string size.
Attachment #140935 - Flags: review?(mscott) → review+
(Assignee)

Comment 21

15 years ago
thanks all (especially Christian for the help), patch got landed.
Status: ASSIGNED → RESOLVED
Last Resolved: 15 years ago
Keywords: relnote
Resolution: --- → FIXED

Updated

14 years ago
Flags: blocking1.7b?
Product: MailNews → Core
Product: Core → MailNews Core
Keywords: relnote
You need to log in before you can comment on or make changes to this bug.