Closed Bug 561568 Opened 10 years ago Closed 4 years ago

Mails auto sent by "Reply with template" encode non-ASCII characters (in mail body) incorrectly.

Categories

(MailNews Core :: Filters, defect, major)

defect
Not set
major

Tracking

(thunderbird45 fixed, thunderbird46 fixed, thunderbird47 fixed)

RESOLVED FIXED
Thunderbird 47.0
Tracking Status
thunderbird45 --- fixed
thunderbird46 --- fixed
thunderbird47 --- fixed

People

(Reporter: quark, Assigned: mkmelin)

Details

Attachments

(3 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.3) Gecko/20100402 Firefox/3.6
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100330 Lightning/1.0b1 Shredder/3.0.4

Mails auto sent by "Reply with template" filter encode non-ASCII characters (in mail body) incorrectly no matter which charset is used.

These mails can be seen in "Sent" folder, where theire body text are already bad encoded.

Receivers will observe bad encoding of body text as well.

non-ASCII characters in mail title are correctly encoded.

Reproducible: Always

Steps to Reproduce:
1. Create a template, contains non-ASCII characters, such as "一只可爱的小花猫" (Chinese characters). No matter which charset is set, GBK, UTF-8 ... Save it.
2. Create a filter, has an action: "Reply with template", select the template created in step 1.
3. Trigger the filter, any method.
4. Wait a while when Thunderbird is sending in background. ("deliver mode: 0" is printed in terminal when Thunderbird starts to reply automatically)

Actual Results:  
5. Check "Sent" folder, find the auto reply mail just sent.
6. In mail body, "一只可爱的小花猫" is messed up, can't restore.
7. Receiver can observe same massed up non-ASCII text in mail body.


Expected Results:  
"一只可爱的小花猫" in mail body should be correctly encoded. It can be observed from "Sent" folder and the receiver.

It happens in mail body, not mail title, nor template name, etc.

So it should have no relationship with Bug 303954.

I can't find a similar bug in both closed and open issues.
Version: unspecified → 3.0
can you attach a saved template and a saved borked sent email to this bug using the add attachment link above ?

Have you tried safe mode? (see https://support.mozillamessaging.com/en-US/kb/Safe+Mode for more information)
I start thunderbird in safe mode using "thunderbird -safe-mode" from terminal.

Following steps are done in thunderbird 3 safe mode:

1. Create a template(auto-reply-template.eml) and a filter (screenshot attached).
2. Write a trigger mail(auto-reply-trigger.eml).
auto-reply-trigger.eml and auto-reply-template.eml are encoded correctly.

3. Send the mail to trigger the filter.
auto-reply-sent.eml is the mail sent by the filter, auto-reply-received.eml is the mail received.
These two mails are bad encoded in message body, while message title seems to be ok.

There is no problem with the mail server, I have tried on various mail services, same result.

Hope these pieces of information are helpful.
Attached image filter screenshot
Same behaviour here with TB ver 3.1.15. I found no way to send auto replies with non-Ascii text.

Tried the following with no success:
   - Changing the default character encoding for outgoing messages to either UTF-8 or ISO-8859-7 (was trying to send Greek text and the later is an 8-bit encoding for Greek text). 
   - Changing the default for composition to be plain text instead of html
   - send an image with my text instead of text 
   - place the non-ascii text to my signature (the signature was not added to the template)

I was recreating the template after every change but had no luck

I'm sorry that I have no time to document the failures more thoroughly but please note that it's extremely easy to reproduce and test (at least for a non-English-speaking-only developer)
Bug confirmed, same behavior persistent in Windows 7, Thunderbird 12.0.1!
Create message filter and add reply with template, if the template contains non ASCII text, such as Cyrillic (russian, bulgarian), UTF 8 encoding is selected and forced to use plain text only. I've tried with html and it is the same. It appears all messed up right in the sent folder, although correct encoding was selected.
Confirm this issue with Thunderburd 14 and cyrillic codepage
It seems that 8-bit encoding is causing the problem.
Reproduced the bug with UTF-8 but not with ISO-2022-JP.
Confirm this issue with Thunderbird 16.0.2 (GNU/Linux Ubuntu 12.04) with UTF-8 encoding

Can someone change the status of this bug ? (still appears unconfirmed to this date)
Confirm this issue with Thunderburd 17 and cyrillic codepage or UTF-8 symbols in template both under Windows and under Linux.
for example: i have a original Cyrillic string:
>
сис.админ ОАО "ПФ "КМТ"
тел.
<
Then TB sends a letter with template, 
In the sent letter source code this Cyrillic string look like a:
>
&ntilde;&egrave;&ntilde;.&agrave;&auml;&igrave;&egrave;&iacute; &Icirc;&Agrave;&Icirc; "&Iuml;&Ocirc; "&Ecirc;&Igrave;&Ograve;"
&ograve;&aring;&euml;.
<

To get back the initial string i need to decode sent letter with QP >>> win1251 encodings
I use the Shtirlitz 4 tool for this
part of this letter source: (from 'sent' mail folder)
>
Subject: auto repl (was: r me nowwwwwwwwwwww)
Content-Type: text/html; charset=windows-1251
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta http-equiv="content-type" content="text/html;
      charset=windows-1251">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    &iuml;&eth;&egrave;&acirc;&aring;&ograve;<br>
    &lt;helo&gt;<br>
    <br>
    <pre class="moz-signature" cols="72">-- 
&ntilde;&egrave;&ntilde;.&agrave;&auml;&igrave;&egrave;&iacute; &Icirc;&Agrave;&Icirc; "&Iuml;&Ocirc; "&Ecirc;&Igrave;&Ograve;"
&ograve;&aring;&euml;.</pre>
  </body>
</html>
<
Confirming. Non-latin1 templates get messed up.
Severity: normal → major
Status: UNCONFIRMED → NEW
Component: General → Filters
Ever confirmed: true
Product: Thunderbird → MailNews Core
Version: 3.0 → Trunk
Assignee: nobody → mkmelin+mozilla
Wow. Using CopyASCIItoUTF16 on the body is just not a good idea.
Attachment #8716002 - Flags: review?(rkent)
OS: Linux → All
Hardware: x86 → All
Comment on attachment 8716002 [details] [diff] [review]
bug561568_auto_reply_charset.patch

Review of attachment 8716002 [details] [diff] [review]:
-----------------------------------------------------------------

I mostly have nits and questions, overall this seems fine. You can either answer my questions and ask for a re-review of the same patch, or provide a new patch. BDS discouraged feedback+ so I'll follow his advice and give r- since I'd like another interaction with you before landing.

::: mailnews/compose/src/nsMsgComposeService.cpp
@@ +960,5 @@
>    compFields->SetRawHeader("Auto-Submitted", NS_LITERAL_CSTRING("auto-replied"), nullptr);
>  
> +  nsCString charset;
> +  rv = mTemplateHdr->GetCharset(getter_Copies(charset));
> +  NS_ENSURE_SUCCESS(rv, rv);

I realize that there are returns of this sort throughout this code, but actually this is not correct. The caller of OnStopRunningURL knows nothing about how you want to respond to errors, and typically would take no action if you return a failure code. But this indicates that a send failed, and there should be some user notification of this.

Unfortunately there is no good solution for this. There is no handling of async errors like this in the filter system, so I'll have to let it pass :(

@@ +963,5 @@
> +  rv = mTemplateHdr->GetCharset(getter_Copies(charset));
> +  NS_ENSURE_SUCCESS(rv, rv);
> +  compFields->SetCharacterSet(charset.get());
> +  rv = nsMsgI18NConvertToUnicode(charset.get(), mTemplateBody, body);
> +  NS_ENSURE_SUCCESS(rv, rv);

Now though we have a choice. nsMsgI18NConvertToUnicode I suppose could reasonably fail if there were invalid characters in the template. How should we respond to that? I think that the previous behavior (revert to convert from ascii) is preferable behavior to just silently failing. So on error, I would add an NS_WARNING then do CopyASCIItoUTF16.

::: mailnews/compose/test/unit/test_autoReply.js
@@ +86,2 @@
>  /// Test that a reply is NOT sent when the message is not addressed to "me".
>  add_task(function testReplyingToUnadressedFails() {

Feel free to fix all of the misspellings of "adressed" -> "addressed" as part of this revised patch.

@@ +99,5 @@
>  
>  /// Test that a reply is sent when the message is addressed to "me".
>  add_task(function testReplyingToAdressedWorks() {
>    try {
> +    testReply(1); // mail 1 is adressed to us, using template-latin1

Since you are changing this line, please fix the misspelling of adressed (addressed).

@@ +109,5 @@
> +
> +/// Test that a reply is sent when the message is addressed to "me".
> +add_task(function testReplyingToAdressedWorks() {
> +  try {
> +    testReply(1, 1); // mail 1 is adressed to us, template-utf8

nit: misspelled addressed

@@ +154,5 @@
> +  // XXX: raw body string is latin1/ascii. Is that reasonable, or should the
> +  // conversion happen in... say, maild.js nsMailReader?
> +  var utf8Converter = Components.classes["@mozilla.org/intl/utf8converterservice;1"]
> +    .getService(Components.interfaces.nsIUTF8ConverterService);
> +  gServer._daemon.post = utf8Converter.convertStringToUTF8(gServer._daemon.post, "UTF-8", true);

I don't understand the intent of this. As written, it converts from UTF-8 to UTF-8, that is does nothing (except check for failures). If the intent is to check for failures, that is unusual enough to require a comment. But why don't you convert from latin1/ascii instead as your comment implies?

@@ +162,4 @@
>    do_check_true(headers.get("Subject").startsWith("Auto: "));
>    do_check_eq(headers.get("Auto-submitted"), "auto-replied");
> +  if (templateHdr.Charset == "windows-1252") {
> +    if (!body.includes("åäö latin1")) { // template-latin1 contains this

Isn't there any way to actually test for "åäö" which is what was in your original file? I assume this is related to my earlier comment about utf-8 conversion. If not, then you need some comment explaining why this does not match what is actually in the file.
Attachment #8716002 - Flags: review?(rkent) → review-
(In reply to Kent James (:rkent) from comment #16)
> Now though we have a choice. nsMsgI18NConvertToUnicode I suppose could
> reasonably fail if there were invalid characters in the template. How should
> we respond to that? I think that the previous behavior (revert to convert
> from ascii) is preferable behavior to just silently failing. So on error, I
> would add an NS_WARNING then do CopyASCIItoUTF16.

I don't think so, that would send garbled text. I don't see how anyone would prefer that.
There was a subtle bug in my original patch, but it was hard to notice as the header said one charset and the inline content <meta> said another.

There's still something fishy going on with the conversions in the fake server so the test is now... eeh:/

Anyway, the code does work in real life and I've been unable to track down where the test server stream handling does something wrong :( Unless someone has good ideas what might be wrong I'd just go with this for now.
Attachment #8716002 - Attachment is obsolete: true
Attachment #8724940 - Flags: review?(rkent)
Status: NEW → ASSIGNED
(In reply to Magnus Melin from comment #17)
> (In reply to Kent James (:rkent) from comment #16)
> > Now though we have a choice. nsMsgI18NConvertToUnicode I suppose could
> > reasonably fail if there were invalid characters in the template. How should
> > we respond to that? I think that the previous behavior (revert to convert
> > from ascii) is preferable behavior to just silently failing. So on error, I
> > would add an NS_WARNING then do CopyASCIItoUTF16.
> 
> I don't think so, that would send garbled text. I don't see how anyone would
> prefer that.

Then you still need to decide how to respond. It's a common error throughout our code to NS_ENSURE_SUCCESS in one of the listener objects, when the parent is not going to do anything with that error. I'm dealing with one of those at this moment actually in another bug.

This is the place where you need to decide how to respond to the error. If you don't like my idea, you need to propose your own. Just returning with an error is not the correct approach if the caller is going to ignore it.
Like you said in comment 16, there's no good solution for this - and we don't do anything special for the other error cases either. You do get "evidence it failed" in that the message doesn't get marked as replied. 

It's also very unlikely this particular call would fail. It's the charset of a message we created ourselves so we're quite sure we can cope with that encoding.
(In reply to Magnus Melin from comment #20)
> Like you said in comment 16, there's no good solution for this - and we
> don't do anything special for the other error cases either. You do get
> "evidence it failed" in that the message doesn't get marked as replied. 
>

Actually that is not true. The reply flag is set in the initial call, in "folder->AddMessageDispositionState" which would persist regardless of a failure of the send in OnStopRunningURL. The reply flag should be set much later to prevent exactly the kind of false reply indicator that will occur in case of error.

But that is not really your issue, so this is not the place to fix that.
Comment on attachment 8724940 [details] [diff] [review]
bug561568_auto_reply_charset.patch

Review of attachment 8724940 [details] [diff] [review]:
-----------------------------------------------------------------

Although I continued to comment on the error handling, I'll let you decide what to do, it is not worth further discussion. Just don't add NS_ENSURE_SUCCESS in the OnStopRunningURL handler. r+ with any other approach.

Also in looking this over, I notice in our last patch to ComposeService that we introduced a possible problem. After the Release() in the handler, if that is the last reference then the member variables, like mIdentity, are undefined. I don't think that is the last reference (which means the AddRef/Release is probably not needed) but if it we introduced a possible error.

::: mailnews/compose/src/nsMsgComposeService.cpp
@@ +963,5 @@
> +  rv = mTemplateHdr->GetCharset(getter_Copies(charset));
> +  NS_ENSURE_SUCCESS(rv, rv);
> +  compFields->SetCharacterSet(charset.get());
> +  rv = nsMsgI18NConvertToUnicode(charset.get(), mTemplateBody, body);
> +  NS_ENSURE_SUCCESS(rv, rv);

I'm going to continue to harp on the error handling here. You are trying to do the "right thing" by testing for errors in things that should not fail, but unfortunately it is not the "right thing" the way this method works. By adding in code that handles errors incorrectly, you are encouraging the next person that comes along to follow your example.

The method is self-owned with a release at the end, so any errors should result in the release being called. That would be the "right thing". The original author handled this by not checking for errors, so that ultimately the Release() would get called. You are trying to introduce a mostly new approach, which is not really a correct approach. The existing design is to continue in spite of errors. I think that you need to follow that approach.

Returning an error code to the caller of OnStopRunningURL is pointless. Also by simply returning, you are introducing a silent failure to reply.

I would be happy with one of these approaches:

1) Don't check for errors, as in the original code. A failure in most cases should result in a blank body, which IMHO is preferable to silently failing with a leak.

2) Check for errors, but don't return, simply add NS_WARNING. I don't think that is worth it myself, but if important to you that would be acceptable.

3) Add a block for your code, if you fail then warn then return NS_OK after doing a Release().

In my own code (where I have a lot of these little C++ listener helpers) I put everything in a do {} while (false) block, with a break on error to the end-of-method processing. That is similar to try {} catch {} finally {} in JS.
Attachment #8724940 - Flags: review?(rkent) → review+
https://hg.mozilla.org/comm-central/rev/afc38db4e0ef -> FIXED

Went for the warning. There are already 7-8 NS_ENSURE_SUCCESS is that method so error handling that way was how things were done around there... which might be a slight problem wrt the release(). Oh well.
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 47.0
Comment on attachment 8724940 [details] [diff] [review]
bug561568_auto_reply_charset.patch

[Approval Request Comment]
User impact if declined: non-latin1 chars in templates sent garbled
Testing completed (on c-c, etc.): xpcshell-test, landed on trunk
Risk to taking this patch (and alternatives if risky): not risky
Attachment #8724940 - Flags: approval-comm-beta?
Attachment #8724940 - Flags: approval-comm-aurora?
Attachment #8724940 - Flags: approval-comm-aurora? → approval-comm-aurora+
Aurora (TB 46):
https://hg.mozilla.org/releases/comm-aurora/rev/47ade0574ee0

Note to the beta lander:
Trivial merge conflict was resolved in mailnews/compose/src/nsMsgCompose.cpp, so please use Aurora changeset.
Comment on attachment 8724940 [details] [diff] [review]
bug561568_auto_reply_charset.patch

http://hg.mozilla.org/releases/comm-esr45/rev/6632b699c150
Attachment #8724940 - Flags: approval-comm-beta? → approval-comm-beta+
Retested with Thunderbird 45.0
The auto sent email body was encoded in Quoted-printable but with "Content-Transfer-Encoding: 7bit" set in the header it became unreadable.
(In reply to Masanori Miyoshi from comment #27)
> Retested with Thunderbird 45.0
> The auto sent email body was encoded in Quoted-printable but with
> "Content-Transfer-Encoding: 7bit" set in the header it became unreadable.

It seems that the above only happens when mail.strictly_mime is set to "true".
Comment on attachment 8724940 [details] [diff] [review]
bug561568_auto_reply_charset.patch

Changed approval from beta to esr45 since it was never landed on beta.
Attachment #8724940 - Flags: approval-comm-beta+ → approval-comm-esr45+
You need to log in before you can comment on or make changes to this bug.