Closed Bug 1026989 Opened 10 years ago Closed 8 years ago

Reply/"Edit as new"/"Forward As->Attachment"/Display message/Text Encoding of View, Options etc. : charset picked up from last attachment

Categories

(Thunderbird :: Message Compose Window, defect)

24 Branch
defect
Not set
normal

Tracking

(thunderbird_esr45? affected, thunderbird51 fixed, thunderbird52 fixed, thunderbird53 fixed)

VERIFIED FIXED
Thunderbird 53.0
Tracking Status
thunderbird_esr45 ? affected
thunderbird51 --- fixed
thunderbird52 --- fixed
thunderbird53 --- fixed

People

(Reporter: darkranger_red, Assigned: jorgk-bmo)

References

Details

(Whiteboard: [Only "Forward Inline" case was already resolved by unknown bug/patch] [see bug 961983 for UTF16-LE specific issue])

User Story

Why should fixing of "charset is picked up from last part" issue be postponed to this bug, despite that the issue was found by bug 597821 on 2010-09-19 and many subsequent bugs, bug 604628, bug 618465, bug 633217, bug 701818,  bug 716983, were reported and analyzed?
What is cause of the so-long-delay in problem resolution despite that many clear/crisp bugs were reported?

Attachments

(4 files, 5 obsolete files)

Attached file test.abc
User Agent: Mozilla/5.0 (Windows NT 5.1; rv:30.0) Gecko/20100101 Firefox/30.0 (Beta/Release)
Build ID: 20140605174243

Steps to reproduce:

1. Send a message by Thunderbird which attached test.abc. test.abc is a UTF-16LE text file with unrecognized file extension (*.abc)
2. Receive that message in Thunderbird.
3. Reply that message



Actual results:

Composer detected the wrong encoding (UTF-16LE)


Expected results:

Composer should detected the correct encoding.
We just had a duplicate of this in bug 1314910.

What is the status of UTF-16 encoding is in TB, Magnus, do you know? UTF-7 is discontinued, right, and I've never seen UTF-16 being used other than internally.

Of course another aspect of the bug is that the attachment's encoding shouldn't be used for the reply.
Flags: needinfo?(mkmelin+mozilla)
OK, here we have a message encoded in EUC-KR (Korean) with a text/plain attachment in KOI8-R (Cyrillic). When answering the message, the charset is set to KOI8-R. IMHO that's wrong, the answer should be in the original charset of the message or the user's default charset.

Quoting the relevant headers:

Body:
Content-Type: text/plain; charset=EUC-KR; format=flowed
Content-Transfer-Encoding: 7bit

Attachment:
Content-Type: text/plain; charset=KOI8-R;
 name="test.tmx"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="test.tmx"

I could bet there is already a bug for picking up the charset from the attachment.
Status: UNCONFIRMED → NEW
Ever confirmed: true
For the record: Reply and "edit as new" pick the attachment's charset, forward uses the charset of the message.
Summary: Encoding corrupted when replying the message which has a attachment in UTF-16LE with unrecognized file extension → Encoding corrupted when replying the message which has a attachment in UTF-16LE with unrecognized file extension - reply charset picked from the attachment
(In reply to Jorg K (GMT+1) from comment #2)
> What is the status of UTF-16 encoding is in TB, Magnus, do you know? UTF-7
> is discontinued, right, and I've never seen UTF-16 being used other than

UTF-16 is not working, and would apparently not be a good idea to support (see bug 275023). 

UTF-7 is used heavily internally (due to the IMAP spec). Likely working for external use too, but for that it would be very rare.
Flags: needinfo?(mkmelin+mozilla)
The special case of picking up UTF-16 from an attachment will be fixed in bug 961983 which will put a stop to using UTF-16 in compositions in general.

The problem of this bug persists, that is picking up the charset from an attachment.
Summary: Encoding corrupted when replying the message which has a attachment in UTF-16LE with unrecognized file extension - reply charset picked from the attachment → Reply charset picked up from attachment (picking up UTF-16LE is particularly bad, bug fixed in bug 961983)
Summary: Reply charset picked up from attachment (picking up UTF-16LE is particularly bad, bug fixed in bug 961983) → Reply charset picked up from attachment (picking up UTF-16LE is particularly bad, but fixed in bug 961983)
I thought charset of message body was always used instead of charset of last attachment part in multipart/mixed after bug 715823 was resolved, because bug 715823 was closed as FIXED.
In bug 715823 comment #60, "worksfome in Tb 46", reference to bug 1239658, bug 597369 are seen.

Jorg K.
"charset picked up from last attachment" part in forward/"edit as new" case(Bug summary of bug 715823) was resolved but problem still remains in Reply case?
Or problem in bug 715823 was in "no conversion" part instead of "charset from wrong subpart" part and it was fixed, but problem remails in special UTF-16LE case?
Or Problem of that "Reply to mail composed with UTF-16LE" uses UTF-16LE?
(so, "picking up UTF-16LE is particularly bad". bug 1193497 looks this case.)
Wada, as I wrote in bug 715823, I couldn't reproduce the problem with the given test cases, so I closed it. I've just looked at the test cases again and it looks like I made a mistake.

On test case had this attachment
Content-Type: text/plain;
 name="buggy-attachment-sjis-1.txt"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="buggy-attachment-sjis-1.txt"
so no charset was specified, so obviously none was picked up.

The other example has
Content-Type: text/plain; charset=windows-1252;
 name="sjis.txt"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="sjis.txt"
and the windows-1252 does get picked up.

Too bad, I won't reopen bug 715823 at comment #60, so I'll look into it here.
Oh boy, this is so confusing. Bug 715823 was about "Forward" and "Edit as new", this bug here is about "Reply" and we already know that these are very different code paths.

Looking at the test cases in bug 715823 yet again the one which has
Content-Type: text/plain; charset=windows-1252;
 name="sjis.txt"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="sjis.txt"
does NOT pick up the "windows-1252" on forward, but it does pick it up on "edit as new" and "reply".

Anyway, I'll change the summary and look at "reply" and "edit as new" as "forward" is already working.
Summary: Reply charset picked up from attachment (picking up UTF-16LE is particularly bad, but fixed in bug 961983) → Reply/"Edit as new" charset picked up from attachment (picking up UTF-16LE is particularly bad, but fixed in bug 961983)
I read bug 715823 carefully again. In bug 715823 comment #50 I found a patch (attachment 8625732 [details] [diff] [review]) by Philippe Martinak. I think this patch is correct, it fixes an obvious bug in the mime code.

I lightly tested with the various messages I have and this seems to work just fine.

I don't know why Philippe didn't pursue this further.

Here's a try run and then we'll see:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=c12a0e28c556aebb86fb598179e34be4564b164c

Philippe, if you're still around somewhere, I'm looking at your patch with the aim to use it.

Thank you!
Flags: needinfo?(philippe.martinak)
(In reply to Jorg K (GMT+1) from comment #12)
> 1026989-charset-body.patch from bug 715823 by Philippe Martinak

Only one liner change? (only 6 bytes on only one line. in binary code, only 1 bit change!)
> -    bool is_body = false;
> +    bool is_body = true;
Thanks for quick re-checking bug 715823 and resolving long lived "pick up from attachment" problem which produced excess bug opens.
FYI.
Following test mail for bug 701819 can be used for automated test.
> https://bug701818.bmoattachments.org/attachment.cgi?id=574216
Please change From: or To: to Yatter King <yatter.king@gmail.com> :-)
(In reply to Jorg K (GMT+1) from comment #11)
> Oh boy, this is so confusing. Bug 715823 was about "Forward" and "Edit as new", (snip)

A cause of confusion is following.

1. bug 597821 was opened for "pick up from attachment" + "garbled display" + Forward.
   bug 597821 was caused by wrongly inserted and badly formed part by AVG.
   bug 604628, bug 618465, bug 633217 were closed as dup of bug 597821.
   These are for Forward case. These duped reports looked for "Forward of normal mail".
2. bug 701818 and bug 716983 was opened by same person. This was for "pick up from attachment" + "garbled display" + Forward too.
   Because bug 597821 was caused by wrongly inserted and badly formed part by AVG,
   "pick up from attachment" part may be masked by "wrongly inserted and badly formed part by AVG".
   So I kept these two bugs separately for phenomenon on normal mail.
3. bug 701818 and bug 716983 was closed as dup of bug 715823.
   bug 715823 was initially reported for Forward case by bug opener of bug 701818 and bug 716983.
   And, bug 597821(and its dups) were ignored in bug 715823, or was ignored upon duping by someone.
   Report/comments on Forward case and "Edit New" case were mixed in this bug, so I added "Edit As New" in bug summary.
4. Something were done in bug 715823. It seems that important one bit only patch somehow masked by many comments.
5. And, suddenly, you closed bug 715823 :-)

As you wrote in bug 715823(and in this bug), problem in Forward case was not observed when I checked for Forward bugs which I knew.
In bug 701818 and bug 716983, I knew that "pick up from attachment" occurs in Reply case too, but "garbled display" never occurred in Reply case. So I didn't mixed Reply case in these two bugs to avoid confusion or misunderstanding.
Bug 715823 was for both Forward and "Edit As New" after many comments, and many developers commented in this bug. So I thought "pick up from attachment" issue was resolved in any case when you closed bug 715823 as FIXED.

I believe that one of causes of confusions is "inappropriate duping" and "problem resolution work in inappropriate bug report".
If "problem resolution work" was done in "crisp bug report for Forward case only", or if "duping was done to crisp bug report for Forward case only", I think excess confusion wouldn't have happened.
Please see 3 test mails I attached to bug 701818. These are not minimum test cases, but I believe it's sufficient to understand issues. I don't think many test mails like Bug 715823 are needed for problem analysis and resolving.
And, please note that all of bug 597821, bug 604628, bug 618465, bug 633217, bug 701818, bug 716983 are for Forward case only.
Only Bug 715823 is bug where reports/comments on Forward and "Edit As New" are mixed. I feel that "Duping" was done to one of worst bug in bug reports for "pick up from attachment" issue.

I think "problem resolution work" is better done in different/separated bug from bug for problem report.
If "problem resolution work" was done on each Forward, Edit As New, Reply, or was done on "pick up from attachment" only, or was done on "garbled display" only, or was done on appropriate combination of elements, I think such useless/excess confusion was not produced.
Comment on attachment 8817507 [details] [diff] [review]
1026989-charset-body.patch from bug 715823 by Philippe Martinak

I believe this patch is correct.

It doesn't cause any test failures, see try here:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=c12a0e28c556aebb86fb598179e34be4564b164c

I will add a test and get this landed.
Attachment #8817507 - Flags: review+
(In reply to WADA from comment #14)
> Following test mail for bug 701818 can be used for automated test.
> https://bug701818.bmoattachments.org/attachment.cgi?id=574216
Thanks, my test case in attachment 8807459 [details] is all I need.
I've just tested:
1) Reply 2) Edit as new 3) Forward inline 4) Forward as attachment.
*All* work with the patch, only 3) works without the patch in TB 45 and current trunk. In TB 38 none work.
How about that ;-)
Summary: Reply/"Edit as new" charset picked up from attachment (picking up UTF-16LE is particularly bad, but fixed in bug 961983) → Reply/"Edit as new"/"Forward As->Attachment" charset picked up from last attachment (picking up UTF-16LE is particularly bad, but fixed in bug 961983)
Fixed the reviewer.
Attachment #8817507 - Attachment is obsolete: true
Attachment #8817615 - Flags: review+
Assignee: nobody → jorgk
Status: NEW → ASSIGNED
Attachment #8817622 - Flags: review?(acelists)
(In reply to Jorg K (GMT+1) from comment #17)
> https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=c12a0e28c556aebb86fb598179e34be4564b164c

Where/How can I get try-server build for the changeset/pushid?
It was only compiled for Linux. Do you want Linux? Click on the "B" and then on
  Build: x86_64 linux64 linux
at the bottom left.
I have Windows only. I believe there is no need of testing by many peoples, because "1 bit only patch" and already fully aged :-)
Thanks.
Although bug summary of this bug has Reply/Edit As New/Forward As Attachment only because Forward Inline case was already WORKSFORME by unknown patch, this bug resolves long lived "charset picked up from last attachment" problem at last.
Duping old bugs for "charset picked up from last attachment" issue to this bug, for ease of tracking, search etc.
No longer blocks: 597821
Flags: in-testsuite+
OS: Windows XP → All
Hardware: x86 → All
Comment on attachment 8817622 [details] [diff] [review]
1026989-charset-body-test.patch (v1)

Review of attachment 8817622 [details] [diff] [review]:
-----------------------------------------------------------------

Nice test. But it does not pass for me even with the patch. Somehow the title does not contain the charset string.

But could we check the charset with some better function? E.g. editor->GetDocumentCharacterSet() ?

::: mail/test/mozmill/composition/test-reply-multipart-charset.js
@@ +15,5 @@
> +var RELATIVE_ROOT = "../shared-modules";
> +var MODULE_REQUIRES = ["folder-display-helpers", "compose-helpers", "window-helpers"];
> +
> +Cu.import("resource://gre/modules/Services.jsm");
> +Cu.import("resource:///modules/mailServices.js");

These seem unused.

@@ +27,5 @@
> +  for (let lib of MODULE_REQUIRES) {
> +    collector.getModule(lib).installInto(module);
> +  }
> +
> +  draftsFolder = get_special_folder(Ci.nsMsgFolderFlags.Drafts, true);

This seems unused.

@@ +55,5 @@
> +  close_window(msgc);
> +
> +  // Check the charset in the compose window. It should be EUC-KR
> +  // like in the input message.
> +  assert_true(cwc.window.document.title.includes("EUC-KR"));

Please add a message here that is shown when the check fails. Otherwise the test failure looks strange.
E.g.
  assert_true(cwc.window.document.title.includes("EUC-KR"), "Expected EUC-KR charset not found");

@@ +67,5 @@
> +  subtest_reply_editAsNew_forward(4);
> +}
> +
> +function teardownModule() {
> +}

This is not needed when empty.

::: mail/test/mozmill/shared-modules/test-compose-helpers.js
@@ +149,5 @@
> + *
> + * @return The loaded window of type "msgcompose" wrapped in a MozmillController
> + *         that is augmented using augment_controller.
> + */
> +function open_compose_with_edit_as_new(aController) {

Document aController. It seems to be the main window controller, not e.g. a compose window one.

@@ +153,5 @@
> +function open_compose_with_edit_as_new(aController) {
> +  if (aController === undefined)
> +    aController = mc;
> +
> +  windowHelper.plan_for_new_window("msgcompose");

Can you test whether a message is actually selected first?

@@ +154,5 @@
> +  if (aController === undefined)
> +    aController = mc;
> +
> +  windowHelper.plan_for_new_window("msgcompose");
> +  aController.click(aController.eid("menu_editMsgAsNew"));

What is this clicking on? Please open the menu first (using click_menus_in_sequence).
Attachment #8817622 - Flags: review?(acelists) → feedback+
(In reply to :aceman from comment #31)
> But could we check the charset with some better function? E.g.
> editor->GetDocumentCharacterSet() ?
Yes, I thought of a better test, but nothing came to mind. I'll see.

> ::: mail/test/mozmill/shared-modules/test-compose-helpers.js
Did you notice that open_compose_with_forward_as_attachments?

It's not fair to complain about all those shortcomings of *existing* code. Am I going to fix the whole file now? There are several functions: reply, reply all, reply list, forward, forward as attachment. Do you want all those fixed now?
(In reply to Jorg K (GMT+1) from comment #20)
> 1026989-charset-body-test.patch (v1)

Does other part follow after following?
> +Just some text.
> +--------------FBDBC8434E9A3FF82D6E6F3E
> diff --git a/mail/test/mozmill/composition
If no data after it, does this test contain "no close boundary" test?
If not, it should be close boundary.
By the way, boundary = "--" + BoundaryText, "--" + BoundaryText + "--", so "-" in BoundaryText is better avoided for visibility, and there is no need of long BoundaryText for testing.
Will fix, thanks.
(In reply to Jorg K (GMT+1) from comment #32)
> > ::: mail/test/mozmill/shared-modules/test-compose-helpers.js
> Did you notice that open_compose_with_forward_as_attachments?
> 
> It's not fair to complain about all those shortcomings of *existing* code.
> Am I going to fix the whole file now? There are several functions: reply,
> reply all, reply list, forward, forward as attachment. Do you want all those
> fixed now?

From the patch it is not seen whether it is just duplicated code ;) OK, you do not need to fix them all.
OK, let's see whether this passes for you.

This includes a test for UTF-16 from bug 961983, so obviously you need to get the patch from that bug and apply it.

If you approve the test before Magnus approves the other change, I will refactor this accordingly. Or you can steal the review in the other bug.
Attachment #8817622 - Attachment is obsolete: true
Attachment #8817701 - Flags: review?(acelists)
(In reply to Jorg K (GMT+1) from comment #37)
> This includes a test for UTF-16 from bug 961983, so obviously you need to
> get the patch from that bug and apply it.

This is not how it should be done.

You just fix the tests here and get it landed. Then in bug 961983 you add the test to this same test file.
Test for this bug only.
Attachment #8817701 - Attachment is obsolete: true
Attachment #8817701 - Flags: review?(acelists)
Attachment #8817708 - Flags: review?(acelists)
Oops, never edit patches manually ;-)
Attachment #8817708 - Attachment is obsolete: true
Attachment #8817708 - Flags: review?(acelists)
Attachment #8817711 - Flags: review?(acelists)
Comment on attachment 8817711 [details] [diff] [review]
1026989-charset-body-test.patch (v3b).

Review of attachment 8817711 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, this is mostly fine now.

Just that it still doesn't pass on Linux for me.
Started https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=911356fa24aae9afcca7cbe94e7ad1bb06d3128d to see if it is just me.

::: mail/test/mozmill/composition/test-reply-multipart-charset.js
@@ +49,5 @@
> +
> +  // Check the charset in the compose window. Somehow the property
> +  // is returned lower case.
> +  let charset = cwc.e("content-frame").contentDocument.charset;
> +  assert_true(charset == aCharset.toLowerCase(),

Now that you have 2 values, you can use assert_equals(charset, aCharset.toLowerCase(), "the messages").
In this way I have found this still doesn't pass for me as I get 'utf-8' as the charset in document, not EUC-KR.

But when I have the sample message in a folder and reply to it manually, I get EUC-KR in the title. What may be wrong when running the test?

@@ +54,5 @@
> +              "Compose window has the wrong charset");
> +  close_compose_window(cwc);
> +}
> +
> +function test_reply_editAsNew_forward() {

Maybe test_replyEditAsNewForward_charset().

@@ +55,5 @@
> +  close_compose_window(cwc);
> +}
> +
> +function test_reply_editAsNew_forward() {
> +  subtest_reply_editAsNew_forward(1, "./multipart-charset.eml", "EUC-KR");

Please add some comment on what this block tests, like "// Check the charset is taken from the msg body.", considering the next patch will add more tests here that check something else.
(In reply to :aceman from comment #41)
> Just that it still doesn't pass on Linux for me.

Met with following bug in test?
- When nothing is loaded at message pane yet, Reply etc. from context menu at Thread pane uses default charset
- When mail of charset=XXX is loaded at message pane, Reply etc. from context menu at Thread pane uses charset=XXX
- When message pane is not shown, Reply etc. from context menu at Thread pane uses default charset

If so, "Load the mail in message pane, invoke Reply etc. in onload event handler", "Set charset=EUC-KR in message pane" etc. may be needed in test.
I don't know which field is looked upon invoking composer, but at least following depends on shown mail at messagepane.
> browser id=messagepane
>   #document
>     HTML
>       Body
>         DIV <= lang attribute value of this DIV
>                x-unicode : utf-8 mail      (composed with Unicode)           
>                ja        : iso-2022-jp mail(composed with iso-2022-jp) 
>                x-western : windows-1252 mai(composed with Western)     
>           #text <= body text of shown plain text mail
Problem due to "difference of charset/lang in messagepane" is not observed on plain text mail in Thunderbird 45.
It seems multipart mail only problem in Tb 45. 
aceman, "you resolve the bug first, then do test for this bug" may be fastest way :-)the
FYI.
If window.gDBView is available in test environment, function like following can be used for loading message.
  gDBView.loadMessageByMsgKey = function loadMessageByMsgKey()
  gDBView.loadMessageByUrl = function loadMessageByUrl()
Sorry, message was opened by let msgc = open_message_from_file(file); in test.
So, the message is selected at hidden gDBView and message is loaded in id=messagepane at the opened message window.
When eml file of test mail or test message in mail folder was loaded in new message window by Tb 45, lang of DIV in id=messagepane was already set as lang="x-cyrillic".
So, internal path may be different between mail in message pane of messenger and mail in standalone message window.
I don't know difference between actual Thunderbird environment and test environment, but I think we are better careful about difference of DIV in id=messagepane/lang="ko"(3pane window) and DIV in id=messagepane/lang="x-cyrillic"(standalone message window)..
(In reply to :aceman from comment #41)
> Started
> https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=911356fa24aae9afcca7cbe94e7ad1bb06d3128d
> to see if it is just me.
Thanks, it's "just" Linux and Mac where it doesn't pass. I haven't looked into the detail.

> But when I have the sample message in a folder and reply to it manually, 
> I get EUC-KR in the title. What may be wrong when running the test?
Having it in a folder changes the behaviour, that's why I do it twice here:
https://dxr.mozilla.org/comm-central/rev/29b034f30fcb12b0e252a33502ba44521f5768d9/mail/test/mozmill/composition/test-forward-utf8.js#100
About following in test.
> + cwc = open_compose_with_reply(msgc);

open_compose_with_reply(msgc) is WindowScopeOfThisTest. ... .open_compose_with_reply(WindowScopeOfOpenedMessageWindow.window);
If "window" is referred explicitly or implicitly in open_compose_with_reply, if it's ordinal module, this is WindowScopeOfThisTest, or if it's jsm module, there is no "window" of ordinal environment.
If such thing is relevant, I think msgc. ... .open_compose_with_reply(msg) is better for test.
OK, now I run the messages through a folder. That's the more realistic scenario anyway.

I submitted a try for together with the changes for bug 961983:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=82ea022b7dc80f4c91c7c2f654083a52a2108a26
Attachment #8817711 - Attachment is obsolete: true
Attachment #8817711 - Flags: review?(acelists)
Attachment #8817789 - Flags: review?(acelists)
Comment on attachment 8817789 [details] [diff] [review]
1026989-charset-body-test.patch (v4a).

Review of attachment 8817789 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, seems to work now. I wonder why the behaviour differs when opening from a local file (and differently on platforms). Another bug? :)

::: mail/test/mozmill/composition/test-reply-multipart-charset.js
@@ +26,5 @@
> +  for (let lib of MODULE_REQUIRES) {
> +    collector.getModule(lib).installInto(module);
> +  }
> +
> +  folderToStoreMessages = create_folder("FolderWithMessages");

OK, please remove the folder in teardownModule now.

@@ +77,5 @@
> +  press_delete(mc);
> +}
> +
> +function test_replyEditAsNewForward_charset() {
> +  subtest_replyEditAsNewForward_charset(1, "./multipart-charset.eml", "EUC-KR");

Please add the comment I asked for here.
Attachment #8817789 - Flags: review?(acelists) → review+
(In reply to :aceman from comment #50)
> Thanks, seems to work now. I wonder why the behaviour differs when opening
> from a local file (and differently on platforms). Another bug? :)
Hmm, well, opening from a file is not common practise, so I don't know how far I want to support that since it also seems platform dependent.

> OK, please remove the folder in teardownModule now.
OK.

> Please add the comment I asked for here.
I did:
https://hg.mozilla.org/try-comm-central/rev/cefa311f72413fd74d88a02002e54c6949df7d80#l2.37

Is that not what you wanted?
(In reply to :aceman from comment #50)
> I wonder why the behaviour differs when opening from a local file (and differently on platforms). Another bug? :)

As I wrote in comment #46, at least following difference exists:
(a) DIV in id=messagepane/lang="ko" (message pane of 3pane window=messenger window)
(b) DIV in id=messagepane/lang="x-cyrillic" (standalone message window, perhaps same in Tab)
(b) indicates that the test mail/eml is opened as "koi8-r" mail.
This difference can be called "bug", if design/intent is same mail handling in both messenger window and message window.
But, if design/intent is "same display", it's never bug, because "same display" is surely achieved.

I don't know well difference between "Reply at message window before patch" and "Reply at message window before patch".
But this bug is surely reproduced by "Reply at message window before patch" in actual Thunderbird 45 on Win10.

I don't know difference between actual Thunderbird environment and test environment,
I don't know difference between "test on Linux/Mac OS X" and "test on Windows".
If Mac OS X only difference, it may be "Apple Menu is located in OS instead of Tb", as seen in source comment of folderWidgets.xml.
But it's difference on both Linux and Mac OS X who are member of *nix family(Linux, Berkley).
So I suspected "window scope in test environment".

Because "problem in test" doesn't look to occur when test mail is loaded from mail folder to stand alone message window in test environment, it may be difference between "load from eml file" and "load from mail folder".
Aceman, how do you delete a folder in a mozmill test? In test-forward-utf8.js I don't delete it either.
I don't see any test that would do it or any function to do it.
Nothing in: test-folder-display-helpers.js
Flags: needinfo?(acelists)
https://hg.mozilla.org/comm-central/rev/a01768ba37569080112f94aa5f7db03ee86e4e00
https://hg.mozilla.org/comm-central/rev/7a69461597e94a4a25157f0a8e17d1ec53df90b9

I landed the test with extra comments as requested by Aceman and Magnus in bug 961983.

I don't remove the folder since there doesn't appear to be a straight-forward way to do it. Eight Mozmill tests in mozmill/composition create folder and never clean up.

Once again, many thanks to Philippe Martinak wherever you might be.
Flags: needinfo?(philippe.martinak)
Flags: needinfo?(acelists)
Target Milestone: --- → Thunderbird 53.0
Comment on attachment 8817615 [details] [diff] [review]
1026989-charset-body.patch from bug 715823 by Philippe Martinak

[Approval Request Comment]
Regression caused by (bug #): That seems to have been wrong since day one of MIME.
User impact if declined: Some messages get sent with an incorrect character set.
Testing completed (on c-c, etc.): Manual and with test.
Risk to taking this patch (and alternatives if risky):
One-word change in MIME, of course that could blow up big time ;-)
Attachment #8817615 - Flags: approval-comm-esr45?
Attachment #8817615 - Flags: approval-comm-beta?
Attachment #8817615 - Flags: approval-comm-aurora+
Comment on attachment 8817789 [details] [diff] [review]
1026989-charset-body-test.patch (v4a).

[Approval Request Comment]
See preceding comment. This is the test.
Attachment #8817789 - Flags: approval-comm-esr45?
Attachment #8817789 - Flags: approval-comm-beta?
Attachment #8817789 - Flags: approval-comm-aurora+
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
> Resolution: --- → FIXED

Thanks a lot for your "digging out hidden-for-long-time/fully-aged treasure=1 bit only patch" and quick fixing of this bug and many long lived old bugs at once.

By the way, please fill/correct following table
>            Messenger Window  Message Window
>                      from-mail-folder from-eml-file
> After patch                           
>  Actual Thunderbird                  
>    MS Windows    ○         ○        ?
>    Linux/Mac OS X  ○         ○        ?
>  Test environment                       
>    MS Windows    ○         ○        ○
>    Linux/Mac OS X  ○         ○        × <= why?
In v3 test, composer window is opened by cwc = open_compose_with_reply(msgc);
and state is checked only by let charset = cwc.e("content-frame").contentDocument.charset;
cwc is too quickly returned before all composing setup at Composer window is successfully executed in low workload *nix environment on fast CPU?
It looks for me wait time/sleep time is hard coded in relevant code in /mail/test/mozmill/shared-modules.
If such case, other elements such as gMsgCompose object's content, Subject field, body text are better checked for detection of error in test itself.
Sorry, this code and tests work as can be seen on comm-central and try-comm-central. The patch was reviewed by Aceman who is the resident expert on these tests. We won't make any more changes in this bug.

If you want to test it, get tomorrow's Daily build, or if you can't wait, go to comm-central, click on a "B" and then on the Tinderbox build directory on the bottom left.
This does not mean I can notice everything. We have many timing problems in the tests.
WADA, if you have a specific proposal what to do better, please open a new bug.
I never say change is needed. I merely want to find reason why your test v3 didn't work as expected in aceman's test and your test, despite that bad thing is never senn in your test v3. Why you had to change simple v3 to complex v4?

(In reply to :aceman from comment #60)
> We have many timing problems in the tests
> WADA, if you have a specific proposal what to do better, please open a new bug.

I see. library for testing != official code.
I was pretty angry simply because Jorg K had to rewrite simple/nothing-is-bad test v3 :-)
"charset picked up from last attachment" phenomenon was observed on pre-checked mark of "View/Text Encoding" in all of Message pane of messenger window, Tab of messenger window opened by "Open Message in New Tab", Message window opened by "Open Message in New Window" of Thunderbird 45.
Is this same problem as this bug?
(In reply to WADA from comment #62)
> Is this same problem as this bug?
I think so. My test e-mail with Korean body and Russian attachment shows as Korean with this patch but as Russian without it. There is only one spot in MIME that got it wrong with a plethora of bad effects. Philippe Martinak found that spot before he vanished in the fog ;-)
(In reply to WADA from comment #61)
> I never say change is needed. I merely want to find reason why your test v3
> didn't work as expected in aceman's test and your test, despite that bad
> thing is never senn in your test v3. Why you had to change simple v3 to
> complex v4?

The reason is your comment 57. Opening message from eml file behaves differently than opening from mail folder (at least on Linux/OS X). The charset is chosen differently. This bug fixed the more common scenario of opening from mail folder.

It will be fine if you clone this bug into a new one for the remaining scenario.
Only 1 bit change from 0 to 1 in many/huge Thunderbird modules resolved many long lived problems at once, at a glance.
Who discovered the pretty small spot is great, but who dig out hidden/aged pretty small "only 1 bit change" is also great ;-)
Anyway, thanks a lot to developers for fixing.
Almost all mysteries for me started from bug 597821 has disappered. (due to aceman's comment #64, I have to change from "All mysteries" to "Almost all mysteries" :-) )
(In reply to Jorg K (GMT+1) from comment #63)
> (In reply to WADA from comment #62)
> > Is this same problem as this bug?
> I think so. My test e-mail with Korean body and Russian attachment shows as Korean with this patch, but as Russian without it.
> There is only one spot in MIME that got it wrong with a plethora of bad effects.

If so, phenopomen looks for me;
1. "charset of last attachment" is stored as (*) "charset value of attribute or property at somewhere" upon message display.
2. In message display itself, charset of each part is correctly used for message display/attachment display.
3. View/Text Encoding shows (*) in messagepane.
4. Composer action such as Reply, Forward, Edit As New also uses the (*) at somewhere in messagepane.
   So, all of them is affected by "charset of last attachment of displayed mail".
5. Fortunately, "charset from last attachment of Replied/Forwared/Edited mail in messagepane" doesn't exist in following case,
   following case is not affected by "charset from last attachment of the Replied/Forwared/Edited mail".
   - Composer action when no message is displayed in messagepane, compose via. context menu.
   - Composer action when message of other charset is shown at messagepane, compose via. context menu.
   - Composer action when message pane is hidden, compose via. context menu.
   Needless to say, because "charset value of other mail at somewhere in messagepane" is used, different problem occurs.
6. In "Forward Inline" case, many issues were produced by "charset of last attachment",
   for example, "garbled message text", "garbled file name", "garbled attachment data".
   So, while many developers were playing with bug 715823, problem in "Forward Inline" was resolved by other bug.
7. Then, developers stopped to play with bug 715823, because interesting "garbled display" disappeared in "Forward Inline" case,
   despite that problems due to "charset from last attachment" surely existed and was known.
   "Edit As New" case : Even though it was pointed in bug 715823, it was ignored.
   "Reply" case : Even though Reply case was pointed in other bug(s), it was hidden by bad duping.
   "Forward Attachment" case : This bug is for Forward. If ordinal software development, I believe at least both cases are checked.
8. Some years later, "charset from last attachment"==UTF16-LE case happened in some users environment.
   Then, all of bad in bug 715823 was exposed and corrected by this bug, and poor tiny patch was rescued from bug 715823,
   or valuable treasure hidden in bug 715823 was dig out :-)
9. Fortunately, automated test for "open from eml" case was first proposed test.
   So, possible issue in "open from eml in Linux/OS X" was exposed, with no effort, except excess work by Jorg K & aceman :-)

It looks for me that outstanding not-so-small/not-negilible issues are 5 and 9 only.
I believe that majority of mysteries around charset in mail composition will disappear by fixing of this bug, because charset of message body will be held in messagepane almost always.
Cause/solution of 5 is now clear by this bug.
  Composer almost always looks "charset of displayed mail at somewhere in messagepane.
  If context menu passes "row in gDBView" or msgDBHdr or messageKey to composer, problem is resolved,
  because composer can not know whether requested mail is shown in messagepane or not,
  except when Message Window/Tab where "the requested message is displayed in messagepane" is guaranteed.
  If composer doesn't always blindly rely on "charset value held at somewhere in messagepane", problem is resolved.
Issue of 9 may not be problem.
  There is no law of "reply/forward has to be composed in charset of received mail".
  User can use any charset in mail composition which mailer supports.
  "Which charset should be/is better pre-selected upon mail composition by mailer" depends on environment.
  So, if OS recommended utf-8, "propose utf-8 as pre-selected charset" is normal behavior as mailer.

Anyway, all outstanding issues are better processed after patch of this bug is landed, in order to avoid affect by long lived problem of "charset is picked up from last attachment", in order to avoid useless work.
I'll look into bug for 5 again. Patch is perhaps landed today or tomorrow.
Summary: Reply/"Edit as new"/"Forward As->Attachment" charset picked up from last attachment (picking up UTF-16LE is particularly bad, but fixed in bug 961983) → Reply/"Edit as new"/"Forward As->Attachment"/Display message: charset picked up from last attachment
Checked with trunk nightly 2016-12-11 build on Win10.
I could see consistent body charset use in any of Reply xxx, Forward Attachment, Edit As New, and view message at Message Pane, Open in Mew Window/Tab, Open Saved Message, and consistent body charset disply in any View/Text Encoding in above cases.
=> Verified.

Phenomenon(issue) of 5, outstanding charset mystery, was also changed to consistent and clear and easy-to-understand.
a) When a mail of charset=XXX is already loaded in messagepane,
   copmpose via Context Menu always uses the charset=XXX of messagepane, except when Forward Inline.
b) When no mail is loaded in message pane(after restart),
    when iso-8859-2+koi8-r mail, copmpose via Context Menu used utf-8, except when Forward Inline.
    when EUC-KR+koir-8 mail, copmpose via Context Menu used koi8-r, except when Forward Inline.
In b), I believe "always use default composing charset" is better and suffient. No need of doing something for this case.
Status: RESOLVED → VERIFIED
User Story: (updated)
Checked with aurora(52) 2016-12-11 build on Win10. Quick check only, no problem was observerved => verified too in aurora.
Summary: Reply/"Edit as new"/"Forward As->Attachment"/Display message: charset picked up from last attachment → Reply/"Edit as new"/"Forward As->Attachment"/Display message/Text Encoding of View, Options etc. : charset picked up from last attachment
To aceman and Jorg K.

I found bug 715823 comment #54 by Nikolay Shopik on 2015-07-20 : Does bug 597369 sound similar?
So, I looked bug 597369.
> Bug 597369 - "Open Draft"/"Forward"/"Edit As New"/"Reply" creates message composition with incorrect character encoding
> if messages wasn't viewed before. Uses charset from previously viewed message.
This was/is mystery I could/can see for long time, before patch of this bug, after patch of this bug.
  When mail of charset=XXX is shown in messagepane, "compose via. context menu of other mail at Thread pane" uses charset=XXX.
Change history of bug 597369:
  aleth@instantbird.org 2016-02-07 06:08:30 : resolved/fixed
  aleth@instantbird.org 2016-02-07 10:00:25 : reopened
  acelists@atlas.sk     2016-02-07 13:24:00 : resolved/fixed
Bug 597369 is 45/affected, 46/fixed, 47/fixed.

Even though the Bug 597369 was fixed, why I still can observe same phenomenon?
Regressin after it by patch of this bug?
Flags: needinfo?(jorgk)
Flags: needinfo?(acelists)
Oh, CharsetOverride etc. was relevant. following is patch of that bug.
> +  if (messageArray && messageArray.length == 1) {
> +    if (GetMsgKeyFromURI(messageArray[0]) != gMessageDisplay.keyForCharsetOverride) {
> +      msgWindow.charsetOverride = false;
> +    }
Because mail data is converted to utf-8(or Utf16) for mail data display, "charset of which at where for what" is pretty difficult, and Parser's interpretation is also relevant to phenomenon.
Bug 597369 was a different issue. If you viewed message 1 and changed the encoding for that message, then answered message 2 without viewing it, the charset of message 1 was used as override charset for message 2, which is clearly wrong.

Overriding the original charset of the message allows you to use a charset that is different to the one determined for the message to view it and reply to it. But this override shouldn't carry over to any other message.

This bug here lead to the charset being incorrectly detected in the first place, one of the causes why you wanted to override the charset.

BTW, the sister-bug of bug 597369 is bug 1239658. If you want the full history, you also need to understand bug 494912 which is the true cause of bug 1239658. The symptom that as tried to be fixed in bug 494912, and whose alleged fix in bug 494912 caused a plethora of regressions, was fixed in bug 1251783, once again, what you call a "one bit" fix in MIME.

In summary, two small MIME bugs (plus bug 597369), both solved be turning 'false' to 'true' caused an tsunami of bugs. Sad but true. I fixed them all. I believe this bug here closes the door on the sad history of wrong charsets being used.

Is there anything else wrong now? If so, please give some STR in a new bug. Otherwise I'd really like to bring the discussion to a close.

BTW, Aleth and Aceman just did some landings in bug 597369, so they have nothing do with this issue.
Flags: needinfo?(jorgk)
Flags: needinfo?(acelists)
Attachment #8817615 - Flags: approval-comm-beta? → approval-comm-beta+
Attachment #8817789 - Flags: approval-comm-beta? → approval-comm-beta+
duping some bugs to this bug which were hidden by repeted duping.

FYI.
Example of loss of Reply case due to inappropriate && repeated duping.
 Bug 755169 pointed "charset picked up from last attachment" in message display and reply/forward -> duped to bug 716983
 Because bug 716983 clearly pointed all of "charset picked up from last attachment", garbled didplay etc. in "Forward Inline" case,
 I did do nothing on "duping to bug 716983", and duped some bugs to bug 716983 because new report was already duped to bug 716983.
 Bug 716983(and bug 701818) was suddenly duped to bug 715823 by someone,
 and nothing was resolved by bug 715823 because "Forward Inline" case only was fortunately or unforunately resolved by unknown bug,
 despite that at least "Edit As New" case was reported in comment and was clearly stated in Bug Summary of the bug 715823,
 despite that duped bug by someone already had some duped bugs.
Bug which explicitly referred to Reply case was bug 755169 only. Referring to Reply was in my comment in one or a few bugs only.
Other old bugs were for "garbled display in Forward Inline case" because impact of "garbled display" is pretty large in Forward Inline case.
Jorg K, you should have been aware of the bug 755169 before this bug is opened for UTF16-LE case :-)
  I was't aware of that bug 755169 was closed as dup of bug 716983 for long time.
  After it, I was aware of dups of bug 716983, so I duped some bugs to it because it already had dup bug(s).
  When I was aware of duping of bug 755169 to bug 716983, I thought it's not problem,
  because I stupidly believed that someone surely tests Reply case too while fixing problem in Forward Inline case.
  If Reply case was kept separately and kept open, I believe you could know bug 755169.
  Sorry for hiding bug 755169.

For "Edit As New" in Bug Summary of bug 715823.
I looked into bug 715823 which I was not interested in after duping, because some peoples started to play with the problem in the bug, and because I stupidly believed problems will be surely resolved because phenomenon is pretty clear.

Initial report was for "garbled display in Forward Inline" only, and it was kept for long time.
"Edit As New" in bug summary was added by me on 2014-06-01 because I saw referring to "Edit As New" in a comment, after duping of bug 716983 to bug 715823 on 2013-10-24.
Almost all activities on the bug looked done at end of 2013 and autumn of 2014. The bug seemed already almost thrown away by Assigned To: person during 2013/2014.
I wasn't aware of RESOLVED/FIXED on 2016-04-04 by you. I was aware of it when I saw this bug.
  I was surprised : "charset from last attachment" issue is pretty old problem. Why processed again in new bug?
Even if FIXED can be called your mistake because of existence of "Edit As New" in bug summary, I can't say so because almost all comments, activities in the bug, were for "garbled display in Forward Inline case" only.
"Edit As New" exists in a comment and in bug summary by me only.
And, test mails attached to the bug can be called "one of worst test mails", as you were confused by the test mails. If test mail was one in this bug or one in bug 701818, I believe confusion couldn't be produced.
And, for "Forward Inline" case, your closing as FIXED is valid because it was surely already resolved, although WORKSFORME might be better because resolved by "Unknown patch".

I believed that "main issue is charset from last part" was understood by developers because I pointed it in bug 597821 and bug 701818(and in bug 716983 too).
However, it was stupid belief. My pointing out seems lost or hidden or ignored due to bug 715823, sadly. i.e. My QA work in bug 597821, bug 701818, and bug 716983 at B.M.O was absolutely useless.

I should have added aceman and Jorg K to cc: of bug 597821 and bug 701818(and, hidden bug 755169 by us).
Whiteboard: [Only "Forward Inline" case was already resolved by unknown bug/patch]
Depends on: 961983
Whiteboard: [Only "Forward Inline" case was already resolved by unknown bug/patch] → [Only "Forward Inline" case was already resolved by unknown bug/patch] [see bug 961983 for UTF16-LE specific issue]
No longer blocks: 1323377
See Also: → 1323377
Comment on attachment 8817615 [details] [diff] [review]
1026989-charset-body.patch from bug 715823 by Philippe Martinak

Not landing in 45.7.0 due to dependency on bug 961983
Attachment #8817615 - Flags: approval-comm-esr45? → approval-comm-esr45-
Attachment #8817789 - Flags: approval-comm-esr45? → approval-comm-esr45-
You got this wrong. You need to apply this first.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: