Closed Bug 1323377 Opened 7 years ago Closed 7 years ago

Composer(except ForwardInline/ReplyWithTemplate) uses charset of currently displayed message in messagepane instead of charset of requested message when requested via. context menu at thread pane

Categories

(MailNews Core :: Composition, defect)

defect
Not set
normal

Tracking

(thunderbird51 fixed, thunderbird52 fixed, thunderbird53 fixed)

RESOLVED FIXED
Thunderbird 53.0
Tracking Status
thunderbird51 --- fixed
thunderbird52 --- fixed
thunderbird53 --- fixed

People

(Reporter: World, Assigned: jorgk-bmo)

References

Details

Attachments

(4 files, 13 obsolete files)

5.22 KB, text/plain
Details
2.47 KB, text/plain
Details
7.05 KB, text/plain
Details
30.72 KB, patch
jorgk-bmo
: review+
Details | Diff | Splinter Review
+++ This bug was initially created as a clone of Bug #1026989 +++

Phenomenon itself is pretty old phenomenon, and some bugs for this issue are already opened.
But "picking up charset" is drastically sorted out and improved by bug 1026989, so phenomenon is now consistent, clear, easy-to-understand.
And, "charset is picked up from last attachment" is still observed in special case.
So open new bug.

Checked with 53.0a1 (2016-12-11) (32-bit) on Win10.

[STR-A - when message is shown in messagepane]
1. View message(charset=ZZZ) => message is displayed in messagepane
2. At different message(plain/text, charset=XXX) in thread pane, right click, choose any compose request except ForwardInline via Context menu.
3. Options/Text Encoding = ZZZ instead of XXX.
Once message is displsayed, messagepane is visible or hidden is irrelevant. "charset at messagepane" is used.

[STR-B - when about:blank or start page is shown at messegepane just after restart of Tb]
1. Restart Tb. At messagepane, blank or startup page is shown.
2. At a message(plain/text, charset=XXX) in thread pane, right click, choose any compose request except ForwardInline via Context menu.
3. used charset(charset in titlebar), Options/Text Encoding
   iso-8859-2 mail, KOI8-R attachment : utf-8, Options/Text Encoding = Unicode
   EUC-KR mail, KOI8-R attachment : KOI8-R, Options/Text Encoding = Unicode
         ("charset is picked up from last attachment" is still observed in this case)
"startup page" is HTML, so this may be phenomenon when HTML mail.
Following structure is shown by DOM Inspector.
  hbox
    browser
      document
        html
        HTML
          HEAD
          BODY ...

Magic/trick in ForwardInline(and ReplyWithTemplate) :
ReplyWithTemplate(reply by message filter) utilizes ForwardInline, and "displayed message at messagepane" doesn't exist. So "message to be used/its charset" can not be blindly obtained from messagepane.
And ForwardInline and ReplyWithTemplate shares code.
> https://hg.mozilla.org/comm-central/annotate/63ee59a8e786/mailnews/compose/src/nsMsgCompose.cpp#l1829
> // If we are forwarding inline or replying with template, mime did already
> // setup the compose fields therefore we should stop now.
> if (type == nsIMsgCompType::ForwardInline ||
>     type == nsIMsgCompType::ReplyWithTemplate)
Sorry, wrong statement about "EUC-KR mail, KOI8-R attachment" case.
I was confused "Subject: ... ABC (KOI8-R)" in titlebar which is shown by my modified Jorg K's test mail 
 with "Subject: ... ABC" + "(KOI8-R)" in titlebar for used charset.
In this case, used charset=utf-8,OPtions/Text Encoding=Unicode is also shown.

When nothing is shown at messagepane, "always use utf-8" may be a best practice as mailer.
All is variant of test mail of bug 1026989 for ease of observation.
User Story: (updated)
Whiteboard: [Only "Forward Inline" case was already resolved by unknown bug/patch] [see bug 961983 for UTF16-LE specific issue]
No longer blocks: 715823
No longer depends on: 961983, 1026989
FYI.
At thread pane, when right clicked and context menu is shown, focus/selected-state is temporally moved to the row in msgDBView, and after context menu is closed, it's immediately moved back to originally selected row at thread pane.
So, when composer is invoked, focus/selected-state is already moved back to originally selected mail at thread pane.
This bug may be a reslut of following.
1. When context menu is shown, focus/selected-state was permanently moved to the row where context memnu was shown in old version.
   Composer code may rely on this behavior.
2. After it, focus/selected-state was moved back to originally selected row from a Tb version(or a Mozilla MailNews version), because it's behavior what majority of users expect and want.
   Composer code is not modified according to the change yet.
Component: Message Compose Window → Composition
Product: Thunderbird → MailNews Core
See Also: → 1026989
Summary: Composer(except ForwardInline/ReplyWithTemplate) uses charset of currently displayed message in messagepane instead of charset of requested mail when requested via context menu → Composer(except ForwardInline/ReplyWithTemplate) uses charset of currently displayed message in messagepane instead of charset of requested message when requested via. context menu at thread pane
Just after following code block for ForwardInline/ReplyWithTemplate,
> // If we are forwarding inline or replying with template, mime did already
> // setup the compose fields therefore we should stop now.
> if (type == nsIMsgCompType::ForwardInline ||
>     type == nsIMsgCompType::ReplyWithTemplate)
> {
>    ...
>  Finally,
>    return NS_OK;
> }
charset is obtained by following code.
> https://hg.mozilla.org/comm-central/annotate/63ee59a8e786/mailnews/compose/src/nsMsgCompose.cpp#l1897
> GetTopmostMsgWindowCharacterSet(mailCharset, &mCharsetOverride);
> if (!mailCharset.IsEmpty())
>   charset = mailCharset;
>   charsetOverride = mCharsetOverride;
> }
In GetTopmostMsgWindowCharacterSet and called sub functions, multiple DOM windows are searched and topmost window is used.
This is apparently for "multiple messenger windows" case. "Multiple messages in multiple Tabs(=message window) in multiple messenger windows" may also be cared.
In these kind of function, I can't think "right clicked msdDBHdr at a row of gDBView in a messenger window" can be easily selected.
I think "currently selected message in gDBView of messenger window which is chosen by GetTopmostMsgWindowCharacterSet" is usually used.

Because requested message(msgDBHdr) by right click is surely used by composer, I think "requested msgDBHdr" is available in composer code. Why it can not be utilized in charset/charsetOverride determination?
Request via. too many paths arrives at charset/charsetOverride determination step?
Aha!
After multiple Tab support, following can occur.
 multiple 3pane Tabs + multiple Tabs for message windows(open message in New Tab) + otherr Taba such as Trouble Shooting Information
In this case, window.gDBView is gDBView of last selected 3pane Tab or last selected Tab for messenge window.
If other Tab such as Trouble Shooting Information is last selected Tab, window.gDBView is not changed.
When Reply etc. of Main menu or Toolbar button, Reply etc. can be requested even when other Tab is currently active, so "Tab where requested message is selected==currently active Tab" is not always guaranteed.
So multiple DOM windows are currently searched for "requested message in a 3pane Tab or Tab for message window" upon charset/charsetOverride determination.

In such case, "found message by DOM window search" == "last selected/shown message in last selected Tab" == requested message.
However, when request via. Context menu, unforunately,
 "found message by DOM window search" == "last selected/shown message in last selected Tab" != requested message.
Thanks for reporting this.

I can reproduce STR-A in TB 45.x and Daily TB 53. I'm surprised I haven't run into this problem myself:
View message 1 in the preview pane, then reply to message 2 via the right-click menu without viewing it.

Picks up the charset from message 1. I thought fixed something similar in bug 597369. Right now I'm confused what was fixed in bug 597369, looks like it had to do with the character set override, so when you manually selected a different encoding. But I can still carry the override over to a different message, so I don't understand what has happened. I'll look into it.
I didn't check "charset change via. View/Text Encoding" case, because "any reporter of bug in B.M.O what I read in the past" didn't report such issue.

[STR-ZZZ - extended STR when message is shown in messagepane]

1. View message #1(charset=AAA) => message#1(charset=AAA) is displayed in messagepane
2. At message#1 in thread pane, compose via. context menu
   -> charset=AAA,Option/Text Encoding=string for AAA (charset in messagepane is used)
3. At message#2(charset=BBB) in thread pane, compose via. context menu
   -> charset=AAA,Option/Text Encoding=string for AAA (charset in messagepane is used)

4. At message pane for message#1(charset=AAA),
   change chaset to charset=CCC by View/Text Encoding 
5. At message#1 in thread pane, compose via. context menu
   -> charset=CCC,Option/Text Encoding=string for CCC (charset in messagepane is used)
6. At message#2(charset=BBB) in thread pane, compose via. context menu
   -> charset=CCC,Option/Text Encoding=string for CCC (charset in messagepane is used)

I think that clear/consistent/never-confusing test result couldn't be obtained unless bug 1026989 was fixed and crisp/good test mail was provided.
As you actually experienced in a bug which I refferred many times, bad test mails pretty easily produces confusions and sometimes easily leads to wrong direction.
Unless bug 1026989 was fixed, once "test mail with different charset in last attachment” was mixed in mail example, it could easily lead to wrong direction.
If original/actual mail data, due to lo--ng and badly-formed HTML and lo-ng base64 attachment data, it's pretty hard(usually nearly impossible) to understand mail structure and parameters of header in each part.
I was suprized when I saw your test mail of bug 1026989. Why developers (can) create such crisp test mail?
I had belieaved that purpose of creating crisp test mail is to surely explain about essense of problem to developer :-)
Oh, Bug 597369 was one of what I said "some bugs for this issue are already opened". Setting dependency for ease of tracking, search.
Blocks: 597369
Oh, Bug 701694, which was duped to Bug 597369 on 2016-01-05, was also one of I knew. Who duped :-)
From perspective of QA people who don't know detail of issue, this bug is regression between Tb 47 and Tb 53 because bug 597369 was fixed once.
But, even though external phenomenon/str are pretty similar, conditions/phenomena looks different between bug 597369 and this bug (which is phenomenon after change by bug 597369 was landed and bug 1026989 was fixed).
Jorg K, is this bug regression between Tb 47 and Tb 53? Or oustanding issue after change of bug 597369?
Sorry but I checked Tb 45 and Tb 53 only.
I looked at bug 597369 (and bug 701694) again, there was also bug 1239658. The fixes provided stopped messages looking garbled. You can easily check in TB 38 that you can produce various garbled results which cannot be produced in TB 45 any more to where those bugs got uplifted.

Once messages stopped looking garbled, people stopped complaining and no one ever noticed the bug you're reporting here since no one paid attention to the title bar.

So what were seeing is yet another outstanding issue in the field of getting the encoding wrong. As I said before, we don't see any garbled messages anywhere, merely the chosen character set is wrong.

I'm confirming again: Reply, "Edit as new", "Forward as attachment" all pick the charset from the previously viewed message ("Forward inline" does not), but after you save, the original erroneous charset is sometimes upgraded to something useful, mostly UTF-8.

I'll look into it.
I'm looking again at nsMsgCompose.cpp, in particular
https://dxr.mozilla.org/comm-central/rev/e69ced0974d588e438569c22f71a0fcbf84c2b8e/mailnews/compose/src/nsMsgCompose.cpp#1962

There the topmost window's charset is retrieved together with an override flag. Why? Well, the user has the chance to change the charset used by the system in case it's not right.

The override-charset should then be used instead for the charset of the message, that's what 'mCharsetOverride' is for: Is the charset overridden.

The fault in the code is to use the charset of the topmost/last displayed message although no override is set. Great.

The whole charset processing in this function is a terrible piecemeal with special processing for "x-windows-949" and nsIMsgCompType::ReplyWithTemplate.

That will be fun to clean up.
This is harder than I thought.

The patch sorts out the bad processing in nsMsgCompose::CreateMessage() which mainly uses the charset of the displayed message, which is somewhat natural, since people will want to reply to the message they're viewing.

Turns out that "plan B", |msgHdr->GetCharset(getter_Copies(charset));| frequently doesn't return *any* charset.

So if not taken from the displayed message and not coming from the database, where should I get the charset from?

Ideally from MIME of course, but as the code notes, that only seems to work for "forward inline" and some other cases.

This needs more investigation.
Assignee: nobody → jorgk
Status: NEW → ASSIGNED
> Turns out that "plan B", |msgHdr->GetCharset(getter_Copies(charset));| frequently doesn't return *any* charset.

Message display utilizes MIME code. If not, I believe no one can display mail :-)
No easy way to call Mime code.
Or too many malformed/not-well-formed mails? multipart/mixed[ multipart/related{ multipart/alternative(text/plain+text/html) + img + many non-used pdf parts} + complex structure parts ] etc.
Because data held in id=messagepane, which was obtained by fighting with not-well-formed mails, can not be used in this bug's case.
If so, there is only two wayshim.
a) Fight with not-well-formed mails in composer code by himself with fully utilizing mime code.
b) Load the message to hidden/temporary message window. Fighting with the not-well-formed mails is done by "mail display code" :-)
Correction.
Because data held in id=messagepane, which was obtained by fighting with not-well-formed mails, can not be used in this bug's case, there is only two ways.
How could FilterInline bypass the issue before writing following comment?
> // If we are forwarding inline or replying with template, mime did already
> // setup the compose fields therefore we should stop now.
> if (type == nsIMsgCompType::ForwardInline ||
>     type == nsIMsgCompType::ReplyWithTemplate)
> {
>    ...
>  Finally,
>    return NS_OK;
> }
(In reply to WADA:World Anti-bad-Duping Agency from comment #19)
> How could ForwardInline bypass the issue before writing following comment?
I don't know yet, I'll have to read the MIME code.
d(In reply to Jorg K (GMT+1) from comment #11)
> I'm confirming again: Reply, "Edit as new", "Forward as attachment" all pick
> the charset from the previously viewed message ("Forward inline" does not),
> but after you save, the original erroneous charset is sometimes upgraded to
> something useful, mostly UTF-8.

I could see the "saved as utf-8" when composing charset=EUC-KR due to this bug.
But when composing charset=iso-8859-2 due to this bug, draft was saved as iso-8859-2 in my test.
It looks that Draft Save looks different field or Draft code has different philosophy.

Revised STR(by quick check, so maybe inaccurate in some cases).

[STR-ZZZ-V2 - extended STR V2 when message is shown in messagepane]

1. View message #1(charset=AAA) => message#1(charset=AAA) is displayed in messagepane
2-a. At message#1 in thread pane, compose via. context menu
     -> charset=AAA,Option/Text Encoding=string for AAA (charset in messagepane is used)
2-b. Save As Draft -> saved as utf-8(when AAA=EUC-KR), or saved as AAA(depends on AAA)
3-a. At message#2(charset=BBB) in thread pane, compose via. context menu
     -> charset=AAA,Option/Text Encoding=string for AAA (charset in messagepane is used)
3-b. Save As Draft -> saved as utf-8(when AAA=EUC-KR), or saved as AAA(depends on AAA)

4.   At message pane for message#1(charset=AAA),
     change charset to charset=CCC by View/Text Encoding 
5-a.  At message#1 in thread pane, compose via. context menu
     -> charset=CCC,Option/Text Encoding=string for CCC (charset in messagepane is used)
5-b. Save As Draft -> saved as utf-8(when CCC=EUC-KR), or saved as CCC(depends on CCC)
6-a. At message#2(charset=BBB) in thread pane, compose via. context menu
     -> charset=CCC,Option/Text Encoding=string for CCC (charset in messagepane is used)
6-b. Save As Draft -> saved as utf-8(when CCC=EUC-KR), or saved as CCC(depends on CCC)
For "EUC-KR mail with KOI8-R attachment was saved as utf-8".
  cause was filename="=?iso-8859-2?Q?v=B9b=2Etxt?=" of attanhnent.
  due to this, charset is changed to utf-8 from EUC-KR.
This was test mail for bug 701818 to force garbled didplay in attachment file name in problem duplication test.
This is good example of bad test mail which produces confusion and leads to wrong direction :-)

Sorry, I'll update test mail.
For about:blank/startup page case.
 about:blank  : structure = HTML -> HEAD/BODY, but nothing is shown, so charset is not available
 startup page : structure = html, HTML -> HEAD/BODY
                "HTML" part is written in UTF-8, but "html" part has nothing -> no charset is available in this browser/document
When about:blank is shown in id=message pane :
 Edit As New  : composed in charset of mail
 Reply to All : composed in utf-8(default charset, or falls into utf-8)
 Forward As   : both Inline and Attachment is grayed out
                (command controller is utilized. nothing is found in id=messagepane, so context menu is disabled)
Each feature has his own philosophy.
startup page case.
"html tag before HTML tag" is perhaps to avoid Forward As Inline/Attachment of "startup page".
Quick Summary of problem.
> (A) Disable compose request in Main menu/Toolbar button
>     when Active Tab != mail Tab(3pane Tab/message Tab only).
>     Addon Manager, Trouble shooting Information Global Search result etc. are excluded.
>     => "requested mail is in currently active Tab" is guaranteed
>     => There is no need to search DOM windows for topmost window
>     
> (B) After (A), any case falls into one of next in Active Mail Tab of Messenmger Window or Message Window.
> 
>       requested message  messagepane  requested message is              charset is
>          at gDBView                      held in id=messagepane
> 
>  (1)  selected           visible      yes                               not altered
>  (2)  selected           visible      yes                               altered
>  (3)  not selected       visible      no / other message is held        N/A
>  (4)  not selected       visible      no / nothing is held              N/A
>                                            (about:blank,startup page)
>     
>  (5)  selected           hidden       yes / viewed before hiding        not altered
>  (6)  selected           hidden       yes / viewed before hiding        altered
>  (7)  selected           hidden       no  / other message is held       N/A
>  (8)  selected           hidden       no  / nothing is held             N/A
>  (9)  not selected       hidden       no  / other message is held       N/A
> (10)  not selected       hidden       no  / nothing is held             N/A
Thanks for your input, but I don't have enough time to study all your comments carefully.

I know what the problem is (analysis in comment #14, also see attachment 8818625 [details] [diff] [review]) and I'm thinking about how to best fix it. Once I have a solution, I will do a try build on Windows that you can test, OK?
Thanks.
My comments are preparation work for fix verification test, except following.

> -  bool charsetOverride = false;
> +  mCharsetOverride = false;
>    GetTopmostMsgWindowCharacterSet(mailCharset, &mCharsetOverride);
> -  if (!mailCharset.IsEmpty())
I think GetTopmostMsgWindowCharacterSet is not needed, if Menu(main menu/app menu) and Toolbar button ignores request if Active Tab!=Mail Tab(3pane tab or messege tab), although disabling menu is additionally needed.
Digging around MIME showed that the charset it also set up correctly for "Edit as new" and "Edit Draft", not only "Forward inline".

This patch makes use of this. So in summary:
Forward inline: Was always working.
Edit as new/Edit Draft: Now working.
Reply and Forward as attachment: to be solved.

To be continued.
Attachment #8818625 - Attachment is obsolete: true
(In reply to Jorg K (GMT+1) from comment #30)
> WIP: 1323377-charset-compose.patch (v2).

> +        // Note: Some messages don't seem to return a charset here,
> +        // in this case the message will be sent with the default
> +        // in the comp fields.
>          rv = msgHdr->GetCharset(getter_Copies(charset));
It looks that data of message is already available in compFields.
Does msgHdr->GetCharset() return msgDBHdr.Charset? Or analyze badly-formed message data?
When "other message of other charset" is loaded in id=messagepane when compose is requested, from where composing charset is obtained?

msgDBHdr.Charset was not set at least when test mail of multipar/mixed. It was set only when non multipart mail.
If Content-Type:multipart/mixed;boundary="...";charset=iso-8859-1 is intentionally set, msgDBHdr.Charset=iso-8859-1 was set.
I could understand a little about reason why charset determination is not so easy when relevant message is not loaded in id=messagepane.

IIRC, "state of mail has attachment" is saved in msgDBHdr by mail viewing in case of that multipart/mixed has only one part, or in case of that excess/non-displayable part is contained in multipart/related.
When mail was viewed first time, why "message display code" or mime doesn't store it in msgDBHdr.Charset for multipart message?
The valuable "original charset of message body" was obtained by fighting with malformed/badly-formed message. Casual mail application programer can generated any badly-formed mail.
Definition of msgDBHdr.Charset=charset of main Content-Type: header? Is it impossible to change definition to "message body charset for Thunderbird"?
I feel service like GetCharacterSetFromMsgDBHdr is kind service for composer by mime code or mail display code.
  If multipart, don't set msgDBHdr.Charset upon first header parsing,
  because "meaning of charset in this content-type" is not defined.
  If multipart and msgDBHdr.Charset is saved upon mail viewing or if non-multipart, return msgDBHdr.Charset,
  else analyze mail data as done upon mail display and store it in msgDBHdr.Charset.

By the way, I now can predict charset used by composer in many cases, so fix verification test is easier than before.

Reason why mystery for me occurs when messagepane is hidden.
(a) Folder is opened with messagepane is shown -> messagepane is hidden
    One of next is loaded in id=messagepane, and message pane is hidden.
      about:blank, startup page, message(charset is not altered), message(charset is altered)
    Because nothing is loaded to id=messepane once message pane is hidden, charset of id=messagepane won't be changed.
(b) messagepane is already hidden -> folder open
    One of next is loaded in id=messagepane, and won't be changed
      nothing, about:blank, startup page
Because messagepane is invisible, I can't know what is shown in id=messagepane.

I was taught "Text Encoding looks same item" by Jorg K, and after fix of bug 1026989, I'm never confused by "charset of last part". So I can predict "charset used by composer" in many cases, regardless of messagepane is visible or hidden, regardless of requested message is shown or not.
View/Text Encoding => charset=XXX is checked, or View/Text Encoding is grayed out(about:blank or startup page)
- charset=XXX is checked => charset=XXX is used(except by Forward, Forward As Inline)
- Text Encoding is grayed out => behavior depends on feature, so I can't predict all cases, but following was observed.
  - "Edit As New" seems to look into message data and to use charset of the message
  - When about:blank or "startup page", "Forward As" was grayed out in some cases, 
    but "Forward As" was enabled in some cases(depends on Tb build?)    
    When "Forward As" is enabled and usable, "Forward As Attachment" looks to use default composing charset.
FYI.

Determination of (a) "requested mail == mail shown in id=messagepane" or (b) "requested mail != mail shown in id=messagepane".

In composer window(Reply, ForwardAsAttachment case), URL of replied/forwarded mail is held in gmsgCompose..originalMsgURI.
> window.gMsgCompose.originalMsgURI = mailbox-message://nobody@Local%20Folders/CharsetLastPart#2
In id=messagepane(browser tag) ina Tab/Window, URL of currently shown message is held in followeing property.
> document.getElementById("messagepane").contentDocument.URL =
> mailbox:///C:/Users/Wada/AppData/Roaming/Thunderbird/Profiles/q8yh6xh8.Release/Mail/Local%20Folders/CharsetLastPart?number=2
This is same in Virtual Folder(Search Folder, Unified Folder), because URL of message is based on real mail folder.

If real mail folder, "same folder" is guaranteed, so comparing "#2" part with "?number=2" part can be used to know "mail in id=messagepane is requested mail or not".
But if virtual folder, different real mail folder can be used. So, comparison of mailbox-message: URL and mailbox: URL is neded.

If same, any data in id=messagepane can be used.
If different, and if multipart mail, analysis of replied/forwarded mail by mime is needed to know body charaset of replied/forwarded mail, because msgDBHdr.Charset is not set by header parser(or incorrect if charset is specied in Content-Type:multipart/... header).
FYI.
A simple(but many changes are needed) bypass of "requested mail!=mail in id=messagepane of 3pane tab" issue.
Define hidden/faked id=messagepane in composer window, and if "requested mail!=mail in id=messagepane of 3pane tab", load repled/forwarded mail into the hidden/faked id=messagepane of composer window. No new function/function change in non-composer component is needed.
Another simple solution : Why "same charset" is mandatory? Why "always use default composing charset" is wrong? If the default charset doesn't have code point, Thunderbird uses utf-8 automatically/silently.
> Digging around MIME showed that the charset it also set up correctly for "Edit as new" and "Edit Draft", not only "Forward inline".

I thought "Edit As New" called complex mime function by himself when failed to obtain charset.

> Reply and Forward as attachment: to be solved.
> To be continued.

Same magic/trick as ForwardInline?
After studying the compose pipeline, I found the charset in the quote processing for replies in MIME. I still have to think about how to lift it from there back into the composition and the comp fields.
Attachment #8819073 - Attachment is obsolete: true
(In reply to Jorg K (GMT+1) from comment #35)
> WIP: 1323377-charset-compose.patch (v3).

I've found magic/trick :-)
> + // LoadDraftOrTemplate() is run in nsMsgComposeService::OpenComposeWindow() for
> + // five types, the ones covered here and ForwardInline/ReplyWithTemplate
> + // which are covered above.
Blocks: 1324264
No longer blocks: 1324264
See Also: → 1324264
OK, after many hours of work looking at the compose pipeline I found a solution here.

What you need to know is that forward/edit draft/edit as new processing is very different to reply processing. The former runs the original message through MIME early on and retrieves the charset. The latter also runs the message through MIME, but so far never paid any attention to the charset retrieved from MIME. So this needs to change.

MIME has the charset and calls nsMsgProtocol::SetContentCharset() which was so far unimplemented. So I implemented it. The problem is how to get the charset from the nsMsgProtocol lifted back into the compose. That's what this is for:

+      // Get the charset from the protocol where MIME left it.
+      nsCOMPtr<nsIMsgQuote> msgQuote;
+      compose->GetMsgQuote(getter_AddRefs(msgQuote));
+
+      if (msgQuote) {
+        nsCOMPtr<nsIChannel> quoteChannel;
+        msgQuote->GetQuoteChannel(getter_AddRefs(quoteChannel));
+        if (quoteChannel) {
+          // XXX Hack, hack, hack: this doesn't compile:
+          // nsCOMPtr<nsMsgProtocol> protocol = do_QueryInterface(quoteChannel);
+          nsMsgProtocol* protocol = static_cast<nsMsgProtocol*>(quoteChannel.get());
+          if (protocol) {
+            protocol->GetContentCharset(charset);
+            printf("=== Retrieved charset from protocol |%s|\n", charset.get());
+            if (!charset.IsEmpty())
+              compFields->SetCharacterSet(charset.get());
+          }
+        }
+      }

Basically, the compose object stores a pointer to the nsIMsgQuote object. From it, I get the "quote channel". And since nsMsgProtocol inherits from nsIChannel, I have the nsMsgProtocol to retrieve the charset from.

It all works, but it's a little ugly. So can you guys help me to make this nicer.
Attachment #8819620 - Attachment is obsolete: true
Attachment #8819648 - Flags: feedback?(mkmelin+mozilla)
Attachment #8819648 - Flags: feedback?(acelists)
Full try run on Windows so Wada can get an installer/executable to try:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=2a3577201c1bb11b9b15efd9b874203357dc1f39

Once completed, builds and logs will be available at:
https://archive.mozilla.org/pub/thunderbird/try-builds/mozilla@jorgk.com-2a3577201c1bb11b9b15efd9b874203357dc1f39/

I expect some failures in some charset processing, so let's see ;-)
Comment on attachment 8819648 [details] [diff] [review]
WIP: 1323377-charset-compose.patch (v4) - working.

Review of attachment 8819648 [details] [diff] [review]:
-----------------------------------------------------------------

Noticed a bug. Let's see whether a test will find it, too ;-)

::: mailnews/compose/src/nsMsgCompose.cpp
@@ +2583,5 @@
> +          if (protocol) {
> +            protocol->GetContentCharset(charset);
> +            printf("=== Retrieved charset from protocol |%s|\n", charset.get());
> +            if (!charset.IsEmpty())
> +              compFields->SetCharacterSet(charset.get());

One thing missing here: I we already set an override charset or the user wants to reply in his default charset, we shouldn't overwrite this here.
So Wada, this when only work when *not* using an override and *not* answering in the default charset. Please keep this in mind when testing, or wait for the next version.
OK, just a small tweak not to overwrite the override/default.

Try here:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=3b40ab09706a50a8d707e955ccd68deb51ac7f16

Once completed, builds and logs will be available at:
https://archive.mozilla.org/pub/thunderbird/try-builds/mozilla@jorgk.com-3b40ab09706a50a8d707e955ccd68deb51ac7f16/

Once completed, builds and logs will be available at:
https://archive.mozilla.org/pub/thunderbird/try-builds/mozilla@jorgk.com-3b40ab09706a50a8d707e955ccd68deb51ac7f16/

Wada, better test this.
Attachment #8819648 - Attachment is obsolete: true
Attachment #8819648 - Flags: feedback?(mkmelin+mozilla)
Attachment #8819648 - Flags: feedback?(acelists)
Attachment #8819652 - Flags: feedback?(mkmelin+mozilla)
Attachment #8819652 - Flags: feedback?(acelists)
I see. I'll try to find hole in your patch as many as possible :-)

By the way, I prefer "hidden messagepane in composer window" method.
"Load mail in composer". That's all. All is done by mail display+Mime code, so you don't need to read code of many components. :-)
I merely summarized about "composed mail charset==original mail charset" cases.
I merely pointed that "reply mail charset=EUC-KR for requested mail charset=iso-8859-2 when mail of EUC-KR is shown" is mystery for user.
Why "composed mail charset==original mail charset" is always mandatory? Why "always propose default charset" is not allowed as mailer? Default charset is his own choice, and he can change composing charset && his default charset any time as he likes.
The try from comment #38 has a test failure and the one from comment #41 didn't compile :-(
So not ready for testing.

(In reply to WADA:World Anti-bad-Duping Agency from comment #42)
> Why "composed mail charset==original mail charset" is always mandatory?
That's the default. The user can choose a fixed default charset that will then be used if possible:
Tools > Options, Display, Advanced button, last item:
"When possible use the default character encoding in replies".

If you forward as attachment, it's useful to use the same charset in the body of the new message and in the forwarded one. When the recipient opens the message, body and attached message will then be displayed correctly since the inline preview uses the charset of the body. In other words, there is only one charset to display body and inline attachments.
This should be OK for a review now. All test should pass now:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=8faeeca9a6080de565cb3c93ee9801d884bf45b3

Once completed, builds and logs will be available at:
https://archive.mozilla.org/pub/thunderbird/try-builds/mozilla@jorgk.com-8faeeca9a6080de565cb3c93ee9801d884bf45b3/

I can't fix the "forward as attachment" case since the original message(s) does/do not pass through mime at all. All other compose types that have an original message pass this through MIME, some earlier, some later.

I will add some comments to the patch.
Attachment #8819652 - Attachment is obsolete: true
Attachment #8819652 - Flags: feedback?(mkmelin+mozilla)
Attachment #8819652 - Flags: feedback?(acelists)
Attachment #8819686 - Flags: review?(mkmelin+mozilla)
Comment on attachment 8819686 [details] [diff] [review]
1323377-charset-compose.patch (v6).

Review of attachment 8819686 [details] [diff] [review]:
-----------------------------------------------------------------

I left a few debug statements in the code.

::: mailnews/compose/src/nsMsgCompose.cpp
@@ +1613,5 @@
> +{
> +  // No matter what, we should block x-windows-949 (our internal name)
> +  // from being used for outgoing emails (bug 234958).
> +  if (aCharset.Equals("x-windows-949",
> +        nsCaseInsensitiveCStringComparator()))

Oops, funny indentation here form copying the original.

@@ +2057,5 @@
> +          return rv;
> +        printf("=== charset from message |%s|\n", charset.get());
> +        if (charset.IsEmpty()) {
> +          // As a last resort we fall back to the charset from the window
> +          // hoping the messages was viewed before the attaching it.

Oops, grammar error here: the message was ...

@@ -2027,5 @@
> -      // nothing else. That is passed in through the compFields.
> -      if (type == nsIMsgCompType::ReplyWithTemplate) {
> -         rv = compFields->GetCharacterSet(getter_Copies(charset));
> -         NS_ENSURE_SUCCESS(rv,rv);
> -      }

This dead code was removed. ReplyWithTemplate is a code path a little above that exists the function early, so this code never ran.

@@ -2033,5 @@
> -      // No matter what, we should block x-windows-949 (our internal name)
> -      // from being used for outgoing emails (bug 234958)
> -      if (charset.Equals("x-windows-949",
> -            nsCaseInsensitiveCStringComparator()))
> -        charset = "EUC-KR";

Removed here and moved into InitEditor() because this really also needs to be run the Forward case, but wasn't since Forward is a code path a little above that exists the function early.

@@ +2339,5 @@
>                                                           bool quoteHeaders,
>                                                           bool headersOnly,
>                                                           nsIMsgIdentity *identity,
> +                                                         nsIMsgQuote* msgQuote,
> +                                                         bool charsetFixed,

The QuotingOutputStreamListener copies some data from the compose object so it's available in QuotingOutputStreamListener::OnStopRequest()

@@ +2588,5 @@
> +          mQuote->GetQuoteChannel(getter_AddRefs(quoteChannel));
> +          if (quoteChannel) {
> +            // XXX Hack, hack, hack: this doesn't compile:
> +            // nsCOMPtr<nsMsgProtocol> protocol = do_QueryInterface(quoteChannel);
> +            nsMsgProtocol* protocol = static_cast<nsMsgProtocol*>(quoteChannel.get());

This I'd like to improve, but sadly the commented-out line above doesn't compile. Help!!

@@ +3429,5 @@
> +                                    mWhatHolder != 1,
> +                                    !bAutoQuote || !mHtmlToQuote.IsEmpty(),
> +                                    m_identity,
> +                                    mQuote,
> +                                    mCharsetOverride || mAnswerDefaultCharset,

If I have a "fixed" charset, either due to an override or due to answering with the default charset, then I don't want to apply the charset determined by MIME later.

::: mailnews/mime/emitters/nsMimeBaseEmitter.cpp
@@ -564,5 @@
>  nsMimeBaseEmitter::UpdateCharacterSet(const char *aCharset)
>  {
> -  if ( (aCharset) && (PL_strcasecmp(aCharset, "US-ASCII")) &&
> -        (PL_strcasecmp(aCharset, "ISO-8859-1")) &&
> -        (PL_strcasecmp(aCharset, "UTF-8")) )

I don't see why US-ASCII, ISO-8859-1 and UTF-8 were excluded here. If I reply to a message in one of those charsets, I also need to pass this information to the nsMsgProtocol object where I can later retrieve it.
Document with an analysis of the compose pipeline.

It shows that forward/edit draft/edit as new run the original message through MIME *before* calling nsMsgCompose::CreateMessage() which runs before nsMsgCompose::InitEditor(). For replies, the original message is run though a "quoting process" which also passes the message through MIME, so the charset of the original message can be retrieved late.
Comment on attachment 8819686 [details] [diff] [review]
1323377-charset-compose.patch (v6).

Review of attachment 8819686 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/compose/src/nsMsgCompose.cpp
@@ +2587,5 @@
> +          nsCOMPtr<nsIChannel> quoteChannel;
> +          mQuote->GetQuoteChannel(getter_AddRefs(quoteChannel));
> +          if (quoteChannel) {
> +            // XXX Hack, hack, hack: this doesn't compile:
> +            // nsCOMPtr<nsMsgProtocol> protocol = do_QueryInterface(quoteChannel);

should be RefPtr. nsCOMPtr is only for interface pointers and nsMsgProtocol is a concrete class. 

But, I didn't manage to get that to compile either. Too weak c++ fuu :(
OK, let's ask Kent. Kent, how do I cast form an interface class to a concrete class? Here is the code:

          nsCOMPtr<nsIChannel> quoteChannel;
          mQuote->GetQuoteChannel(getter_AddRefs(quoteChannel));
          if (quoteChannel) {
            // XXX Hack, hack, hack: this doesn't compile:
            // nsCOMPtr<nsMsgProtocol> protocol = do_QueryInterface(quoteChannel);
            nsMsgProtocol* protocol = static_cast<nsMsgProtocol*>(quoteChannel.get());
            if (protocol) {
              protocol->GetContentCharset(charset);

With the cast this actually works. mQuote is a nsCOMPtr<nsIMsgQuote> and that class offers:
readonly attribute nsIChannel quoteChannel;

MIME calls nsMsgProtocol::SetContentCharset() and the nsMsgProtocol class inherits from ...
class NS_MSG_BASE nsMsgProtocol : public nsIStreamListener
                                , public nsIChannel
                                , public nsITransportEventSink

So from the nsIChannel, how to I "up-cast" to the nsMsgProtocol?
Flags: needinfo?(rkent)
BTW, the call in MIME to set the charset is here:
https://dxr.mozilla.org/comm-central/rev/b020cffa188fac078bff74c9e19c98b044ea4173/mailnews/mime/emitters/nsMimeBaseEmitter.cpp#598
  mChannel->SetContentCharset(nsDependentCString(aCharset));
mChannel is this: nsCOMPtr<nsIChannel> mChannel;

This content charset is a bit of a mystery to me, there are also implementations in IMAP and NNTP but nothing for mailboxes. I'd just like to get to the charset that the MIME code supplies to SetContentCharset() somehow in my compose code and the "channel" appeared to be the link here.
"Kent, how do I cast form an interface class to a concrete class?" Well you can do is as you did, but it breaks any XPCOM uses outside of a particular implementation. Which means, please do not do that. You can look at my patches in Bug 1285679 to see my attempts to undo earlier such casts in the message database, which make it impossible to do JS overrides of the message database. If I was the reviewer, I would not allow new casts like this.

In this particular case, I don't understand why this is a problem. GetContentCharset is an attribute of he channel, and quoteChannel is a channel, so can't you just do quoteChannel->GetContentCharset(charset) ?
Flags: needinfo?(rkent)
(In reply to Jorg K (GMT+1) from comment #44)
> Once completed, builds and logs will be available at:
> https://archive.mozilla.org/pub/thunderbird/try-builds/mozilla@jorgk.com-
> 8faeeca9a6080de565cb3c93ee9801d884bf45b3/
> 
> I can't fix the "forward as attachment" case since the original message(s)
> does/do not pass through mime at all. All other compose types that have an
> original message pass this through MIME, some earlier, some later.

Quick check result("When possible use the default character encoding in replies" was surely Unchecked).

(Case-1) Selected message is shown in messagepane(EUC-KR).
         Request compose at context menu for mail=iso-8859-2 at Thread pane.
  Reply               : iso-8859-2 is used, regardless of charset change in messagepane (resolved) 
  Edit As New         : iso-8859-2 is used, regardless of charset change in messagepane (resolved)
  ForwardAsAttachment : EUC-KR is used as before. (if charset in messagepane is changed, it's used)
  Edit Draft Message  : iso-8859-2 is used (resolved) (same on double click)
(Case-2) select message, mail in messagepane=EUC-KR, hide message pane,
         request compose at context menu for mail=iso-8859-2 at Thread pane.
  All is same as Case-1, because charset in id=messagepane is not used after patch except in ForwardAsAttachment.
  So, "mystery when messagepane is hidden after message display" won't occur after patch except in ForwardAsAttachment.
  although View/Text Encoding shows EUC-KR(bug 1324264).
(Case-3) Message window, message Tab.
  This is same as "a message is selected, message is shown in messagepane, request compose on selected message".
  So, "charset change at messagepane" is relevant.
  "new charset in messagepane" is used by Reply, Edit As New, ForwardAsAttachment, Edit Draft Message. (not broken by patch)
(Case-4) "When possible use the default character encoding in replies" = checked
  Reply : default outgoing charset(GB18030) is used. (not broken by patch)
(Case-5) Tab/Window switch while patch is running at composer window.
  I can't produce such condition.
  EventListener, efficient code which runs at best timing, forced delay in composer, ... , are needed to interfere patch.

In SeaMonkey, used charset is always shown even when default charset. But when I did compose(new msg), I was aware of "no charset" in window title, so I viewed preference setting. Nothing was shown in outgoing message charset of Text Encoding setting. After set it, charset was shown as expected. SeaMonkey's setting was broken by your patch... I remembered that select box was broken in a release or a build in the past.
This patch is functionally the same as v6, but it removes the C++ hack.
Kent, thank you so much for your input, I didn't see the forest for the trees.

Wada, thanks for testing. If I read your summary in comment #63 correctly, the patch fixes what it intends to fix, sadly can't fix "forward as attachment", and doesn't break anything that's already working, so forward/edit as new/etc. Right? So you're happy with it, right?

As for the Window title: Yes, we only show the charset if it's not the default:
https://dxr.mozilla.org/comm-central/rev/c46a741a72139a42eb2001033bead4a3db459c1a/mail/components/compose/content/MsgComposeCommands.js#2880
SM is different:
https://dxr.mozilla.org/comm-central/rev/c46a741a72139a42eb2001033bead4a3db459c1a/suite/mailnews/compose/MsgComposeCommands.js#1380
but the do suppress the display at times, too.
Attachment #8819686 - Attachment is obsolete: true
Attachment #8819686 - Flags: review?(mkmelin+mozilla)
Attachment #8820185 - Flags: review?(mkmelin+mozilla)
(In reply to Jorg K (GMT+1) from comment #64)
> Wada, If I read your summary in comment #63 correctly,
> the patch fixes what it intends to fix, sadly can't fix "forward as
> attachment", and doesn't break anything that's already working, so
> forward/edit as new/etc. Right? So you're happy with it, right?

Sadly, I couldn't find hole of your patch :-)
If you completely throw away ForwardAsAttachement case, I reject your patch, because ForwardAsAttachment case is contained in bug summary.
But if you'll take some time to think about how to deal with it, I hope immediate patch landing.
For us, composer users, source code/source comment is absolutely irrelevant.
What we composer users want is "many mysteries disappear at once, at a glance".

Yes of course I'm happy. Thanks for fixing bug. Please go ahead.
To fix "Forward as Attachment", we'd have to rewrite the compose pipeline, to pass the attached message through MIME to get its charset.

The patch here is really a little hack to steal the correct charset from MIME during processing, and where MIME doesn't process anything, there's nothing to steal.

The patch still improves the "Forward as attachment" case a little, since for non-multipart messages the correct charset will be retrieved. Try it, forward a single-part message as attachment.
Refreshed to latest trunk.
Attachment #8820185 - Attachment is obsolete: true
Attachment #8820185 - Flags: review?(mkmelin+mozilla)
Attachment #8821730 - Flags: review?(mkmelin+mozilla)
After some deliberation I decided that for "forward as attachment" is makes more sense to use the charset in which new messages are composed.
Attachment #8821730 - Attachment is obsolete: true
Attachment #8821730 - Flags: review?(mkmelin+mozilla)
Attachment #8821781 - Flags: review?(mkmelin+mozilla)
For improvement in "Tab switch while your patch is being invoked at composer window" case.

(1) After command controller use, "Composer menu/Toolbar button in messagener window" is disabled correctly.
(1-1) "Composer menu/Toolbar button in messagener window" is already disabled when active tab != 3pane Tab/message tagb.
(1-2) When no mail is selected at thread pane of active 3pane tab, "Composer menu/Toolbar button in messagener window".
(1-3) Exception is : If a mail is shown in messagepane of 3pane tab, and the messagepane is hidden after mail display,
      when the 3pane tab is active tab in messenger window,
      "Composer menu/Toolbar button in messagener window" is not disabled.
(2) In message window and active message tab, status is same as following in 3pane tab.
    mail is selected at thread pane, and the mail is shown in messagepane, and messagepane is not hiddden.

i.e.
"Compose request is requested from UI element in currently active 3pane tab/message tab of currently active/topmost messenger window(or active/topmost message window) in multiple messenger windows/message windows environment" is guaranteed in current Thunderbird.

So, there is no need of searching messagepane in TopMostMessengerWindow or TopMostMessegeWindow using window enumerator for charsetOverride by Composer, if requester of compose passes "window or messagepane" to Composer.
Exception is : Compose request via. mailto:/MAPI while Thunderbird is not started yet.
               In this case, "window or messagepane of active tab/window" doesn't exist.

If "Tab(window) switch while your patch is being invoked at composer window" is problem, such (big) change is perhaps needed, because current "searching TopmostMessenger/MessageWindow using window enumerator" is useless in such case.
Further, addon can request compose from active/top messenger window #2 for a message in messagepane of non-active 3pane tab of non-active/non-topmost messenger window #1 :-)

.


 
.
Refreshed to latest trunk after bug 1325745 and bug 1301640 backout.
Attachment #8821781 - Attachment is obsolete: true
Attachment #8821781 - Flags: review?(mkmelin+mozilla)
Attachment #8821976 - Flags: review?(mkmelin+mozilla)
Comment on attachment 8821976 [details] [diff] [review]
1323377-charset-compose.patch (v8) - refreshed.

Review of attachment 8821976 [details] [diff] [review]:
-----------------------------------------------------------------

Looks ok to me. r=mkmelin

Would be great to have a test for this as it's quite easy to regress it

::: mail/test/mozmill/composition/test-reply-multipart-charset.js
@@ +83,5 @@
>    // Check that the charset is taken from the message body.
>    subtest_replyEditAsNewForward_charset(1, "./multipart-charset.eml", "EUC-KR");
>    subtest_replyEditAsNewForward_charset(2, "./multipart-charset.eml", "EUC-KR");
>    subtest_replyEditAsNewForward_charset(3, "./multipart-charset.eml", "EUC-KR");
> +  // For "forward as attachment" we use the default charset.

please add "(which is UTF-8") to the comment

::: mailnews/base/util/nsMsgProtocol.cpp
@@ +595,5 @@
>  
>  NS_IMETHODIMP nsMsgProtocol::SetContentCharset(const nsACString &aContentCharset)
>  {
> +  m_Charset.Assign(aContentCharset);
> +  printf("=== nsMsgProtocol::SetContentCharset %s\n", m_Charset.get());

want to remove this of course

::: mailnews/base/util/nsMsgProtocol.h
@@ +148,5 @@
>    nsCOMPtr<nsIProgressEventSink> mProgressEventSink;
>    nsCOMPtr<nsIInterfaceRequestor> mCallbacks;
>    nsCOMPtr<nsISupports>       mOwner;
>    nsCString                   m_ContentType;
> +  nsCString                   m_Charset;

gah, such mix of naming. I'd go with mCharset though

::: mailnews/compose/src/nsMsgCompose.cpp
@@ +1609,5 @@
>  }
>  
> +static nsresult
> +fixCharset(nsCString &aCharset)
> +{

nit: maybe put all of this on the same line

@@ +1630,5 @@
> +  // outgoing encoding for e-mail. MIME can't handle those messages
> +  // encoded in ASCII-incompatible encodings.
> +  if (NS_FAILED(rv) ||
> +      StringBeginsWith(aCharset, NS_LITERAL_CSTRING("UTF-16"))) {
> +    NS_WARNING("Suppressing UTF-16");

as we handle it gracefully a warning seems uncalled for

@@ +1907,5 @@
>    NS_ENSURE_SUCCESS(rv, rv);
>  
> +  /**
> +   * Forward inline and reply with template processing.
> +   */

Not a function documentation, so /** comment style */ is not appropriate here. Please just use //

@@ +1971,5 @@
>    }
>  
> +  /**
> +   * All other processing.
> +   */

Not a function documentation, so /** comment style */ is not appropriate here

@@ +2007,5 @@
> +    // which are covered above.
> +    char* compFieldsCharset;
> +    m_compFields->GetCharacterSet(&compFieldsCharset);
> +    charset.Assign(nsDependentCString(compFieldsCharset));
> +    printf("=== Edit as new/Edit draft |%s|\n", compFieldsCharset);

... and remove this

@@ +2568,5 @@
> +          nsCOMPtr<nsIChannel> quoteChannel;
> +          mQuote->GetQuoteChannel(getter_AddRefs(quoteChannel));
> +          if (quoteChannel) {
> +            quoteChannel->GetContentCharset(charset);
> +            printf("=== Retrieved charset from channel |%s|\n", charset.get());

remove debug

::: mailnews/imap/src/nsImapProtocol.cpp
@@ +9619,5 @@
>  
>  NS_IMETHODIMP nsImapMockChannel::SetContentCharset(const nsACString &aContentCharset)
>  {
> +  m_Charset.Assign(aContentCharset);
> +  printf("=== nsImapMockChannel::SetContentCharset %s\n", m_Charset.get());

remove debug
Attachment #8821976 - Flags: review?(mkmelin+mozilla) → review+
Thanks for the generous review. Of course I've addressed all the issues and nits. I went a little further, and s/m_ContentType/mContentType/ which also hit an NNTP file.

As for
> +  /**
> +   * Forward inline and reply with template processing.
> +   */
I wanted to mark those sections clearly in the file, as Kent requested one in bug 629738 comment #73: "This comment applies to the entire section below, so it needs to be formatted differently to make it clear that it is a major section comment, ...".
In the end, this got accepted and landed:
https://dxr.mozilla.org/comm-central/rev/b5cbf4a5f294c84cee905a0e22c32ea34cc66359/mailnews/imap/src/nsImapProtocol.cpp#9193
/**
 * Part processing (rest of this function).
 */
Very hard to serve two masters with different taste ;-(

I'll file another bug for adding some tests. They should not only cover answering without viewing the message, but also using a charset override and forcing the reply to the default charset.
Attachment #8821976 - Attachment is obsolete: true
Some more tweaks:
1) White-space
2) Don't use same variable as in and out here:
   ccm->GetCharsetAlias(aCharset.get(), aCharset);
3) Removed unnecessary include file.
Attachment #8822054 - Attachment is obsolete: true
Attachment #8822059 - Flags: review+
Sorry for yet another loop here. I've removed some redundant code and added a comment instead ;-)
Attachment #8822059 - Attachment is obsolete: true
Attachment #8822065 - Flags: review+
Blocks: 1325967
https://hg.mozilla.org/comm-central/rev/5d7cb058c29636d12e3c85f6092fb5b6066bf6d0

Landed with a two-line difference to the attached patch: I killed an unused variables.

I filed bug 1325967 for improving the test.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 53.0
Comment on attachment 8822065 [details] [diff] [review]
1323377-charset-compose.patch (v8d).

Let's take this to Aurora and Beta after a little baking. It's a little risky, but we should fix this "mystery" (as Wada calls it) and make the solution available to our users.

Note that the landed version differs from the attached patch by two lines.
Attachment #8822065 - Flags: approval-comm-beta+
Attachment #8822065 - Flags: approval-comm-aurora+
(In reply to Jorg K (GMT+1) from comment #82)
> As for
> > +  /**
> > +   * Forward inline and reply with template processing.
> > +   */
> I wanted to mark those sections clearly in the file, as Kent requested one
> in bug 629738 comment #73: "This comment applies to the entire section
> below, so it needs to be formatted differently to make it clear that it is a
> major section comment, ...".

Well you can mark sections, but there are much better ways than that particular commenting style. Typically something like

// *** THE BELOW DOES X....

or 

// ::: THE BELOW DOES
Flags: in-testsuite+
There is a problem with displaying umlauts in Trunk which has now reached Aurora due to a comment in a German news group. Not reliable reproducible, so I cannot open a new bug, but it happens often. Here is an example from the thread '2017 - first SM' in mozilla.test.

First the umlaut ü was displayed perfectly.
http://www.triffids.de/pub/screenshot/um170101-b.png

Then the ancestor of the posting was select, then the same posting again.
http://www.triffids.de/pub/screenshot/um170101-c.png
After backing out '1323377-charset-compose.patch (v8d)', omitting the first part which would fail but concerns only a test, the problem seems to have gone away.
To back out the test you'd need to back out bug 1325967, too, which modifies the test.

I'm using TB Daily writing in German and I haven't seen wrong Umlaute anywhere.

The change here mainly affected replying. Your images showing sent replies? So can you attach the original message and the reply, since you're saying that it happens often.

I really need a semi-reproducible case to fix this. Obviously messing up people's e-mail is not a good idea.
The problem seems not to occur in E-Mails, only in news groups. And they are not consistent. You get different results trying the same. A guess from a commentator is a race condition. Obviously a way is needed to reproduce this reliable. So far no one has succeeded.

How to see it by yourself? Subscribe to de.test. There are lots of umlauts. Then step through the messages using the key n. Should not take long to see the problem. And then you will despair to reproduce. ;)

There is a thread in de.comm.software.mozilla.nightly-builds. SM-Trunk: Umlaute.
(In reply to Hartmut Figge from comment #92)
> How to see it by yourself? Subscribe to de.test. There are lots of umlauts.
> Then step through the messages using the key n. Should not take long to see
> the problem. And then you will despair to reproduce. ;)
Sorry, you don't mention "reply" anywhere here. Looks like you're merely viewing lost of messages in quick succession by holding down "n". Then you ran into bug 1287336 (also reported as bug 1113511).

The patch here changes overall system behaviour a little since MIME now sets the charset of the channel. So that could shift the balance a bit.
(In reply to Jorg K (GMT+1) from comment #93)

> Sorry, you don't mention "reply" anywhere here. Looks like you're merely
> viewing lost of messages in quick succession by holding down "n". Then you
> ran into bug 1287336 (also reported as bug 1113511).

No, I'm not holding down 'n'. My news server would not be pleased by that. I see the problem reading news in a normal way. Very annoying. Not only to me but also to others. On Win and on Linux.

> The patch here changes overall system behaviour a little since MIME now sets
> the charset of the channel. So that could shift the balance a bit.

Well, that 'bit' has a remarkable effect. *g*
Sorry, I'm not so familiar with news feeds. I have a subscription to http://www.tagesschau.de/xml/rss2 and don't see the problem there. But these are all messages from the same source and they are all UTF-8 encoded.

If you can kindly give me the exact feed URL, I can try your feed as well.
Sorry, I didn't know that you are not used to the usenet. My own main news server is free and mostly reliable. You can get an account using http://www.albasani.net/index.html.en and then subscribe to de.test.

Or you could use news.mozilla.org which requires no registration. There you could subscribe to mozilla.test. Probably looking at the postings in the thread '017 - first SM' will give you wrongly displayed umlauts.
Let's take the discussion to bug 1328131.
Depends on: 1328131
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: