Closed Bug 1328131 Opened 3 years ago Closed 3 years ago
NNTP Messages loaded and displayed with incorrect encoding
This is most likely a duplicate of bug 1287336, but here we have a more reproducible case that doesn't involve displaying messages in rapid succession. STR: Subscribe to mozilla.test on news.mozilla.org. View the thread "2017 - first SM". The second message "Re: 2017 - first SM" has this body: ü. That was pasted from Character Map. Ã¼. That was pasted from Character Map. Sometimes the ü shows, sometimes Ã¼. I get the Ã¼ when using down-arrow from the first message and ü when using up-arrow from the third message, but it's not consistent. In the end the ü will always show. Leaving the news group and later returning brings back the garbled character. The message is winwindows-1252 encoded, I'm not sure why the ü (0xFC) is shown as Ã¼. Apparently this was aggravated by bug 1323377.
The problem wit the umlauts seems to be independent from the used charset. Happens often on news groups but is not reliably reproducible. A commentator mentioned that is does not happen for E-Mails. Another commentator assumed a race condition. I'm currently using a SM-Trunk built with back out of the fix in bug 1323377 and have not seen the problem since.
Backing out this tiny bit of bug 1323377 fixes the problem. Alta, can you enlighten me on what this forwarding business is about: https://dxr.mozilla.org/comm-central/rev/2e479125e5d9df6778fc45ba58e4923575c7bb49/mailnews/news/src/nsNntpMockChannel.cpp#218 I assume the forwarded call now does something instead of returning NS_ERROR_NOT_IMPLEMENTED.
Assignee: nobody → jorgk
(In reply to Jorg K (GMT+1) from comment #2) > Created attachment 8823100 [details] [diff] [review] > 1328131-charset-feeds.patch WFM
i assume it's some macro to do some hack/override, but otherwise don't know any specifics of charset in nntp, other than nntp servers (the poor ones still not converted to utf8) should have the right encoding set in the news account, and reply/forward may need to ensure that's respected in a way different from email. unlike email, i think it would be legitimate to at some very near point only accept utf8 encoding from nntp servers and be done with this mess once and for all. also, don't conflate news/nntp/newsgroups with feeds (atom/rss).
(In reply to Hartmut Figge from comment #3) > WFM Sure, but you can't have that since it will break things big time in mail replies. We need to work out why setting the charset on the channel has such bad consequences for nntp.
OK, I debugged it further. nsMsgProtocol::GetContentCharset() is called for the news message before the charset was set with nsMsgProtocol::SetContentCharset(). In other words, you get the charset for the previously viewed message. The first message is in US-ASCII/ISO-8859-1, the second one with the ü in windows-1252. So we display the windows-1252 data as ISO-8859-1. Hmm, that shouldn't be a problem since they are mostly the same, especially for the ü. But nsMsgProtocol::GetContentCharset() doesn't always return something, sometimes is returns nothing (empty string) and then the message displays correctly. All a little mysterious. Debugging continues.
Here's another result: The charset is fetched from the channel here: https://dxr.mozilla.org/mozilla-central/rev/31ffcb82ced81bb75faa800d2b7e883a3761a03b/dom/html/nsHTMLDocument.cpp#710 That happens for mailbox and IMAP messages, too, but there always an empty string is returned since the charset is set by MIME before it's fetched. For NNTP that call sometimes returns a stale value. Maybe the nsMsgProtocol object isn't properly initialised. Debugging continues.
Patch coming, you'll see that this is a regression from bug 1323377. Sorry 'bout this :-(
Sorry Magnus, I broke this, so here's the one line fix.
Summary: Messages sometimes loaded and displayed with incorrect encoding → NNTP Messages loaded and displayed with incorrect encoding
Comment on attachment 8823115 [details] [diff] [review] 1328131-reset-charset-for-nntp-reuse.patch (v1). Review of attachment 8823115 [details] [diff] [review]: ----------------------------------------------------------------- LGTM, r=mkmelin - appears the nsMsgProtocol::InitFromURI calls from nntp are looking for trouble though... ::: mailnews/base/util/nsMsgProtocol.cpp @@ +78,5 @@ > mailUrl->GetStatusFeedback(getter_AddRefs(statusFeedback)); > mProgressEventSink = do_QueryInterface(statusFeedback); > } > + > + // NNTP reuses the object and calls this again on the same object. While true, that's not knowledge this class should need to have, so remove this and keep the comment on a general level.
Attachment #8823115 - Flags: review?(mkmelin+mozilla) → review+
Fixed comment as requested.
Seamonkey Aurora branch is affected. Fix must be applied there too.
Comment on attachment 8823143 [details] [diff] [review] 1328131-reset-charset-for-nntp-reuse.patch (v1b). Uplift to Aurora and Beta since this is where the regressing bug 1323377 landed (or will land for beta). Coming up in the next 24 hours.
Beta (TB 51): https://hg.mozilla.org/releases/comm-beta/rev/94c29c2b3a6ff6ac8dffae2d5fa915673f8c476a Fix uplifted together with regressing bug and enhanced test. Should all be good now: https://hg.mozilla.org/releases/comm-beta/rev/bdc870f9fe257618f91319559c7eb3d5b77b1b8c (bug 1323377) https://hg.mozilla.org/releases/comm-beta/rev/26265ec715cf1ffa58436256667ebc108c90eebd (bug 1325967)
You need to log in before you can comment on or make changes to this bug.