Closed Bug 1328131 Opened 3 years ago Closed 3 years ago

NNTP Messages loaded and displayed with incorrect encoding


(MailNews Core :: Networking: NNTP, defect)

Not set


(thunderbird51 fixed, thunderbird52 fixed, thunderbird53 fixed)

Thunderbird 53.0
Tracking Status
thunderbird51 --- fixed
thunderbird52 --- fixed
thunderbird53 --- fixed


(Reporter: jorgk, Assigned: jorgk)




(1 file, 2 obsolete files)

This is most likely a duplicate of bug 1287336, but here we have a more reproducible case that doesn't involve displaying messages in rapid succession.

Subscribe to mozilla.test on
View the thread "2017 - first SM". The second message
"Re: 2017 - first SM" has this body:
ü.  That was pasted from Character Map.
ü.  That was pasted from Character Map.

Sometimes the ü shows, sometimes ü. I get the ü when using down-arrow from the first message and ü when using up-arrow from the third message, but it's not consistent. In the end the ü will always show. Leaving the news group and later returning brings back the garbled character.

The message is winwindows-1252 encoded, I'm not sure why the ü (0xFC) is shown as ü.

Apparently this was aggravated by bug 1323377.
The problem wit the umlauts seems to be independent from the used charset. Happens often on news groups but is not reliably reproducible. A commentator mentioned that is does not happen for E-Mails. Another commentator assumed a race condition.

I'm currently using a SM-Trunk built with back out of the fix in bug 1323377 and have not seen the problem since.
Component: Feed Reader → Networking: NNTP
Attached patch 1328131-charset-feeds.patch (obsolete) — Splinter Review
Backing out this tiny bit of bug 1323377 fixes the problem.

Alta, can you enlighten me on what this forwarding business is about:

I assume the forwarded call now does something instead of returning NS_ERROR_NOT_IMPLEMENTED.
Assignee: nobody → jorgk
Flags: needinfo?(alta88)
(In reply to Jorg K (GMT+1) from comment #2)

> Created attachment 8823100 [details] [diff] [review]
> 1328131-charset-feeds.patch

i assume it's some macro to do some hack/override, but otherwise don't know any specifics of charset in nntp, other than nntp servers (the poor ones still not converted to utf8) should have the right encoding set in the news account, and reply/forward may need to ensure that's respected in a way different from email. unlike email, i think it would be legitimate to at some very near point only accept utf8 encoding from nntp servers and be done with this mess once and for all.

also, don't conflate news/nntp/newsgroups with feeds (atom/rss).
Flags: needinfo?(alta88)
(In reply to Hartmut Figge from comment #3)
Sure, but you can't have that since it will break things big time in mail replies. We need to work out why setting the charset on the channel has such bad consequences for nntp.
OK, I debugged it further.
nsMsgProtocol::GetContentCharset() is called for the news message before the charset was set with nsMsgProtocol::SetContentCharset(). In other words, you get the charset for the previously viewed message.

The first message is in US-ASCII/ISO-8859-1, the second one with the ü in windows-1252. So we display the windows-1252 data as ISO-8859-1. Hmm, that shouldn't be a problem since they are mostly the same, especially for the ü. But nsMsgProtocol::GetContentCharset() doesn't always return something, sometimes is returns nothing (empty string) and then the message displays correctly.

All a little mysterious. Debugging continues.
Here's another result:
The charset is fetched from the channel here:
That happens for mailbox and IMAP messages, too, but there always an empty string is returned since the charset is set by MIME before it's fetched.

For NNTP that call sometimes returns a stale value. Maybe the nsMsgProtocol object isn't properly initialised. Debugging continues.
Patch coming, you'll see that this is a regression from bug 1323377. Sorry 'bout this :-(
Blocks: 1323377
Sorry Magnus, I broke this, so here's the one line fix.
Attachment #8823100 - Attachment is obsolete: true
Attachment #8823115 - Flags: review?(mkmelin+mozilla)
Summary: Messages sometimes loaded and displayed with incorrect encoding → NNTP Messages loaded and displayed with incorrect encoding
Comment on attachment 8823115 [details] [diff] [review]
1328131-reset-charset-for-nntp-reuse.patch (v1).

Review of attachment 8823115 [details] [diff] [review]:

LGTM, r=mkmelin - appears the nsMsgProtocol::InitFromURI calls from nntp are looking for trouble though...

::: mailnews/base/util/nsMsgProtocol.cpp
@@ +78,5 @@
>      mailUrl->GetStatusFeedback(getter_AddRefs(statusFeedback));
>      mProgressEventSink = do_QueryInterface(statusFeedback);
>    }
> +
> +  // NNTP reuses the object and calls this again on the same object.

While true, that's not knowledge this class should need to have, so remove this and keep the comment on a general level.
Attachment #8823115 - Flags: review?(mkmelin+mozilla) → review+
Fixed comment as requested.
Attachment #8823115 - Attachment is obsolete: true
Attachment #8823143 - Flags: review+
Keywords: checkin-needed
Seamonkey Aurora branch is affected. Fix must be applied there too.
Comment on attachment 8823143 [details] [diff] [review]
1328131-reset-charset-for-nntp-reuse.patch (v1b).

Uplift to Aurora and Beta since this is where the regressing bug 1323377 landed (or will land for beta).

Coming up in the next 24 hours.
Attachment #8823143 - Flags: approval-comm-beta+
Attachment #8823143 - Flags: approval-comm-aurora+
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 53.0
Keywords: checkin-needed
Blocks: TB52found
You need to log in before you can comment on or make changes to this bug.