NNTP Messages loaded and displayed with incorrect encoding

RESOLVED FIXED in Thunderbird 53.0

Status

MailNews Core
Networking: NNTP
RESOLVED FIXED
5 months ago
4 months ago

People

(Reporter: Jorg K (GMT+2), Assigned: Jorg K (GMT+2))

Tracking

(Blocks: 1 bug)

Trunk
Thunderbird 53.0
Dependency tree / graph

Thunderbird Tracking Flags

(thunderbird51 fixed, thunderbird52 fixed, thunderbird53 fixed)

Details

Attachments

(1 attachment, 2 obsolete attachments)

(Assignee)

Description

5 months ago
This is most likely a duplicate of bug 1287336, but here we have a more reproducible case that doesn't involve displaying messages in rapid succession.

STR:
Subscribe to mozilla.test on news.mozilla.org.
View the thread "2017 - first SM". The second message
"Re: 2017 - first SM" has this body:
ü.  That was pasted from Character Map.
ü.  That was pasted from Character Map.

Sometimes the ü shows, sometimes ü. I get the ü when using down-arrow from the first message and ü when using up-arrow from the third message, but it's not consistent. In the end the ü will always show. Leaving the news group and later returning brings back the garbled character.

The message is winwindows-1252 encoded, I'm not sure why the ü (0xFC) is shown as ü.

Apparently this was aggravated by bug 1323377.

Comment 1

5 months ago
The problem wit the umlauts seems to be independent from the used charset. Happens often on news groups but is not reliably reproducible. A commentator mentioned that is does not happen for E-Mails. Another commentator assumed a race condition.

I'm currently using a SM-Trunk built with back out of the fix in bug 1323377 and have not seen the problem since.

Updated

5 months ago
Component: Feed Reader → Networking: NNTP
(Assignee)

Comment 2

5 months ago
Created attachment 8823100 [details] [diff] [review]
1328131-charset-feeds.patch

Backing out this tiny bit of bug 1323377 fixes the problem.

Alta, can you enlighten me on what this forwarding business is about:
https://dxr.mozilla.org/comm-central/rev/2e479125e5d9df6778fc45ba58e4923575c7bb49/mailnews/news/src/nsNntpMockChannel.cpp#218

I assume the forwarded call now does something instead of returning NS_ERROR_NOT_IMPLEMENTED.
Assignee: nobody → jorgk
Flags: needinfo?(alta88)

Comment 3

5 months ago
(In reply to Jorg K (GMT+1) from comment #2)

> Created attachment 8823100 [details] [diff] [review]
> 1328131-charset-feeds.patch

WFM

Comment 4

5 months ago
i assume it's some macro to do some hack/override, but otherwise don't know any specifics of charset in nntp, other than nntp servers (the poor ones still not converted to utf8) should have the right encoding set in the news account, and reply/forward may need to ensure that's respected in a way different from email. unlike email, i think it would be legitimate to at some very near point only accept utf8 encoding from nntp servers and be done with this mess once and for all.

also, don't conflate news/nntp/newsgroups with feeds (atom/rss).
Flags: needinfo?(alta88)
(Assignee)

Comment 5

5 months ago
(In reply to Hartmut Figge from comment #3)
> WFM
Sure, but you can't have that since it will break things big time in mail replies. We need to work out why setting the charset on the channel has such bad consequences for nntp.
(Assignee)

Comment 6

5 months ago
OK, I debugged it further.
nsMsgProtocol::GetContentCharset() is called for the news message before the charset was set with nsMsgProtocol::SetContentCharset(). In other words, you get the charset for the previously viewed message.

The first message is in US-ASCII/ISO-8859-1, the second one with the ü in windows-1252. So we display the windows-1252 data as ISO-8859-1. Hmm, that shouldn't be a problem since they are mostly the same, especially for the ü. But nsMsgProtocol::GetContentCharset() doesn't always return something, sometimes is returns nothing (empty string) and then the message displays correctly.

All a little mysterious. Debugging continues.
(Assignee)

Comment 7

5 months ago
Here's another result:
The charset is fetched from the channel here:
https://dxr.mozilla.org/mozilla-central/rev/31ffcb82ced81bb75faa800d2b7e883a3761a03b/dom/html/nsHTMLDocument.cpp#710
That happens for mailbox and IMAP messages, too, but there always an empty string is returned since the charset is set by MIME before it's fetched.

For NNTP that call sometimes returns a stale value. Maybe the nsMsgProtocol object isn't properly initialised. Debugging continues.
(Assignee)

Comment 8

5 months ago
Patch coming, you'll see that this is a regression from bug 1323377. Sorry 'bout this :-(
Blocks: 1323377
(Assignee)

Comment 9

5 months ago
Created attachment 8823115 [details] [diff] [review]
1328131-reset-charset-for-nntp-reuse.patch (v1).

Sorry Magnus, I broke this, so here's the one line fix.
Attachment #8823100 - Attachment is obsolete: true
Attachment #8823115 - Flags: review?(mkmelin+mozilla)
(Assignee)

Updated

5 months ago
Summary: Messages sometimes loaded and displayed with incorrect encoding → NNTP Messages loaded and displayed with incorrect encoding

Comment 10

5 months ago
Comment on attachment 8823115 [details] [diff] [review]
1328131-reset-charset-for-nntp-reuse.patch (v1).

Review of attachment 8823115 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM, r=mkmelin - appears the nsMsgProtocol::InitFromURI calls from nntp are looking for trouble though...

::: mailnews/base/util/nsMsgProtocol.cpp
@@ +78,5 @@
>      mailUrl->GetStatusFeedback(getter_AddRefs(statusFeedback));
>      mProgressEventSink = do_QueryInterface(statusFeedback);
>    }
> +
> +  // NNTP reuses the object and calls this again on the same object.

While true, that's not knowledge this class should need to have, so remove this and keep the comment on a general level.
Attachment #8823115 - Flags: review?(mkmelin+mozilla) → review+
(Assignee)

Comment 11

5 months ago
Created attachment 8823143 [details] [diff] [review]
1328131-reset-charset-for-nntp-reuse.patch (v1b).

Fixed comment as requested.
Attachment #8823115 - Attachment is obsolete: true
Attachment #8823143 - Flags: review+
(Assignee)

Updated

5 months ago
Keywords: checkin-needed

Comment 12

5 months ago
Seamonkey Aurora branch is affected. Fix must be applied there too.
(Assignee)

Comment 13

5 months ago
Comment on attachment 8823143 [details] [diff] [review]
1328131-reset-charset-for-nntp-reuse.patch (v1b).

Uplift to Aurora and Beta since this is where the regressing bug 1323377 landed (or will land for beta).

Coming up in the next 24 hours.
Attachment #8823143 - Flags: approval-comm-beta+
Attachment #8823143 - Flags: approval-comm-aurora+
(Assignee)

Comment 14

5 months ago
https://hg.mozilla.org/comm-central/rev/281d3a046b986dd07f8f2f1fcde8112a05cd9cbd
Status: NEW → RESOLVED
Last Resolved: 5 months ago
status-thunderbird51: --- → affected
status-thunderbird52: --- → affected
status-thunderbird53: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 53.0
(Assignee)

Updated

5 months ago
Keywords: checkin-needed
(Assignee)

Comment 15

5 months ago
Aurora (TB 52):
https://hg.mozilla.org/releases/comm-aurora/rev/338dac1c7ce578e18d673c4f33afe7e89f203d04
status-thunderbird52: affected → fixed
(Assignee)

Comment 16

5 months ago
Beta (TB 51):
https://hg.mozilla.org/releases/comm-beta/rev/94c29c2b3a6ff6ac8dffae2d5fa915673f8c476a

Fix uplifted together with regressing bug and enhanced test. Should all be good now:
https://hg.mozilla.org/releases/comm-beta/rev/bdc870f9fe257618f91319559c7eb3d5b77b1b8c (bug 1323377)
https://hg.mozilla.org/releases/comm-beta/rev/26265ec715cf1ffa58436256667ebc108c90eebd (bug 1325967)
status-thunderbird51: affected → fixed
Blocks: 1313774
You need to log in before you can comment on or make changes to this bug.