Closed Bug 1511950 Opened 2 years ago Closed 2 years ago

cp932 encoded messages display as garbage again


(MailNews Core :: Internationalization, defect)

Not set


(thunderbird_esr52 unaffected, thunderbird_esr6064+ fixed, thunderbird64 wontfix, thunderbird65 fixed)

Thunderbird 65.0
Tracking Status
thunderbird_esr52 --- unaffected
thunderbird_esr60 64+ fixed
thunderbird64 --- wontfix
thunderbird65 --- fixed


(Reporter: ku, Assigned: jorgk-bmo)



(Keywords: regression)


(4 files, 3 obsolete files)

Attached file cp932.eml (obsolete) —
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36

Steps to reproduce:

Read a Japanese text mail which is created on older iOS 10.3.2 or such.

This is similar problem that is previously reported:

Actual results:

Display is messed up like:



Expected results:

This is good display:


Attachment #9029394 - Attachment is obsolete: true
Regression window:

Suspect: 	f558febc1ead	Jorg K — Bug 1363281 - Port bug 1261841 |Replace uconv with encoding_rs| to mailnews. rs=mkmelin, f=hsivonen
Component: Untriaged → Internationalization
Product: Thunderbird → MailNews Core
Blocks: 1363281
@Jorg K,
The problem seems to be old regression since 56. 
Can you please look into this?
Flags: needinfo?(jorgk)
Attachment #9029395 - Attachment mime type: message/rfc822 → text/plain
Flags: needinfo?(jorgk)
Henri, what's the story with cp932? Sounds like the "code page 932", so the Windows native encoding in Japan, a variation on Shift JIS. When I edit the message from
Content-Type: text/plain; charset=cp932
Content-Type: text/plain; charset=shift_jis
the message displays fine. Is cp932 no longer supported?
Flags: needinfo?(hsivonen)
cp932 is not an Encoding Standard-supported label. The encoding itself is supported.
Flags: needinfo?(hsivonen) → needinfo?(annevk)
annevk, considering that this is a regression, how did the cp932 label not get included in the standard based on Gecko previously knowing about the label?
"View" > "Text Encoding" > "●Japanese(Shift_JIS)" temporarily fixes. However, the text will be corrupted again when re-viewing the message.

In this case, I think that "Auto-detect" is not working correctly.
Hi I heard about a workaround using "" file upon my googling about this problem. e.g.

Is it still functional?

(However the topics about are 10 years old (2009 or such). It is not recent topic.)

It says we should have this file:
- C:/Program Files/Mozilla Thunderbird/res/


# Aliases for Shift_JIS

Append the following line should solve the problem.


I'm not sure about this. just fyi.
(In reply to KU from comment #8)
> Hi I heard about a workaround using "" file upon my
> googling about this problem. e.g.
> Is it still functional?

No, is in the tree and, indeed, contains

# Aliases for Shift_JIS

...but apparently is no longer used for message decoding purpose if it's not taking effect here.

That line came from bug 542823.
Indeed, it's here:

More recently, we also added cp936=gbk here: reply to Henri Sivonen (:hsivonen) from comment #9)

(In reply to Henri Sivonen (:hsivonen) from comment #9)
> ...but apparently is no longer used for message decoding purpose if it's not
> taking effect here.
Right. We still have some additional alias code in nsCharsetAlias.cpp
but the calls to nsCharsetAlias::GetPreferred() got removed here :-(
in bug 1363281.

I'll see how to put this back.
Assignee: nobody → jorgk
I don't recall. Per Firefox did not always support it at least and Internet Explorer didn't either. Unfortunately the Chrome/Safari/ICU situation at the time wasn't documented.

I think the rationale used to whether to include labels was mostly whether the implementations with majority market share had them, since if they didn't, the expectation was that sites using the label were doing so erroneously and relied on a default kicking in.
Flags: needinfo?(annevk)
This should do it. The message displays fine now. Try:

We should add a test for cp932 and perhaps cp936=gbk.
Attachment #9029557 - Flags: review?(mkmelin+mozilla)
Comment on attachment 9029557 [details] [diff] [review]
1511950-GetCharsetAlias.patch (v1)

Review of attachment 9029557 [details] [diff] [review]:

::: mailnews/base/util/nsMsgI18N.cpp
@@ +88,5 @@
> +  NS_ENSURE_SUCCESS(rv, rv);
> +
> +  nsCString newCharset;
> +  rv = ccm->GetCharsetAlias(PromiseFlatCString(aCharset).get(), newCharset);

should check rv

::: mailnews/mime/src/mimemoz2.cpp
@@ +816,5 @@
> +  NS_ENSURE_SUCCESS(rv, -1);
> +
> +  nsCString newCharset;
> +  rv = ccm->GetCharsetAlias(input_charset, newCharset);

and here
Attachment #9029557 - Flags: review?(mkmelin+mozilla) → review+
Added check of return value. I'm still thinking about where to best add a test.
Attachment #9029557 - Attachment is obsolete: true
Attachment #9029575 - Flags: review+
Keywords: leave-open
Pushed by
Add missing calls to GetCharsetAlias() to take Thunderbird's aliases into account. r=mkmelin
Attached patch 1511950-jsmime-part.patch - WIP (obsolete) — Splinter Review
I'm noticed that these subjects work:
Subject: =?shift_jis?B?grGCsYLJlnuVtoKqgquC3IK3gUI=?=
Subject: =?shift_jis?Q?=82=b1=82=b1=82=c9=96=7b=95=b6=82=aa=82=ab=82=dc=82=b7=81=42?=
giving ここに本文がきます。 (Here comes the text.) but using cp932 it doesn't work :-(

So there is more code missing.
With this patch, these tests pass:
mach xpcshell-test comm/mailnews/mime/test/unit/
mach xpcshell-test comm/mailnews/mime/jsmime/test/
mach xpcshell-test comm/mailnews/intl/test/unit/test_decode_utf-7_internal.js

It also undoes the damage done here:

I still need to add a test for decoding cp932 in bodies which isn't covered anywhere yet. Still thinking where to best add it.

With all the tests passing locally this try should also be successful, however, after the last merge, the state of the tree is still unknown, might be busted.
Attachment #9029592 - Flags: review?(mkmelin+mozilla)
Attachment #9029584 - Attachment is obsolete: true
Try run is good, so here an additional test. Copied from test-base64-display.js.
Attachment #9029619 - Flags: review?(mkmelin+mozilla)
Attachment #9029575 - Attachment description: 1511950-GetCharsetAlias.patch (v1b) → 1511950-GetCharsetAlias.patch (v1b) [landed in comment #15]
Attachment #9029592 - Flags: review?(mkmelin+mozilla) → review+
Comment on attachment 9029619 [details] [diff] [review]
1511950-test.patch (v1)

Review of attachment 9029619 [details] [diff] [review]:

LGTM, r=mkmelin
Attachment #9029619 - Flags: review?(mkmelin+mozilla) → review+
Keywords: leave-open
Attachment #9029575 - Flags: approval-comm-esr60+
Pushed by
Add GetCharsetAlias() call to JS Mime. r=mkmelin
Add MozMill test for message display in cp932. r=mkmelin
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 65.0
The patch says:
> +  // The following will throw if the charset is unknown.
> +  let newCharset = manager.getCharsetAlias(charset);

It seems that does not cover even nearly all the labels from the Encoding Standard. What mechanism is there to ensure that this change didn't remove support for most encoding labels from Thunderbird? That is, where's the code to delegate to mozilla::Encoding::ForLabel() for stuff that isn't covered in
(In reply to Jorg K (GMT+1) from comment #22)
> 16003487aabb9202fe21e4fc43777478d0dd916e/mailnews/intl/nsCharsetAlias.cpp#26

Thanks. Somehow, I managed to miss that on searchfox.
BTW, this fixed another bug where unsupported encodings where shipped off to the UTF-7 decoder:
Error: UTF7TextDecoder: This code should never be reached for x-mac-croatian

Now we get instead:
RangeError: The given encoding 'x-mac-croatian' is not supported.
Follow-up: Adapt test test-cp932-display.js to ESR 60. r+a=jorgk
Thank you!
I have confirmed that Thunderbird 60.4.0 can display cp932 mail!
And we've added tests so it won't break again :-) - Sorry 'bout the inconvenience.
You need to log in before you can comment on or make changes to this bug.