Closed Bug 1120813 Opened 10 years ago Closed 9 years ago

MS932 not recognized as a label of Shift_JIS

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox46 --- fixed
firefox47 --- fixed
thunderbird_esr38 --- affected

People

(Reporter: mizota.toshiki, Assigned: mkmelin)

References

Details

(Keywords: regression, testcase, Whiteboard: [regression:TB31])

Attachments

(4 files, 2 obsolete files)

Attached file maile_image.xlsx
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.3; rv:11.0) like Gecko Steps to reproduce: See a mail.Chose Encording. Actual results: Can't chose Encording like some mails. can chose: Subject ISO-2022JP / Content-type: iso-2022-jp can't choose: Subject Shift_JIS / Content-type: MS932 *When the version 24.6 it worked. After update to 31.3 it doesnot work. Expected results: Can chose Encording.
Summary: sjis → Can't choose Japanese encoding sjis (Subject Shift_JIS / Content-type: MS932)
Thanks for repoting this. Just an observation - if this is a regression starting at version 31.0 (I'm not saying it is), it's hard to understand why this is only now being reported 5 months after release.
I could confirm this issue with attaching message. But on my linux box, the issue also exists on thunderbird 24.6.0. Note that this message does not contain any characters being included only in ms932.
Attachment #8548560 - Attachment mime type: message/rfc822 → text/plain
mizota san, thanks for reporting this issue. I'd like to make sure that ms932 message did show correctly on thunderbird 24.6. Could you please attach a message which is GOOD on 24.6 but NG on 31.3?
I am sorry. I did not set needinfo?
Flags: needinfo?(mizota.toshiki)
I can reproduce. Steps: 1. Save attachment 8548560 [details] as ms932.eml 2. Drag and Drop the ms932.eml into INBOX of local folder 3. Select the message and View > View – Character Encoding Actual Results: The menu item is disabled Actual Results: The menu item should be enabled. Regression window Good: https://hg.mozilla.org/mozilla-central/rev/c67a79064fd4 https://hg.mozilla.org/comm-central/rev/109486fb2b22 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.0a1 ID:20140427030201 Bad: https://hg.mozilla.org/mozilla-central/rev/4d926af89907 https://hg.mozilla.org/comm-central/rev/c8a51e3fbe7e Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.0a1 ID:20140428030201 Pushlog https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c67a79064fd4&tochange=4d926af89907 https://hg.mozilla.org/comm-central/pushloghtml?fromchange=109486fb2b22&tochange=c8a51e3fbe7e Regressed by: Bug 999881
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(mconley)
I was misunderstanding "Can't choose" means "failed auto-detection"...
Flags: needinfo?(mizota.toshiki)
(In reply to Wayne Mery (:wsmwk) from comment #1) > Just an observation - if this is a regression starting at version 31.0 (I'm > not saying it is), it's hard to understand why this is only now being > reported 5 months after release. Because it requires a message charset whose value is not anything we recognize.
Flags: needinfo?(mconley)
Flags: needinfo?(Pidgeot18)
Attached patch Add ms932 (obsolete) — Splinter Review
I am not sure adding ms932 is a right thing to do.
Assignee: nobody → hiikezoe
I am guessing the problematic message in attachment 8547985 [details] was sent by a JAVA system. mizota san, does the messsage surely contain extended characters specified for ms932 (IBM extended)? If not, windows-31j should be used instead there.
Flags: needinfo?(mizota.toshiki)
> Could you please attach a message which is GOOD on 24.6 but NG on 31.3? ikezoe san ,I'm sorry this message is a "trade secret" so I can't attach. I ask the person who send the email to send a sample mail but they could not send the mail. I download other version from this site. http://ftp.mozilla.org/pub/mozilla.org/thunderbird/releases/ 29.0b1 is GOOD and 30.0b1 is NG
Flags: needinfo?(mizota.toshiki)
> mizota san, does the messsage surely contain extended characters specified > for ms932 (IBM extended)? > If not, windows-31j should be used instead there. ikezoe san , how can I check the message is extended characters specified for ms932.
Attached file maile_image2.xlsx
atached the file when it worked on 24.1.1 -29.0b1
(In reply to mizota.toshiki from comment #11) > > mizota san, does the messsage surely contain extended characters specified > > for ms932 (IBM extended)? > > If not, windows-31j should be used instead there. > > ikezoe san , how can I check the message is > extended characters specified for ms932. I misunderstood about ms932. According to a document in wikipedia[1] ms932 means windows-31j in java prior to version 1.4. [1] http://ja.wikipedia.org/wiki/Microsoft%E3%82%B3%E3%83%BC%E3%83%89%E3%83%9A%E3%83%BC%E3%82%B8932#OEM.E3.82.B3.E3.83.BC.E3.83.89.E3.83.9A.E3.83.BC.E3.82.B8.E3.81.AE.E7.B5.B1.E5.90.88 mizota san, the charset of your system can not be changed to windows-31j? If the charset can not be changed, I will set review? flag to attachment 8548613 [details] [diff] [review]. There is no other way to fix this issue.
Flags: needinfo?(mizota.toshiki)
I have to see this mail so I will use 24.8.1 at my couple address. Thanks for support.
Anne, did you go for the intersection rather than the union when including labels in the Encoding Standard? Based on the research you posted, it seems that Presto-Opera recognized ms932 as a label of Shift_JIS. (In reply to Hiroyuki Ikezoe (:hiro) from comment #13) > mizota san, the charset of your system can not be changed to windows-31j? Do you mean in whatever Java program that generated emails labeled as ms932?
Flags: needinfo?(annevk)
Summary: Can't choose Japanese encoding sjis (Subject Shift_JIS / Content-type: MS932) → MS932 not recognized as a label of Shift_JIS
(In reply to Henri Sivonen (:hsivonen) from comment #15) > Anne, did you go for the intersection rather than the union when including > labels in the Encoding Standard? Based on the research you posted, it seems > that Presto-Opera recognized ms932 as a label of Shift_JIS. > > (In reply to Hiroyuki Ikezoe (:hiro) from comment #13) > > mizota san, the charset of your system can not be changed to windows-31j? > > Do you mean in whatever Java program that generated emails labeled as ms932? I am guessing programs built on Java-1.4 and prior versions use ms932 in Japan.
(In reply to Henri Sivonen (:hsivonen) from comment #15) > Anne, did you go for the intersection rather than the union when including > labels in the Encoding Standard? Based on the research you posted, it seems > that Presto-Opera recognized ms932 as a label of Shift_JIS. Mostly based on what the majority of user agents recognized. We had to be conservative to some extent as e.g. recognizing "shift-jis" as "shift_jis" causes compatibility issues. Adding "ms932" seems like it might be worth it.
Flags: needinfo?(annevk)
(In reply to Anne (:annevk) from comment #17) > Adding "ms932" seems like it might be worth it. Filed https://www.w3.org/Bugs/Public/show_bug.cgi?id=27851
I'm trying to understand what Thunderbird should do with this. Looking at the discussions in https://www.w3.org/Bugs/Public/show_bug.cgi?id=27851 it looks like this is leaning toward WONTFIX. Is that correct?
(In reply to Kent James (:rkent) from comment #19) > I'm trying to understand what Thunderbird should do with this. Looking at > the discussions in https://www.w3.org/Bugs/Public/show_bug.cgi?id=27851 it > looks like this is leaning toward WONTFIX. Is that correct? The W3 argues that adding new labels is probably not a net win, and I would WONTFIX adding ms932. However, we appear to have a problem that giving an illegal charset as a label causes the charset selector to be disabled... which, if true, is a bug that really ought to be fixed.
(In reply to Joshua Cranmer [:jcranmer] from comment #20) ... > > The W3 argues that adding new labels is probably not a net win, and I would > WONTFIX adding ms932. > > However, we appear to have a problem that giving an illegal charset as a > label causes the charset selector to be disabled... which, if true, is a bug > that really ought to be fixed. If the problem of "the charset selector to be disabled" is fixed, is the message with charset as ms932 somehow be rendered correctly? I just noticed a posting in a large Japanese BBS in which one user describes an issue of receiving such e-mails with ms932 chartype from automated warehouse management system. It looks that the old Java system used to build the inventory and delivery management system used ms932 instead of proper label (Windows-31J ???) to send out e-mails in Japanese, and there is no way the user (the mere recipient of the automatic e-mail from the warehouse) could do. He was forced to switch to livemail or something, but was not heard from again so far. TIA
https://www.w3.org/Bugs/Public/show_bug.cgi?id=27851 has been fixed by adding ms932, so we should do the same. Is mailnews/intl/charsetalias.properties still used, or is it sufficient to add it to dom/encoding/labelsencodings.properties?
If it's added to dom/encoding/labelsencodings.properties that should be sufficient. mailnews/intl/charsetalias.properties is checked only if there's no match in dom/encoding/labelsencodings.properties
What's next?
Assignee: hiikezoe → nobody
Flags: needinfo?(mizota.toshiki) → needinfo?(mkmelin+mozilla)
Keywords: testcase
Whiteboard: [regression:TB31]
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #24) > What's next? We should add it to make e-mails from legacy e-mail software / applications readable in thunderbird IMHO.
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #24) > What's next? Someone should write a patch to add "ms932" alias to labelsencodings.properties. I'll happily review it.
Component: Mail Window Front End → Internationalization
Product: Thunderbird → Core
Assignee: nobody → mkmelin+mozilla
Flags: needinfo?(mkmelin+mozilla)
Attached patch bug1120813_ms932_shift_jis.patch (obsolete) — Splinter Review
Attachment #8548613 - Attachment is obsolete: true
Attachment #8718055 - Flags: review?(VYV03354)
Status: NEW → ASSIGNED
Comment on attachment 8718055 [details] [diff] [review] bug1120813_ms932_shift_jis.patch Please also change the following files: /dom/encoding/test/test_TextDecoder.js /testing/web-platform/tests/tools/html5lib/html5lib/constants.py /testing/web-platform/tests/encoding/resources/encodings.js
And here: /testing/web-platform/tests/dom/nodes/Document-characterSet-normalization.html
Comment on attachment 8718055 [details] [diff] [review] bug1120813_ms932_shift_jis.patch Waiting for a patch update.
Attachment #8718055 - Flags: review?(VYV03354)
Thx for the pointers, and sorry for the delay
Attachment #8718055 - Attachment is obsolete: true
Attachment #8721354 - Flags: review?(VYV03354)
Comment on attachment 8721354 [details] [diff] [review] bug1120813_ms932_shift_jis.patch LGTM about dom/encodings. Ms2ger, please review wpt changes.
Attachment #8721354 - Flags: review?(VYV03354)
Attachment #8721354 - Flags: review?(Ms2ger)
Attachment #8721354 - Flags: review+
Comment on attachment 8721354 [details] [diff] [review] bug1120813_ms932_shift_jis.patch Review of attachment 8721354 [details] [diff] [review]: ----------------------------------------------------------------- > Bug 1120813 - MS932 not recognized as a label of Shift_JIS. r?emk "Add support for the MS932 label of Shift_JIS", or something along those lines. r+ with those changes. ::: testing/web-platform/tests/tools/html5lib/html5lib/constants.py @@ +3049,5 @@ > 'latin5': 'windows-1254', > 'latin6': 'iso8859-10', > 'latin8': 'iso8859-14', > 'latin9': 'iso8859-15', > + 'ms932': 'shift_jis', Please revert this change.
Attachment #8721354 - Flags: review?(Ms2ger) → review+
(In reply to :Ms2ger from comment #33) > > + 'ms932': 'shift_jis', > > Please revert this change. May I ask why? E.g. mskanji is also there a few lines down
The code it changes is part of html5lib, which doesn't accept changes based only on our peer review, unlike the tests in wpt.
OS: Windows 7 → All
Hardware: x86_64 → All
Version: 31 Branch → Trunk
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Blocks: 1252508
Comment on attachment 8721354 [details] [diff] [review] bug1120813_ms932_shift_jis.patch Approval Request Comment [Feature/regressing bug #]: bug 999881, really bug 943252 I believe [User impact if declined]: content in ms932 can't be read. Such content is rare but existing. [Describe test coverage new/current, TreeHerder]: code and tests landed on trunk [Risks and why]: no risk, just adding an alias [String/UUID change made/needed]: none
Attachment #8721354 - Flags: approval-mozilla-release?
Attachment #8721354 - Flags: approval-mozilla-aurora?
Comment on attachment 8721354 [details] [diff] [review] bug1120813_ms932_shift_jis.patch (wrong flag, meant mozilla-beta of course)
Attachment #8721354 - Flags: approval-mozilla-release? → approval-mozilla-beta?
Comment on attachment 8721354 [details] [diff] [review] bug1120813_ms932_shift_jis.patch Too late for 45.
Attachment #8721354 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Marking 46 as affected. This is a regression from some time ago so I don't think we need to track.
Comment on attachment 8721354 [details] [diff] [review] bug1120813_ms932_shift_jis.patch OK to uplift to aurora, adds tests, fixes an older regression
Attachment #8721354 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
But effectively we will have to wait until Thunderbird 52 :(
We'll put it in on the "thunderbird 45 version branch" (or whatever we'll call it) - bug 1252508.
Landed in THUNDERBIRD_45_VERBRANCH for inclusion in Thunderbird 45: https://hg.mozilla.org/releases/mozilla-esr45/rev/95e3b775e1ce
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: