Closed Bug 65702 Opened 25 years ago Closed 24 years ago

Can't decode subject if it contains ASCII and Japanese

Categories

(MailNews Core :: Internationalization, defect, P3)

defect

Tracking

(Not tracked)

VERIFIED WORKSFORME

People

(Reporter: kazhik, Assigned: jgmyers)

References

Details

(Keywords: intl)

Attachments

(1 file)

If the subject of a message contains encoded ASCII code and Japanese code, Mozilla cannot decode it. For example, $B$"$$$&$($*(B(abcdefghijklmnopqrstuvwxyz) Mew(Mailer for emacs) encodes this subject as below: Subject: =?iso-2022-jp?B?GyRCJCIkJCQmJCgkKhsoQihhYmNkZQ==?= =?us-ascii?Q?fghijklmnopqrstuvwxyz)?= Mozilla can't decode it. But NC4.7 can decode it.
Attached file Testcase(mail file)
change platform to ALL and mark this as P3 moz0.9.1
Keywords: intl, nsbeta1
OS: Windows 2000 → All
Priority: -- → P3
Hardware: PC → All
Target Milestone: --- → mozilla0.9.1
QA Contact: momoi → ji
Change QA contact to ji.
MIME decoder is being rewritten by jgmyers@netscape.com, reassign to him.
Assignee: nhotta → jgmyers
Target Milestone: mozilla0.9.1 → ---
My rewrite will indeed handle this.
Status: NEW → ASSIGNED
Depends on: 58114
Fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
The fix was done by changed the MIME decoder to do a charset conversion and always returns UTF-8 string. There are a couple of issues appeared by that implementation. * By always returning UTF-8, there is no way for the caller to correct mislabeled charset headers (e.g. ISO-8859-1 labeled Big5, US-ASCII labeled Shift_JIS). * There are places which optimizes charset conversions in libmime. By doing charset conversion inside MIME decoder, we cannot take advantage of them. The first issue caused a regression of bug 65277. I reopen this bug and propose a better implementatiuon. * Do the UTF-8 conversion only if the header contains multiple charsets. That can be done by pre-parsing the header to check charsets in the header. This is a litter overhead but avoiding the charset conversion inside the decoder helps performance gain.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Blocks: 65277
A better approach would be to pass an override charset down to the encoded-word decoder. Converting to UTF-8 only in the multi-charset case requires a 2-pass decoder and prevents charset override in the multi-charset case. What charset conversion optimizations are you talking about? The only one I know of is the one which no-ops conversions between UTF-8 and US-ASCII. I believe this bug should remain closed fixed and override work be done on bug 65277.
Blocks: 68344
I think my proposal has minimum impact for the caller (and less chances of another regression) because it is basically requesting to back to the old behavior. I am not sure if anybody care about overriding multiple charsets case. In fact, multiple charset in a header itself is rarely seen. So the other option could be no support for multiple charset at all then no need for the pre-parsing. Anyway, please try whatever you think it's right to fix the problem but please test to prevent another regression. About the optimization, we cache the charset convertors which saves extra createintance and getservice.
No longer blocks: 68344
Blocks: 68344
This code is still working. Override work is being done as bug 65277
Status: REOPENED → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → FIXED
The testing mail that the original reporter attached does't show in the folder. I'd like to reopen this bug.
I suggest you wait another day. You may be seeing bug 75390.
It doesn't show with previous builds, like 04/02 build either. I'll wait until tomorrow anyway.
The attached testing mail doesn't show up either with today's trunk build (04/11). Reopened the bug.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Once I added a "From " line to the test case, it worked for me.
Status: REOPENED → RESOLVED
Closed: 25 years ago24 years ago
Resolution: --- → WORKSFORME
The original testcase does include the "From:" line. How did you edit it to get it work?
"From ", not "From:".
Yes. it appears correctly. Marked it as verified.
Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: