Closed Bug 65702 Opened 24 years ago Closed 23 years ago

Can't decode subject if it contains ASCII and Japanese

Categories

(MailNews Core :: Internationalization, defect, P3)

defect

Tracking

(Not tracked)

VERIFIED WORKSFORME

People

(Reporter: kazhik, Assigned: jgmyers)

References

Details

(Keywords: intl)

Attachments

(1 file)

If the subject of a message contains encoded ASCII code and Japanese
code, Mozilla cannot decode it.

For example,

$B$"$$$&$($*(B(abcdefghijklmnopqrstuvwxyz)

Mew(Mailer for emacs) encodes this subject as below:

Subject: =?iso-2022-jp?B?GyRCJCIkJCQmJCgkKhsoQihhYmNkZQ==?=
 =?us-ascii?Q?fghijklmnopqrstuvwxyz)?=

Mozilla can't decode it. But NC4.7 can decode it.
Attached file Testcase(mail file)
change platform to ALL and mark this as P3 moz0.9.1
Keywords: intl, nsbeta1
OS: Windows 2000 → All
Priority: -- → P3
Hardware: PC → All
Target Milestone: --- → mozilla0.9.1
QA Contact: momoi → ji
Change QA contact to ji.
MIME decoder is being rewritten by jgmyers@netscape.com, reassign to him.
Assignee: nhotta → jgmyers
Target Milestone: mozilla0.9.1 → ---
My rewrite will indeed handle this.
Status: NEW → ASSIGNED
Depends on: 58114
Fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
The fix was done by changed the MIME decoder to do a charset conversion and
always returns UTF-8 string. There are a couple of issues appeared by that
implementation.

* By always returning UTF-8, there is no way for the caller to correct
mislabeled charset headers (e.g. ISO-8859-1 labeled Big5, US-ASCII labeled
Shift_JIS).

* There are places which optimizes charset conversions in libmime. By doing
charset conversion inside MIME decoder, we cannot take advantage of them.

The first issue caused a regression of bug 65277. I reopen this bug and propose
a better implementatiuon.
* Do the UTF-8 conversion only if the header contains multiple charsets.
That can be done by pre-parsing the header to check charsets in the header. This
is a litter overhead but avoiding the charset conversion inside the decoder
helps performance gain.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Blocks: 65277
A better approach would be to pass an override charset down to the encoded-word 
decoder.  Converting to UTF-8 only in the multi-charset case requires a 2-pass 
decoder and prevents charset override in the multi-charset case.

What charset conversion optimizations are you talking about?  The only one I 
know of is the one which no-ops conversions between UTF-8 and US-ASCII.

I believe this bug should remain closed fixed and override work be done on bug 
65277.
Blocks: 68344
I think my proposal has minimum impact for the caller (and less chances of 
another regression) because it is basically requesting to back to the old 
behavior. I am not sure if anybody care about overriding multiple charsets case. 
In fact, multiple charset in a header itself is rarely seen.
So the other option could be no support for multiple charset at all then no need 
for the pre-parsing.

Anyway, please try whatever you think it's right to fix the problem but please 
test to prevent another regression.

About the optimization, we cache the charset convertors which saves extra 
createintance and getservice.

No longer blocks: 68344
Blocks: 68344
This code is still working.  Override work is being done as bug 65277
Status: REOPENED → RESOLVED
Closed: 24 years ago23 years ago
Resolution: --- → FIXED
The testing mail that the original reporter attached does't show in the folder.
I'd like to reopen this bug.
I suggest you wait another day.  You may be seeing bug 75390.
It doesn't show with previous builds, like 04/02 build either.
I'll wait until tomorrow anyway.
The attached testing mail doesn't show up either with today's trunk build
(04/11). Reopened the bug.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Once I added a "From " line to the test case, it worked for me.
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → WORKSFORME
The original testcase does include the "From:" line.
How did you edit it to get it work?
"From ", not "From:".
Yes. it appears correctly. Marked it as verified.
Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: