If the subject of a message contains encoded ASCII code and Japanese code, Mozilla cannot decode it. For example, $B$"$$$&$($*(B(abcdefghijklmnopqrstuvwxyz) Mew(Mailer for emacs) encodes this subject as below: Subject: =?iso-2022-jp?B?GyRCJCIkJCQmJCgkKhsoQihhYmNkZQ==?= =?us-ascii?Q?fghijklmnopqrstuvwxyz)?= Mozilla can't decode it. But NC4.7 can decode it.
change platform to ALL and mark this as P3 moz0.9.1
Change QA contact to ji.
MIME decoder is being rewritten by firstname.lastname@example.org, reassign to him.
My rewrite will indeed handle this.
Fix checked in.
The fix was done by changed the MIME decoder to do a charset conversion and always returns UTF-8 string. There are a couple of issues appeared by that implementation. * By always returning UTF-8, there is no way for the caller to correct mislabeled charset headers (e.g. ISO-8859-1 labeled Big5, US-ASCII labeled Shift_JIS). * There are places which optimizes charset conversions in libmime. By doing charset conversion inside MIME decoder, we cannot take advantage of them. The first issue caused a regression of bug 65277. I reopen this bug and propose a better implementatiuon. * Do the UTF-8 conversion only if the header contains multiple charsets. That can be done by pre-parsing the header to check charsets in the header. This is a litter overhead but avoiding the charset conversion inside the decoder helps performance gain.
A better approach would be to pass an override charset down to the encoded-word decoder. Converting to UTF-8 only in the multi-charset case requires a 2-pass decoder and prevents charset override in the multi-charset case. What charset conversion optimizations are you talking about? The only one I know of is the one which no-ops conversions between UTF-8 and US-ASCII. I believe this bug should remain closed fixed and override work be done on bug 65277.
I think my proposal has minimum impact for the caller (and less chances of another regression) because it is basically requesting to back to the old behavior. I am not sure if anybody care about overriding multiple charsets case. In fact, multiple charset in a header itself is rarely seen. So the other option could be no support for multiple charset at all then no need for the pre-parsing. Anyway, please try whatever you think it's right to fix the problem but please test to prevent another regression. About the optimization, we cache the charset convertors which saves extra createintance and getservice.
This code is still working. Override work is being done as bug 65277
The testing mail that the original reporter attached does't show in the folder. I'd like to reopen this bug.
I suggest you wait another day. You may be seeing bug 75390.
It doesn't show with previous builds, like 04/02 build either. I'll wait until tomorrow anyway.
The attached testing mail doesn't show up either with today's trunk build (04/11). Reopened the bug.
Once I added a "From " line to the test case, it worked for me.
The original testcase does include the "From:" line. How did you edit it to get it work?
"From ", not "From:".
Yes. it appears correctly. Marked it as verified.