57.51 KB, text/plain
Build: 091311 0.9.4 build When Universal auto-detector is turned on, the display of Japanese sjis attachments with no charset info are onlycorrect partially. Steps to reproduce: 1. Launch mail. 2. Turn on the universal auto-detector by electing View | Character Coding | Auto Detect | Universal 3. Select a mail with a sjis attachment which doesn't have charset info. You can see some strings in the attachment are still garbled. I'll attach test mails later.
Nominating for nsbranch since sjis attachment with no charset info is frequently seen in Japanese mails.
To identify japanese correctly with a single line of input is beyond the capability of my design in universal charset detector. I believe we should fix bug 12481 to avoid similar problem. Shirley, send me your attachemnt via email. Most likely this bug will be resolved as wontfix.
Shanjian, what do you mean for "single line of input"? The attachment of the mail is pretty big, not just having one line. I'll send you the attachments.
Currently in mail/news, strings sent to charset detector line by line and charset detector is asked to give decision for each line. That's why you can see that some lines are displayed correctly, some lines are interpreted as other script. bug 12481 has been filed long ago but still haven't been taken care of yet.
If you change the autodetect to Japanese, then these attachments are displayed correctly. Is that still the default setting for the localized Japanese builds? Shanjian, When Universal auto fails to determine the charset, what does it use as the default. I went into Edit|Preferences...|Navigator|Languages and set my Default Character Coding to SJIS, but the attachments still had lines of garbage Latin1. I would have expected when it could not detect the charset, that the default from this pref would be used.
bob, universal charset detector does not report charset if its confidence level could not reach a certain threshold. When universal charset detector reports nothing, I don't know what happens in mail/news convert. I am looking into problem 12481 now, and hope I can find a solution to this long standing problem.
Bob, for Japanese build, the Japanese auto-detector is turned on by default.
The real solution for this issue is to change the mail/news detection arch to pass the whole data to the detector instead of pass line by line. One line is simply too smal chunk of data to detect. The current arch won't work. Fixing code won't without fixing the arch won't fix it.
> The real solution for this issue is to change the mail/news detection arch to > pass the whole data to the detector instead of pass line by line. One line is > simply too smal chunk of data to detect. Yes, I agree that is the main problem. But it also appears that the defaulting behavior is not what a user would expect. That should be fixed in addition to the line-by-line limitation. But since autodetect Universal is not the default settings, I think/hope most users will probably not see this problem, so we can leave this nsbranch-... As firstname.lastname@example.org confirmed, localized Japanese builds use autodetect Japanese as the default setting.
After bug 12481 was fixed, this bug should go away.
Verified with 11/19 builds. The display is correct when universal autodetector is turned on, but browser crashes on linux when viewing the 2nd mail in the attached testing folder when auto-detector is turned off (bug 110858).
Sorry, added the keyword to the wrong bug.