Closed
Bug 301915
Opened 19 years ago
Closed 9 years ago
Improve charset/encoding Auto Detect to handle ISO-8859-2
Categories
(Core :: Internationalization, enhancement)
Core
Internationalization
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: petr, Assigned: smontagu)
References
()
Details
Attachments
(1 file)
|
739 bytes,
message/rfc822
|
Details |
User-Agent: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.7.10) Gecko/20050723 Build Identifier: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.7.10) Gecko/20050723 If default character encoding is set to iso-8859-2 and Auto-detect is set to universal, messages without character encoding header (generated by Outlook Express for example) written in iso-8859-2 encoding are displayed in iso-8859-1 encoding instead. Reproducible: Always Steps to Reproduce: 1. Set default character set to ISO-8859-2 2. Set Auto-Detect to Universal 3. Go to news://microsoft.public.cs.windows for example 4. Find message without correct MIME header written with accented characters Actual Results: Accented characters are displayed as in iso-8859-1 charset. Expected Results: Accented characters shoud be displayed as in iso-8859-2 charset. To display the message in the right charset, it is necesary to change default charset to some other and then back to iso-8859-2. For every message separately. Or to switch Auto-Detect off. The same behavior on Windows 98 and Windows XP.
Comment 1•19 years ago
|
||
Please find an example message, save it as a .EML file, and attach it to this bug (using the Create New Attachment link above). I believe Auto Detect/Universal only identifies ISO-8859-1 vs. Win-1252 vs. UTF-8, as well as some subset of various Chinese, Russian and perhaps some other encodings. If so, detection of the various flavors of 8859 would be an enhancement, and one I'm guessing would be fairly difficult to implement. FWIW, I run with Auto Detect turned off.
Assignee: mail → smontagu
Component: MailNews: Main Mail Window → Internationalization
OS: Windows XP → All
Product: Mozilla Application Suite → Core
QA Contact: amyy
Hardware: PC → All
Summary: Incorrect characrter encoding used for messages without character encoding in the header → Incorrect character encoding used for messages without character encoding in the header
Version: unspecified → Trunk
| Reporter | ||
Comment 2•19 years ago
|
||
This message is wrongly displayed in iso-8859-1 charset if default charset is iso-8859-2 and Auto-Detect is universal.
| Reporter | ||
Comment 3•19 years ago
|
||
(In reply to comment #1) > Please find an example message, save it as a .EML file, and attach it to this > bug (using the Create New Attachment link above). > Done. In this text, several times the ISO-8859-2 character xB9 <U0161> LATIN SMALL LETTER S WITH CARON is displayed as 3/4. > I believe Auto Detect/Universal only identifies ISO-8859-1 vs. Win-1252 vs. > UTF-8, as well as some subset of various Chinese, Russian and perhaps some other > encodings. If so, detection of the various flavors of 8859 would be an > enhancement, and one I'm guessing would be fairly difficult to implement. > I don't know what is concept of "Auto-Detect" and "Default" character encoding settings, but I'd suppose that if I have iso-8859-2 as default and message is in iso-8859-2 (without proper MIME headers) it should not be changed to iso-8859-1 by Auto-Detect feature. It is not necessary to distinguish between various flavors of iso-8859, but then the default one should be chosen. > FWIW, I run with Auto Detect turned off. Of course, this is possible, but if there is a mix of utf-8 and iso-8859-2 messages not very practical I think.
| Assignee | ||
Comment 4•19 years ago
|
||
See bug 115114, especially comments 6 and 14. However, since we now have a Latin-1 detector, we might want to experiment with turning the Latin-2 detector back on. (In reply to comment #3) > I don't know what is concept of "Auto-Detect" and "Default" character encoding > settings, but I'd suppose that if I have iso-8859-2 as default and message is in > iso-8859-2 (without proper MIME headers) it should not be changed to iso-8859-1 > by Auto-Detect feature. The encoding detected by autodetection has higher priority than your default encoding.
Comment 5•19 years ago
|
||
There are some other 8859-x schemes for Latin-based alphabets, such as -15, that might be included under this.
Severity: normal → enhancement
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Incorrect character encoding used for messages without character encoding in the header → Improve charset/encoding Auto Detect to handle ISO-8859-2
Updated•15 years ago
|
QA Contact: amyy → i18n
The "universal" detector is gone.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•