Closed Bug 148369 Opened 23 years ago Closed 21 years ago

Better autodetection for ISO-8859-1, Windows-1252

Categories

(Core :: Internationalization, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: ian, Assigned: shanjian)

References

()

Details

(Keywords: intl, top100, Whiteboard: [Hixie-P0])

I use the universal character encoding autodetection algorithm (View, Character Coding, Autodetect, Universal). However, it frequently picks UTF8 when the character encoding is actually Windows-1252 or ISO-8859-1. Either the universal autodetector should be improved, or a new autodetection algorithm should be written, specifically for Western character encodings. To detect Windows-1252, it seems that looking for characters that are control characters in UTF8 and ISO-8859-1 would be a good start, especially 0x92 (the apostrophe) and 0x93 and 0x94 (smart quotes).
Keywords: intl
Whiteboard: [Hixie-P0]
QA Contact: ruixu → ylong
to autodetector owner
Assignee: yokoyama → shanjian
accepting.
Status: NEW → ASSIGNED
Possibly related: Bug 148369, Bug 159295.
There are several autodetection related bugs that affect top100 sites. This is one of them.
Keywords: top100
I believe in comment 3, Christian Franke meant to xref to bug 158285, rather than to this bug. Bug 159295 (likely the same problem as this or 158285, but in mail/news) probably should be duped to bug 177505, which has sample messages attached.
On closer examination, bug 159295 is unrelated. Bug 158285, however, appears to be about this same issue; that bug has been WFM'd since the autodetector is correctly identifying the test case at that bug; and it appears to be correctly identifying the test case for this bug, as well. Hixie, do you want to keep this bug open, or is the autodetector working sufficiently well for you at this point? (There are a few known autodetector bugs, of course, but not about the differentiation between Western and UTF-8.) (If this is to be kept open, perhaps it should be changed to 'enhancement'?)
testcase is wfm
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.