Closed Bug 536506 Opened 15 years ago Closed 15 years ago

[zh-TW] Set default value of intl.charset.default to UTF-8 and turn on Universal Charset Detector

Categories

(Mozilla Localizations :: zh-TW / Chinese (Traditional), defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: timdream, Assigned: timdream)

Details

Everything is ready, just need to explain things here for the record.

intl.charset.default affects 
a) The request header "Accept-Charset"
b) The encoding text files or web pages if it does not specific its encoding, which is really rare now days.

By changing the value from Big5 to UTF-8, we could
1) keep some of the mis-configured servers to behavior in the same way with en-US browser, even if it erroneously "obeys" Accept-Charset header.
2) keep the browser behavior in the same way with en-US version when it encounters a UTF-8 plain text file.

To minimize impact, the change would also turn on Universal Charset Detector (see ref), which is a piece of decade old code that guesses the encoding of these legacy pages that fits condition (b). The reason I choose Universal over zh-TW one is because over the years, the usage of zh-CN websites in Taiwan had grown rapidly and they don't really obeys web standards most of the time.

I've already asked our beta users to test out the new configure values, so far none of them have feel any difference. The changes would apply to existing profiles - I think it's better to ship the change in a new version, i.e. Firefox 3.6.

ref: http://www.mozilla.org/projects/intl/chardet.html

see also (forum discussion in Chinese): http://forum.moztw.org/viewtopic.php?f=4&t=28987

Tim
Committed to 1.9.2 and trunk. Mark as RESOLVED FIXED.

http://hg.mozilla.org/l10n-central/zh-TW/rev/0bb7951b7fe0
http://hg.mozilla.org/releases/l10n-mozilla-1.9.2/zh-TW/rev/4a32d99f179d
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Simon, does switching on the universal charset detector come with a perf cost?
It can cause bug 61363, which is annoying, but other than that I don't think the perf cost is serious.

Instead of "Universal", I would consider the option of "Chinese", which will detect both Simplified and Traditional (and Latin-1 and UTF- encodings), but won't activate the other detector modules.
(In reply to Simon Montagu from comment #3)
> Instead of "Universal", I would consider the option of "Chinese", which will
> detect both Simplified and Traditional (and Latin-1 and UTF- encodings), but
> won't activate the other detector modules.

Filed bug 844114.
You need to log in before you can comment on or make changes to this bug.