Closed Bug 268222 Opened 20 years ago Closed 15 years ago

Set character encoding auto-detection on by default in fr-FR builds

Categories

(Mozilla Localizations :: fr / French, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: jmdesp, Assigned: bugzilla.fr)

Details

Setting auto-detection on solves a lot of encoding problems of french users,
it's a this has already been set on for the swedish port (bug 267644), and the
bug to set it on globally (bug 264871) is likely to be blocked for quite a while
yet, so this would be needed for the french localization.

It's as benefic for french users to set it on as it is for swedich users. On the
newsgroup fr.comp.infosystemes.www.navigateurs, users are very frequently
instructed to do that.
We'll take a look on what sv-SE builds have done. Probably needs a little
testing. Reassigning.
Assignee: bellot → filip
QA Contact: bellot → benoit.leseul
The needed change is nothing more than one line change in the preference in
firefox-l10n.js :

+pref("intl.charset.detector", "universal_charset_detector");

I should have said I had looked :) 

Yes I know it is an one line change, what should be tested are the actual
consequences of this change. I for one never had any serious problems with
autodetection off. 

I see in the dependencies of bug 264871 that autodetection has some bugs in
detecting windows-1252 which is IMHO the most probable encoding for a document
without charset information that a french-speaking user would want to read. 

So are you sure that turning it on would actually be better for our users? Or
will it solve some edge cases while some others will be broken?
I just found an example of a page which is broken with autodetection on.

It is the bug for the CVS account request from jerome :) 
https://bugzilla.mozilla.org/show_bug.cgi?id=297829

Deer Park thinks that page is in Chinese (it should probably be UTF-8). I don't
remember such problems with autodetection off. 

Since autodetection doesn't remember if the encoding was manually corrected (it
is wrong again even with a simple reload!), I don't think it is wise to do this
change for now. Not marking wontfix though, autodetection may get better in the
future. Maybe there should be some machine-learning in it, like Thunderbird has
for spam.
Benoit, I'll be a bit brusk, but you know next to nothing about the problem, and
I'm very well aware about everything it implies, and for french users the
positive points of enabling auto-detection overcome by a very large amount the
negative points.

The bug to correct chinese problem is asigned to me, even if unfortunately I had
no time to work on it since I tooked the responsibility of it. It almost never
touches french pages. Fortunetaly there's enough accentuated characters in a
real french page, and they are disposed so that the auto-detector never gets it
wrong. Meanwhile the choice between UTF-8/ISO-8859-1 is much more often annoying
for french people, especially when you do searches from Google where the result
are returned in UTF-8 and the pages it refers are ISO-8859-1 without an encoding
information. That one is a frequent problem for french people !

This setting is what we always recommend to users who have characters problems
on news://fr.comp.infosystemes.www.navigateurs and there never was a complain of
problems with the auto-detector, and instead thanks just like here :
http://groups.google.fr/group/fr.comp.infosystemes.www.navigateurs/msg/c279c441ec6741fa

The people who are really annoyed by the chinese problem are the uk users,
because te pound symbol frequently causes that problem, and I wouldn't recommend
to set auto-detection on for a uk version.
Summary: Set character encoding auto-detection on by default → Set character encoding auto-detection on by default in fr-FR builds
If you do decide to change this pref, may I suggest that you, rather than adding
it to firefox-l10n.js, do it by edit the string in
toolkit/chrome/global/intl.properties instead? Change this:

intl.charset.detector=

to:

intl.charset.detector=universal_charset_detector

It seems that what we did for sv-SE on the Aviary branch was not really the
right way. The value for this pref in firefox-l10n.js is overridden by an empty
pref in intl.properties. And even though about:config reports that
"intl.charset.detector" is set to "universal_charset_detector", the
auto-detector is in fact not used. In View -> Character Encoding -> Auto-Detect,
the option "Off" is selected. And worse, manually selecting "Universal" is no
longer possible.
Mass resolving old l10n bugs. These didn't change since 2006, WORKSFORME.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.