Closed Bug 1034960 Opened 10 years ago Closed 10 years ago

Restore menu option to fallback to UTF-8 (or other) encoding in case it is not specified in the content

Categories

(Firefox :: Settings UI, defect)

30 Branch
x86_64
All
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: awd, Unassigned)

References

Details

Using Firefox 30, on a web page, follow a link that opens UTF-8-encoded non-English plain-text file (e.g. myfile.txt) in a new tab. Observe garbled characters. In FF Options > Content > Advanced, observe that UTF-8 is no longer present in the list of encodings to fallback to. User has to manually switch to UTF-8 through the View menu to see content. User has to do this every time a file like this is opened.

This option is removed as of FF 28 per Bug 910192

This option, however, was used to instruct the browser to fallback to UTF-8 (or other user-preferred) encoding for non-legacy encodings, such as in the scenario described above.

Please restore this option.

Related bugs:
Bug 967981 - Provide some way for addons/prefs to control the default fallback character set 
Bug 910192 - Get rid of intl.charset.default as a localizable pref and deduce the fallback from the locale
Blocks: 910192
Status: UNCONFIRMED → NEW
Ever confirmed: true
I am strongly opposed to restoring the option to use UTF-8 as a general fallback that would extend to text/html from http and https URLs. It's worth noting that doing so would interfere with bug 910211. I think this should be resolved as WONTFIX.

For online the case described in comment 0, changing the fallback to accommodate one misauthored site would break other misauthored sites and, therefore, would not be a general fix. However, as a side effect, the user would end up with an abnormal configuration, so if the user is a Web author, the user could end up relying on the abnormal configuration when authoring new content thereby contributing to the problem.

Although I think we should avoid introducing more magic for the Web (http and https) case, it's worth noting that autodetecting UTF-8 for text/plain only would be less problematic than autodetecting it for text/html.

As for the case where a user is working from file: URLs on documents that then get served with the appropriate HTTP header to other people, for the file: case there are two potential fixes that would be more appropriate than what's being requested here:

 1) Since file: URLs (practically always) point to finite files, we could scan all the bytes for UTF-8ness before parsing and use UTF-8 if the bytes look like UTF-8. This wouldn't involve the problems that are involved in detecting UTF-8 from a network stream.

 2) Failing #1, we could introduce a checkbox "Use UTF-8 as the fallback for file: URLs".
"View > Character Encoding > Autodetect > Universal" is another menu option that is gone from FF version 30, but could've helped to solve the issue reported in this bug. Please consider restoring this option if no other solutions are possible.
(In reply to Angry Bird from comment #2)
> "View > Character Encoding > Autodetect > Universal" is another menu option
> that is gone from FF version 30, but could've helped to solve the issue
> reported in this bug. Please consider restoring this option if no other
> solutions are possible.

Definitely not restoring that one. The "Universal" detector wasn't actually universal. Rather, it was arbitrary in what it detected and what it didn't (Thai but not Vietnamese, Hebrew but not Arabic, Hungarian but not Polish, etc.).
OS: Windows 8.1 → All
WONTFIX per comment 1.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.