Default Encoding for all pages in Unicode and NOT Western




13 years ago
10 years ago


(Reporter: Mike Yamnitsky, Unassigned)


1.5.0.x Branch
Windows XP

Firefox Tracking Flags

(Not tracked)


(Whiteboard: CLOSEME 07/14)



13 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b5) Gecko/20051006 Firefox/1.4.1
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b5) Gecko/20051006 Firefox/1.4.1

Whenever a page whose source does not specify encoding loads in Firefox, the
default encoding is set to Unicode which causes incorrect display of many
characters.  Additionally, View->Character Encoding->Auto-Detect is always set
to "OFF" and there is no way to turn it on.

Reproducible: Always

Steps to Reproduce:
1.  Go to any English-language webpage
2.  Go under View->Character Encoding.
3.  Unicode (UTF8) is highlighted;  View->Character Encoding->Auto-Detect says "OFF"

Actual Results:  
the page loaded using Unicode encoding instead of Western (1252)

Expected Results:  
the page should have loaded using Western (1252) encoding, and the Auto-Detect
should be settable to "ON"
Set your desired default character encoding in Tools -> Options -> Content ->
Fonts & Colors -> Advanced...

The default is ISO-8859-1, so you would've had to change it yourself to UTF-8.

Comment 2

13 years ago
I haven't changed anything, I just installed Beta 2.0 -- that's all.  I can't
even find where I would change the default encoding!
It's beta 2 for 1.5, not a beta for 2.0. :P

The place to change it is right there where I said it is...

You can also go to about:config (type it in the location bar), type "charset" in the filter box, and change the intl.charset.default preference to "windows-1252" or "ISO-8859-1" or whatever other character encoding you want. (Double-click it.)

Comment 4

13 years ago
int.charset.default IS set to ISO-8859-1, but every time I open a page (e.g. the page I'm typing this on right now, is UTF-8.

Comment 5

13 years ago
(In reply to comment #4)
> int.charset.default IS set to ISO-8859-1, but every time I open a page (e.g.
> the page I'm typing this on right now, is UTF-8.

I have tested using 1.5 branch build 20051027 on Windows XP.

If it is a page that you have already opened, the character encoding is cached, so the testing is not valid.

If you are in a XXX-encoded page, you then open a new window (I have not tested with tabs) and in that window type a URL to a page whose character encoding is not specified, and the resulting page is interpreted as being XXX-encoded instead of the value specified in int.charset.default, then you are experiencing bug 158285, which would make this bug a duplicate of the former.

Are you experiencing the issue that when opening a new blank tab or window, "View -> Charecter Encoding" always shows UTF-8? This does not seem to affect the procedure used for selecting the encoding for a new page. Due to its misleading nature, it should probably warrant a bug report, if it has not been yet been filed.

If your experience is not totally covered by the above, please include a concrete test case tried out with a fresh profile, stating from which page and character encoding setting you start from, given that the choice is (unfortunately) significant.

Comment 6

12 years ago
I am having a similar problem in the opposite direction:

Wrong character encoding chosen for Unicode XHTML documents.

I am trying to view some docbook XHTML files, and they are in unicode.
Viewed with Firefox, I see funny little As with umlouts everywhere,
unless I switch to view->encoding->Unicode.

Unicode (UTF8) is actually the default Encoding I have in my Font preferences. 

Anyway, for some reason the browser keeps switching back into
Western (ISO 8859-1) when I view another page in the same document, or
if i reload. Why does it keep switching back to western like that?

It seems as if auto-detect determines which encoding is correct and then uses the OTHER one! And further, my default setting in preferences is totally ignored.

The very top of the XHTML page is this:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">
<html xmlns=""><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title> About the Authors </title><link rel="stylesheet" href="oop.css" type="text/css" /><meta name="generator" content="DocBook XSL Stylesheets V1.68.1" />

Comment 7

12 years ago
One more thing -
it seems to NOT MATTER ONE BIT if auto-detect is set to "off" or universal.
Is it possible to switch off auto-detect?

Comment 8

12 years ago
The bug is still unsolved in version
It becomes increasingly important over time for me.
The percentage of pages displayed correctly goes down.
Files on the local file system become errorneous rendered by default
(doc files from Apache/PHP/MySQL originally in UTF8 but
converted to ISO-8859-1 and edited locally, for an annoying example).

I can confirm that not a single configuration option has any influence on the wrong charset selection, not even anything thats accessible via about:config!
(May be there has been something messed up with the configuration, but i'm unable to detect any possible reason for that from the forum.)

It is not bound to windows xp as shown in the header of this bug report.
I'm using windows 98.

Switching off autodetection completely (despite it being impossible at the moment) would not be a usefull workaround, since more and more files become coded in UTF8, what is in priciple a good thing. But not all the Editors are up to date. Ultra Edit, for example, is unusable with Unicode files on Windows 98 at the moment.
It is only a matter of changeover to UTF encoding, but very annoying while in process.

Comment 9

12 years ago

I realized the reason for this behaviour:

The character encoding is wrongly choosen if it is wrongly defined in a 
"META HTTP-EQUIV"-tag in the html file,
THIS resulting from editing a file locally with conversion of the charset because of a missing charset support by the used editor.

Thus it turns out not to be a problem of firefox (quite the contrary), but instead only a missing hint for it in the user documentation.

It would be good to include something in the help file of firefox and the other mozilla products, which is VERY short and uninformative at the moment, while by itself absolutely correct:

Character Encoding

        The character encoding selected here will be used to display pages that
        do not specify which encoding to use.

Should get at least the extension:

        Meta-tags in html documents always overwrite the user defined encoding.
Since the reporter didn't show up again after comment #5, maybe his problem is solved?  Maybe this bug could be closed as INVALID?
Reporter, do you still see this problem with the latest Firefox 2? If not, can you please close this bug as WORKSFORME. Thanks!
Whiteboard: CLOSEME 07/14
Version: unspecified → 1.5.0.x Branch

Comment 12

10 years ago
No response, incomplete. 
Last Resolved: 10 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.