Last Comment Bug 608609 - Firefox may sometimes display a page with the wrong encoding
: Firefox may sometimes display a page with the wrong encoding
Status: UNCONFIRMED
[WFM?]
:
Product: Core
Classification: Components
Component: Internationalization (show other bugs)
: 15 Branch
: All All
: -- normal (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
Mentors:
http://atilf.atilf.fr and beyond
Depends on:
Blocks: 254868
  Show dependency treegraph
 
Reported: 2010-10-31 10:23 PDT by André Pirard
Modified: 2015-11-12 22:57 PST (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments

Description André Pirard 2010-10-31 10:23:52 PDT
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.11) Gecko/20101013 Ubuntu/10.04 (lucid) Firefox/3.6.11
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.11) Gecko/20101013 Ubuntu/10.04 (lucid) Firefox/3.6.11

Full, detailed story:  https://bugs.launchpad.net/ubuntu/+bug/228988

Since I'm using it, I see Firefox (2.X and 3.X) *sometimes* display pages using the wrong character encoding. Sometime a page will display correctly, later the same page won't, without any apparent reason, nor special action. However, I have found a procedure that's reproducible on all my systems, including freshly installed, e.g. Ubuntu Karmic as is,  but unfortunately not on all other ones (mainly US ones).
display http://atilf.atilf.fr
Click 'Entrez dans le TLF'
Fill in search field: 'erreur', click 'Valider 1'
Select the word 'ERREUR', a window pops up
Click on 'le TLF1' in that window
The pop up window invariably displays ISO8859-1 as UTF-8 as shown in
http://launchpadlibrarian.net/55706907/Screenshot-Mozilla%20Firefox.png

See more details at the Ubuntu launchpad URLs.
Sorry if the questions I was asked there made that report a mess.

Side notes:

It is a bad idea to specify the character code of a Web page in the envelope (SMTP or HTTP MIME), because it puts the burden of modifying the page to someone who stores it to a file instead of to whoever wrote the page in the first place.

Any feature allowing the user to correctly display a bad page is not only inciting the Web authors to continue writing more bad pages but also towards the users displaying good pages the wrong way.  The Web is full of horror stories by people not having understood character sets


Reproducible: Sometimes

Steps to Reproduce:
1. display http://atilf.atilf.fr
2. Click 'Entrez dans le TLF'
3. Fill in search field: 'erreur', click 'Valider 1'
4. Select the word 'ERREUR', a window pops up
5. Click on 'le TLF1' in that window
The pop up window invariably displays ISO8859-1 as UTF-8 as shown in
http://launchpadlibrarian.net/55706907/Screenshot-Mozilla%20Firefox.png

Actual Results:  
http://launchpadlibrarian.net/55706907/Screenshot-Mozilla%20Firefox.png

Expected Results:  
same after forcing ISO8859-1

Thanks.
Comment 1 Nickolay_Ponomarev 2010-11-27 09:47:34 PST
Assuming that in step 4 by "select" you meant double-click the word in red, I can't reproduce on Mac, neither using 3.6.12, nor using Firefox 4 nightlies.

This is the page that opens in a frame in the pop-up: http://194.214.124.200/dendien/scripts/tlfiv5/displayp.exe?18;s=atilf_40_2841007530;i=ft-4-2.htm;;
It doesn't send the charset in HTTP headers at all and doesn't specify the charset in the HTML, as far as I can see.

Can you reproduce using a clean profile? <http://support.mozilla.com/en-US/kb/Basic+Troubleshooting> With a build from mozilla.org? What's selected in View -> Character Encoding -> Autodetect?
Comment 2 André Pirard 2010-11-28 03:39:14 PST
Привет Николай?
> Assuming that in step 4 by "select" you meant double-click the word in red,
No, by 'select' I mean 'select'.
> I can't reproduce on Mac, neither using 3.6.12, nor using Firefox 4 nightlies.
That's the meaning of 'may', 'Reproducible: Sometimes' and comments on Launchpad. The occurrence seems to depend on the system used, on the page and on some circumstances. The choice of this particular example is because it always failed for me.  But that's never for others. At least 7 people have confirmed to have seen something like this.   But most people just shudder.
> This is the page that opens in a frame in the pop-up:
> http://194.214.124.200/dendien/scripts/tlfiv5/displayp.exe?18;s=atilf_40_2841007530;i=ft-4-2.htm;;
The error is not in that page.  I said "Click on 'le TLF1' in that window" and the error is in a page like this
http://194.214.124.200/dendien/scripts/tlfiv5/xxx.exe?mthk=1;forced=40;s=2807581110;mot=ERREUR;pot=40,53,41,42,43,44,46;from=atilf,51,2807581110;ru=no;
> It doesn't send the charset in HTTP headers at all and doesn't specify the
> charset in the HTML, as far as I can see.
Correct. No tlfi page does it. And that why I said about the second page that it "displays ISO8859-1 as UTF-8". That is the problem. It may probably affect only pages without character set definitions.
BTW, specifying the character set in HTTP is a crazy invention because it means that a program storing a page must modify it (and Firefox fails to do that).
> Can you reproduce using a clean profile?
Yes, it's now four years that I'm doing that on request. Somebody should conclude that the problem should be analyzed, not watched.

I said this. Isn't it what you mean?
> In VirtualBox[es], I have Ubuntu 10.04, 10.10, Windows 1.01, XP and 7 all [but 1.01 ;-)]
> exhibiting the Bug with native Firefox or its latest Ubuntu update.
> Maverick has 3.6.10 and my running Lucid [has] 3.6.11.
These systems were just installed or very little used (just for some installations and tests), Ubuntu ran the Firefox version that came with it and the latest Firefox was installed in Windows.
> <http://support.mozilla.com/en-US/kb/Basic+Troubleshooting>
> With a build from mozilla.org? 
This troubleshooting article is indecent.  Asking naive people to erase all the cookies that were recorded to ease their life and then to make tests in a blank Firefox profile sounds like a bad joke.

In my running system I installed 3.6.12 and I checked that the problem still occurs in a blank profile.
> What's selected in View -> Character Encoding -> Autodetect?
That is a feature that is totally undocumented, that nobody understands [I have some good guesses], that everybody and his dog use to try to solve their problems the wrong way and that shouldn't exist.
Firefox is not supposed to autodetect anything and mu setting is most evidently (off).

Shall I say спасибо и пака?
Comment 3 André Pirard 2012-10-27 05:35:22 PDT
FYI

1) This bug, that admittedly does not hit everyone, has been confirmed by some 6 persons on Launchpad for Ubuntu.  So, why 'unconfirmed'?

2) I reported it for release 3.x, I upgraded to release 15 and the bug is still there

3) Since then, I noticed a highly repeatable test: if I open a plain text file (not HTML), Firefox consistently displays with the Cyrillic codepage 

4) Why Cyrillic?
a) because I have it configured in View/Character encoding selections?
b) because I sometimes use Cyrillic?
c) ... ?

5) That may be the reason why it is the codepage in which *some* HTML pages are erroneously displayed 

4) there are at least 2 reason why I may report the bug more than others
a) not many Western characters users have Cyrillic configured and use it
b) I don't just shrug

5) I'm available to give more information to a developer or to conduct nondestructive tests if that's what solving a bug means

Note You need to log in before you can comment on or make changes to this bug.