User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:184.108.40.206) Gecko/20101013 Ubuntu/10.04 (lucid) Firefox/3.6.11 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:220.127.116.11) Gecko/20101013 Ubuntu/10.04 (lucid) Firefox/3.6.11 Full, detailed story: https://bugs.launchpad.net/ubuntu/+bug/228988 Since I'm using it, I see Firefox (2.X and 3.X) *sometimes* display pages using the wrong character encoding. Sometime a page will display correctly, later the same page won't, without any apparent reason, nor special action. However, I have found a procedure that's reproducible on all my systems, including freshly installed, e.g. Ubuntu Karmic as is, but unfortunately not on all other ones (mainly US ones). display http://atilf.atilf.fr Click 'Entrez dans le TLF' Fill in search field: 'erreur', click 'Valider 1' Select the word 'ERREUR', a window pops up Click on 'le TLF1' in that window The pop up window invariably displays ISO8859-1 as UTF-8 as shown in http://launchpadlibrarian.net/55706907/Screenshot-Mozilla%20Firefox.png See more details at the Ubuntu launchpad URLs. Sorry if the questions I was asked there made that report a mess. Side notes: It is a bad idea to specify the character code of a Web page in the envelope (SMTP or HTTP MIME), because it puts the burden of modifying the page to someone who stores it to a file instead of to whoever wrote the page in the first place. Any feature allowing the user to correctly display a bad page is not only inciting the Web authors to continue writing more bad pages but also towards the users displaying good pages the wrong way. The Web is full of horror stories by people not having understood character sets Reproducible: Sometimes Steps to Reproduce: 1. display http://atilf.atilf.fr 2. Click 'Entrez dans le TLF' 3. Fill in search field: 'erreur', click 'Valider 1' 4. Select the word 'ERREUR', a window pops up 5. Click on 'le TLF1' in that window The pop up window invariably displays ISO8859-1 as UTF-8 as shown in http://launchpadlibrarian.net/55706907/Screenshot-Mozilla%20Firefox.png Actual Results: http://launchpadlibrarian.net/55706907/Screenshot-Mozilla%20Firefox.png Expected Results: same after forcing ISO8859-1 Thanks.
Assuming that in step 4 by "select" you meant double-click the word in red, I can't reproduce on Mac, neither using 3.6.12, nor using Firefox 4 nightlies. This is the page that opens in a frame in the pop-up: http://18.104.22.168/dendien/scripts/tlfiv5/displayp.exe?18;s=atilf_40_2841007530;i=ft-4-2.htm;; It doesn't send the charset in HTTP headers at all and doesn't specify the charset in the HTML, as far as I can see. Can you reproduce using a clean profile? <http://support.mozilla.com/en-US/kb/Basic+Troubleshooting> With a build from mozilla.org? What's selected in View -> Character Encoding -> Autodetect?
Привет Николай? > Assuming that in step 4 by "select" you meant double-click the word in red, No, by 'select' I mean 'select'. > I can't reproduce on Mac, neither using 3.6.12, nor using Firefox 4 nightlies. That's the meaning of 'may', 'Reproducible: Sometimes' and comments on Launchpad. The occurrence seems to depend on the system used, on the page and on some circumstances. The choice of this particular example is because it always failed for me. But that's never for others. At least 7 people have confirmed to have seen something like this. But most people just shudder. > This is the page that opens in a frame in the pop-up: > http://22.214.171.124/dendien/scripts/tlfiv5/displayp.exe?18;s=atilf_40_2841007530;i=ft-4-2.htm;; The error is not in that page. I said "Click on 'le TLF1' in that window" and the error is in a page like this http://126.96.36.199/dendien/scripts/tlfiv5/xxx.exe?mthk=1;forced=40;s=2807581110;mot=ERREUR;pot=40,53,41,42,43,44,46;from=atilf,51,2807581110;ru=no; > It doesn't send the charset in HTTP headers at all and doesn't specify the > charset in the HTML, as far as I can see. Correct. No tlfi page does it. And that why I said about the second page that it "displays ISO8859-1 as UTF-8". That is the problem. It may probably affect only pages without character set definitions. BTW, specifying the character set in HTTP is a crazy invention because it means that a program storing a page must modify it (and Firefox fails to do that). > Can you reproduce using a clean profile? Yes, it's now four years that I'm doing that on request. Somebody should conclude that the problem should be analyzed, not watched. I said this. Isn't it what you mean? > In VirtualBox[es], I have Ubuntu 10.04, 10.10, Windows 1.01, XP and 7 all [but 1.01 ;-)] > exhibiting the Bug with native Firefox or its latest Ubuntu update. > Maverick has 3.6.10 and my running Lucid [has] 3.6.11. These systems were just installed or very little used (just for some installations and tests), Ubuntu ran the Firefox version that came with it and the latest Firefox was installed in Windows. > <http://support.mozilla.com/en-US/kb/Basic+Troubleshooting> > With a build from mozilla.org? This troubleshooting article is indecent. Asking naive people to erase all the cookies that were recorded to ease their life and then to make tests in a blank Firefox profile sounds like a bad joke. In my running system I installed 3.6.12 and I checked that the problem still occurs in a blank profile. > What's selected in View -> Character Encoding -> Autodetect? That is a feature that is totally undocumented, that nobody understands [I have some good guesses], that everybody and his dog use to try to solve their problems the wrong way and that shouldn't exist. Firefox is not supposed to autodetect anything and mu setting is most evidently (off). Shall I say спасибо и пака?
FYI 1) This bug, that admittedly does not hit everyone, has been confirmed by some 6 persons on Launchpad for Ubuntu. So, why 'unconfirmed'? 2) I reported it for release 3.x, I upgraded to release 15 and the bug is still there 3) Since then, I noticed a highly repeatable test: if I open a plain text file (not HTML), Firefox consistently displays with the Cyrillic codepage 4) Why Cyrillic? a) because I have it configured in View/Character encoding selections? b) because I sometimes use Cyrillic? c) ... ? 5) That may be the reason why it is the codepage in which *some* HTML pages are erroneously displayed 4) there are at least 2 reason why I may report the bug more than others a) not many Western characters users have Cyrillic configured and use it b) I don't just shrug 5) I'm available to give more information to a developer or to conduct nondestructive tests if that's what solving a bug means