Closed Bug 815279 Opened 12 years ago Closed 10 years ago

Broken pages: Firefox (unlike Opera/Webkit/Chrome) ignores XML encoding declaration in HTML

Categories

(Core :: DOM: HTML Parser, defect)

17 Branch
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 673087

People

(Reporter: xn--mlform-iua, Unassigned)

Details

(Keywords: intl)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/17.0 Firefox/17.0 Build ID: 20121119183901 Steps to reproduce: Open http://www.xn--elqus623b.net/XKCD/1137.html 1. The HTTP says Content-Type:text/html 2. There is an XML prolog: <?xml version="1.0" encoding="utf-8" ?> 3. There is no <meta> element. Actual results: Page was parsed like IE does — as ISO-8859-1 encoded. Expected results: Expected like Safari/Opera/Chrome (and W3m) do - as UTF-8.
Component: Untriaged → HTML: Parser
Keywords: intl
OS: Mac OS X → All
Product: Firefox → Core
Hardware: x86 → All
Summary: Broken page because Firefox doesn't respect XML prolog in HTML files → Broken pages: Firefox (unlike Opera/Webkit/Chrome) ignores XML encoding declaration in HTML
I ran into a similar issue when testing pages using "File -> Open". It might be OK if the page is loaded from a web server, but the page should look the same regardless of how the encoding is obtained, as long as it is the same encoding (UTF-8 in this case) from wherever the encoding is specified ("charset=" in a Content-Type header - which isn't available for pages loaded from the file system, a directive with encoding="UTF-8", a <meta> tag, etc.). For a page loaded via File, Open, it appears that Firefox is paying attention to the byte order mark (BOM) but not the encoding directive. Given the HTML 5 source below (based on a simple example of HTML 5 web page code at http://www.HTML-5.com/tutorials/basic-html-code.html) ... 1) In an editor, do File, Save As and save the file with UTF-8 encoding _with_ the BOM. In Firefox, do File, Open and browse to the saved file. The international characters are displayed correctly. 2) In the editor again, do File, Save As and save the file with UTF-8 encoding but _without_ the BOM. In Firefox, do File, Open and browse to the saved file. The international characters are wrong, presumably because the encoding directive is being ignored. <?xml version="1.1" encoding="UTF-8"?> <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Other TLDs <!-- February 9, 2014 --></title> <link rel="stylesheet" type="text/css" href="/styles/screen.css"/> </head> <body> <p>测试 <br/>परीक्षा <br/>испытание <br/>테스트 <br/>測試 <br/>பரிட்சை <br/>δοκιμή <br/>テスト <br/>טעסט <br/>زمایشی <br/>إختبار </p> </body> </html>
I have the same problem with the page http://www.mathsim.eu/interna/git-use.html Using Ubuntu 14.04 with Firefox 29.0 Build identifier: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:29.0) Gecko/20100101 Firefox/29.0
The following is supposed to be a nonbreaking space character in ISO-8859-1: (0xa0) Â On all Mac programs it shows as a space. Only Firefox seems to show Â, which is the UTF-8 encoding. I encountered this ticket when looking for the reason. We tried to vary the encoding from <?xml version="1.1" encoding="UTF-8"?> to: <?xml version="1.1" encoding="ISO-8859-1"?> <?xml version="1.1" encoding="iso-8859-1"?> Loaded it as a file locally, and via a server. ÂÂÂÂÂ Shows everywhere. It's part of a fairly complex Java application that produces these files, so it's quite agonizing trying to sort it out as it can seemingly be done in several places.
Status: UNCONFIRMED → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.