Closed
Bug 582788
Opened 14 years ago
Closed 14 years ago
iso-10646 in meta not rejected as a non-ASCII-superset encoding
Categories
(Core :: DOM: HTML Parser, defect, P2)
Core
DOM: HTML Parser
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
blocking2.0 | --- | betaN+ |
People
(Reporter: stream, Assigned: hsivonen)
References
()
Details
(Keywords: regression)
Attachments
(1 file)
4.77 KB,
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
From: https://input.mozilla.com/en-US/search/?product=firefox&sentiment=sad Some user reports that the page http://cda.ipmailing.it/IPMailing/forms/optInForm1.asp cite: "won't show properly" The page is working with Firefox 3.6.8, so this is a regression. Im not sure for the component, but the page and the view source are both unreadable with Firefox 4 b2.
Reporter | ||
Comment 1•14 years ago
|
||
Firefox 3.6.8 detects the page encoding as Western (ISO-8859-1) Firefox 4b2 detects as (UTF-16BE) There is another problem with View source, i cant change the encoding until the view source window is reloaded, should i fill separate bug for this?
Reporter | ||
Comment 2•14 years ago
|
||
(In reply to comment #1) > There is another problem with View source, i cant change the encoding until the > view source window is reloaded anyway filled bug 582795
Comment 3•14 years ago
|
||
This is fallout from the HTML5 parser. I can get the broken behavior in 3.6.8 if I turn that parser on. Note that charsetalias.properties aliases iso-10646 to UTF-16BE. So I'm not quite sure why the old parser didn't use that encoding here... but if I change the meta to say "UTF-16BE" we render the page correctly. So there's some sort of special-casing going on here, and it seems to happen _before_ alias resolution. Should it happen after?
Assignee | ||
Comment 4•14 years ago
|
||
(In reply to comment #3) > So there's some sort > of special-casing going on here, and it seems to happen _before_ alias > resolution. Should it happen after? Looks like it need to happen after. The code is: http://mxr-test.konigsberg.mozilla.org/mozilla-central/source/parser/html/nsHtml5MetaScannerCppSupplement.h#87
OS: Windows XP → All
Priority: -- → P2
Hardware: x86 → All
Summary: Firefox 4 b2 cant render page with encoding iso-10646 → iso-10646 in meta not mapped to UTF-8
Assignee | ||
Comment 5•14 years ago
|
||
I wonder why http://mxr-test.konigsberg.mozilla.org/mozilla-central/source/parser/html/nsHtml5MetaScannerCppSupplement.h#112 isn't working
Assignee | ||
Comment 6•14 years ago
|
||
The bug is instead that similar checks are missing here: http://mxr-test.konigsberg.mozilla.org/mozilla-central/source/parser/html/nsHtml5StreamParser.cpp#750
Assignee | ||
Comment 7•14 years ago
|
||
Safari seems to special-case UTF-16 without alias resolution so it doesn't sniff to UTF-8. It then reject UTF-16BE as non-ASCII-based encoding, so the default encoding (Windows-1252) kicks in.
Assignee | ||
Updated•14 years ago
|
Summary: iso-10646 in meta not mapped to UTF-8 → iso-10646 in meta not rejected as a non-ASCII-superset encoding
Assignee | ||
Comment 8•14 years ago
|
||
I've examined four Web pages that declare iso-10646 in meta. (Thankfully, they are rare.) 3 were ASCII. One was Windows-1252. So from this data, it seems we should *not* do alias resolution before the UTF-16 to UTF-8 aliasing step.
Assignee | ||
Comment 9•14 years ago
|
||
(In reply to comment #7) > Safari seems to special-case UTF-16 without alias resolution so it doesn't > sniff to UTF-8. Chances are I'm misreading Safari's encoding menu. :-(
Assignee | ||
Comment 10•14 years ago
|
||
Assignee | ||
Updated•14 years ago
|
Comment 11•14 years ago
|
||
Comment on attachment 461189 [details] [diff] [review] Unify late meta treatment with prescan treatment, remove UTF-32 to UTF-8 aliasing while at it This looks fine, but shouldn't all those calls be LowerCaseEqualsLiteral? r=me with that.
Attachment #461189 -
Flags: review?(bzbarsky) → review+
Updated•14 years ago
|
blocking2.0: ? → betaN+
Assignee | ||
Comment 12•14 years ago
|
||
(In reply to comment #11) > This looks fine, but shouldn't all those calls be LowerCaseEqualsLiteral? r=me > with that. Thanks. Pushed with that change: http://hg.mozilla.org/mozilla-central/rev/98617b5a532b The corresponding spec change is being tracked as http://www.w3.org/Bugs/Public/show_bug.cgi?id=10260
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•