Closed
Bug 240962
Opened 21 years ago
Closed 21 years ago
Charset in Content-Type header is ignored for XHTML
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
VERIFIED
FIXED
People
(Reporter: supermario, Assigned: bugzilla-mozilla-20000923)
References
()
Details
(5 keywords)
Attachments
(1 file)
8.98 KB,
patch
|
bzbarsky
:
review+
bzbarsky
:
superreview+
dbaron
:
approval1.7+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7b) Gecko/20040414 Firefox/0.8
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a) Gecko/20040418 Firefox/0.8
FF ignores given Charset on XHTML sites served with content type
"application/xhtml+xml". On the example page, the content type header is:
Content-Type: application/xhtml+xml; charset=iso-8859-1
But FF displays the site with charset UTF-8.
Reproducible: Always
Steps to Reproduce:
1. go to www.ego4u.de
2. you can't see German-Umlauts
Updated•21 years ago
|
Summary: Charset is ignored on XHTML sites → Charset in Content-Type header is ignored for XHTML
Comment 1•21 years ago
|
||
Confirmed with Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a)
Gecko/20040418 Firefox/0.8.0+
Manually changing the character encoding results in nothing. UTF-8 is still
selected.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 2•21 years ago
|
||
didn't bz recently fix a bug related to this that might've caused this?
for future reference... charset detection is not done in frontend specific code,
so this should be in the browser product. but I'm not sure which component, I'll
let bz correct it :)
Comment 3•21 years ago
|
||
hmm, bug 240321. sorry, that was jst, not bz
![]() |
||
Comment 4•21 years ago
|
||
Er, yes. Compare
http://lxr.mozilla.org/seamonkey/source/content/html/document/src/nsHTMLDocument.cpp#830
to
http://lxr.mozilla.org/seamonkey/source/content/xml/document/src/nsXMLDocument.cpp#578
We should probably push the TryChannelCharset function on nsHTMLDocument up to
nsDocument or something and call it from both those places and from
nsXULDocument::PrepareToLoadPrototype
jst, do you have time to deal with this? At least the XML document part we need
on the 1.7 branch, since bug 240321 landed there...
Assignee: firefox → general
Component: General → DOM
OS: Windows XP → All
Product: Firefox → Browser
QA Contact: ian
Hardware: PC → All
Version: unspecified → Trunk
Comment 5•21 years ago
|
||
Not that anyone here appears to be trying to suggest this, but, make sure we
don't take the <meta http-equiv="Content-Type"> line into account when
determining the charset of an XHTML document.
Comment 6•21 years ago
|
||
And be sure to also test "Manually changing the character encoding" as mentioned
in comment 1 (which seems unaddressed by bz's comment)
![]() |
||
Comment 7•21 years ago
|
||
Manually changing the character encoding is generally unsupported by XML at the
moment. It's not clear to me that it ever should be supported for XML. No
matter what, that should be in a separate bug filed on intl so people with intl
clue can comment.
> <meta http-equiv="Content-Type">
Handled by the parser, not this code. See
http://lxr.mozilla.org/seamonkey/source/htmlparser/src/nsParser.cpp#2180
![]() |
||
Comment 8•21 years ago
|
||
I won't be able to patch and test this any time soon... but we do need this
fixed for 1.7. Can someone take this?
Flags: blocking1.7?
Keywords: helpwanted
Updated•21 years ago
|
Keywords: regression,
xhtml
Updated•21 years ago
|
Flags: blocking1.7? → blocking1.7+
Assignee | ||
Updated•21 years ago
|
Assignee: general → silver
Assignee | ||
Comment 10•21 years ago
|
||
This puts TryChannelCharset into nsDocument, as bz suggested. It also makes
nsHTMLDocument (when IsXHTML()) and nsXMLDocument call it, instead of their
current behaviours (HTML: always use UTF-8, XML: duplicated code to check
channel charset).
I have tested my patched non-debug Mozilla with XML and XHTML documents sent
without a charset (uses UTF-8) and sent with charset Windows-1252 in the HTTP
header (uses Windows-1252). The site http://www.ego4u.de/ (in this bug's URL)
also correctly shows the German-Umlauts. I can test other scenarios if needed.
![]() |
||
Comment 11•21 years ago
|
||
Comment on attachment 147502 [details] [diff] [review]
Move TryChannelCharset up to nsDocument, and use in HTML/XML documents.
r+sr=bzbarsky. The testing you did should be just fine. Thank you for picking
this up!
Attachment #147502 -
Flags: superreview+
Attachment #147502 -
Flags: review+
Assignee | ||
Comment 12•21 years ago
|
||
Comment on attachment 147502 [details] [diff] [review]
Move TryChannelCharset up to nsDocument, and use in HTML/XML documents.
This fixes a big regression for XHTML and has a low risk (no new charset code,
just moved and used for XHTML).
Attachment #147502 -
Flags: approval1.7?
Attachment #147502 -
Flags: approval1.7? → approval1.7+
Assignee | ||
Comment 13•21 years ago
|
||
Checked in, tinderboxes have cycled.
I believe the patch covered this bug completely (it certainly fixed the site in
question), so I'm marking this bug FIXED. If this is not the case, please reopen
the bug stating why.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Comment 15•21 years ago
|
||
Silver, did you meant to misspell your email in the patch and check-in?
Updated•21 years ago
|
Keywords: fixed1.7 → verified1.7
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•