Closed Bug 240962 Opened 21 years ago Closed 21 years ago

Charset in Content-Type header is ignored for XHTML

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
major

Tracking

()

VERIFIED FIXED

People

(Reporter: supermario, Assigned: bugzilla-mozilla-20000923)

References

()

Details

(5 keywords)

Attachments

(1 file)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7b) Gecko/20040414 Firefox/0.8 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a) Gecko/20040418 Firefox/0.8 FF ignores given Charset on XHTML sites served with content type "application/xhtml+xml". On the example page, the content type header is: Content-Type: application/xhtml+xml; charset=iso-8859-1 But FF displays the site with charset UTF-8. Reproducible: Always Steps to Reproduce: 1. go to www.ego4u.de 2. you can't see German-Umlauts
Summary: Charset is ignored on XHTML sites → Charset in Content-Type header is ignored for XHTML
Confirmed with Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a) Gecko/20040418 Firefox/0.8.0+ Manually changing the character encoding results in nothing. UTF-8 is still selected.
Status: UNCONFIRMED → NEW
Ever confirmed: true
didn't bz recently fix a bug related to this that might've caused this? for future reference... charset detection is not done in frontend specific code, so this should be in the browser product. but I'm not sure which component, I'll let bz correct it :)
hmm, bug 240321. sorry, that was jst, not bz
Er, yes. Compare http://lxr.mozilla.org/seamonkey/source/content/html/document/src/nsHTMLDocument.cpp#830 to http://lxr.mozilla.org/seamonkey/source/content/xml/document/src/nsXMLDocument.cpp#578 We should probably push the TryChannelCharset function on nsHTMLDocument up to nsDocument or something and call it from both those places and from nsXULDocument::PrepareToLoadPrototype jst, do you have time to deal with this? At least the XML document part we need on the 1.7 branch, since bug 240321 landed there...
Assignee: firefox → general
Component: General → DOM
OS: Windows XP → All
Product: Firefox → Browser
QA Contact: ian
Hardware: PC → All
Version: unspecified → Trunk
Not that anyone here appears to be trying to suggest this, but, make sure we don't take the <meta http-equiv="Content-Type"> line into account when determining the charset of an XHTML document.
And be sure to also test "Manually changing the character encoding" as mentioned in comment 1 (which seems unaddressed by bz's comment)
Manually changing the character encoding is generally unsupported by XML at the moment. It's not clear to me that it ever should be supported for XML. No matter what, that should be in a separate bug filed on intl so people with intl clue can comment. > <meta http-equiv="Content-Type"> Handled by the parser, not this code. See http://lxr.mozilla.org/seamonkey/source/htmlparser/src/nsParser.cpp#2180
I won't be able to patch and test this any time soon... but we do need this fixed for 1.7. Can someone take this?
Flags: blocking1.7?
Keywords: helpwanted
Keywords: regression, xhtml
Flags: blocking1.7? → blocking1.7+
#1 bug has been filed to bug 234628.
Blocks: 234628
Keywords: intl
Assignee: general → silver
This puts TryChannelCharset into nsDocument, as bz suggested. It also makes nsHTMLDocument (when IsXHTML()) and nsXMLDocument call it, instead of their current behaviours (HTML: always use UTF-8, XML: duplicated code to check channel charset). I have tested my patched non-debug Mozilla with XML and XHTML documents sent without a charset (uses UTF-8) and sent with charset Windows-1252 in the HTTP header (uses Windows-1252). The site http://www.ego4u.de/ (in this bug's URL) also correctly shows the German-Umlauts. I can test other scenarios if needed.
Comment on attachment 147502 [details] [diff] [review] Move TryChannelCharset up to nsDocument, and use in HTML/XML documents. r+sr=bzbarsky. The testing you did should be just fine. Thank you for picking this up!
Attachment #147502 - Flags: superreview+
Attachment #147502 - Flags: review+
Comment on attachment 147502 [details] [diff] [review] Move TryChannelCharset up to nsDocument, and use in HTML/XML documents. This fixes a big regression for XHTML and has a low risk (no new charset code, just moved and used for XHTML).
Attachment #147502 - Flags: approval1.7?
Checked in, tinderboxes have cycled. I believe the patch covered this bug completely (it certainly fixed the site in question), so I'm marking this bug FIXED. If this is not the case, please reopen the bug stating why.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
v.
Status: RESOLVED → VERIFIED
Silver, did you meant to misspell your email in the patch and check-in?
Keywords: fixed1.7
Keywords: fixed1.7verified1.7
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: