Closed Bug 107904 Opened 19 years ago Closed 16 years ago
Problems with XHTML as text/html in HTML tokenizer
From conversation with harishd on IRC tonight, CParserContext::SetMimeType() isn't setting the XML parsing flag for XHTML served as text/html, which it will eventually need to parse XML-style empty tags properly. Solution is probably to modify doctype-sniffing code upstream of tokenizer so that XHTML DTDs set both the strict and XML flags. (Actually, layout is handling text/html XHTML as quirks right now, but that's another bug).
Looks like the most likely solution would be to hack nsParser.cpp around line 1105 so documents with an XHTML doctype are set to, say, aDocType = eXHTML. In "if (aDocType == eXML)", s/XML/XHTML/, and the tokenizer should work correctly. Parse Mode will still be set to strict; however, I'm not sure exactly how the parser decides to use HTML or XML tokenizer, so I don't know how this would affect that mechanism.
Status: NEW → ASSIGNED
Priority: -- → P3
Target Milestone: --- → mozilla0.9.8
mass move to 1.1
Target Milestone: mozilla1.0 → mozilla1.1
Pardon my ignorance, but is this bug requesting that XHTML documents served as text/html be parsed as XML (as they would if they were served with an XML mime type)?
I should hope not! This is in reference to XML-style empty elements being incompatible with strict SGML parsing: see <URL:http://www.cs.tut.fi/~jkorpela/html/empty.html>.
Target Milestone: mozilla1.1alpha → Future
According to the HTMLWG, XHTML sent as text/html must not be parsed any differently than HTML sent as text/html. So this would be WONTFIX, no? I wish we implemented the NET feature...
I'm going to go out on a limb here and wontfix this.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.