Closed Bug 155723 Opened 22 years ago Closed 11 years ago

innerHTML will need to be fixed to work with XHTML

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: bzbarsky, Unassigned)

References

Details

(Whiteboard: [has landed, needs testing?])

Attachments

(4 files, 1 obsolete file)

With the fix for bug 125746, we will need to do some changes to innerHTML to make it work right with XHTML elements (eg, CDATA kids of textareas are not well supported by the current setup).
Um, CDATA kids of textareas? Sorry folks but when XHTML is served with the correct MIME type (application/xhtml+xml), innerHTML does not work *at all*...
Yeah, that's bug 133827. I plan to fix it sometime in the not-too-far future. That's why I used the future tense when filing this bug -- "will need to fix" instead of "need to fix".
QA Contact: lchiang → ian
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6b) Gecko/20031119 <br /> are transformed into <br> when I look at the innerhtml value. That's maybe not the best place to ask but why is the html code transformed ? I see that IE also transforms that code (and of course the way to transform it is different :( ).
I agree. The innerHTML should fit the page's DOCTYPE. IE is already far behing as it converts every tag to capitals, alphabetizes all html tag attributes, and even unquotes non-numeric values. I think we should fix this so we can remain forward compatible. M$ dug their own grave :p. I also think that since an XHTML document would have proper tag endings, it would benefit the browser in speed when working with innerHTML.
Attached patch tentative patchSplinter Review
Reasons for the changes: Change in ConfirmPrefix(): nsXMLContentSerializer::ConfirmPrefix(nsAString& aPrefix, const nsAString& aURI) { - if (aPrefix.Equals(kXMLNS)) { + if (aPrefix.Equals(kXMLNS) || + (aPrefix.Equals(kXMLPrefix) && aURI.Equals(kXMLNameSpaceURI))) { return PR_FALSE; } needed to stop, e.g., <p xml:lang="en"> from becoming <p xml:lang="en" xlmns:xml="http://www.w3.org/XML/1998/namespace"> (i.e., xml: is built-in and doesn't need to be declared) ============= Change in nsGenericHTMLElement::SetAttrAndNotify() + rv = mNodeInfo->NodeInfoManager()->GetNodeInfo(aAttribute, aPrefix, aNamespaceID, getter_AddRefs(ni)); Needed because the prefix (e.g. on xml:lang) is lost with XHTML elements for no apparent reasons. This has been a bug all along. But it does no harm because nobody has been using the prefix so far. Consumers relies on the namespaceID which is kept properly. For serialization purposes however, the prefix is not optional, it is needed. EXAMPLE OF OUTPUT with the patch ================================= The output isn't pretty with the patch however... In its eagerness to produce a self-contained output, the XML serializer gives an output with "generated" prefices... For example, in my internal testing with view-selection-source somewhere: <a0:body xmlns:a0="http://www.w3.org/1999/xhtml"> <a0:span lang="test">lang</a0:span> <a0:br/> break <a0:br/> </a0:body> =========== It is okay otherwise. But such an output will make people wonder was is going on.
> Needed because the prefix (e.g. on xml:lang) is lost with XHTML elements for no > apparent reasons. sicking? What's up with that? rbs, there is also the SetInnerHTML issue, though bug 133827 covers most of that... once that's fixed, we'll have to see what remains to be done. For example, what if SetInnerHTML is passed a string that's not well-formed?
Depends on: 133827
> what if SetInnerHTML is passed a string that's not well-formed? Seems fair to fire a DOM error in the usual DOM way. (i.e., only commit changes to the DOM if everything is ok; preserve things as they were before an invalid SetInnerHTML).
Yeah, i noticed that we're loosing the prefix the other day too. It's actually fairly bad since it'll affect callers of 'setAttribute'. Same problem exists in XUL too.
Bug 45627 was the last time that a major surgery happened to the XML serializer. Cc:ing heikki and jst. I wonder if the "generated prefixes" can go away -- c.f. comment #5. Heikki was apparently puzzled by |ConfirmPrefix| too and ignored it at some point, but re-enabled it again, writing that it was needed for scripts. But he left it in the dark the problem that it solved? Also, jst mentioned that things might be different with nsINodeInfo. Well, nsINodeInfo is the foundation now, any thoughts on how to prettify the output -- beyond what is in comment #5 above?
*** Bug 240652 has been marked as a duplicate of this bug. ***
The change in comment 5 for nsGenericHTMLElement::SetAttrAndNotify() to get the real prefix has been checked in bug 248172.
Depends on: 248172
(In reply to comment #9) > Bug 45627 was the last time that a major surgery happened to the XML serializer. > Cc:ing heikki and jst. I wonder if the "generated prefixes" can go away -- c.f. > comment #5. > > Heikki was apparently puzzled by |ConfirmPrefix| too and ignored it at some > point, but re-enabled it again, writing that it was needed for scripts. But he > left it in the dark the problem that it solved? Also, jst mentioned that things > might be different with nsINodeInfo. Well, nsINodeInfo is the foundation now, > any thoughts on how to prettify the output -- beyond what is in comment #5 above? The problem that generated prefixes fix is that anyone can insert nodes through the DOM that have a namespace URI and a random or empty or previously existing prefix that's totally unrelated to the prefixes declared at that point through xmlns attributes. There's other ways to solve this, but we always have to keep track of the prefix->namespace URI mapping. I think your example in comment 5 could be improved by checking for an empty prefix and use a default namespace instead of generating a prefix if there hasn't been a default namespace set before. Or you could go all out and always map the prefix from the DOM, it'll lead to surprising behaviour in certain edge cases but I'm not sure we care and I think it would work.
Note that you'll still need to handle conflicting prefixes (inserting http://foo/|x:foo and http://bar/|x:bar attributes in one element for example).
Conflicting attributes are pretty rare anyway -- but yeah, when they occur we need to (probably arbitrarily, since I doubt we can know which one came first) change one of the prefixes on output.
Blocks: 251695
This doesn't assume that non-HTML documents have to be XHTML. You can have other documents (eg svg, etc)... Probably doesn't matter much for serialization purposes, but.... First issue: The XML serializer is pretty terrible. With this patch, if I take: <div id="thetarget" onclick="clickHandler()">Click me <br/> More text here <span>And more</span> <img /></div> the innerHTML of the <div/> is: Click me <a0:br xmlns:a0="http://www.w3.org/1999/xhtml"/> More text here <a1:span xmlns:a1="http://www.w3.org/1999/xhtml">And more</a1:span> <a2:img xmlns:a2="http://www.w3.org/1999/xhtml"/> The code really needs to be better at not outputing redundant namespace URIs.... and perhaps at using xmlns="whatever" in preference to making up random prefixes for nodes that started out without prefix.
BTW, these redundant namespace declarations and prefixes often break rendering. There are many users out there that serialize i.e. transformation result trees to (X)HTML and the browser simply doesn't render that properly with the redundant prefixes/declarations. Could be the chrome CSS, i dont really know. i have even reccomended the xml method without namespaces in the result tree! The browser will render that fine when the result tree has no namespace and the host document is XHTML which is actually wrong behaviour. Just wanted to also say a better serializer is needed, but *please* dont change the interfaces in a non backwards compatible way. Thanks for listening.
While I agree about the prefixes I wonder which namespace declaration is redundant in Click me <a0:br xmlns:a0="http://www.w3.org/1999/xhtml"/> More text here <a1:span xmlns:a1="http://www.w3.org/1999/xhtml">And more</a1:span> <a2:img xmlns:a2="http://www.w3.org/1999/xhtml"/>
Ah, oops. You're right that none of those tags are nested... The real testcase should be: <div id="thetarget" onclick="clickHandler()"><span>And more<img /> </span> </div> which gives: <a0:span xmlns:a0="http://www.w3.org/1999/xhtml">And more<a0:img/> </a0:span> By using default namespaces, we could reduce the number of namespace prefixes hanging about considerably, I think. Manos, the XML generated here is valid (if ugly), so if there is a problem with rendering it please file a bug on it (and cc me).
Ok, submitted bug 263144. IMHO this should be fixed via innerHTML as I also suggest and try to back up there.
Depends on: 263225
I filed bug 263225, with patch, on the unnecessary namespace prefixes the serializer creates.
I did some research on this. This is NOT a bug. It is valid according to the XHTML 1.1 standards. Say you have valid XHTML 1.1, then someone does this: object.innerHTML = '<b><i></b></i>', you are now officially breaking the most fundamental of all standards rules. Your standards are completely broken. This is critical for two reasons: 1) XHTML 1.1 is XML and 2) XHTML 1.1 is sent with application/xhtml+xml ONLY(not text/html!!), thus it is parsed by the browser. Having it parsed properly in it's static structure and then destroying it in the dynamic structure is inappropriate. IE team promotes the ability to write to innerHTML, the Mozilla team promotes standards. The two a mutually exclusive. If you need similar functionality, then use the DOM to create the elements. Please close this bug.
> It is valid according to the XHTML 1.1 standards. XHTML has nothing to say about this extension to the DOM. > then someone does this: object.innerHTML = '<b><i></b></i>' Mozilla throws a DOM_SYNTAX_ERR exception if you try to do this in XHTML. As you would know if you'd, say, tried it. > Please close this bug. Please do your research on the code you're commenting on, not on unrelated specifications, before commenting on bugs.
Well, the truth will prove itself in the end. XHTML has nothing to do with the DOM. PERIOD(well, you know that). That's why we use the DOM to get the functionality that XHTML cannot provide. Two separate worlds which can live in harmony. You know this, you sound like a serious pro. Probably better than me in some areas, but I respecfully think you are flawed here. It throws an exception to make sure you do not violate the standards. This is very apprpriate. I wish IE did that. In fact in my .NET code, in all my custom libraries. If you try to violate a standard. BAM. Exception it thrown. Mozilla is forcing you to be proper. Please thank them. I found this out the hard way by switching to XHTML1.1 with application/xhtml+xml. I immediately posted in all the bug forums about this "SERIOUS" bug and that it "NEEDS" to be fixed. Well, then I grew up and recanted of that. I rewrote my code to be more standards based. The DOM helps you do that. Please don't be a jerk, this is not a forum. This is not a bug, it is a complaint about convienence which I admit that I was party to, But I retract my statement on calling this a bug and I wish I could remove my bug reports from a few websites. It was completely immature of me to do that. You're right. I should have done my research first. I'm sorry. End of Thread. (In reply to comment #23) > > It is valid according to the XHTML 1.1 standards. > > XHTML has nothing to say about this extension to the DOM. > > > then someone does this: object.innerHTML = '<b><i></b></i>' > > Mozilla throws a DOM_SYNTAX_ERR exception if you try to do this in XHTML. As > you would know if you'd, say, tried it. > > > Please close this bug. > > Please do your research on the code you're commenting on, not on unrelated > specifications, before commenting on bugs.
David, I think you misunderstand what this bug is about. It's about two issues, mostly: 1) _Reading_ the innerHTML property is broken in XHTML in some cases (while writing works). 2) Both reading and writing are broken for certain (X)HTML node classes that are parsed "specially" in HTML (but not in XHTML, hence it being broken). Please feel free to send me mail privately if you have any more questions about this bug, but don't add any more long misguided comments here, ok? Certainly not ones in which you quote an entire previous comment. As you said, this is not a message board. And this bug is not exactly a discussion. It's just a way to track development filed based on a discussion about innerHTML that happened back in 2002 between one Mozilla developer (me) and another Mozilla developer (the bug's assignee).
Comment on attachment 161091 [details] [diff] [review] Another partial patch I think we should just take this patch for 1.8. It makes things much better, though possibly not completely perfect...
Attachment #161091 - Flags: superreview?(jst)
Attachment #161091 - Flags: review?(bugmail)
Flags: blocking1.8b4?
Flags: blocking1.8b4? → blocking1.8b4+
Comment on attachment 161091 [details] [diff] [review] Another partial patch sr=jst
Attachment #161091 - Flags: superreview?(jst) → superreview+
Comment on attachment 161091 [details] [diff] [review] Another partial patch Requesting 1.8b4 approval. This just makes the innerHTML serialize as the MIME type of the document (so as XML for XML documents, basically). HTML documents are not affected, but for XHTML/XML we get a string that can be passed to the innerHTML setter and not cause an XML parse error...
Attachment #161091 - Flags: approval1.8b4?
Attachment #161091 - Flags: approval1.8b4? → approval1.8b4+
bz, this causes lonely elements from media documents (such as from view image) to fail because there are no serializers for img/gif & etc. Suggesting to add this fallback (it would be more consistent to use the original containing mimetype where these elements came from): [...] + if (!docEncoder) +- docEncoder = do_CreateInstance(..."text/html"...)... NS_ENSURE_TRUE(docEncoder, NS_ERROR_FAILURE);
Er... people are doing innerHTML for those cases? <sigh>.... Patch coming up.
Attached patch Hackety-hack (obsolete) — Splinter Review
Attachment #189506 - Flags: superreview?(jst)
Attachment #189506 - Flags: review?(jst)
Checked in the previous patch, by the way. We still need to deal with HTMLElement impls that hack together a weird innerHTML getter and setter, I think... Not sure how many of those are left.
Yeah, there are all sort of bookmarklets these days...
The patch checked in from this bug (attachment 161091 [details] [diff] [review]) seems to have broken ChatZilla rather badly, but don't panic just yet! + QueryInterface (function) 3 lines + message (string) 'Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIDOMNSHTMLElement.innerHTML]' + result (number) 2147500037 + name (string) 'NS_ERROR_FAILURE' + filename (string) 'chrome://chatzilla/content/static.js' + lineNumber (number) 1356 + columnNumber (number) 0 + location (object) + inner (object) null + data (object) null + initialize (function) 3 lines I'm going to explain what ChatZilla does, why I think it broke, and what I think it going to fix it. Firstly, ChatZilla creates elements (using document.createElementNS) when processing messages, to apply effects like *bold* and so on. They're nothing more than html:span usually. Now, I am reasonably sure these elements are sorta-attached to the XUL ChatZilla document at this point. So along comes innerHTML (which we use as a mega-hack to flatten the message for logging purposes), and it goes to get a doc encoder thingy for XUL. Oops, there isn't one! So we fail. Now, that's all a little dodgy to start with, but it seems that it /could/ work (like it has since at least Mozilla 1.0). I think bz's hackety-hack patch will fix it. So, really, I'm just warning you that the dups (if we get any) may be a little... strange. :)
so question... what about random xml content? what about, say, svg? do all XML mime types that can contain XHTML need their own encoder?
(In reply to comment #31) > Created an attachment (id=189506) [edit] > Hackety-hack > I have created a build including this patch and can verify that it fixes the issue with chatzilla described in comment #34.
Attachment #189506 - Flags: superreview?(jst)
Attachment #189506 - Flags: superreview-
Attachment #189506 - Flags: review?(jst)
Attachment #189506 - Flags: review-
This should fix XUL to use the XML serializer... Problem is that other consumers of serializers will fail to deal with XUL, still. :(
Attachment #189506 - Attachment is obsolete: true
Attachment #189651 - Flags: superreview?(jst)
Attachment #189651 - Flags: review?(jst)
(In reply to comment #37) > Created an attachment (id=189651) [edit] > Better hackity-hack > > This should fix XUL to use the XML serializer... Problem is that other > consumers of serializers will fail to deal with XUL, still. :( I have verified that chatzilla works with this patch as well.
Comment on attachment 189651 [details] [diff] [review] Better hackity-hack r+sr=jst
Attachment #189651 - Flags: superreview?(jst)
Attachment #189651 - Flags: superreview+
Attachment #189651 - Flags: review?(jst)
Attachment #189651 - Flags: review+
Comment on attachment 189651 [details] [diff] [review] Better hackity-hack Requesting 1.8b4 approval. more innerHTML in XML fun...
Attachment #189651 - Flags: approval1.8b4?
*** Bug 301273 has been marked as a duplicate of this bug. ***
Attachment #189651 - Flags: approval1.8b4? → approval1.8b4+
Checked that in too.
Whiteboard: [has landed, needs testing?]
Chris, see comment 32. But yes, the basic reading of innerHTML in XML docs needs testing...
What's left to do here? Have we fixed the blocking1.8b4 parts?
Yeah, the blocking parts are fixed. Perhaps we should file a followup bug on the remaining work...
Assignee: bugmail → general
Flags: blocking1.8b4+
Depends on: 340800
What work remains? Seems like this bug should've been closed a couple years ago, eh?
Um... read the bug? Comment 0 and comment 32 pretty much cover what remains to be done, and has remained to be done the entire time this bug has been filed.
Assignee: general → nobody
QA Contact: ian → general
I've already filed a bug for this on Webkit (https://bugs.webkit.org/show_bug.cgi?id=30923), but this seems like the right bug to append information to: XHTML character entities don't work when you feed them into XHTML innerHTML. For instance: var element = document.createElement("div"); element.innerHTML = "test &nbsp;"; Will fail. It shouldn't - non-breaking space is a valid character entity in the XHTML namespace, but despite recognizing XHTML elements, Firefox does not recognize entities. That would seem to fly in the face of the purpose of innerHTML - that any string passed to it would be parsed as though inserted into the page. And of course, in the case of a text/html DOM document, this example succeeds.
Component: DOM: Mozilla Extensions → DOM
Attachment #189651 - Flags: checkin+
Attachment #161091 - Flags: checkin+
I believe this is now following the relevant spec, so we're all good here.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: