Closed Bug 799937 Opened 12 years ago Closed 9 years ago

Make createElement()'s result always in the HTML namespace

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
minor

Tracking

()

RESOLVED WONTFIX

People

(Reporter: ayg, Assigned: ayg)

References

Details

(Keywords: addon-compat)

Attachments

(1 file, 1 obsolete file)

data:text/html,<!DOCTYPE html>
<script>
document.documentElement.textContent = 
document.implementation
        .createDocument(null, "", null)
        .createElement("x")
        .namespaceURI === null;
</script>

This is "true" in all browsers.  However, the spec says it should be the HTML namespace:

"""
Return a new element with no attributes, namespace set to the HTML namespace, local name set to localName, and node document set to the context object.
"""
http://dom.spec.whatwg.org/#dom-document-createelement

I'm told that what browsers actually do here is complicated and varies, whereas this is simple and should work.  In particular, it's no different for the HTML case, so it's unlikely to break anything anyone cares about.
Tested by this, which will be imported into our test suite soon enough:

http://w3c-test.org/webapps/DOMCore/tests/approved/Node-properties.html

Not a particularly extensive test, but should be good enough for something we don't care much about.  :)
Flags: in-testsuite+
I'm not at all sure we should do this.
What do we do now, in fact?  I can't figure it out from source code inspection.
If I read the code corrently, in (X)HTML documents createElement returns elements in
HTML ns, in XUL documents in XUL ns and otherwise null.
So what initializes mDefaultElementType to 0?  It seems to me like it never gets initialized for non-HTML, non-XUL documents.  Clearly, we can leave behavior for XUL documents alone, since those are nonstandard anyway.

Also, how do you figure that XHTML documents return in the HTML NS?  AFAICT, that line only runs for nsHTMLDocument -- that doesn't include XML documents at all, does it?  It doesn't make sense that XHTML documents should return elements in the null namespace.  (Do they?  I have to run now, can't test.)
Document has zeroing new. So unless initialized to something else, all the member variables
are 0/false
(In reply to :Aryeh Gregor from comment #5)
> Also, how do you figure that XHTML documents return in the HTML NS?  AFAICT,
> that line only runs for nsHTMLDocument -- that doesn't include XML documents
> at all, does it?
nsHTMLDocument is the document when we're dealing with XHTML
The HTML spec does not have the HTMLDocument interface anymore. All documents are just Document per spec. This is to make the members usable on any document. I believe the SVG WG are going to follow this approach for SVGDocument.

http://www.whatwg.org/specs/web-apps/current-work/multipage/dom.html#the-document-object
http://www.whatwg.org/specs/web-apps/current-work/multipage/browsers.html#htmldocument
> Also, how do you figure that XHTML documents return in the HTML NS?

nsHTMLDocument is used for document of type application/xhtml+xml.

Simon, I'm aware that Hixie wants to flatten all the document interfaces.  It's not clear to me that it's desirable, much less web-compatible (in fact, there are some cases involving XMLDocument where it's known to not be web-compatible; it's possible the spec handles those already, though).
(In reply to Boris Zbarsky (:bz) from comment #9)
> Simon, I'm aware that Hixie wants to flatten all the document interfaces. 
> It's not clear to me that it's desirable,

Why not?

> much less web-compatible (in fact,
> there are some cases involving XMLDocument where it's known to not be
> web-compatible; it's possible the spec handles those already, though).

XMLDocument is special in the spec (it has a load() method).

http://dom.spec.whatwg.org/#dom-domimplementation-createdocument
http://www.whatwg.org/specs/web-apps/current-work/multipage/dom.html#loading-xml-documents

Are there other problems?
> Why not?

Well, for a simple example it will put various currently-HTML-specific stuff on SVG documents and vice versa.  That can easily break sites, especially ones that use inline event handlers...

It also gets you the problem that some of the things on these various document subtypes are pretty nonsensical for the other subtypes.  For example, are we going to keep .rootElement returning SVGSVGElement on HTML documents?

As far as XMLDocument, there was more weirdness than just load() but that might be handled by the LenientThis bits we added...
The only thing SVGDocument has that HTMLDocument doesn't is rootElement.

$ grep -aP "rootElement" web200904 
//var rootElement = document.getElementById("facets");
<html xmlns="http://www.w3.org/1999/xhtml" lang="ja" xml:lang="ja" id="rootElement" >
<html xmlns="http://www.w3.org/1999/xhtml" lang="ja" xml:lang="ja" id="rootElement" >
	var rootElement = request.responseXML.documentElement;
	if(rootElement.hasChildNodes()){
	if(rootElement.childNodes.length > 1){
	else if(rootElement.childNodes.length == 1){
		      var rootElement = document.getElementById("searchBody");
		      var titles = getElementsByAttribute(rootElement, "div", "id", "resultTitle");
		      var descriptions = getElementsByAttribute(rootElement, "div", "id", "resultDescription");
		      var locationInfo = getElementsByAttribute(rootElement, "div", "id", "resultLocationInfo");


No rootElement in event handlers in this data set.

Whether it should return null when the root is not <svg>, or maybe if it should be dropped altogether, is up to the SVG WG, I guess.

As for HTMLDocument members in SVG documents, I thought the SVG people wanted that.
Attached patch Patch (obsolete) — Splinter Review
So take a step back here: if we don't want this, what do we want to spec?  Gecko's behavior now (ignoring XUL) is to give it the HTML namespace if it's an (X)HTML document, and null otherwise.  "(X)HTML document" is determined by what?  MIME type of the document?  Is the procedure for deciding whether something is an HTML document sane and/or standardizable?  Do we want to ask other browsers to implement it?

Also, is it ever useful to make createElement() return a null namespace?  An element with null namespace will normally do nothing interesting, correct?  The XHTML namespace might do the wrong thing, if you really wanted an SVG element or something, but at least it stands a chance of being correct.

More generally, I don't see any problem with saying "createElement() is for HTML, use createElementNS() for other stuff".  There are already all kinds of ways in which HTML is special in DOM; what's wrong with adding another one if that's simplest?  Non-HTML documents are fairly unimportant, and we should spec the simplest possible solution that doesn't change behavior for HTML documents.  Hardcoding createElement() as HTML-only looks like it, assuming there's no compat problem.


Anyway, I'm fine with r- for this, but only if you tell me exactly what you want the spec to say instead and why.  :)
Attachment #670359 - Flags: review?(bzbarsky)
Try (Linux64 debug only): https://tbpl.mozilla.org/?tree=Try&rev=588fa16450c9
So in order:

1)  I'm not quite sure what behavior we want here.  This is more sicking/smaug
    territory.  I personally have no objections to this particular change, modulo
    backwards compat concerns.
2)  "(X)HTML document" in Gecko is determined purely based on MIME type.  text/html,
    application/xhtml+xml, and the various synthetic (image/plugin/media) documents are
    all in this bucket.
3)  An element with null namespace will normally do nothing interesting, yes.  The one
    question I have here is whether there are scripts that depend on the current
    behavior (e.g. by using createElement on a data doc to create "script" elements and
    relying on them to not execute when inserted into the main doc).
    What do other UAs do?
An in particular, I'd like an answer to the question at the end of item 3 before reviewing if possible.
(In reply to Boris Zbarsky (:bz) from comment #15)
> 1)  I'm not quite sure what behavior we want here.  This is more
> sicking/smaug
>     territory.  I personally have no objections to this particular change,
> modulo
>     backwards compat concerns.

Smaug seems skeptical.  Sicking, what do you think?

> 2)  "(X)HTML document" in Gecko is determined purely based on MIME type. 
> text/html,
>     application/xhtml+xml, and the various synthetic (image/plugin/media)
> documents are
>     all in this bucket.

Is this something we could reasonably spec?  It basically just depends on document.contentType, which can't change after the page begins to load, right?  It doesn't seem too bad.  zcorpan/annevk, what are your objections to this again?

> 3)  An element with null namespace will normally do nothing interesting,
> yes.  The one
>     question I have here is whether there are scripts that depend on the
> current
>     behavior (e.g. by using createElement on a data doc to create "script"
> elements and
>     relying on them to not execute when inserted into the main doc).
>     What do other UAs do?

In the simple case from comment #0, they all make the namespace null.  I was told that they use different tactics for deciding when to do this -- someone mentioned something about looking at the namespace of the root element, if any.  I don't know what's wrong with MIME type sniffing, exactly.

Anne or Simon, do you have any more details on what other implementations do?  There isn't something simple we could spec that's more compatible?  Gecko's behavior seems pretty simple.
> It basically just depends on document.contentType, which can't change after the page
> begins to load, right?

That's correct.  It would not be hard to spec if desired.  Whether other UAs are interested is a separate question, of course.

And again, I personally don't mind using the HTML namespace as long as other UAs commit to converging on it and it doesn't break existing things...
Making APIs more dependent on context than needed is just hostile towards developers (script portability, predictability, etc.). However, given that effectively nobody uses XML maybe that does not matter. Opera uses the root element namespace to determine a bunch of these things by the way. I suppose WebKit has a somewhat similar strategy to Gecko, but have not checked. No idea about IE.
I did a bunch of research here:

https://www.w3.org/Bugs/Public/show_bug.cgi?id=19431#c6

It looks like IE9/Gecko/WebKit all basically agree that it's the HTML namespace if the page was served with an HTML MIME type, otherwise it's null.  IE10 seems to have changed to match the spec, and Opera keys off the namespace of the initial root element.  I think the spec should change to match the dominant behavior.  If that gets agreed upon over there, I'll resolve this WONTFIX.
Since I was asked, I don't have much of an opinion here. On the surface I do like Hixie's approach of trying to make all document types behave the same. However it definitely adds complexity and it's hard to say what the result will be down the line.
Opera 12 .00 looks at the root element's namespace to decide which document interface to use. How this works exactly I'm not sure. Opera 12.10 looks at the MIME type. However, we seem to make HTMLDocument members available on all documents.

Chrome seems to use Document for text/xml and application/xhtml+xml; SVGDocument for image/svg+xml; HTMLDocument for text/html. HTMLDocument members are available in all documents.

Note that loading a document in a browsing context is not the only case we need to consider. There's also createDocument and recently Anne added `new Document()` constructor to the DOM spec. IIRC Acid3 expects HTMLDocument members to be available on a createDocument-created XHTML 1.0 doc.
Comment on attachment 670359 [details] [diff] [review]
Patch

Nix the "bug NNN" comment, and if we decide to do this, patch looks fine.
Attachment #670359 - Flags: review?(bzbarsky) → review+
I wrote tests:

https://www.w3.org/Bugs/Public/show_bug.cgi?id=19431#c12

My conclusion is that IE9/10 and Chrome 23 dev behave basically the same as Gecko, with minor variations, and going with Gecko's behavior would make the spec about one line longer.  On the other hand, the current spec is a bit simpler, and perhaps better matches what authors would actually want.  So do we want to change Gecko or the spec?
I would probably be fine with either one....

Note that it's not just one line longer because of createDocument crap.  :(
Okay, so can I go ahead and push this to inbound, or do I need to get anyone further to agree to it?  I agree in principle this behavior is nicer for people who are doing stupid things like serving HTML as application/xml.
As long as sicking and smaug agree, we're good to go.
So the only change is that XML documents will create html elements when .createElement is called? XUL and SVG documents will still create XUL and SVG elements?

If so, that sounds good to me.
SVG documents don't create current SVG elements.

It is IMHO odd if XML documents start creating html elements.
Once you then serialize such stuff, you get quite different result than now.
(In reply to Jonas Sicking (:sicking) from comment #28)
> So the only change is that XML documents will create html elements when
> .createElement is called? XUL and SVG documents will still create XUL and
> SVG elements?

XUL, yes.  SVG is not special-cased per spec.  The idea is that in XML, you should be using createElementNS() -- we're making createElement() HTML-specific.  IMO, this is probably better than making createElement() silently vary its behavior in a critical way depending on whether the page is served as application/xml or application/xhtml+xml.

(In reply to Olli Pettay [:smaug] from comment #29)
> It is IMHO odd if XML documents start creating html elements.
> Once you then serialize such stuff, you get quite different result than now.

Yes, this is a significant change for anyone using createElement() in XML documents that are not served with application/xhtml+xml (and also aren't XUL).  Realistically, it seems hard to believe that there are enough of those that it will be an issue.
XML is used as data container reasonable often, so I would be surprised if the change would not
break pages, especially some intranet pages.
(In reply to Olli Pettay [:smaug] from comment #31)
> XML is used as data container reasonable often, so I would be surprised if
> the change would not
> break pages, especially some intranet pages.

So does that mean we're not okay with trying this and we should change the spec, or we can give it a shot but should be on the lookout for regressions?  I don't particularly mind either, but we should make a decision here one way or the other.
Okay, so someone needs to make a call here on whether we want to change to match the spec or push for the spec to change.  Until then, unassigning, because this issue is much too minor to warrant investing more time and energy.
Assignee: ayg → nobody
Status: ASSIGNED → NEW
Flags: needinfo?
Flags: needinfo?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f38554fc7737

This should be green, because I fixed all the orange from the last try run in the new patch.  But it's still running as I write this, so make sure to check before checking in.
Assignee: nobody → ayg
Attachment #670359 - Attachment is obsolete: true
Status: NEW → ASSIGNED
https://hg.mozilla.org/integration/mozilla-inbound/rev/f6759ed53f46
Keywords: checkin-needed
Depends on: 1214621
And we have extension bustage from this already.
Keywords: addon-compat
https://hg.mozilla.org/integration/mozilla-inbound/rev/645b0892ebfc
Wait.  So the spec here doesn't match any actual UAs, based on the comments in https://www.w3.org/Bugs/Public/show_bug.cgi?id=19431

It was worth trying, but given than and given known resulting breakage, I'm backing this out.  Spec needs to be fixed.

Backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/645b0892ebfc
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: