getElementsByTagName should match on localName not tagName (for interop)

RESOLVED WONTFIX

Status

()

Core
DOM: Core & HTML
RESOLVED WONTFIX
8 years ago
a year ago

People

(Reporter: hsivonen, Assigned: bz)

Tracking

({dev-doc-complete, site-compat})

Trunk
mozilla44
dev-doc-complete, site-compat
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox44 wontfix, firefox45 wontfix, firefox46 wontfix)

Details

(URL)

Attachments

(2 attachments, 5 obsolete attachments)

(Reporter)

Description

8 years ago
Steps to reproduce:
 1) Navigate to http://software.hixie.ch/utilities/js/live-dom-viewer/saved/111

Expected results:
Expected the log to show
log: 0
log: 1
as in Safari and Opera.

Actual results:
The log shows
log: 1
log: 0

Upcoming spec:
http://simon.html5.org/specs/web-dom-core#dom-document-getelementsbytagname

See also:
http://krijnhoetmer.nl/irc-logs/whatwg/20090514#l-381

Comment 1

8 years ago
What does IE do?
(Reporter)

Comment 2

8 years ago
According to zcorpan in the IRC log cited above, it uses an equivalent of localName, but I haven't verified this.
Note that changing this will require changes to at least the following tests:

dom/tests/mochitest/dom-level2-core/test_elementsetattributens03.html
dom/tests/mochitest/dom-level2-core/test_getAttributeNS02.html
dom/tests/mochitest/dom-level2-core/test_getAttributeNS03.html
dom/tests/mochitest/dom-level2-core/test_getAttributeNS04.html
dom/tests/mochitest/dom-level2-core/test_getAttributeNS05.html
dom/tests/mochitest/dom-level2-core/test_getAttributeNodeNS01.html
dom/tests/mochitest/dom-level2-core/test_getAttributeNodeNS02.html
dom/tests/mochitest/dom-level2-core/test_hasAttributeNS03.html
dom/tests/mochitest/dom-level2-core/test_hasAttributeNS04.html
dom/tests/mochitest/dom-level2-core/test_importNode05.html
dom/tests/mochitest/dom-level2-core/test_importNode06.html
dom/tests/mochitest/dom-level2-core/test_localName01.html
dom/tests/mochitest/dom-level2-core/test_namespaceURI02.html
dom/tests/mochitest/dom-level2-core/test_nodehasattributes03.html
dom/tests/mochitest/dom-level2-core/test_prefix02.html
dom/tests/mochitest/dom-level2-core/test_prefix03.html
dom/tests/mochitest/dom-level2-core/test_prefix05.html
dom/tests/mochitest/dom-level2-core/test_prefix09.html
dom/tests/mochitest/dom-level2-core/test_setAttributeNS02.html
dom/tests/mochitest/dom-level2-core/test_setAttributeNS04.html
dom/tests/mochitest/dom-level2-core/test_setAttributeNS05.html
dom/tests/mochitest/dom-level2-core/test_setAttributeNS09.html
dom/tests/mochitest/dom-level2-core/test_setAttributeNodeNS01.html
dom/tests/mochitest/dom-level2-core/test_setAttributeNodeNS03.html
Created attachment 447306 [details] [diff] [review]
Patch v1
Assignee: nobody → Ms2ger
Status: NEW → ASSIGNED
Attachment #447306 - Flags: review?(Olli.Pettay)

Comment 5

7 years ago
Uh, changing DOM 2 tests :/

Comment 6

7 years ago
So where is the new behavior defined?

I hope somewhere else than just in http://simon.html5.org/specs/web-dom-core
That one hasn't been updated for a while, and not sure how well it has been
reviewed.
Yeah, it seems to me that the behavior of getElementsByTagName is very clearly specified in the DOM spec at the moment.  I'm surprised that so many browsers get it totally wrong...
(Reporter)

Comment 8

7 years ago
I fail to see use cases the DOM Level 2 / Gecko behavior of letting a prefix be meaningful would be more appropriate than matching on local name while wild carding the prefix / namespace.

That is, I don't see why other browsers would be persuaded to change.
Well, for one thing the "match on local name while wild carding prefix/namespace" has an obvious way to do it: getElementsByTagNameNS("*", localName).  The non-ns version, on the other hand, has a behavior that's not achievable in other ways (and is actually used, last I checked).
(Reporter)

Comment 10

7 years ago
(In reply to comment #9)
> The non-ns version, on the other hand, has a behavior that's not
> achievable in other ways (and is actually used, last I checked).

Surely it can't be in use in cross-browser code in a way where the difference matters, since it doesn't work the same cross-browser.
No, but I believe I've seen instances of it in browser-sniffed code...

Comment 12

7 years ago
FWIW, we (Opera) consider this a bug as far as I know.
Anne, which behavior do you consider as bug? Opera's current behavior?

Comment 14

7 years ago
Yes, matching on local name rather than tag name.
Comment on attachment 447306 [details] [diff] [review]
Patch v1

So atm I don't see reason to change our behavior. If HTML5 has a bug,
better to report that in W3C or whatwg mailing list
Attachment #447306 - Flags: review?(Olli.Pettay) → review-

Comment 16

7 years ago
FWIW, Henri convinced me that matching on local name is better. Web DOM Core defines that now.
And the reason that matching on local name is better because...?
(Reporter)

Comment 18

7 years ago
(In reply to comment #17)
> And the reason that matching on local name is better because...?

I'm not sure what exactly I've said that convinced Anne, so I'm building an answer from scratch here:

 * In text/html, there's no difference, so this is mostly an XML thing. In practical terms, there no harm or benefit in text/html either way.
 * In architecturally sound XML, you never give significance to the prefix. Thus, an author who actually wants to match on tagName is wanting the wrong thing.
 * In implementations like Gecko, matching on local name is simpler, because the local name is stored as an atom but the tagName isn't.
 * Element selectors wildcard the namespace when you use the syntax that isn't explicit about namespaces, so doing the same in the DOM would make the platform more self-consistent.
Fwiw, I'd like to see someone address my concerns in comment 9...  Both the "there's no way to do this without this function" and "sites will break" aspects.
(Reporter)

Comment 20

7 years ago
(In reply to comment #19)
> Fwiw, I'd like to see someone address my concerns in comment 9...  Both the
> "there's no way to do this without this function" 

I addressed this concern in comment 18 by asserting that no one should want to do this (where "this" is paying attention to the prefix in XML). 

> and "sites will break" aspects.

This is a general problem whenever there's UA sniffing and reliance of browser differences. I don't know of a solution other than breaking sites that UA sniff.
> no one should want to do this

Yet in practice it turns out people do (e.g. the RDFa folks, iirc).  You may think they're misguided, but...

> I don't know of a solution other than breaking sites that UA sniff.

The question is why we should be shouldering the burden of such breakage in this case.

Updated

5 years ago
Duplicate of this bug: 816012
Duplicate of this bug: 31671
Using the test-case from comment #0 -- Chrome 45 matches the spec, IE 11 does not, and we do not.  Do we want to change, or the spec to change?  Asking module owners for a decision here.

(I got here because I'm going through web-platform-tests failures and trying to resolve them, and this causes failures in dom/nodes/Document-getElementsByTagName.*.)
Flags: needinfo?(peterv)
Flags: needinfo?(jst)
Has anyone asked if Microsoft would be willing to change their implementation to also match the spec? If they do, we could follow and everyone would match the spec. If not, we should reach out to the Chrome devs and advocate for a change in their implementation and a change in the spec to match.
Flags: needinfo?(jst)
Keywords: dev-doc-needed

Comment 26

2 years ago
Reached out to Microsoft: https://lists.w3.org/Archives/Public/www-archive/2015Oct/0008.html.

Comment 27

2 years ago
According to Travis https://lists.w3.org/Archives/Public/www-archive/2015Oct/0023.html Microsoft Edge already matches 0, 1 from comment 0's test. So it's just us at this point.
Flags: needinfo?(jst)
Talked to bz, he can live with trying to change behaviour. Let's try it.

Unfortunately this is one of those cases where breakage (in browser-sniffed code) will be complicated to detect :-(.
Flags: needinfo?(peterv)
Created attachment 8673138 [details] [diff] [review]
Patch v2

Ms2ger's patch no longer applies, but this patch includes his test changes.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=511fb9bedfaf

Someone else will probably need to be the one to land this (and fix it up if it needs fixup).
Assignee: Ms2ger → ayg
Attachment #447306 - Attachment is obsolete: true
Flags: needinfo?(jst)
Attachment #8673138 - Flags: review?(bzbarsky)
Comment on attachment 8673138 [details] [diff] [review]
Patch v2

>+    return matchHTML ? ni->Equals(mHTMLMatchAtom) : ni->Equals(mXMLMatchAtom);

So... as you might have noticed, and as mentioned earlier in this bug, this is now identical to the "wildcard" case.

Specifically, after this change getElementsByTagName(foo) is identical to getElementsByTagNameNS("*", ASCIILowerCase(foo)), right?

So I think we should just set the namespace id to kNameSpaceID_Wildcard when we do the lowercasing in NS_GetContentList, and remove the "unknown" block in Match, replacing it with an assert that the namespace id is not unknown.

We should also fix the docs in nsContentListDeclarations.h and use that header instead of the manual declarations of NS_GetContentList in Element.h and nsIDocument.h.
Attachment #8673138 - Flags: review?(bzbarsky) → review-
Also, while we're here, we're not exactly following the spec in terms of whether we should use the html or xml atom.  In particular, per spec the determination of whether we're in an "HTML document" is done at list-creation time.  We do it at match time.  You can see the difference in http://web.mit.edu/bzbarsky/www/testcases/dom-queries/test_getElementsByTagName_1.html for example.  We should either fix this as part of this bug or get a followup filed on it...  We'd need to include the HTMLness of the document in the cache key or something.
Flags: needinfo?(bzbarsky)
Created attachment 8673867 [details] [diff] [review]
getElementsByTagName should match on localName not tagName,

Olli, do you mind reviewing this?  Really, just the dom/base changes; I've reviewed the rest
Attachment #8673867 - Flags: review?(bugs)
Assignee: ayg → bzbarsky
Flags: needinfo?(bzbarsky)
Created attachment 8673921 [details] [diff] [review]
With string header hell fixed
Attachment #8673921 - Flags: review?(bugs)
Attachment #8673867 - Attachment is obsolete: true
Attachment #8673867 - Flags: review?(bugs)
Created attachment 8673945 [details] [diff] [review]
Argh, can't change the namespace id without messing up the cache keys
Attachment #8673945 - Flags: review?(bugs)
Attachment #8673921 - Attachment is obsolete: true
Attachment #8673921 - Flags: review?(bugs)
Created attachment 8674018 [details] [diff] [review]
Might even pass try now

Sorry for the churn.  :(
Attachment #8674018 - Flags: review?(bugs)
Attachment #8673945 - Attachment is obsolete: true
Attachment #8673945 - Flags: review?(bugs)
Comment on attachment 8674018 [details] [diff] [review]
Might even pass try now

scary change. crossing fingers.
Attachment #8674018 - Flags: review?(bugs) → review+
Created attachment 8674244 [details] [diff] [review]
Additional patch to make xcpshell tests patch
Attachment #8673138 - Attachment is obsolete: true

Comment 38

2 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/d8012b35413b
> So I think we should just set the namespace id to kNameSpaceID_Wildcard when we do the
> lowercasing in NS_GetContentList

To be clear, this doesn't work because we cache the nodelist in the cache entry looked up with kNameSpaceID_Unknown, but remove in the list destructor using whatever namespace id is stored in the list.

And we can't cache under kNameSpaceID_Wildcard, because (kNameSpaceID_Wildcard, someLocalName) is not the same thing as (kNameSpaceID_Unknown, someLocalName); you'd have to compare both xmlatom and htmlatom in the cache key to allow collapsing the two namespace ids together.
https://hg.mozilla.org/mozilla-central/rev/d8012b35413b
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
status-firefox44: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
Posted the site compatibility doc: https://www.fxsitecompat.com/en-US/docs/2015/getelementsbytagname-now-matches-localname-instead-of-tagname/
Keywords: site-compat

Comment 42

a year ago
This change breaks the management page of the embedded web server of my HP Officejet Pro 8610 printer. It works on Chrome because of browser sniffing. So when the change reaches release, people will blame Firefox instead of HP.
Depends on: 1236329
Note that I just backed this out on m-c to fix bug 1236329 and plan to back this out on branches as well.  I have also filed a spec bug at https://github.com/whatwg/dom/issues/143
Resolution: FIXED → WONTFIX
status-firefox44: fixed → wontfix
status-firefox45: --- → wontfix
status-firefox46: --- → wontfix

Comment 44

a year ago
The Firefox 44 for Developers doc needs to be updated, since this change was backed out and is now wontfix:

> The Element.getElementsByTagName() now match on local name instead of qualified name, that is, xyz:abc will match any element with abc rather than only xyz:abc (bug 492933).
Done thanks, good catch!
Keywords: dev-doc-needed → dev-doc-complete

Updated

a year ago
Blocks: 1222189
You need to log in before you can comment on or make changes to this bug.