Last Comment Bug 492933 - getElementsByTagName should match on localName not tagName (for interop)
: getElementsByTagName should match on localName not tagName (for interop)
Status: RESOLVED WONTFIX
: dev-doc-complete, site-compat
Product: Core
Classification: Components
Component: DOM: Core & HTML (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla44
Assigned To: Boris Zbarsky [:bz] (still a bit busy)
:
:
Mentors:
http://software.hixie.ch/utilities/js...
: 31671 816012 (view as bug list)
Depends on: 1236329
Blocks: 1222189
  Show dependency treegraph
 
Reported: 2009-05-14 01:44 PDT by Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03)
Modified: 2016-01-30 14:53 PST (History)
19 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
wontfix
wontfix
wontfix


Attachments
Patch v1 (23.91 KB, patch)
2010-05-25 04:24 PDT, :Ms2ger (⌚ UTC+1/+2)
bugs: review-
Details | Diff | Splinter Review
Patch v2 (24.52 KB, patch)
2015-10-13 08:35 PDT, Aryeh Gregor (:ayg) (away until October 25)
bzbarsky: review-
Details | Diff | Splinter Review
getElementsByTagName should match on localName not tagName, (31.98 KB, patch)
2015-10-14 13:23 PDT, Boris Zbarsky [:bz] (still a bit busy)
no flags Details | Diff | Splinter Review
With string header hell fixed (32.82 KB, patch)
2015-10-14 14:51 PDT, Boris Zbarsky [:bz] (still a bit busy)
no flags Details | Diff | Splinter Review
Argh, can't change the namespace id without messing up the cache keys (31.67 KB, patch)
2015-10-14 15:30 PDT, Boris Zbarsky [:bz] (still a bit busy)
no flags Details | Diff | Splinter Review
Might even pass try now (35.76 KB, patch)
2015-10-14 19:26 PDT, Boris Zbarsky [:bz] (still a bit busy)
bugs: review+
Details | Diff | Splinter Review
Additional patch to make xcpshell tests patch (2.18 KB, patch)
2015-10-15 07:02 PDT, Boris Zbarsky [:bz] (still a bit busy)
no flags Details | Diff | Splinter Review

Description Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2009-05-14 01:44:50 PDT
Steps to reproduce:
 1) Navigate to http://software.hixie.ch/utilities/js/live-dom-viewer/saved/111

Expected results:
Expected the log to show
log: 0
log: 1
as in Safari and Opera.

Actual results:
The log shows
log: 1
log: 0

Upcoming spec:
http://simon.html5.org/specs/web-dom-core#dom-document-getelementsbytagname

See also:
http://krijnhoetmer.nl/irc-logs/whatwg/20090514#l-381
Comment 1 Olli Pettay [:smaug] (reviewing overload) 2009-05-14 01:48:07 PDT
What does IE do?
Comment 2 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2009-05-14 01:51:02 PDT
According to zcorpan in the IRC log cited above, it uses an equivalent of localName, but I haven't verified this.
Comment 3 :Ms2ger (⌚ UTC+1/+2) 2010-04-06 02:39:50 PDT
Note that changing this will require changes to at least the following tests:

dom/tests/mochitest/dom-level2-core/test_elementsetattributens03.html
dom/tests/mochitest/dom-level2-core/test_getAttributeNS02.html
dom/tests/mochitest/dom-level2-core/test_getAttributeNS03.html
dom/tests/mochitest/dom-level2-core/test_getAttributeNS04.html
dom/tests/mochitest/dom-level2-core/test_getAttributeNS05.html
dom/tests/mochitest/dom-level2-core/test_getAttributeNodeNS01.html
dom/tests/mochitest/dom-level2-core/test_getAttributeNodeNS02.html
dom/tests/mochitest/dom-level2-core/test_hasAttributeNS03.html
dom/tests/mochitest/dom-level2-core/test_hasAttributeNS04.html
dom/tests/mochitest/dom-level2-core/test_importNode05.html
dom/tests/mochitest/dom-level2-core/test_importNode06.html
dom/tests/mochitest/dom-level2-core/test_localName01.html
dom/tests/mochitest/dom-level2-core/test_namespaceURI02.html
dom/tests/mochitest/dom-level2-core/test_nodehasattributes03.html
dom/tests/mochitest/dom-level2-core/test_prefix02.html
dom/tests/mochitest/dom-level2-core/test_prefix03.html
dom/tests/mochitest/dom-level2-core/test_prefix05.html
dom/tests/mochitest/dom-level2-core/test_prefix09.html
dom/tests/mochitest/dom-level2-core/test_setAttributeNS02.html
dom/tests/mochitest/dom-level2-core/test_setAttributeNS04.html
dom/tests/mochitest/dom-level2-core/test_setAttributeNS05.html
dom/tests/mochitest/dom-level2-core/test_setAttributeNS09.html
dom/tests/mochitest/dom-level2-core/test_setAttributeNodeNS01.html
dom/tests/mochitest/dom-level2-core/test_setAttributeNodeNS03.html
Comment 4 :Ms2ger (⌚ UTC+1/+2) 2010-05-25 04:24:47 PDT
Created attachment 447306 [details] [diff] [review]
Patch v1
Comment 5 Olli Pettay [:smaug] (reviewing overload) 2010-05-25 04:37:25 PDT
Uh, changing DOM 2 tests :/
Comment 6 Olli Pettay [:smaug] (reviewing overload) 2010-05-25 04:39:59 PDT
So where is the new behavior defined?

I hope somewhere else than just in http://simon.html5.org/specs/web-dom-core
That one hasn't been updated for a while, and not sure how well it has been
reviewed.
Comment 7 Boris Zbarsky [:bz] (still a bit busy) 2010-05-25 05:24:24 PDT
Yeah, it seems to me that the behavior of getElementsByTagName is very clearly specified in the DOM spec at the moment.  I'm surprised that so many browsers get it totally wrong...
Comment 8 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2010-05-25 05:45:34 PDT
I fail to see use cases the DOM Level 2 / Gecko behavior of letting a prefix be meaningful would be more appropriate than matching on local name while wild carding the prefix / namespace.

That is, I don't see why other browsers would be persuaded to change.
Comment 9 Boris Zbarsky [:bz] (still a bit busy) 2010-05-25 05:48:48 PDT
Well, for one thing the "match on local name while wild carding prefix/namespace" has an obvious way to do it: getElementsByTagNameNS("*", localName).  The non-ns version, on the other hand, has a behavior that's not achievable in other ways (and is actually used, last I checked).
Comment 10 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2010-05-25 05:56:11 PDT
(In reply to comment #9)
> The non-ns version, on the other hand, has a behavior that's not
> achievable in other ways (and is actually used, last I checked).

Surely it can't be in use in cross-browser code in a way where the difference matters, since it doesn't work the same cross-browser.
Comment 11 Boris Zbarsky [:bz] (still a bit busy) 2010-05-25 06:16:19 PDT
No, but I believe I've seen instances of it in browser-sniffed code...
Comment 12 Anne (:annevk) 2010-05-26 02:48:11 PDT
FWIW, we (Opera) consider this a bug as far as I know.
Comment 13 Olli Pettay [:smaug] (reviewing overload) 2010-06-22 07:29:55 PDT
Anne, which behavior do you consider as bug? Opera's current behavior?
Comment 14 Anne (:annevk) 2010-06-23 03:34:40 PDT
Yes, matching on local name rather than tag name.
Comment 15 Olli Pettay [:smaug] (reviewing overload) 2010-08-02 09:25:29 PDT
Comment on attachment 447306 [details] [diff] [review]
Patch v1

So atm I don't see reason to change our behavior. If HTML5 has a bug,
better to report that in W3C or whatwg mailing list
Comment 16 Anne (:annevk) 2010-10-16 05:09:41 PDT
FWIW, Henri convinced me that matching on local name is better. Web DOM Core defines that now.
Comment 17 Olli Pettay [:smaug] (reviewing overload) 2010-10-16 10:33:26 PDT
And the reason that matching on local name is better because...?
Comment 18 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2010-10-18 04:26:26 PDT
(In reply to comment #17)
> And the reason that matching on local name is better because...?

I'm not sure what exactly I've said that convinced Anne, so I'm building an answer from scratch here:

 * In text/html, there's no difference, so this is mostly an XML thing. In practical terms, there no harm or benefit in text/html either way.
 * In architecturally sound XML, you never give significance to the prefix. Thus, an author who actually wants to match on tagName is wanting the wrong thing.
 * In implementations like Gecko, matching on local name is simpler, because the local name is stored as an atom but the tagName isn't.
 * Element selectors wildcard the namespace when you use the syntax that isn't explicit about namespaces, so doing the same in the DOM would make the platform more self-consistent.
Comment 19 Boris Zbarsky [:bz] (still a bit busy) 2010-10-18 21:54:25 PDT
Fwiw, I'd like to see someone address my concerns in comment 9...  Both the "there's no way to do this without this function" and "sites will break" aspects.
Comment 20 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2010-10-19 02:00:05 PDT
(In reply to comment #19)
> Fwiw, I'd like to see someone address my concerns in comment 9...  Both the
> "there's no way to do this without this function" 

I addressed this concern in comment 18 by asserting that no one should want to do this (where "this" is paying attention to the prefix in XML). 

> and "sites will break" aspects.

This is a general problem whenever there's UA sniffing and reliance of browser differences. I don't know of a solution other than breaking sites that UA sniff.
Comment 21 Boris Zbarsky [:bz] (still a bit busy) 2010-10-19 12:57:47 PDT
> no one should want to do this

Yet in practice it turns out people do (e.g. the RDFa folks, iirc).  You may think they're misguided, but...

> I don't know of a solution other than breaking sites that UA sniff.

The question is why we should be shouldering the burden of such breakage in this case.
Comment 22 :Ms2ger (⌚ UTC+1/+2) 2012-11-28 05:22:50 PST
*** Bug 816012 has been marked as a duplicate of this bug. ***
Comment 23 Aryeh Gregor (:ayg) (away until October 25) 2015-10-07 06:40:22 PDT
*** Bug 31671 has been marked as a duplicate of this bug. ***
Comment 24 Aryeh Gregor (:ayg) (away until October 25) 2015-10-07 06:47:17 PDT
Using the test-case from comment #0 -- Chrome 45 matches the spec, IE 11 does not, and we do not.  Do we want to change, or the spec to change?  Asking module owners for a decision here.

(I got here because I'm going through web-platform-tests failures and trying to resolve them, and this causes failures in dom/nodes/Document-getElementsByTagName.*.)
Comment 25 Johnny Stenback (:jst, jst@mozilla.com) 2015-10-11 14:07:46 PDT
Has anyone asked if Microsoft would be willing to change their implementation to also match the spec? If they do, we could follow and everyone would match the spec. If not, we should reach out to the Chrome devs and advocate for a change in their implementation and a change in the spec to match.
Comment 26 Anne (:annevk) 2015-10-12 00:54:19 PDT
Reached out to Microsoft: https://lists.w3.org/Archives/Public/www-archive/2015Oct/0008.html.
Comment 27 Anne (:annevk) 2015-10-12 23:08:03 PDT
According to Travis https://lists.w3.org/Archives/Public/www-archive/2015Oct/0023.html Microsoft Edge already matches 0, 1 from comment 0's test. So it's just us at this point.
Comment 28 Peter Van der Beken [:peterv] 2015-10-13 07:52:21 PDT
Talked to bz, he can live with trying to change behaviour. Let's try it.

Unfortunately this is one of those cases where breakage (in browser-sniffed code) will be complicated to detect :-(.
Comment 29 Aryeh Gregor (:ayg) (away until October 25) 2015-10-13 08:35:21 PDT
Created attachment 8673138 [details] [diff] [review]
Patch v2

Ms2ger's patch no longer applies, but this patch includes his test changes.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=511fb9bedfaf

Someone else will probably need to be the one to land this (and fix it up if it needs fixup).
Comment 30 Boris Zbarsky [:bz] (still a bit busy) 2015-10-13 09:11:17 PDT
Comment on attachment 8673138 [details] [diff] [review]
Patch v2

>+    return matchHTML ? ni->Equals(mHTMLMatchAtom) : ni->Equals(mXMLMatchAtom);

So... as you might have noticed, and as mentioned earlier in this bug, this is now identical to the "wildcard" case.

Specifically, after this change getElementsByTagName(foo) is identical to getElementsByTagNameNS("*", ASCIILowerCase(foo)), right?

So I think we should just set the namespace id to kNameSpaceID_Wildcard when we do the lowercasing in NS_GetContentList, and remove the "unknown" block in Match, replacing it with an assert that the namespace id is not unknown.

We should also fix the docs in nsContentListDeclarations.h and use that header instead of the manual declarations of NS_GetContentList in Element.h and nsIDocument.h.
Comment 31 Boris Zbarsky [:bz] (still a bit busy) 2015-10-13 09:29:35 PDT
Also, while we're here, we're not exactly following the spec in terms of whether we should use the html or xml atom.  In particular, per spec the determination of whether we're in an "HTML document" is done at list-creation time.  We do it at match time.  You can see the difference in http://web.mit.edu/bzbarsky/www/testcases/dom-queries/test_getElementsByTagName_1.html for example.  We should either fix this as part of this bug or get a followup filed on it...  We'd need to include the HTMLness of the document in the cache key or something.
Comment 32 Boris Zbarsky [:bz] (still a bit busy) 2015-10-14 13:23:55 PDT
Created attachment 8673867 [details] [diff] [review]
getElementsByTagName should match on localName not tagName,

Olli, do you mind reviewing this?  Really, just the dom/base changes; I've reviewed the rest
Comment 33 Boris Zbarsky [:bz] (still a bit busy) 2015-10-14 14:51:20 PDT
Created attachment 8673921 [details] [diff] [review]
With string header hell fixed
Comment 34 Boris Zbarsky [:bz] (still a bit busy) 2015-10-14 15:30:16 PDT
Created attachment 8673945 [details] [diff] [review]
Argh, can't change the namespace id without messing up the cache keys
Comment 35 Boris Zbarsky [:bz] (still a bit busy) 2015-10-14 19:26:37 PDT
Created attachment 8674018 [details] [diff] [review]
Might even pass try now

Sorry for the churn.  :(
Comment 36 Olli Pettay [:smaug] (reviewing overload) 2015-10-15 05:53:11 PDT
Comment on attachment 8674018 [details] [diff] [review]
Might even pass try now

scary change. crossing fingers.
Comment 37 Boris Zbarsky [:bz] (still a bit busy) 2015-10-15 07:02:59 PDT
Created attachment 8674244 [details] [diff] [review]
Additional patch to make xcpshell tests patch
Comment 39 Boris Zbarsky [:bz] (still a bit busy) 2015-10-15 07:11:49 PDT
> So I think we should just set the namespace id to kNameSpaceID_Wildcard when we do the
> lowercasing in NS_GetContentList

To be clear, this doesn't work because we cache the nodelist in the cache entry looked up with kNameSpaceID_Unknown, but remove in the list destructor using whatever namespace id is stored in the list.

And we can't cache under kNameSpaceID_Wildcard, because (kNameSpaceID_Wildcard, someLocalName) is not the same thing as (kNameSpaceID_Unknown, someLocalName); you'd have to compare both xmlatom and htmlatom in the cache key to allow collapsing the two namespace ids together.
Comment 40 Carsten Book [:Tomcat] 2015-10-16 05:39:07 PDT
https://hg.mozilla.org/mozilla-central/rev/d8012b35413b
Comment 41 Kohei Yoshino [:kohei] 2015-10-23 03:27:37 PDT
Posted the site compatibility doc: https://www.fxsitecompat.com/en-US/docs/2015/getelementsbytagname-now-matches-localname-instead-of-tagname/
Comment 42 Oriol 2015-12-29 13:20:32 PST
This change breaks the management page of the embedded web server of my HP Officejet Pro 8610 printer. It works on Chrome because of browser sniffing. So when the change reaches release, people will blame Firefox instead of HP.
Comment 43 Boris Zbarsky [:bz] (still a bit busy) 2016-01-05 12:09:05 PST
Note that I just backed this out on m-c to fix bug 1236329 and plan to back this out on branches as well.  I have also filed a spec bug at https://github.com/whatwg/dom/issues/143
Comment 44 Bob 2016-01-17 22:22:07 PST
The Firefox 44 for Developers doc needs to be updated, since this change was backed out and is now wontfix:

> The Element.getElementsByTagName() now match on local name instead of qualified name, that is, xyz:abc will match any element with abc rather than only xyz:abc (bug 492933).
Comment 45 Jean-Yves Perrier [:teoli] 2016-01-17 23:41:43 PST
Done thanks, good catch!

Note You need to log in before you can comment on or make changes to this bug.