Last Comment Bug 499656 - nsContentList should have dual atoms: HTML and other
: nsContentList should have dual atoms: HTML and other
Status: RESOLVED FIXED
: html5
Product: Core
Classification: Components
Component: DOM: Core & HTML (show other bugs)
: Trunk
: All All
: -- normal with 1 vote (vote)
: mozilla2.0b7
Assigned To: David Zbarsky (:dzbarsky)
:
Mentors:
http://livedom.validator.nu/?%3C!DOCT...
: 589134 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-22 02:55 PDT by Henri Sivonen (:hsivonen)
Modified: 2010-09-15 15:02 PDT (History)
6 users (show)
bzbarsky: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch (29.16 KB, patch)
2009-07-26 18:11 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Review
Updated Patch with tests (27.97 KB, patch)
2009-07-29 10:22 PDT, David Zbarsky (:dzbarsky)
jonas: review-
Details | Diff | Review
Patch (31.61 KB, patch)
2010-08-26 13:42 PDT, David Zbarsky (:dzbarsky)
jonas: review+
Details | Diff | Review
Patch with nits r=sicking (31.65 KB, patch)
2010-08-31 19:47 PDT, David Zbarsky (:dzbarsky)
jst: approval2.0+
Details | Diff | Review

Description Henri Sivonen (:hsivonen) 2009-06-22 02:55:08 PDT
Steps to reproduce:
 1) Load http://livedom.validator.nu/?%3C!DOCTYPE%20html%3E%0A%3Csvg%3E%0A%3CfeBlend%2F%3E%0A%3C%2Fsvg%3E%0A%3Cscript%3Ew%28document.getElementsByTagName%28%27svg%27%29.length%29%3C%2Fscript%3E%0A%3Cscript%3Ew%28document.getElementsByTagName%28%27feBlend%27%29.length%29%3C%2Fscript%3E

Expected results:
Expected log to say
1
1

Actual results:
Log says
1
0

See http://lists.w3.org/Archives/Public/public-html/2009Apr/0081.html

As of HTML5, SVG names with uppercase ASCII letters can enter into a text/html DOM. Even without an HTML5 parser, such elements can be introduced with createElementNS().

Currently, nsContentList atoms associated with text/html DOMs are lowercased. This makes it impossible to match SVG camelCase names. Going forward, nsContentList should have dual atoms: one atom for matching against elements in the http://www.w3.org/1999/xhtml namespace and another for matching against all other elements.

The atom for matching against elements in the http://www.w3.org/1999/xhtml namespace should be ASCII-lowercased if the document is a text/html document. The other atom should always be in the original case.
Comment 1 Henri Sivonen (:hsivonen) 2009-06-22 02:56:23 PDT
See also bug 499655.
Comment 2 David Zbarsky (:dzbarsky) 2009-07-26 18:11:52 PDT
Created attachment 390759 [details] [diff] [review]
Patch
Comment 3 Boris Zbarsky [:bz] (Out June 25-July 6) 2009-07-26 21:01:33 PDT
Drive-by comments:

1)  Why change the signature of the constructor that takes a match function?
    Those types of content lists never need the extra atom.
2)  This doesn't look like it does the right thing (which is using the HTML atom
    for HTML content and the XML atom otherwise).  Some tests for correct
    behavior in HTML documents with XML content in them (which are missing from
    this patch!) would have caught that, I think.
Comment 4 David Zbarsky (:dzbarsky) 2009-07-29 10:22:14 PDT
Created attachment 391369 [details] [diff] [review]
Updated Patch with tests
Comment 5 Jonas Sicking (:sicking) PTO Until July 5th 2009-08-25 23:07:42 PDT
Comment on attachment 391369 [details] [diff] [review]
Updated Patch with tests

Why change the order of the arguments to NS_GetContentList, but not to the nsContentList ctor? I don't really care about the order, but it's nice to be consistent.

Also, I think it might be worth caching the result of (mRoot->GetCurrentDoc()->IsCaseSensitive()). Then implement ParentChainChanged (it's one of the notifications on nsIMutationObserver), and update the cached value there.

>@@ -703,35 +710,41 @@ nsContentList::Match(nsIContent *aConten
> {
>   if (!aContent)
>     return PR_FALSE;
> 
>   NS_ASSERTION(aContent->IsNodeOfType(nsINode::eELEMENT),
>                "Must have element here");
> 
>   if (mFunc) {
>-    return (*mFunc)(aContent, mMatchNameSpaceId, mMatchAtom, mData);
>+    return (*mFunc)(aContent, mMatchNameSpaceId, mXMLMatchAtom, mData);
>   }
> 
>-  if (mMatchAtom) {
>-    nsINodeInfo *ni = aContent->NodeInfo();
>+  if (!mXMLMatchAtom)
>+    return PR_FALSE;
> 
>-    if (mMatchNameSpaceId == kNameSpaceID_Unknown) {
>-      return (mMatchAll || ni->QualifiedNameEquals(mMatchAtom));
>-    }
>+  nsINodeInfo *ni = aContent->NodeInfo();
> 
>-    if (mMatchNameSpaceId == kNameSpaceID_Wildcard) {
>-      return (mMatchAll || ni->Equals(mMatchAtom));
>-    }
>+  PRBool matchHTML = aContent->GetNameSpaceID() == kNameSpaceID_XHTML &&
>+    !(aContent->GetCurrentDoc()->IsCaseSensitive());
> 
>-    return ((mMatchAll && ni->NamespaceEquals(mMatchNameSpaceId)) ||
>-            ni->Equals(mMatchAtom, mMatchNameSpaceId));
>+  if (mMatchNameSpaceId == kNameSpaceID_Unknown) {
>+    return mMatchAll || ni->QualifiedNameEquals(mXMLMatchAtom)  || 
>+      (matchHTML &&  ni->QualifiedNameEquals(mHTMLMatchAtom));
>   }
>-
>-  return PR_FALSE;
>+  
>+  if (mMatchNameSpaceId == kNameSpaceID_Wildcard) {
>+    return mMatchAll || ni->Equals(mXMLMatchAtom) || 
>+      (matchHTML &&  ni->Equals(mHTMLMatchAtom));
>+  }
>+  
>+  return (mMatchAll && ni->NamespaceEquals(mMatchNameSpaceId)) ||
>+    ni->Equals(mXMLMatchAtom, mMatchNameSpaceId)  || 
>+    (matchHTML && ni->Equals(mXMLMatchAtom, mMatchNameSpaceId));
>+  
> }

If mMatchAll is true you are unnecessarily computing matchHTML. Might be worth dealing with mMatchAll separately.

Also, if matchHTML is false, you don't want to do the comparison with mXMLMatchAtom, right? So for example:

  ni->QualifiedNameEquals(mXMLMatchAtom)  ||
  (matchHTML &&  ni->QualifiedNameEquals(mHTMLMatchAtom)

should be

  matchHTML ? ni->QualifiedNameEquals(mHTMLMatchAtom) :
              ni->QualifiedNameEquals(mXMLMatchAtom)


>diff --git a/content/base/src/nsDocument.cpp b/content/base/src/nsDocument.cpp
>--- a/content/base/src/nsDocument.cpp
>+++ b/content/base/src/nsDocument.cpp
>@@ -4266,20 +4266,30 @@ nsDocument::CreateEntityReference(const 
>   *aReturn = nsnull;
>   return NS_OK;
> }
> 
> NS_IMETHODIMP
> nsDocument::GetElementsByTagName(const nsAString& aTagname,
>                                  nsIDOMNodeList** aReturn)
> {
>-  nsCOMPtr<nsIAtom> nameAtom = do_GetAtom(aTagname);
>-  NS_ENSURE_TRUE(nameAtom, NS_ERROR_OUT_OF_MEMORY);
>-
>-  nsContentList *list = NS_GetContentList(this, nameAtom, kNameSpaceID_Unknown).get();
>+  nsCOMPtr<nsIAtom> XMLAtom = do_GetAtom(aTagname);
>+  NS_ENSURE_TRUE(XMLAtom, NS_ERROR_OUT_OF_MEMORY);
>+
>+
>+  nsAutoString lowercaseName(aTagname);
>+  ToLowerCase(lowercaseName);
>+  nsCOMPtr<nsIAtom> HTMLAtom = do_GetAtom(lowercaseName);
>+  NS_ENSURE_TRUE(HTMLAtom, NS_ERROR_OUT_OF_MEMORY);

Please name the variables xmlAtom/htmlAtom to follow existing convention (variable names start with lower case).

The above is faster as:
nsAutoString lowercaseName;
ToLowerCase(aTagname, lowercaseName);

Also no need for two empty lines.

Same applies in nsGenericElement::GetElementsByTagName

>+
>+
>+  nsContentList *list = NS_GetContentList(this, 
>+                                          kNameSpaceID_Unknown, 
>+                                          HTMLAtom, 
>+                                          XMLAtom).get();

No need for two empty lines.

>diff --git a/content/html/document/src/nsHTMLDocument.cpp b/content/html/document/src/nsHTMLDocument.cpp

> nsHTMLDocument::GetLinks(nsIDOMHTMLCollection** aLinks)
> {
>   if (!mLinks) {
>-    mLinks = new nsContentList(this, MatchLinks, nsnull, nsnull);
>+    mLinks = new nsContentList(this, MatchLinks, nsnull, nsnull, nsnull);

Does this compile? I though you weren't changing this signature? Same in a few other places.

Looks good otherwise. Would like to see an updated patch though.
Comment 6 Boris Zbarsky [:bz] (Out June 25-July 6) 2010-08-23 10:21:49 PDT
*** Bug 589134 has been marked as a duplicate of this bug. ***
Comment 7 Boris Zbarsky [:bz] (Out June 25-July 6) 2010-08-23 10:22:14 PDT
David, any chance of you picking this up again?
Comment 8 David Zbarsky (:dzbarsky) 2010-08-23 11:33:02 PDT
Sure, I forgot about this
Comment 9 David Zbarsky (:dzbarsky) 2010-08-26 13:42:52 PDT
Created attachment 469594 [details] [diff] [review]
Patch

I'm not sure that this is correct, because we always pass mXMLMatchAtom to the callback in ::Match.  Maybe it should check matchHTML and then pass in mXMLMatchAtom/mHTMLMatchAtom accordingly?
Comment 10 Jonas Sicking (:sicking) PTO Until July 5th 2010-08-30 16:59:48 PDT
Comment on attachment 469594 [details] [diff] [review]
Patch

>diff --git a/content/base/src/nsContentList.cpp b/content/base/src/nsContentList.cpp
...
>   if (!list) {
>     // We need to create a ContentList and add it to our new entry, if
>     // we have an entry
>-    list = new nsContentList(aRootNode, aMatchAtom,
>-                             aMatchNameSpaceId);
>+    list = new nsContentList(aRootNode, aMatchNameSpaceId,
>+                             aHTMLMatchAtom, aXMLMatchAtom);
>+    NS_ASSERTION(list, "Infallible malloc failed.");

Nit: No need for this assertion IMHO.


> nsContentList::nsContentList(nsINode* aRootNode,
>-                             nsIAtom* aMatchAtom,
>                              PRInt32 aMatchNameSpaceId,
>+                             nsIAtom* aHTMLMatchAtom,
>+                             nsIAtom* aXMLMatchAtom,
>                              PRBool aDeep)
>   : nsBaseContentList(),
>-    nsContentListKey(aRootNode, aMatchAtom, aMatchNameSpaceId),
>+    nsContentListKey(aRootNode, aHTMLMatchAtom, aXMLMatchAtom, aMatchNameSpaceId),
>     mFunc(nsnull),
>     mDestroyFunc(nsnull),
>     mData(nsnull),
>     mState(LIST_DIRTY),
>     mDeep(aDeep),
>     mFuncMayDependOnAttr(PR_FALSE)
> {
>   NS_ASSERTION(mRootNode, "Must have root");
>-  if (nsGkAtoms::_asterix == mMatchAtom) {
>+  if (nsGkAtoms::_asterix == mHTMLMatchAtom) {
>     mMatchAll = PR_TRUE;

Assert that mXMLMatchAtom is also nsGkAtoms::_asterix in this case.

> nsContentList::Match(Element *aElement)
...
>+  nsIDocument* doc = aElement->GetCurrentDoc();
>+  PRBool matchHTML = aElement->GetNameSpaceID() == kNameSpaceID_XHTML &&
>+    doc && doc->IsHTML();

you want GetOwnerDoc() instead of GetCurrentDoc().

>diff --git a/content/base/src/nsContentList.h b/content/base/src/nsContentList.h
>   nsContentListKey(nsINode* aRootNode,
>-                   nsIAtom* aMatchAtom, 
>+                   nsIAtom* aHTMLMatchAtom,
>+                   nsIAtom* aXMLMatchAtom,
>                    PRInt32 aMatchNameSpaceId)
>-    : mMatchAtom(aMatchAtom),
>+    : mHTMLMatchAtom(aHTMLMatchAtom),
>+      mXMLMatchAtom(aXMLMatchAtom),
>       mMatchNameSpaceId(aMatchNameSpaceId),
>       mRootNode(aRootNode)
>   {
>+    NS_ASSERTION((aXMLMatchAtom == nsnull && aHTMLMatchAtom == nsnull) ||
>+                 (aXMLMatchAtom != nsnull && aHTMLMatchAtom != nsnull), "Either neither or both atoms should be null");
>   }

Ugly/neat way of doing this is
NS_ASSERTION((aXMLMatchAtom == nsnull) == (aHTMLMatchAtom == nsnull), "...");
or
NS_ASSERTION(!aXMLMatchAtom == !aHTMLMatchAtom, "...");


>diff --git a/content/base/src/nsDocument.cpp b/content/base/src/nsDocument.cpp
> nsDocument::GetElementsByTagName(const nsAString& aTagname)
...
>+  nsAutoString lowercaseName;
>+  ToLowerCase(aTagname, lowercaseName);
>+  nsCOMPtr<nsIAtom> xmlAtom = do_GetAtom(aTagname);
>+  nsCOMPtr<nsIAtom> htmlAtom = do_GetAtom(lowercaseName);
>+
>+  return NS_GetContentList(this, kNameSpaceID_Unknown, htmlAtom, xmlAtom);

Actually, you should use nsContentUtils::ASCIIToLower here. Sorry for not catching that earlier. (yes, I know that the old code got this wrong).
Same in nsGenericElement::GetElementsByTagName

>diff --git a/content/base/test/test_bug499656.html b/content/base/test/test_bug499656.html
...
>+div1 = document.createElementNS("http://www.w3.org/1999/xhtml","test");
>+div2 = document.createElementNS("http://www.w3.org/1999/xhtml","TEst");
>+div3 = document.createElementNS("test","test");
>+div4 = document.createElementNS("test","TEst");
>+
>+content = document.getElementById("content");
>+
>+content.appendChild(div1);
>+content.appendChild(div2);
>+content.appendChild(div3);
>+content.appendChild(div4);
>+
>+list = document.getElementsByTagName('test');
>+is(list.length, 2, "Number of elements found");
>+ok(list[0] == div1, "First element didn't match");
>+ok(list[1] == div3, "Third element didn't match");
>+
>+list = document.getElementsByTagName('TEst');
>+is(list.length, 2, "Wrong number of elements found");
>+ok(list[1] == div1, "First element didn't match");
>+ok(list[2] == div4, "Fourth element didn't match");

Can you also test that getElementsByTagNameNS *doesn't* do any case folding.
And test that document.createElement("TEst") and document.createElement("test") produces elements which ends up in the first list, but not the second.

Same in test_bug499656.xhtml, but there the createElement("TEst") should end up in the second list but not the first.

r=me with those changes.
Comment 11 David Zbarsky (:dzbarsky) 2010-08-31 19:47:58 PDT
Created attachment 471010 [details] [diff] [review]
Patch with nits r=sicking

Using the AsCIILowercase ended up fixing the test failure
Comment 12 Boris Zbarsky [:bz] (Out June 25-July 6) 2010-09-09 21:14:07 PDT
Comment on attachment 471010 [details] [diff] [review]
Patch with nits r=sicking

This needs approval to land, right?
Comment 13 Boris Zbarsky [:bz] (Out June 25-July 6) 2010-09-15 15:02:44 PDT
Pushed http://hg.mozilla.org/mozilla-central/rev/3a418c95938d

Note You need to log in before you can comment on or make changes to this bug.