Closed Bug 499656 Opened 15 years ago Closed 14 years ago

nsContentList should have dual atoms: HTML and other

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla2.0b7

People

(Reporter: hsivonen, Assigned: dzbarsky)

References

()

Details

(Keywords: html5)

Attachments

(1 file, 3 obsolete files)

Steps to reproduce:
 1) Load http://livedom.validator.nu/?%3C!DOCTYPE%20html%3E%0A%3Csvg%3E%0A%3CfeBlend%2F%3E%0A%3C%2Fsvg%3E%0A%3Cscript%3Ew%28document.getElementsByTagName%28%27svg%27%29.length%29%3C%2Fscript%3E%0A%3Cscript%3Ew%28document.getElementsByTagName%28%27feBlend%27%29.length%29%3C%2Fscript%3E

Expected results:
Expected log to say
1
1

Actual results:
Log says
1
0

See http://lists.w3.org/Archives/Public/public-html/2009Apr/0081.html

As of HTML5, SVG names with uppercase ASCII letters can enter into a text/html DOM. Even without an HTML5 parser, such elements can be introduced with createElementNS().

Currently, nsContentList atoms associated with text/html DOMs are lowercased. This makes it impossible to match SVG camelCase names. Going forward, nsContentList should have dual atoms: one atom for matching against elements in the http://www.w3.org/1999/xhtml namespace and another for matching against all other elements.

The atom for matching against elements in the http://www.w3.org/1999/xhtml namespace should be ASCII-lowercased if the document is a text/html document. The other atom should always be in the original case.
See also bug 499655.
Attached patch Patch (obsolete) — Splinter Review
Attachment #390759 - Flags: review?(jonas)
Drive-by comments:

1)  Why change the signature of the constructor that takes a match function?
    Those types of content lists never need the extra atom.
2)  This doesn't look like it does the right thing (which is using the HTML atom
    for HTML content and the XML atom otherwise).  Some tests for correct
    behavior in HTML documents with XML content in them (which are missing from
    this patch!) would have caught that, I think.
Attached patch Updated Patch with tests (obsolete) — Splinter Review
Attachment #390759 - Attachment is obsolete: true
Attachment #391369 - Flags: review?(jonas)
Attachment #390759 - Flags: review?(jonas)
Comment on attachment 391369 [details] [diff] [review]
Updated Patch with tests

Why change the order of the arguments to NS_GetContentList, but not to the nsContentList ctor? I don't really care about the order, but it's nice to be consistent.

Also, I think it might be worth caching the result of (mRoot->GetCurrentDoc()->IsCaseSensitive()). Then implement ParentChainChanged (it's one of the notifications on nsIMutationObserver), and update the cached value there.

>@@ -703,35 +710,41 @@ nsContentList::Match(nsIContent *aConten
> {
>   if (!aContent)
>     return PR_FALSE;
> 
>   NS_ASSERTION(aContent->IsNodeOfType(nsINode::eELEMENT),
>                "Must have element here");
> 
>   if (mFunc) {
>-    return (*mFunc)(aContent, mMatchNameSpaceId, mMatchAtom, mData);
>+    return (*mFunc)(aContent, mMatchNameSpaceId, mXMLMatchAtom, mData);
>   }
> 
>-  if (mMatchAtom) {
>-    nsINodeInfo *ni = aContent->NodeInfo();
>+  if (!mXMLMatchAtom)
>+    return PR_FALSE;
> 
>-    if (mMatchNameSpaceId == kNameSpaceID_Unknown) {
>-      return (mMatchAll || ni->QualifiedNameEquals(mMatchAtom));
>-    }
>+  nsINodeInfo *ni = aContent->NodeInfo();
> 
>-    if (mMatchNameSpaceId == kNameSpaceID_Wildcard) {
>-      return (mMatchAll || ni->Equals(mMatchAtom));
>-    }
>+  PRBool matchHTML = aContent->GetNameSpaceID() == kNameSpaceID_XHTML &&
>+    !(aContent->GetCurrentDoc()->IsCaseSensitive());
> 
>-    return ((mMatchAll && ni->NamespaceEquals(mMatchNameSpaceId)) ||
>-            ni->Equals(mMatchAtom, mMatchNameSpaceId));
>+  if (mMatchNameSpaceId == kNameSpaceID_Unknown) {
>+    return mMatchAll || ni->QualifiedNameEquals(mXMLMatchAtom)  || 
>+      (matchHTML &&  ni->QualifiedNameEquals(mHTMLMatchAtom));
>   }
>-
>-  return PR_FALSE;
>+  
>+  if (mMatchNameSpaceId == kNameSpaceID_Wildcard) {
>+    return mMatchAll || ni->Equals(mXMLMatchAtom) || 
>+      (matchHTML &&  ni->Equals(mHTMLMatchAtom));
>+  }
>+  
>+  return (mMatchAll && ni->NamespaceEquals(mMatchNameSpaceId)) ||
>+    ni->Equals(mXMLMatchAtom, mMatchNameSpaceId)  || 
>+    (matchHTML && ni->Equals(mXMLMatchAtom, mMatchNameSpaceId));
>+  
> }

If mMatchAll is true you are unnecessarily computing matchHTML. Might be worth dealing with mMatchAll separately.

Also, if matchHTML is false, you don't want to do the comparison with mXMLMatchAtom, right? So for example:

  ni->QualifiedNameEquals(mXMLMatchAtom)  ||
  (matchHTML &&  ni->QualifiedNameEquals(mHTMLMatchAtom)

should be

  matchHTML ? ni->QualifiedNameEquals(mHTMLMatchAtom) :
              ni->QualifiedNameEquals(mXMLMatchAtom)


>diff --git a/content/base/src/nsDocument.cpp b/content/base/src/nsDocument.cpp
>--- a/content/base/src/nsDocument.cpp
>+++ b/content/base/src/nsDocument.cpp
>@@ -4266,20 +4266,30 @@ nsDocument::CreateEntityReference(const 
>   *aReturn = nsnull;
>   return NS_OK;
> }
> 
> NS_IMETHODIMP
> nsDocument::GetElementsByTagName(const nsAString& aTagname,
>                                  nsIDOMNodeList** aReturn)
> {
>-  nsCOMPtr<nsIAtom> nameAtom = do_GetAtom(aTagname);
>-  NS_ENSURE_TRUE(nameAtom, NS_ERROR_OUT_OF_MEMORY);
>-
>-  nsContentList *list = NS_GetContentList(this, nameAtom, kNameSpaceID_Unknown).get();
>+  nsCOMPtr<nsIAtom> XMLAtom = do_GetAtom(aTagname);
>+  NS_ENSURE_TRUE(XMLAtom, NS_ERROR_OUT_OF_MEMORY);
>+
>+
>+  nsAutoString lowercaseName(aTagname);
>+  ToLowerCase(lowercaseName);
>+  nsCOMPtr<nsIAtom> HTMLAtom = do_GetAtom(lowercaseName);
>+  NS_ENSURE_TRUE(HTMLAtom, NS_ERROR_OUT_OF_MEMORY);

Please name the variables xmlAtom/htmlAtom to follow existing convention (variable names start with lower case).

The above is faster as:
nsAutoString lowercaseName;
ToLowerCase(aTagname, lowercaseName);

Also no need for two empty lines.

Same applies in nsGenericElement::GetElementsByTagName

>+
>+
>+  nsContentList *list = NS_GetContentList(this, 
>+                                          kNameSpaceID_Unknown, 
>+                                          HTMLAtom, 
>+                                          XMLAtom).get();

No need for two empty lines.

>diff --git a/content/html/document/src/nsHTMLDocument.cpp b/content/html/document/src/nsHTMLDocument.cpp

> nsHTMLDocument::GetLinks(nsIDOMHTMLCollection** aLinks)
> {
>   if (!mLinks) {
>-    mLinks = new nsContentList(this, MatchLinks, nsnull, nsnull);
>+    mLinks = new nsContentList(this, MatchLinks, nsnull, nsnull, nsnull);

Does this compile? I though you weren't changing this signature? Same in a few other places.

Looks good otherwise. Would like to see an updated patch though.
Attachment #391369 - Flags: review?(jonas) → review-
David, any chance of you picking this up again?
Sure, I forgot about this
Assignee: nobody → dzbarsky
Attached patch Patch (obsolete) — Splinter Review
I'm not sure that this is correct, because we always pass mXMLMatchAtom to the callback in ::Match.  Maybe it should check matchHTML and then pass in mXMLMatchAtom/mHTMLMatchAtom accordingly?
Attachment #391369 - Attachment is obsolete: true
Attachment #469594 - Flags: review?(jonas)
Comment on attachment 469594 [details] [diff] [review]
Patch

>diff --git a/content/base/src/nsContentList.cpp b/content/base/src/nsContentList.cpp
...
>   if (!list) {
>     // We need to create a ContentList and add it to our new entry, if
>     // we have an entry
>-    list = new nsContentList(aRootNode, aMatchAtom,
>-                             aMatchNameSpaceId);
>+    list = new nsContentList(aRootNode, aMatchNameSpaceId,
>+                             aHTMLMatchAtom, aXMLMatchAtom);
>+    NS_ASSERTION(list, "Infallible malloc failed.");

Nit: No need for this assertion IMHO.


> nsContentList::nsContentList(nsINode* aRootNode,
>-                             nsIAtom* aMatchAtom,
>                              PRInt32 aMatchNameSpaceId,
>+                             nsIAtom* aHTMLMatchAtom,
>+                             nsIAtom* aXMLMatchAtom,
>                              PRBool aDeep)
>   : nsBaseContentList(),
>-    nsContentListKey(aRootNode, aMatchAtom, aMatchNameSpaceId),
>+    nsContentListKey(aRootNode, aHTMLMatchAtom, aXMLMatchAtom, aMatchNameSpaceId),
>     mFunc(nsnull),
>     mDestroyFunc(nsnull),
>     mData(nsnull),
>     mState(LIST_DIRTY),
>     mDeep(aDeep),
>     mFuncMayDependOnAttr(PR_FALSE)
> {
>   NS_ASSERTION(mRootNode, "Must have root");
>-  if (nsGkAtoms::_asterix == mMatchAtom) {
>+  if (nsGkAtoms::_asterix == mHTMLMatchAtom) {
>     mMatchAll = PR_TRUE;

Assert that mXMLMatchAtom is also nsGkAtoms::_asterix in this case.

> nsContentList::Match(Element *aElement)
...
>+  nsIDocument* doc = aElement->GetCurrentDoc();
>+  PRBool matchHTML = aElement->GetNameSpaceID() == kNameSpaceID_XHTML &&
>+    doc && doc->IsHTML();

you want GetOwnerDoc() instead of GetCurrentDoc().

>diff --git a/content/base/src/nsContentList.h b/content/base/src/nsContentList.h
>   nsContentListKey(nsINode* aRootNode,
>-                   nsIAtom* aMatchAtom, 
>+                   nsIAtom* aHTMLMatchAtom,
>+                   nsIAtom* aXMLMatchAtom,
>                    PRInt32 aMatchNameSpaceId)
>-    : mMatchAtom(aMatchAtom),
>+    : mHTMLMatchAtom(aHTMLMatchAtom),
>+      mXMLMatchAtom(aXMLMatchAtom),
>       mMatchNameSpaceId(aMatchNameSpaceId),
>       mRootNode(aRootNode)
>   {
>+    NS_ASSERTION((aXMLMatchAtom == nsnull && aHTMLMatchAtom == nsnull) ||
>+                 (aXMLMatchAtom != nsnull && aHTMLMatchAtom != nsnull), "Either neither or both atoms should be null");
>   }

Ugly/neat way of doing this is
NS_ASSERTION((aXMLMatchAtom == nsnull) == (aHTMLMatchAtom == nsnull), "...");
or
NS_ASSERTION(!aXMLMatchAtom == !aHTMLMatchAtom, "...");


>diff --git a/content/base/src/nsDocument.cpp b/content/base/src/nsDocument.cpp
> nsDocument::GetElementsByTagName(const nsAString& aTagname)
...
>+  nsAutoString lowercaseName;
>+  ToLowerCase(aTagname, lowercaseName);
>+  nsCOMPtr<nsIAtom> xmlAtom = do_GetAtom(aTagname);
>+  nsCOMPtr<nsIAtom> htmlAtom = do_GetAtom(lowercaseName);
>+
>+  return NS_GetContentList(this, kNameSpaceID_Unknown, htmlAtom, xmlAtom);

Actually, you should use nsContentUtils::ASCIIToLower here. Sorry for not catching that earlier. (yes, I know that the old code got this wrong).
Same in nsGenericElement::GetElementsByTagName

>diff --git a/content/base/test/test_bug499656.html b/content/base/test/test_bug499656.html
...
>+div1 = document.createElementNS("http://www.w3.org/1999/xhtml","test");
>+div2 = document.createElementNS("http://www.w3.org/1999/xhtml","TEst");
>+div3 = document.createElementNS("test","test");
>+div4 = document.createElementNS("test","TEst");
>+
>+content = document.getElementById("content");
>+
>+content.appendChild(div1);
>+content.appendChild(div2);
>+content.appendChild(div3);
>+content.appendChild(div4);
>+
>+list = document.getElementsByTagName('test');
>+is(list.length, 2, "Number of elements found");
>+ok(list[0] == div1, "First element didn't match");
>+ok(list[1] == div3, "Third element didn't match");
>+
>+list = document.getElementsByTagName('TEst');
>+is(list.length, 2, "Wrong number of elements found");
>+ok(list[1] == div1, "First element didn't match");
>+ok(list[2] == div4, "Fourth element didn't match");

Can you also test that getElementsByTagNameNS *doesn't* do any case folding.
And test that document.createElement("TEst") and document.createElement("test") produces elements which ends up in the first list, but not the second.

Same in test_bug499656.xhtml, but there the createElement("TEst") should end up in the second list but not the first.

r=me with those changes.
Attachment #469594 - Flags: review?(jonas) → review+
Using the AsCIILowercase ended up fixing the test failure
Attachment #469594 - Attachment is obsolete: true
Keywords: checkin-needed
Comment on attachment 471010 [details] [diff] [review]
Patch with nits r=sicking

This needs approval to land, right?
Attachment #471010 - Flags: approval2.0?
Attachment #471010 - Flags: approval2.0? → approval2.0+
Pushed http://hg.mozilla.org/mozilla-central/rev/3a418c95938d
Status: NEW → RESOLVED
Closed: 14 years ago
Flags: in-testsuite+
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b7
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: