Closed Bug 357450 Opened 18 years ago Closed 17 years ago

Implement getElementsByClassName

Categories

(Core :: DOM: Core & HTML, enhancement)

enhancement
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla1.9

People

(Reporter: sayrer, Assigned: sayrer)

References

()

Details

Attachments

(1 file, 18 obsolete files)

42.46 KB, patch
Details | Diff | Splinter Review
getElementsByClassName would be nice to have for our own use. 

getElementsByCSSSelector would be the more expressive alternative that would require me to write another language inside a string literal. This is not that bug.
Use cases: 

http://www.google.com/search?q=getElementsByClassName
Results 1 - 10 of about 430,000 for getElementsByClassName.
One could use this to extract microformats from a page with a single function call, f.e. this call, which grabs all hCards:

document.getElementsByClassName("vcard");
Turns out we have a standard for this too.

http://whatwg.org/specs/web-apps/current-work/#getelementsbyclassname

Sweet.
OS: Mac OS X 10.3 → All
Hardware: PC → All
Summary: getElementsByClassName → Implement getElementsByClassName
In the spec, Hixie says:

"There is an open issue on whether we should use multiple arguments or just one argument that needs to be split on spaces."

I think we should allow both, sort of. The first (and only?) argument should be allowed to be a string or an array. If it's a string, we'll call split() for you.

Thoughts?
OS: All → Mac OS X 10.3
Hardware: All → PC
Both seem reasonable, yeah. There's been a lot of discussion on this on the WHATWG list.
OS: Mac OS X 10.3 → All
Hardware: PC → All
At the moment the WHATWG spec defines a single method that takes a single argument, which must be an array. Other arguments would cause an exception to be thrown. I recommend implementing that for now; if, during implementation and initial deployment, we find that Web authors frequently don't understand this, then we can change to some more suitable model. Please keep the WHATWG list apprised of your implementation experience. Thanks!
I'll take this. It'll be fun.
Assignee: general → sayrer
This is the first DOM method I know of that takes an array, so I had to unpack it by hand. Kind of irritating, but should also work in PyXPCOM, etc.
fwiw, it may be worth using nsTArray<nsString> instead of nsStringArray.  Though might now matter much, and eventually we'd like to make nsStringArray just be nsTArray<nsString>...
Attachment #247592 - Attachment is obsolete: true
fwiw, that could use some OOM check and the delete calls should happen on the right pointers (cast to the actual class you implement, not some superclass), no?
Attached patch WIP - Fix bugs bz spotted (obsolete) — Splinter Review
Attachment #247628 - Attachment is obsolete: true
but the methods live in the wrong place and I haven't done document.getElementsByClassName yet.
Attachment #247690 - Attachment is obsolete: true
jst says:

"the least overhead would probably be to write a non-scriptable method that takes an nsTArray<nsAString> and write JS glue in nsDOMClassInfo.cpp to make it work with either a single string or an array of strings"
You'd need nsTArray<nsString>, but I agree that that's the way to go.
Attached patch WIP - helper class style (obsolete) — Splinter Review
Attachment #247712 - Attachment is obsolete: true
Attachment #247923 - Attachment is obsolete: true
Attachment #247924 - Attachment is obsolete: true
Attached patch WIP - works. now for cleanup. (obsolete) — Splinter Review
Works on documents and elements, unit tests cover HTML, XHTML, XHTML/SVG, and XUL/XHTML/SVG.
Attachment #247984 - Attachment is obsolete: true
Attached patch getElementsByClassName (obsolete) — Splinter Review
Attachment #247999 - Attachment is obsolete: true
Attached patch getElementsByClassName (obsolete) — Splinter Review
Attachment #248004 - Attachment is obsolete: true
Attachment #248007 - Flags: review?(jst)
hmm, maybe I should change this to return a node list if it gets an argument. That's what getElementsByTagName seems to do.
That's what the spec says to do...
(In reply to comment #25)
> That's what the spec says to do...

Actually, you told me to throw an exception for something like

> document.getElementsByClassName({})

or

> document.getElementsByClassName([[]])

but browsers don't seem to do that.


<html>
<body>
 <p id="null">WebKit finds this</p>
 <p id="undefined">Firefox and WebKit find this</p>
 <p id=",">Firefox and WebKit find this</p>
 
 <script type="text/javascript">
 document.write(document.getElementById(null) + "<br>" + 
                document.getElementById(undefined) + "<br>" + 
                document.getElementById(["",""]))
 </script>
   
</body>
</html>
(In reply to comment #24)
> hmm, maybe I should change this to return a node list if it gets an argument.
> That's what getElementsByTagName seems to do.
> 

ah, this should have said "if it gets a *bogus* argument"... getElementsByTagName seems to return a node list no matter what.
Oh, a bogus argument. Yeah, do whatever you think is right for that case.
Comment on attachment 248007 [details] [diff] [review]
getElementsByClassName

"A call to getElementsByClassName(['ccc', 'bbb']) would only return one node, however, namely p3. A call to document.getElementById('example').getElementsByClassName('ccc bbb') would return the same thing."

Oops, I missed the string tokenization requirement.
Attachment #248007 - Flags: review?(jst)
Robert, that's in one of the non normative examples. I'm pretty sure that's a bug in the spec. (Although personally I think having it accept a single string might make more sense than an array given how .className works.)
Attached patch getElementsByClassName (obsolete) — Splinter Review
OK, this one seems to be compliant, and I added unit tests for all of the examples in the WHATWG doc. However, things got messy again. I think I want to create an interface called nsIDOMDocumentOrElement (not a node). Since I already had to write a tearoff for nsGenericElement, I'm guessing I can use it in nsDocument.
Attachment #248007 - Attachment is obsolete: true
Attached patch everything in the tearoff (obsolete) — Splinter Review
Attachment #248289 - Attachment is obsolete: true
Attached patch getElementsByClassName (obsolete) — Splinter Review
Attachment #248314 - Attachment is obsolete: true
Comment on attachment 248333 [details] [diff] [review]
getElementsByClassName

there's a little bit of duplication in nsIDOMClassInfo, between the two helper classes. I could probably make that go away with a new helper they would both inherit from. Not sure if that's a good idea.
Attachment #248333 - Flags: review?(jst)
Comment on attachment 248333 [details] [diff] [review]
getElementsByClassName

+	nsIDOMNSDocumentOrElement.idl		\

I'd much rather to see this new method be added to nsIDOMNSDocument and a new nsIDOMNSElement interface than a new combined document&element interface.

+interface nsIDOMNSDocumentOrElement : nsISupports
+{
+  // Web Applications 1.0
+  [noscript] nsIDOMNodeList GetElementsByClassNameTArray(in StringArray classes);
+  nsIDOMNodeList getElementsByClassName(in DOMString classesString);

This is something that really should be discussed in the WhatWG group, but wouldn't this be easier to implement and just as easy to use if the argument to this method was a string that lets you specify more than one class name by separating them with whitespace, rather than to go into this whole mess with arrays (which has so far ben avoided completely in the DOM APIs)?

- In nsElementSH::NewResolve():

+    JSString *str = JSVAL_TO_STRING(id);
+    JSAutoRequest ar(cx);
+    JSFunction *cfnc =
+      ::JS_DefineFunction(cx, obj, ::JS_GetStringBytes(str),
+                          GetElementsByClassName, 0, 0);

Using JS_DefineUCFunction(cx, obj, ::JS_GetStringChars(str), ::JS_GetStringLength(str), ...) here would potentially save us from needing to keep the single byte version of the function name around. I know we do this in other places in nsDOMClassInfo.cpp (due to some of the code predating JS_DefineUCFunction), but the same applies to those places as well.

- In doGetElementsByClassName():

The isDoc argument looks unused.

+  } else if (JSVAL_IS_OBJECT(argv[0]) &&
+             JS_ValueToObject(cx, argv[0], &arrObj) &&
+             JS_IsArrayObject(cx, arrObj)) {

You don't need JS_ValueToObject() here, just use the JSVAL_TO_OBJECT() macro directly.

+    jsval elemVal;

This variable is only used inside the below loop, move the declaration there.

+      for (jsuint i=0; i < arrayLen; i++) {
+        jsstr = nsnull;
+        if (JS_GetElement(cx, arrObj, i, &elemVal) &&
+            JSVAL_IS_PRIMITIVE(elemVal)) {
+          jsstr = JS_ValueToString(cx, elemVal);
+        }

Why allow non-string primitive values in the array but not objects? I'd say we either allow any type of object as long as it converts to a string that's a valid class name, or we only allow string values. If we choose to permit only string values, use the macro JSVAL_TO_STRING() here and not the more expensive JS_ValueToString() call.

+    rv = element->GetElementsByClassNameTArray(&classes,
+                                               getter_AddRefs(nodeList));

Seems like the C++-only method in this interface should take a reference to a nsTArray, not a pointer.

+  if (NS_FAILED(rv)) {
+    nsDOMClassInfo::ThrowJSException(cx, rv);
+    return JS_FALSE;
+  }
+
+  return NS_SUCCEEDED(rv);

The return JS_FALSE inside the if check is not necessary, fall through and you'll get the same result.

+JSBool JS_DLL_CALLBACK
+nsDocumentSH::GetElementsByClassName(JSContext *cx, JSObject *obj, uintN argc,
+                                    jsval *argv, jsval *rval)

This looks *identical* to nsElementSH::GetElementsByClassName() if you remove the unused isDoc argument to doGetElementsByClassName(). Maybe make this a non-class static and call the same method from both places?

- In nsContentList::~nsContentList():

+  if (mDestroyFunc) {
+    // Clean up mData
+    (*mDestroyFunc)(mData);

Maybe only call this if mData is non-null to prevent ever needing to null check the data in the destroy function implementation?

- In content/base/src/nsDocument.cpp:

+// NS_INTERFACE_MAP_ENTRY_TEAROFF(nsIDOMNSDocumentOrElement, 
+//                               new nsDocumentOrElement(this->mRootContent));
+

Don't need this.

+nsresult
+nsDocumentOrElement::GetElementsByAtomArray(nsIContent* aContent,
+                                            nsCOMArray<nsIAtom>* aClasses,
+                                            nsIDOMNodeList** aReturn)
+{
+  if (aClasses->Count() > 0 && aContent) {
+    nsContentList* elements =
+      new nsContentList(aContent, MatchClassNames,
+                        DestroyClassNameArray, aClasses);
+    NS_ENSURE_TRUE(elements, NS_ERROR_OUT_OF_MEMORY);

This'll leak aClasses...

+    *aReturn = elements;
+  } else {
+    *aReturn = new nsBaseContentList();
+    if (!*aReturn) {
+      return NS_ERROR_OUT_OF_MEMORY;

and so will this.

+    }
+  }

And the above duplicates the null check, you could move that out here. And "elements" in the if() case above is not needed, assign directly into *aReturn there too.

+  NS_IF_ADDREF(*aReturn);

*aReturn is already null-checked, so use NS_ADDREF().

- In nsHTMLDocument.cpp:

+nsHTMLDocument::DestroyNameAttribute(void* aData)
+{
+  nsString* elementName = NS_STATIC_CAST(nsString*, aData);
+  delete elementName;
 }

There's a few of these around. Maybe stick one of these in nsContentUtils or nsContentList and let call-sites that use nsString data pass in that shared nsString destroying method?

And all the match functions (or call sites that create the nsContentLists) would now need to null-check the data argument in case we're OOM.

Other than that this looks good, but r- until we figure out the above...
Attachment #248333 - Flags: review?(jst) → review-
>
> This is something that really should be discussed in the WhatWG group, but
> wouldn't this be easier to implement and just as easy to use if the argument to
> this method was a string that lets you specify more than one class name by
> separating them with whitespace, rather than to go into this whole mess with
> arrays (which has so far ben avoided completely in the DOM APIs)?

iirc, this has been discussed to death on the WhatWG list--I don't have a preference either way.
Attachment #248333 - Attachment is obsolete: true
Attached patch nits fixed (obsolete) — Splinter Review
#whatwg says we can change to string argument only. Besides, we can always add the array goop later if consensus shifts.
Attachment #251691 - Attachment is obsolete: true
Attachment #251694 - Flags: review?(jst)
I assume the quirks-mode difference between this code and CSS matching is purposeful?  In quirks mode, class names are case-insensitive for matching purposes...
Attached patch fixed flubbed null check (obsolete) — Splinter Review
Attachment #251694 - Attachment is obsolete: true
Attachment #251701 - Flags: review?(jst)
Attachment #251694 - Flags: review?(jst)
Comment on attachment 251701 [details] [diff] [review]
fixed flubbed null check

>? dom/public/idl/core/nsIDOMNSElement.idl

You forgot to add this to the patch.


>Index: dom/public/idl/core/nsIDOMNSDocument.idl

>+  nsIDOMNodeList getElementsByClassName(in DOMString classes);

Need to bump the IID here.
Attached patch do the idl files right (obsolete) — Splinter Review
good catch, thanks
Attachment #251701 - Attachment is obsolete: true
Attachment #251724 - Flags: review?(jst)
Attachment #251701 - Flags: review?(jst)
(In reply to comment #39)
> I assume the quirks-mode difference between this code and CSS 
> matching is purposeful?  In quirks mode, class names are 
> case-insensitive for matching purposes...

Quirks mode is just for legacy pages; since there are no legacy pages that use this API, no quirks are needed, IMHO.
Comment on attachment 251724 [details] [diff] [review]
do the idl files right

- In nsDocument::DestroyClassNameArray(void* aData):

+  if (aData) {
+    nsCOMArray<nsIAtom>* classes = NS_STATIC_CAST(nsCOMArray<nsIAtom>*, aData);
+    delete classes;

The if (aData) check isn't needed, delete is null safe.

- In content/base/src/nsDocument.h:

+  /**
+   * Utility methods for getElementsByClassName
+   */
+  static nsresult GetElementsByClassNameHelper(nsIContent* aContent,
+                                               const nsAString& aClasses,
+                                               nsIDOMNodeList** aReturn);
+   
+  // nsContentList match functions
+  static PRBool MatchClassNames(nsIContent* aContent, PRInt32 aNamespaceID,
+                                nsIAtom* aAtom, void* aData);
+  static void DestroyClassNameArray(void* aData);

Since the above are used by both nsDocument and nsGenericElement one could argue that the above belongs in nsContentUtils rather than in nsDocument. Thoughts?

And it looks like the MatchClassNames() and DestroyClassNameArray() functions could be non-class statics, or at least protected static methods as they're only ever used internally by GetElementsByClassNameHelper().

+NS_INTERFACE_MAP_BEGIN(nsNSElementTearoff)
+ NS_INTERFACE_MAP_ENTRY(nsIDOMNSElement)

Add a space there to match surrounding indentation.

r=jst with that.
Attachment #251724 - Flags: review?(jst) → review+
(In reply to comment #44)
> 
> Since the above are used by both nsDocument and nsGenericElement one could
> argue that the above belongs in nsContentUtils rather than in nsDocument.
> Thoughts?

I can't think of a reason they would be useful elsewhere, though. I followed the pattern used in nsXULElement and nsXULDocument, which share match functions that live in nsXULDocument. Either location is OK with me.

Attached patch address jst's comments (obsolete) — Splinter Review
Attachment #251724 - Attachment is obsolete: true
Attachment #252419 - Flags: superreview?(bzbarsky)
Attachment #252419 - Flags: superreview?(bzbarsky) → superreview?(peterv)
It might take me a week or two to get to this.  :(
Comment on attachment 252419 [details] [diff] [review]
address jst's comments

>Index: testing/mochitest/tests/test_bug357450.js
>===================================================================

>+  var nodes = root.getElementsByClassName("f");

>+  is(nodes.length, 0, "empty string should get an empty nodelist");

Didn't you just pass in "f" instead of the empty string?

>+  nodes = root.getElementsByClassName([1]);

So we still support arrays?

>+  nodes = root.getElementsByClassName({});

>+  is(nodes.length, 0, "bogus arg should get an empty nodelist");

I'm guessing we don't do that if the object passed in provides a toString?

>Index: testing/mochitest/tests/test_bug357450.xul
>===================================================================

>\ No newline at end of file

Fix this.

>Index: testing/mochitest/tests/test_bug357450_svg.xhtml
>===================================================================

>\ No newline at end of file

>Index: content/base/src/nsDocument.cpp
>===================================================================

>+nsDocument::MatchClassNames(nsIContent* aContent,

>+  PRInt32 length = classes->Count();
>+  for (PRInt32 i=0; i < length; i++) {

Nits: declare i outside of the loop, spaces around = and ++i.

>Index: content/base/src/nsGenericElement.h
>===================================================================

>+  virtual ~nsNSElementTearoff() {};

Do we even need this?

>Index: content/html/document/src/nsHTMLDocument.cpp
>===================================================================

> nsHTMLDocument::MatchNameAttribute(nsIContent* aContent, PRInt32 aNamespaceID,

>+  nsAString* elementName = NS_STATIC_CAST(nsAString*, aData);

I'd cast to nsString*.

>Index: content/xul/document/src/nsXULDocument.cpp
>===================================================================

> nsXULDocument::MatchAttribute(nsIContent* aContent,

>+    nsAString* attrValue = NS_STATIC_CAST(nsAString*, aData);

Ditto.

>Index: dom/public/idl/core/nsIDOMNSElement.idl
>===================================================================

>+ * The Initial Developer of the Original Code is
>+ * Netscape Communications Corporation.
>+ * Portions created by the Initial Developer are Copyright (C) 2000
>+ * the Initial Developer. All Rights Reserved.
>+ *
>+ * Contributor(s):
>+ *   Vidur Apparao <vidur@netscape.com> (original author)
>+ *   Johnny Stenback <jst@netscape.com>

Heh, haven't seen Vidur in a while. ;-)

>+  nsIDOMNodeList getElementsByClassName(in DOMString classes);

Want to add a comment describing this, maybe with a link to the spec?

Looks great, thanks.
Attachment #252419 - Flags: superreview?(peterv) → superreview+
(In reply to comment #48)
> 
> >+  nodes = root.getElementsByClassName([1]);
> 
> So we still support arrays?

No, we coerce that to the string "1". see comment #26.
Attachment #252419 - Attachment is obsolete: true
Checking in dom/public/idl/core/Makefile.in;
/cvsroot/mozilla/dom/public/idl/core/Makefile.in,v  <--  Makefile.in
new revision: 1.13; previous revision: 1.12
done
Checking in dom/public/idl/core/nsIDOMNSDocument.idl;
/cvsroot/mozilla/dom/public/idl/core/nsIDOMNSDocument.idl,v  <--  nsIDOMNSDocument.idl
new revision: 1.14; previous revision: 1.13
done
RCS file: /cvsroot/mozilla/dom/public/idl/core/nsIDOMNSElement.idl,v
done
Checking in dom/public/idl/core/nsIDOMNSElement.idl;
/cvsroot/mozilla/dom/public/idl/core/nsIDOMNSElement.idl,v  <--  nsIDOMNSElement.idl
initial revision: 1.1
done
Checking in dom/src/base/nsDOMClassInfo.cpp;
/cvsroot/mozilla/dom/src/base/nsDOMClassInfo.cpp,v  <--  nsDOMClassInfo.cpp
new revision: 1.425; previous revision: 1.424
done
Checking in content/base/public/nsContentUtils.h;
/cvsroot/mozilla/content/base/public/nsContentUtils.h,v  <--  nsContentUtils.h
new revision: 1.117; previous revision: 1.116
done
Checking in content/base/src/nsContentList.cpp;
/cvsroot/mozilla/content/base/src/nsContentList.cpp,v  <--  nsContentList.cpp
new revision: 3.102; previous revision: 3.101
done
Checking in content/base/src/nsContentList.h;
/cvsroot/mozilla/content/base/src/nsContentList.h,v  <--  nsContentList.h
new revision: 3.68; previous revision: 3.67
done
Checking in content/base/src/nsDocument.cpp;
/cvsroot/mozilla/content/base/src/nsDocument.cpp,v  <--  nsDocument.cpp
new revision: 3.703; previous revision: 3.702
done
Checking in content/base/src/nsDocument.h;
/cvsroot/mozilla/content/base/src/nsDocument.h,v  <--  nsDocument.h
new revision: 3.325; previous revision: 3.324
done
Checking in content/base/src/nsGenericElement.cpp;
/cvsroot/mozilla/content/base/src/nsGenericElement.cpp,v  <--  nsGenericElement.cpp
new revision: 3.529; previous revision: 3.528
done
Checking in content/base/src/nsGenericElement.h;
/cvsroot/mozilla/content/base/src/nsGenericElement.h,v  <--  nsGenericElement.h
new revision: 3.241; previous revision: 3.240
done
Checking in content/html/content/src/nsHTMLTableRowElement.cpp;
/cvsroot/mozilla/content/html/content/src/nsHTMLTableRowElement.cpp,v  <--  nsHTMLTableRowElement.cpp
new revision: 1.112; previous revision: 1.111
done
Checking in content/html/document/src/nsHTMLDocument.cpp;
/cvsroot/mozilla/content/html/document/src/nsHTMLDocument.cpp,v  <--  nsHTMLDocument.cpp
new revision: 3.710; previous revision: 3.709
done
Checking in content/html/document/src/nsHTMLDocument.h;
/cvsroot/mozilla/content/html/document/src/nsHTMLDocument.h,v  <--  nsHTMLDocument.h
new revision: 3.204; previous revision: 3.203
done
Checking in content/xul/content/src/nsXULElement.cpp;
/cvsroot/mozilla/content/xul/content/src/nsXULElement.cpp,v  <--  nsXULElement.cpp
new revision: 1.682; previous revision: 1.681
done
Checking in content/xul/document/src/nsXULDocument.cpp;
/cvsroot/mozilla/content/xul/document/src/nsXULDocument.cpp,v  <--  nsXULDocument.cpp
new revision: 1.747; previous revision: 1.746
done
Checking in content/xul/document/src/nsXULDocument.h;
/cvsroot/mozilla/content/xul/document/src/nsXULDocument.h,v  <--  nsXULDocument.h
new revision: 1.191; previous revision: 1.190
done
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Flags: in-testsuite?
Flags: in-testsuite? → in-testsuite+
Hi, you guys might want to have a look at the testcases that fail here: http://tc.labs.opera.com/apis/getElementsByClassName/
(In reply to comment #52)
> Hi, you guys might want to have a look at the testcases that fail here:
> http://tc.labs.opera.com/apis/getElementsByClassName/
> 

You might, too. ;)

See comment 43 for details on "quirks mode" behavior.
Most of these failures look like our nodelist code missing the class attribute on the <html> element. The tokenization of the parameter and dynamic updates seem fine.
I think the quirks mode test fails because of the <html> element problem. One other thing that seems go wrong is not recognizing \f as space character (007 and 008). HTML5 currently defines that as space character and I've assumed that this API will simply reuse that definition.
> Most of these failures look like our nodelist code missing the class attribute
> on the <html> element.

It's not missing anything. It's being told to ignore the root by the caller.  That is, nsDocument::GetElementsByClassName is just buggy here.
Depends on: 395915
Filed bug 395915.

The whitespace issue should get a bug once that part of the API is stable, I guess.
Note that I'm not sure the 011.xml test is correct.  Since there is no <x> element defined by SVG, I'm not sure the "class" attribute applies to it.  Someone should check with the SVG working group, and I would suggest using valid markup for the test, unless you mean to be testing SVG error handling (which is not really defined on the DOM level, so I'm not sure what you'd be testing).
I have assumed that each element in the SVG namespace supports the global attributes much like each element in the HTML namespace does. I suppose this might not be what the SVG draft says, but it seems the right thing to do...
It's not clear that it is the right thing, per SVG spec.  As a simple example, per the spec schema the "class" attribute is only allowed on some SVG elements (the ones implementing SVGStylableElement) and not others.  We're actually violating that for existing SVG elements in the interests of having this API work better on valid SVG documents.  But for invalid ones, I don't think we really care to make it work.
Depends on: 399334
Depends on: 390411
Depends on: iWeb
can folks that worked on this comment on the possible side effect detected in bug 421015

-thanks, chofmann
Speaking of side effects this one is also a possible candidate: bug #428293
Depends on: 433833
No longer depends on: gaelFS
Status: RESOLVED → VERIFIED
Target Milestone: --- → mozilla1.9
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.