Last Comment Bug 591467 - Implement HTML Microdata API
: Implement HTML Microdata API
Status: RESOLVED FIXED
: dev-doc-complete
Product: Core
Classification: Components
Component: DOM (show other bugs)
: Trunk
: All All
: -- normal with 12 votes (vote)
: mozilla16
Assigned To: David Zbarsky (:dzbarsky)
:
:
Mentors:
http://www.whatwg.org/specs/web-apps/...
: 742633 (view as bug list)
Depends on: 758928 802548 585381 663647 758903 761747 763229 763626 800850 801988 802831 802874 810011 824153 909633
Blocks: 806263
  Show dependency treegraph
 
Reported: 2010-08-27 14:06 PDT by David Zbarsky (:dzbarsky)
Modified: 2016-05-25 05:41 PDT (History)
35 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP (8.57 KB, patch)
2010-08-27 14:06 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
Testcase (1.43 KB, text/html)
2010-08-27 14:07 PDT, David Zbarsky (:dzbarsky)
no flags Details
WIP Part 1 (37.79 KB, patch)
2010-08-30 17:55 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
WIP Part 2 (7.60 KB, patch)
2010-08-30 17:56 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
WIP Part 3 (7.64 KB, patch)
2010-08-30 17:56 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
itemvalue test (1.79 KB, text/html)
2010-08-30 18:09 PDT, David Zbarsky (:dzbarsky)
no flags Details
unroll.js (1.70 KB, application/x-javascript)
2010-08-30 18:10 PDT, David Zbarsky (:dzbarsky)
no flags Details
WIP Part 1 (37.70 KB, patch)
2010-09-26 18:35 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
WIP v1.1 (50.79 KB, patch)
2010-09-28 15:02 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
wip v1.2 (56.99 KB, patch)
2010-09-29 19:35 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
wip v1.2 (60.90 KB, patch)
2010-09-29 19:39 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
Test part 1 (4.25 KB, text/html)
2010-10-03 17:28 PDT, David Zbarsky (:dzbarsky)
no flags Details
Test part 2 (5.85 KB, text/html)
2010-10-03 17:29 PDT, David Zbarsky (:dzbarsky)
no flags Details
mostly working patch (62.38 KB, patch)
2010-10-03 17:32 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
working patch with tests (76.79 KB, patch)
2010-10-16 20:51 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
working patch with tests (84.58 KB, patch)
2010-10-19 17:28 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
patch with tests (113.49 KB, patch)
2011-06-13 12:44 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
Patch (114.72 KB, patch)
2011-08-22 22:55 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
Patch (116.04 KB, patch)
2011-08-23 05:43 PDT, David Zbarsky (:dzbarsky)
hsivonen: review-
Details | Diff | Splinter Review
Patch with tests (306.59 KB, patch)
2012-05-11 14:41 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
Patch with tests (302.31 KB, patch)
2012-05-11 15:08 PDT, David Zbarsky (:dzbarsky)
hsivonen: review-
Details | Diff | Splinter Review
Fixes to pass tryserver (2.63 KB, patch)
2012-05-11 20:30 PDT, David Zbarsky (:dzbarsky)
hsivonen: review+
Details | Diff | Splinter Review
Import tests (228.66 KB, patch)
2012-05-22 03:04 PDT, :Ms2ger (⌚ UTC+1/+2)
no flags Details | Diff | Splinter Review
Patch (274.94 KB, patch)
2012-05-24 16:01 PDT, David Zbarsky (:dzbarsky)
bzbarsky: review-
Details | Diff | Splinter Review
Patch with files added (297.61 KB, patch)
2012-05-25 13:08 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
Interdiff (76.46 KB, patch)
2012-05-27 01:49 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review
Fix tests to match spec (67.64 KB, patch)
2012-05-28 22:04 PDT, David Zbarsky (:dzbarsky)
Ms2ger: review+
Details | Diff | Splinter Review
Fix tests to match spec (67.71 KB, patch)
2012-05-29 11:11 PDT, David Zbarsky (:dzbarsky)
Ms2ger: checkin+
Details | Diff | Splinter Review
Import the test (232.83 KB, patch)
2012-05-29 12:56 PDT, :Ms2ger (⌚ UTC+1/+2)
no flags Details | Diff | Splinter Review
Interdiff (297.71 KB, patch)
2012-05-29 23:14 PDT, David Zbarsky (:dzbarsky)
bzbarsky: review-
Details | Diff | Splinter Review
Hashtable and leak fix (5.97 KB, patch)
2012-05-31 17:11 PDT, David Zbarsky (:dzbarsky)
bzbarsky: review-
Details | Diff | Splinter Review
Interdiff from last review (24.89 KB, patch)
2012-06-01 13:15 PDT, David Zbarsky (:dzbarsky)
bzbarsky: review+
Details | Diff | Splinter Review
(Hopefully) final interdiff (7.14 KB, patch)
2012-06-03 00:14 PDT, David Zbarsky (:dzbarsky)
bzbarsky: review+
Details | Diff | Splinter Review
Rollup patch (83.23 KB, patch)
2012-06-04 10:47 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Splinter Review

Description David Zbarsky (:dzbarsky) 2010-08-27 14:06:33 PDT
Created attachment 470017 [details] [diff] [review]
WIP

WIP attached.  Only implements enough to run the attached testcase.
Problems:
-For each loop does not work
-itemref isn't implemented
-multiple properties with the same itemprop are not combined
-itemValue is unimplemnted, so the url has the wrong value (should be href, not textContent)
-etc.
Comment 1 David Zbarsky (:dzbarsky) 2010-08-27 14:07:19 PDT
Also, it uses nsIHTMLCollection as the interface instead of nsIHTMLPropertiesCollection
Comment 2 David Zbarsky (:dzbarsky) 2010-08-27 14:07:42 PDT
Created attachment 470018 [details]
Testcase
Comment 3 David Zbarsky (:dzbarsky) 2010-08-30 17:55:25 PDT
Created attachment 470650 [details] [diff] [review]
WIP Part 1
Comment 4 David Zbarsky (:dzbarsky) 2010-08-30 17:56:07 PDT
Created attachment 470652 [details] [diff] [review]
WIP Part 2
Comment 5 David Zbarsky (:dzbarsky) 2010-08-30 17:56:23 PDT
Created attachment 470653 [details] [diff] [review]
WIP Part 3
Comment 6 David Zbarsky (:dzbarsky) 2010-08-30 18:08:22 PDT
WIP Part 1 is the main implementation.  It implements most of itemValue, except for calling itemValue on an element with an itemscope property.  It also implements itemprop, itemref, itemid, itemscope, and element.properties.

Issues: 
1. For each loop does not work, but that is fixable by making it implement the right interface
2. element.properties currently returns an nsIDOMHTMLCollection.  It should instead return an nsIDOMHTMLPropertiesCollection.  WIP Part 2 defines part of the nsIDOMHTMLPropertiesCollection. (I was trying to get namedItem to work before writing the rest of it).  WIP Part 3 changes nsGenericHTMLElement to use nsIDOMHTMLPropertiesCollection for Properties.  I'm not sure why, but with WIP Part 3, there is an issue where attempting to access the properties object after getting it gives an error: "Illegal operation on WrappedNative prototype object"
3. document.getItems() is not implemented, but with itemprop working correctly, it shouldn't be too hard.

I will attach a test for itemvalue - it uses a JS file unroll.js
Unroll.js has 2 functions that return a string representation of a properties object.  There is getJSON, which takes an array of nodes, and returns the JSON representation of their microdata, as specified in http://www.whatwg.org/specs/web-apps/current-work/multipage/links.html#json.
The other function is unrollProps, which takes the result of an element.properties call (so currently an HTMLCollection), and returns a string of propnames and their values, which is slightly more readable than the JSON string.

If anyone wants to take this, feel free to do so.
Comment 7 David Zbarsky (:dzbarsky) 2010-08-30 18:09:26 PDT
Created attachment 470659 [details]
itemvalue test
Comment 8 David Zbarsky (:dzbarsky) 2010-08-30 18:10:00 PDT
Created attachment 470661 [details]
unroll.js
Comment 9 David Zbarsky (:dzbarsky) 2010-09-26 18:35:29 PDT
Created attachment 478691 [details] [diff] [review]
WIP Part 1

fix itemValue to work correctly
Comment 10 David Zbarsky (:dzbarsky) 2010-09-28 15:02:27 PDT
Created attachment 479186 [details] [diff] [review]
WIP v1.1

document.getItems() not yet implemented
element.properties.namedItem.values not yet implemented

stuff leaks
Comment 11 David Zbarsky (:dzbarsky) 2010-09-29 19:35:37 PDT
Created attachment 479683 [details] [diff] [review]
wip v1.2

Implement document.getItems
Comment 12 David Zbarsky (:dzbarsky) 2010-09-29 19:39:00 PDT
Created attachment 479684 [details] [diff] [review]
wip v1.2

this time with new files
Comment 13 David Zbarsky (:dzbarsky) 2010-10-03 17:28:19 PDT
Created attachment 480547 [details]
Test part 1
Comment 14 David Zbarsky (:dzbarsky) 2010-10-03 17:29:01 PDT
Created attachment 480548 [details]
Test part 2
Comment 15 David Zbarsky (:dzbarsky) 2010-10-03 17:32:02 PDT
Created attachment 480549 [details] [diff] [review]
mostly working patch

I haven't quite gotten PropertyNodeList.values to work - for some reason, it gives
Exception: Permission denied for <file://> to create wrapper for object of class UnnamedClass Source File: file:///C:/Documents%20and%20Settings/David/Desktop/microdata/test.html Line: 83, Column: 0 Category: content javascript

Also, nothing leaks, but I'm pretty sure that I messed up the refcounting somehow...I don't think I should be calling release in nsDOMNodeList::GetNodeAt
Comment 16 Jonas Sicking (:sicking) No longer reading bugmail consistently 2010-10-08 00:27:30 PDT
As it's too late for this to make FF4, I don't think I'll have time to look at this until FF4 has calmed down a bit.
Comment 17 David Zbarsky (:dzbarsky) 2010-10-16 20:51:58 PDT
Created attachment 483798 [details] [diff] [review]
working patch with tests

adds the tests as mochitest/reftest
fix the issue with GetValues
Comment 18 David Zbarsky (:dzbarsky) 2010-10-19 17:28:07 PDT
Created attachment 484559 [details] [diff] [review]
working patch with tests

So I ran the tests at http://gitorious.org/microdatajs/microdatajs/blobs/master/test/microdata.unit.js and found several bugs.  This patch fixes them.
We should probably add them to automated tests.
Comment 19 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2011-06-10 01:37:08 PDT
Any chance of getting this reviewed and landed now that Firefox 4 is in the past? Opera seems to be implementing this: http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-June/031951.html
Comment 20 Olli Pettay [:smaug] (reviewing overload) 2011-06-10 03:25:08 PDT
(And as usually when reviewing code which implements some new draft, it is
better to review the draft first and make sure it is sane :) )
Comment 21 David Zbarsky (:dzbarsky) 2011-06-13 12:44:49 PDT
Created attachment 538980 [details] [diff] [review]
patch with tests
Comment 22 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2011-07-27 08:16:46 PDT
FWIW, Opera released a snapshop with the API: http://my.opera.com/desktopteam/blog/2011/07/27/latency-microdata-qresync
Comment 23 Jonas Sicking (:sicking) No longer reading bugmail consistently 2011-08-03 17:30:10 PDT
Comment on attachment 538980 [details] [diff] [review]
patch with tests

Review of attachment 538980 [details] [diff] [review]:
-----------------------------------------------------------------

I'm realizing I won't have time to do this review.

mrbkap gloriously offered to take this. Below is the comments I had so far before giving up.

Sorry I haven't been able to help more :(

::: content/base/src/nsDOMNodeList.cpp
@@ +45,5 @@
> +
> +nsDOMNodeList::nsDOMNodeList(nsINode* aParent)
> + : mParent(aParent)
> +{
> +  mElements = new nsCOMArray<nsINode>;

Make mElements a nsCOMArray<nsINode> rather than a nsCOMArray<nsINode>*. That'll save both code and cycles.

We should probably give this a more descriptive name, like nsDOMStaticNodeList to separate it from all other nodelists. Though really, why does the spec return a static nodelist rather than an array of nodes?

::: content/base/src/nsDOMNodeList.h
@@ +67,5 @@
> +    mElements->Clear();
> +  }
> +
> +protected:
> +  nsINode* mParent;

What prevents mParent from going away while the nodelist is still alive? I suspect you need to make this a strong reference.

Either way you need to add cycle collection to this class.

::: content/html/content/reftests/591467-1-ref.html
@@ +1,1 @@
> +<!DOCTYPE html>

Hmm.. Why are you writing this as a reftest rather than a mochitest? That way you could turn the remaining work into todo's as well.

::: content/html/content/src/nsGenericHTMLElement.cpp
@@ +354,5 @@
> +  : mRoot(aRoot),
> +    mDoc(aDoc),
> +    mIsDirty(true)
> +{
> +  mProperties = new nsCOMArray<nsIContent>;

Make this an inline member. I.e. mProperties should be a nsCOMArray, not a pointer to one.

@@ +356,5 @@
> +    mIsDirty(true)
> +{
> +  mProperties = new nsCOMArray<nsIContent>;
> +  mNames = new nsDOMStringList();
> +  NS_ADDREF(mNames);

Make mNames a nsRefPtr<nsDOMStringList>

@@ +414,5 @@
> +
> +  nsAutoString aName(name);
> +  nsPropertyNodeList* propertyList = new nsPropertyNodeList(this, mDoc, aName);
> +
> +  CallQueryInterface(propertyList, aResult);

You can just do

NS_ADDREF(*aResult = propertyList);

@@ +466,5 @@
> +static int CompareTreeOrder(nsIContent* aContent1, nsIContent* aContent2, void* aData)
> +{
> +  PRUint16 pos;
> +  aContent2->CompareDocumentPosition(aContent1, &pos);
> +  return 3 - pos & 6;

Huh? What magic numbers are these? Should you use nsContentUtils::PositionIsBefore? Or at least nsINode::CompareDocPosition?
Comment 24 Jonas Sicking (:sicking) No longer reading bugmail consistently 2011-08-03 17:31:18 PDT
Also, would be great if someone could help mrbkap with reviewing that this implements the spec properly. Probably by reviewing the tests.

Is there a decent test suite for this?
Comment 25 Jonas Sicking (:sicking) No longer reading bugmail consistently 2011-08-03 17:33:32 PDT
Hmm.. Is the NodeList returned here supposed to be live? If so I understand why a NodeList rather than an Array is returned.

If that's the case, could you use nsContentList rather than rolling your own nodelist?
Comment 26 Rob Crowther 2011-08-08 16:10:20 PDT
> Is there a decent test suite for this?

Opera recently announced that "Tests written while implementing the APIs will also be shortly submitted to the W3C test suite."

http://my.opera.com/ODIN/blog/2011/08/02/what-s-new-in-opera-development-snapshots-27-july-2011-edition
Comment 27 Rob Crowther 2011-08-09 07:02:19 PDT
And today they submitted them:

http://w3c-test.org/html/tests/submission/Opera/microdata/001.html
Comment 28 David Zbarsky (:dzbarsky) 2011-08-22 22:55:39 PDT
Created attachment 555033 [details] [diff] [review]
Patch

Fixes most of the failures exposed by Opera tests.  I think the rest are due to NodeList/HTMLCollection not matching WebIDL specs.  Still needs cycle collection.
Comment 29 David Zbarsky (:dzbarsky) 2011-08-23 05:43:03 PDT
Created attachment 555081 [details] [diff] [review]
Patch

Previous version had a problem with the DOMStringList returned by GetNames() not updating properly.
Comment 30 :Ms2ger (⌚ UTC+1/+2) 2011-08-23 05:50:30 PDT
Comment on attachment 555081 [details] [diff] [review]
Patch

>--- /dev/null
>+++ b/content/base/src/nsDOMNodeList.h

>+#ifndef nsDOMNodeList_h___
>+#define nsDOMNodeTokenList_h___

That isn't going to work...
Comment 31 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2011-09-06 05:21:55 PDT
Comment on attachment 555081 [details] [diff] [review]
Patch

>+ * The Initial Developer of the Original Code is
>+ * David Zbarsky.

If you wrote this as a contractor, the initial developer should be the Mozilla Foundation. See http://blog.gerv.net/2010/02/mpl_initial_developer_for_mozilla_employ/

>+ * ***** END LICENSE BLOCK ***** */
>+#include "nsDOMNodeList.h"

Please add a blank line between the above two lines.

>+NS_IMETHODIMP    
>+nsDOMNodeList::Item(PRUint32 aIndex, nsIDOMNode** aReturn)
>+{
>+  nsINode* element = mElements.SafeObjectAt(aIndex);
>+  if (!element)
>+  {

Please put the { on the same line as the |if|.

>+nsIContent* 
>+nsDOMNodeList::GetNodeAt(PRUint32 aIndex)
>+{
>+  nsINode* node = mElements.SafeObjectAt(aIndex);
>+  nsIContent* content = nsnull;
>+  if (node) {
>+    CallQueryInterface(node, &content);
>+    NS_RELEASE(node);
>+  }
>+  return content;
>+}

Is there a reason not to write this as
{
  nsCOMPtr<nsIContent> content =
    do_QueryInterface(mElements.SafeObjectAt(aIndex));
  return content;
}
?

Unless there's a reason not to, please write it like this.

>+#ifndef nsDOMNodeList_h___
>+#define nsDOMNodeTokenList_h___

Please remove "Token".

>+            document.getElementById("results").innerHTML += (a ? "PASS" : "FAIL") + " - " + b + "<br />";

This test case should run a little faster if you replace the pattern
foo.innerHTML += bar;
with
foo.insertAdjacentHTML("beforeend", bar);

>+protected:
>+  void EnsureFresh();
>+  nsAutoString mName;

Fields should be nsString instead of nsAutoString.

>+NS_IMETHODIMP
>+nsDOMHTMLPropertiesCollection::NamedItem(const nsAString & aName, 
>+                                         nsIDOMPropertyNodeList** aResult)
>+{
>+  EnsureFresh();
>+
>+  if (mNamedItemEntries.Get(aName, aResult)) {
>+    NS_ADDREF(*aResult);
>+    return NS_OK;
>+  }
>+
>+  PropertyNodeList* propertyList = new PropertyNodeList(this, mRoot, aName);
>+
>+  NS_ADDREF(*aResult = propertyList);
>+  NS_ADDREF(*aResult);
>+  mNamedItemEntries.Put(aName, *aResult);
>+  return NS_OK;
>+}

Why is the above method written as if mNamedItemEntries didn't take care of addrefing the stuff that's in it? It seems to me that this always ends up addrefing *aResult an incorrect extra time.

>+void
>+nsDOMHTMLPropertiesCollection::CrawlPropertiesRecursive(nsIContent* aContent)
>+{
>+  if (!(aContent->IsElement() && 

To avoid a large number of invocations of CrawlPropertiesRecursive that just return early, please change the elementness check to an assertion here and do an elementness check before calling CrawlPropertiesRecursive recursively at the end of this method.

> aContent->AsElement()->IsHTML())) {
>+    return;
>+  }

The spec doesn't check for the element namespace. Hixie tells me that that's an oversight. Even so, the code above would fail to crawl into HTML descendants of non-HTML elements. Hixie tells me that crawling past non-HTML nodes into HTML descendants is expected and the Microdata attributes should be disregarded on non-HTML nodes.

(Personally, I don't see the technical point of not supporting Microdata on SVG or MathML. Unfortunately, I suspect it's politically expedient not to.)

http://www.w3.org/Bugs/Public/show_bug.cgi?id=14039

Please remove the IsHTML() check here and add it to later places so in this method so that the item* attributes get ignored on non-HTML but non-HTML is still crawled for HTML descendants when non-HTML occurs as a descendant of aContent or as the target of itemref.

>+      nsIContent* element = doc ? doc->GetElementById(ref) : GetElementById(aContent, ref);

(I think that right now, doc->GetElementById() doesn't implement what the spec requires, but I guess this is OK since eventual convergence is expected.)

>+  if (!ref.Equals(itemId)) {

Is this check together with the 
>+    if (child != mRoot) {
below really sufficient for avoiding infinite loops when following itemref?

Consider <mRoot itemscope><foo id=a><foo><foo itemref=a>.

Shouldn't this algorithm have a collection of already-seen nodes like the collection called "memory" in http://www.whatwg.org/specs/web-apps/current-work/#the-properties-of-an-item ?

Please add a test case that demonstrates that evil cyclical itemrefs don't cause infinite loops.

(Also, why is this algorithm implemented recursively instead of implementing the algorithm in the spec that uses an explicit "pending" queue? Is this for avoiding heap allocations for the pending queue? I'm a bit worried about recursive algorithms whose recursion depth is controlled by Web content, though I suppose having such an algorithm here is no worse than having such algorithms already in layout and, thus, there's no need to change the patch on that point.)

>+  for (nsIContent* child = aContent->GetFirstChild();
>+       child;
>+       child = child->GetNextSibling()) {
>+    if (child != mRoot) {
>+      CrawlPropertiesRecursive(child);

Referring to the earlier comment about removing the elementness check from the top of the method: Please make this call conditional on child->IsElement() being true.

>+  nsresult rv;
>+  nsCOMPtr<nsIWritableVariant> out = do_CreateInstance(NS_VARIANT_CONTRACTID, &rv);
>+
>+  NS_ENSURE_SUCCESS(rv, rv);

Please change this to
nsCOMPtr<nsIWritableVariant> out = new nsVariant();
in order to avoid burning cycles in useless XPCOM abstractions and in OOM checks when the allocator is infallible anyway.

>+  nsTArray<nsIVariant*> values;
>+
>+  PRUint32 length = mElements.Count();
>+  if (length == 0) {
>+    out->SetAsEmptyArray();
>+  } else {
>+      for(PRUint32 i=0; i < length; ++i) {

Please add a space on both sides of the equals sign.

>+        nsIVariant* itemValue;
>+        static_cast<nsGenericHTMLElement*>(mElements.ObjectAt(i))->GetItemValue(&itemValue);
>+        values.AppendElement(itemValue);

I can see why your nsTArray holds plain nsIVariant*s, but the way ownership is managed manually here makes me nervous. Please add a comment that explains why you are managing the ownership of the elements of the array manually and please mention that they are manually released at the end of this method.

>+      }
>+      out->SetAsArray(nsIDataType::VTYPE_INTERFACE_IS,
>+                      &NS_GET_IID(nsIVariant),
>+                      values.Length(),
>+                      values.Elements());
>+  }
>+
>+  *aValues = out.forget().get();

Please change to
out.forget(aValues);

>+  for(PRUint32 i=0; i < values.Length(); ++i) {

Please add a space on both sides of the equals sign.

>+    NS_RELEASE(values[i]);
>+  }
>+
>+  return NS_OK;
>+}


>+void
>+PropertyNodeList::AttributeChanged(nsIDocument *aDocument, Element* aElement,

For consistency, please move the asterisk to the left side of the space for the nsIDocument pointer.

>+void
>+PropertyNodeList::ContentInserted(nsIDocument *aDocument,

Also here.

>+PropertyNodeList::ContentRemoved(nsIDocument *aDocument,

And here.

>+  PRUint32 count = mCollection->mProperties.Count();
>+  for (PRUint32 i=0; i < count; ++i) {

Please add a space on both sides of the equals sign.

>+nsGenericHTMLElement::~nsGenericHTMLElement()
>+{
>+  if (mItemRef) {
>+    mItemRef->DropReference();
>+  }
>+
>+  if (mItemProp) {
>+    mItemProp->DropReference();
>+  }
>+  
>+  NS_IF_RELEASE(mProperties);
>+}

Why is the refcounting of mProperties managed manually instead of making mProperties a smart pointer?

Doesn't mProperties end up creating reference cycles through its mDoc and mRoot? That is, shouldn't these references participate in cycle collection? If they really don't create reference cycles, it might be worthwhile to add a comment mentioning why not.

>+NS_IMETHODIMP
>+nsGenericHTMLElement::GetItemValue(nsIVariant** aValue)
>+{
>+  nsresult rv;
>+  nsCOMPtr<nsIWritableVariant> out = do_CreateInstance(NS_VARIANT_CONTRACTID, &rv);
>+
>+  NS_ENSURE_SUCCESS(rv, rv);

Please use 
nsCOMPtr<nsIWritableVariant> out = new nsVariant();
above.

>+  if (!HasAttr(kNameSpaceID_None, nsGkAtoms::itemprop)) {
>+    out->SetAsEmpty();
>+    *aValue = out.forget().get();

Please use
out.forget(aValue);

>+  *aValue = out.forget().get();

And again.

>+  nsresult rv;
>+  nsCOMPtr<nsIWritableVariant> out = do_CreateInstance(NS_VARIANT_CONTRACTID, &rv);
>+  NS_ENSURE_SUCCESS(rv, rv);

nsCOMPtr<nsIWritableVariant> out = new nsVariant();

>+  *aResult = out.forget().get();

out.forget(aValue);

>+  nsresult rv;
>+  nsCOMPtr<nsIWritableVariant> out = do_CreateInstance(NS_VARIANT_CONTRACTID, &rv);
>+  NS_ENSURE_SUCCESS(rv, rv);

nsCOMPtr<nsIWritableVariant> out = new nsVariant();

>+  *aResult = out.forget().get();

out.forget(aValue);

>+NS_IMETHODIMP
>+nsGenericHTMLElement::GetProperties(nsIDOMHTMLPropertiesCollection** aReturn)
>+{
>+  if (!mProperties) {
>+    NS_ADDREF(mProperties = new nsDOMHTMLPropertiesCollection(this));

Wouldn't need to addref manually here if mProperties was a smart pointer.

>+  virtual void GetItemValueText(nsAString& text);
>+  virtual void SetItemValueText(const nsAString & text);

For consistency, please delete the space before the ampersand.

>+  nsIDOMHTMLPropertiesCollection*  mProperties;

Why not a smart pointer here?

>+function testDOMSettableTokenListReflection(tag, attr, prop) {
>+  var elm = document.createElement(tag);
>+  var list = elm[prop];
>+
>+  // content attribute -> DOMSettableTokenList
>+  function verifyTokenList(value, allItems, notContainItems) {
>+    equals(list.value, value, 'token list .value'+value);
>+    equals(list.length, allItems.length, 'token list length');
>+    for (var i=0; i < list.length && i < allItems.length; i++) {

Please add a space on both sides of the equals sign.

>+      equals(list.item(i), allItems[i], 'token list .item() getter');
>+      equals(list[i], allItems[i], 'token list [] getter');
>+      ok(list.contains(allItems[i]), 'token list contains '+allItems[i]);
>+    }
>+    for (i=0; i<notContainItems.length; i++) {

Please add a space on both sides of the equals sign and the less-than sign.

>+function testDOMSettableTokenListExceptions(tag, prop) {
>+  var throwMethods = ['contains', 'add', 'remove', 'toggle'];
>+  for (var i=0; i<throwMethods.length; i++) {

Please add a space on both sides of the equals sign and the less-than sign.

>+function verifyValues(actual, expected, message) {
>+  equals(actual.length, expected.length, message+'.length');
>+  for (var i=0; i < actual.length && i < expected.length; i++) {

Please add a space on both sides of the equals sign.

>+// static
>+void
>+nsHTMLDocument::GetItemsRecursive(nsINode* root, nsDOMNodeList** results, 
>+                                  nsTArray<nsString>* tokens, PRBool matchAll)

Where does this get called and why is this recursive? If this no longer gets called and was part of an earlier implementation iteration, please remove this method.

>+public:
>+  static nsIClassInfo *doCreate(nsDOMClassInfoData* aData)

Please put the asterisk consistently on the left side of the space.

>+  {
>+    return new nsHTMLPropertiesCollectionSH(aData);
>+  }
>+};
>+

>@@ -59,16 +60,20 @@ interface nsIDOMNSHTMLElement : nsISuppo
> 
>            attribute long             tabIndex;
> 
>            attribute DOMString        contentEditable;
>   readonly attribute boolean          isContentEditable;
> 
>            // for WHAT-WG drag and drop
>            attribute boolean          draggable;
>+           
>+           attribute nsIVariant       itemValue;
>+           attribute nsIVariant       itemProp;
>+           attribute nsIVariant       itemRef;

Could you, please, put these on nsIDOMHTMLElement unless there's a reason why they have to be here? It's confusing to have some Microdata members on nsIDOMHTMLElement and others on nsIDOMNSHTMLElement.

r- until these comments are addressed, though overall this looks very good.
Comment 32 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2011-09-06 05:32:58 PDT
(In reply to Henri Sivonen (:hsivonen) from comment #31)
> (Also, why is this algorithm implemented recursively instead of implementing
> the algorithm in the spec that uses an explicit "pending" queue? Is this for
> avoiding heap allocations for the pending queue? I'm a bit worried about
> recursive algorithms whose recursion depth is controlled by Web content,
> though I suppose having such an algorithm here is no worse than having such
> algorithms already in layout and, thus, there's no need to change the patch
> on that point.)

Thinking about this more: Wouldn't it be reasonable to implement this iteratively (instead of recursively) with an explicit "pending" collection as in the spec if the collection is a stack-allocated nsAutoTArray to avoid heap-allocations in common cases?
Comment 33 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2011-09-09 10:26:38 PDT
>+#ifndef nsDOMNodeList_h___

Also, symbols with two underscores are reserved to the compiler, so there's should only be one underscore after the 'h' even though we do have multiple underscores in analogous places all over the codebase.
Comment 34 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2011-10-13 09:33:34 PDT
itemtype now allows multiple values:
http://html5.org/tools/web-apps-tracker?from=6667&to=6668
Comment 35 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2011-10-14 06:22:04 PDT
also http://html5.org/tools/web-apps-tracker?from=6679&to=6680
Comment 36 David Zbarsky (:dzbarsky) 2012-05-11 14:41:41 PDT
Created attachment 623311 [details] [diff] [review]
Patch with tests

I know the test is a little messy, it's because it's a copy of the microdata tests submitted by Opera to W3C.

I haven't updated this patch to the itemtype changes, but that should be a relatively small change.

Everything else should work, tryservering it now.
Comment 37 David Zbarsky (:dzbarsky) 2012-05-11 15:08:29 PDT
Created attachment 623322 [details] [diff] [review]
Patch with tests

and now with 100% less DOS newlines
Comment 38 David Zbarsky (:dzbarsky) 2012-05-11 20:30:02 PDT
Created attachment 623384 [details] [diff] [review]
Fixes to pass tryserver
Comment 39 :Ms2ger (⌚ UTC+1/+2) 2012-05-21 04:50:36 PDT
Comment on attachment 623322 [details] [diff] [review]
Patch with tests

Review of attachment 623322 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/base/src/nsDOMNodeList.cpp
@@ +1,1 @@
> +/* ***** BEGIN LICENSE BLOCK *****

MPL2

::: content/base/src/nsDOMNodeList.h
@@ +41,5 @@
> +#include "nsINodeList.h"
> +#include "nsDOMClassInfoID.h"
> +#include "nsIContent.h"
> +
> +class nsDOMNodeList : public nsINodeList

Is this class actually used?

::: content/html/content/src/nsGenericHTMLElement.cpp
@@ +295,5 @@
> +  cb->NoteXPCOMChild(aEntry);
> +  return PL_DHASH_NEXT;
> +}
> +
> +NS_IMPL_CYCLE_COLLECTION_CLASS(nsDOMHTMLPropertiesCollection)

And a file of their own for the implementation too?

@@ +362,5 @@
> +}
> +
> +JSObject*
> +nsDOMHTMLPropertiesCollection::WrapObject(JSContext *cx, JSObject *scope,
> +                                          bool *triedToWrap)

* to the left

@@ +401,5 @@
> +                                            nsWrapperCache **aCache)
> +{
> +  nsPropertyNodeList* item = nsnull;
> +  nsIDOMPropertyNodeList* list = static_cast<nsIDOMPropertyNodeList*>(item);
> +  NamedItem(aName, &list);

Something seems weird here

@@ +511,5 @@
> +  mProperties.Sort(CompareTreeOrder, nsnull);
> +
> +  // Create the names DOMStringList
> +  PRUint32 count = mProperties.Count();
> +  for(PRUint32 i = 0; i < count; ++i) {

for (

@@ +530,5 @@
> +  }
> +}
> +
> +static nsIContent*
> +GetElementById(nsIContent* aContent, nsAString& aId)

const nsAString&, I think. I'm surprised that you need to reimplement this, though.

@@ +533,5 @@
> +static nsIContent*
> +GetElementById(nsIContent* aContent, nsAString& aId)
> +{
> +  while (nsIContent* parent = aContent->GetParent()) {
> +    aContent = parent;

SubtreeRoot(), maybe?

@@ +551,5 @@
> +  
> +void
> +nsDOMHTMLPropertiesCollection::CrawlPropertiesRecursive(nsIContent* aContent)
> +{
> +  NS_ASSERTION(aContent->IsElement(), "should only be called for elements");

Please pass an Element*, then.

@@ +606,5 @@
> +    mDoc->AddMutationObserver(this);
> +  }
> +  nsString name(aName);
> +  if (name.FindCharInSet(" \r\n\t\f") >= 0) {
> +    mIsEmpty = true;

I'm confused. Are you trying to test that the sting only has space characters, or?

@@ +621,5 @@
> +  }
> +}
> +
> +void
> +nsPropertyNodeList::SetDocument(nsIDocument* aDoc) {

{ on a new line

@@ +657,5 @@
> +nsPropertyNodeList::GetNodeAt(PRUint32 aIndex)
> +{
> +  EnsureFresh();
> +  nsCOMPtr<nsIContent> content =
> +    do_QueryInterface(mElements.SafeObjectAt(aIndex));

Will this ever not be an nsIContent? (Or Element?) If not, mElements should probably have a more specific type.

@@ +730,5 @@
> +  PRUint32 length = mElements.Count();
> +  if (length == 0) {
> +    out->SetAsEmptyArray();
> +  } else {
> +      for(PRUint32 i = 0; i < length; ++i) {

Indentation

@@ +878,5 @@
> +}
> +
> +nsDOMHTMLPropertiesCollection::nsDOMHTMLPropertiesCollection(nsGenericHTMLElement* aRoot)
> +  : mRoot(aRoot),
> +    mDoc(aRoot->GetCurrentDoc()),

, on the next line

@@ +4165,5 @@
> +    return NS_OK;
> +  }
> +
> +  bool itemScope;
> +  GetItemScope(&itemScope);

This should have an accessor with a nicer signature. (Bonus points if it's Paris-bindings-ready)

::: content/html/content/src/nsGenericHTMLElement.h
@@ +161,5 @@
> +  NS_IMETHOD GetItemValue(nsIVariant** aValue);
> +  NS_IMETHOD SetItemValue(nsIVariant* aValue);
> +private:
> +  virtual void GetItemValueText(nsAString& text);
> +  virtual void SetItemValueText(const nsAString& text);

I find it strange that those are private, and not protected.

@@ +836,5 @@
>  };
>  
>  
>  //----------------------------------------------------------------------
> +class PropertyStringList : public nsDOMStringList

Can all these classes get files of their own? nsGenericHTMLElement.h is pretty big already...

@@ +877,5 @@
> +  NS_DECL_NSIMUTATIONOBSERVER_CONTENTAPPENDED
> +  NS_DECL_NSIMUTATIONOBSERVER_CONTENTINSERTED
> +  NS_DECL_NSIMUTATIONOBSERVER_CONTENTREMOVED
> +
> +  

One empty line will do

@@ +879,5 @@
> +  NS_DECL_NSIMUTATIONOBSERVER_CONTENTREMOVED
> +
> +  
> +  NS_DECL_CYCLE_COLLECTION_SCRIPT_HOLDER_CLASS_AMBIGUOUS(nsDOMHTMLPropertiesCollection,
> +                                                         nsIHTMLCollection)

Indentation

::: content/html/content/src/nsHTMLAreaElement.cpp
@@ +181,5 @@
> +
> +void
> +nsHTMLAreaElement::SetItemValueText(const nsAString& aValue)
> +{
> +  SetAttr(kNameSpaceID_None, nsGkAtoms::href, aValue, PR_TRUE);

true

::: content/html/content/src/nsHTMLAudioElement.cpp
@@ +123,5 @@
> +
> +void
> +nsHTMLAudioElement::SetItemValueText(const nsAString& aValue)
> +{
> +  SetAttr(kNameSpaceID_None, nsGkAtoms::src, aValue, PR_TRUE);

true

::: content/html/content/src/nsHTMLIFrameElement.cpp
@@ +150,5 @@
> +
> +void
> +nsHTMLIFrameElement::SetItemValueText(const nsAString& aValue)
> +{
> +  SetAttr(kNameSpaceID_None, nsGkAtoms::src, aValue, PR_TRUE);

true

::: content/html/content/src/nsHTMLImageElement.cpp
@@ +259,5 @@
> +
> +void
> +nsHTMLImageElement::SetItemValueText(const nsAString& aValue)
> +{
> +  SetAttr(kNameSpaceID_None, nsGkAtoms::src, aValue, PR_TRUE);

true

::: content/html/document/src/nsHTMLDocument.cpp
@@ +1793,5 @@
>  
> +static bool MatchItems(nsIContent* aContent, PRInt32 aNameSpaceID, 
> +                       nsIAtom* aAtom, void* aData)
> +{
> +  if(!(aContent->IsElement() && aContent->AsElement()->IsHTML())) {

if (

@@ +1804,5 @@
> +    return false;
> +  }
> +
> +  nsTArray<nsString>* tokens = static_cast<nsTArray<nsString>*>(aData);
> +  if (tokens->Length() == 0) {

IsEmpty()

@@ +1822,5 @@
> +
> +static void* CreateTokens(nsINode* aRootNode, const nsString* types)
> +{
> +  nsTArray<nsString>* tokens = nsnull;
> +  tokens = new nsTArray<nsString>();

One line

@@ +1825,5 @@
> +  nsTArray<nsString>* tokens = nsnull;
> +  tokens = new nsTArray<nsString>();
> +  nsAString::const_iterator iter, end;
> +  types->BeginReading(iter);
> +  types->EndReading(end);

I think these should just be

PRUnichar* iter = types->BeginReading();
PRUnichar* end = types->EndReading();

now

@@ +1836,5 @@
> +  nsAString::const_iterator start(iter);
> +  
> +  // parse the tokens
> +  while (iter != end) {
> +    start = iter;

I think it would be clearer if you declared this variable inside the loop

@@ +1859,5 @@
> +  nsContentList* elements = 
> +    NS_GetFuncStringContentList(this, MatchItems, DestroyTokens, 
> +                                CreateTokens, types).get();
> +  NS_ENSURE_TRUE(elements, NS_ERROR_OUT_OF_MEMORY);
> +  *aReturn = elements;

nsRefPtr<nsContentList> elements =
  NS_GetFuncStringContentList(this, MatchItems, DestroyTokens,
                              CreateTokens, types);
NS_ENSURE_TRUE(elements, NS_ERROR_OUT_OF_MEMORY);
elements.forget(aReturn);

please.

::: js/xpconnect/src/codegen.py
@@ +398,5 @@
>          template = resultConvTemplates.get(typeName)
>      elif isInterfaceType(type):
>          if isVariantType(type):
> +            template =  ("    XPCLazyCallContext lccx(JS_CALLER, cx, obj);\n"
> +                         "    return xpc_qsVariantToJsval(lccx, result, ${jsvalPtr});\n")

Nice.
Comment 40 David Zbarsky (:dzbarsky) 2012-05-21 10:07:05 PDT
Comment on attachment 623322 [details] [diff] [review]
Patch with tests

bz, please review the dom bindings changes
Comment 41 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2012-05-22 02:15:35 PDT
(In reply to David Zbarsky from comment #36)
> I know the test is a little messy, it's because it's a copy of the microdata
> tests submitted by Opera to W3C.

I'm rubber-stamping the contents of the test on the assumption that the Opera folks know what they are doing, but:
 * Please add a line feed to the end of the test file.
 * Shouldn't the test file have some BSD license header that has the appropriate copyright legend for W3C tests?

I'm a bit uncomfortable with landing without <time> support. But not uncomfortable enough not to land. Is there a follow-up bug on file?

<track> also. I wonder if bits of the patch for bug 629350 could be landed ahead of WebVTT support or if showing DOM traits for <track> without actual WebVTT support would be worse than incomplete Microdata support.

>+/* ***** BEGIN LICENSE BLOCK *****
>+ * Version: MPL 1.1/GPL 2.0/LGPL 2.1

The license blocks need to be updated to MPL 2.0 boilerplate: http://www.mozilla.org/MPL/headers/

>+nsDOMNodeList::nsDOMNodeList(nsINode* aParent)
>+ : mParent(aParent)
>+{ }

Where is this class ever instantiated?

>+NS_IMPL_ADDREF(nsDOMNodeList)
>+NS_IMPL_RELEASE(nsDOMNodeList)

Can this class get away with not supporting cycle collection for sure?

>+NS_IMETHODIMP
>+nsDOMHTMLPropertiesCollection::NamedItem(const nsAString& aName, 
>+                                         nsIDOMNode** aResult)
>+{
>+  return NS_OK;
>+}

Where does this get called? Or this just exist to satisfy some inheritance oddities and never actually gets called?

>+void
>+nsDOMHTMLPropertiesCollection::CrawlPropertiesRecursive(nsIContent* aContent)

I guess this will stay recursive then. :-/ 

>+{
>+  NS_ASSERTION(aContent->IsElement(), "should only be called for elements");
>+  
>+  if (aContent != mRoot && aContent->AsElement()->IsHTML()) {
>+    if (aContent->HasAttr(kNameSpaceID_None, nsGkAtoms::itemprop) &&
>+      (mProperties.IndexOf(aContent) == -1)) {
>+      mProperties.AppendObject(aContent);
>+    }
>+
>+    if (aContent->HasAttr(kNameSpaceID_None, nsGkAtoms::itemscope)) {
>+      return;
>+    }
>+  }
>+
>+  nsDOMSettableTokenList* itemRef = 
>+    static_cast<nsGenericHTMLElement*>(aContent)->GetItemRefInternal();
>+
>+  nsAutoString itemId;
>+  aContent->GetAttr(kNameSpaceID_None, nsGkAtoms::id, itemId);
>+  nsIDocument* doc = aContent->GetCurrentDoc();
>+  
>+  PRUint32 length;
>+  itemRef->GetLength(&length);
>+  nsAutoString ref;
>+  for (PRUint32 currRef = 0; currRef < length; ++currRef) {
>+    itemRef->Item(currRef, ref);
>+    if (!ref.Equals(itemId)) {
>+      nsIContent* content = doc ? doc->GetElementById(ref) : GetElementById(aContent, ref);
>+      if (content && content->IsElement()) {
>+        CrawlPropertiesRecursive(content);
>+      }
>+    }
>+  }
>+
>+  for (nsIContent* child = aContent->GetFirstChild();
>+       child;
>+       child = child->GetNextSibling()) {
>+    if (child != mRoot && child->IsElement()) {
>+      CrawlPropertiesRecursive(child);
>+    }
>+  }
>+}

I have trouble verifying by inspection that this algorithm is equivalent to the one given in the spec. In particular, this algorithm seems to process itemref on non-mRoot nodes. It seems to me that the algorithm in the spec only processes itemref on root. What am I missing?

>+nsPropertyNodeList::nsPropertyNodeList(nsDOMHTMLPropertiesCollection* aCollection, 
>+                                       nsIContent* aParent, const nsAString& aName)
>+  : mParent(aParent),
>+    mDoc(aParent->GetCurrentDoc()),
>+    mCollection(aCollection),
>+    mIsDirty(true)
>+{
>+  SetIsDOMBinding();
>+  if (mDoc) {
>+    mDoc->AddMutationObserver(this);
>+  }
>+  nsString name(aName);
>+  if (name.FindCharInSet(" \r\n\t\f") >= 0) {
>+    mIsEmpty = true;

This is about rejecting names that contain space characters, because such names never match, right? Please add a comment explaining what's going on.

>+ NS_IMPL_INT_ATTR_DEFAULT_VALUE(nsGenericHTMLElement, TabIndex, tabindex, -1)
>+ NS_IMPL_BOOL_ATTR(nsGenericHTMLElement, Hidden, hidden)

Please delete the leading extra space per line added here.

>+  nsresult rv;
>+  nsCOMPtr<nsIWritableVariant> out = do_CreateInstance(NS_VARIANT_CONTRACTID, &rv);
>+  NS_ENSURE_SUCCESS(rv, rv);
>+
>+  out->SetAsInterface(NS_GET_IID(nsIDOMDOMSettableTokenList), mItemProp);
>+  *aResult = out.forget(aResult);

Fixed in the other patch, it seems.

>+++ b/js/xpconnect/src/codegen.py

>-            template = "    return xpc_qsVariantToJsval(lccx, result, ${jsvalPtr});\n"
>+            template =  ("    XPCLazyCallContext lccx(JS_CALLER, cx, obj);\n"
>+                         "    return xpc_qsVariantToJsval(lccx, result, ${jsvalPtr});\n")

I'm not at all competent to review this bit (or other Paris Bindings details). Leaving those to bz.

Also, seconding the stylistic comments (including file organization) that Ms2ger made.
Comment 42 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2012-05-22 02:17:00 PDT
Comment on attachment 623322 [details] [diff] [review]
Patch with tests

Tentatively setting to r- to indicate that I've done a round of looking at the patch but wasn't ready to r+ at least not without answers to the questions in my previous comment.
Comment 43 :Ms2ger (⌚ UTC+1/+2) 2012-05-22 03:04:56 PDT
Created attachment 625949 [details] [diff] [review]
Import tests

Now we can import W3C tests automatically, I'd do that. You will probably need to add a file with the expected failures at dom/imptests/failures/html/tests/submission/Opera/microdata/test_001.html.json; see existing files around there and dom/imptests/README.
Comment 44 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2012-05-22 03:14:16 PDT
(In reply to Ms2ger from comment #43)
> Created attachment 625949 [details] [diff] [review]
> Import tests
> 
> Now we can import W3C tests automatically, I'd do that.

Oh, that works now. Yeah, it would good to use the general import mechanism that has a README with legal stuff already.
Comment 45 David Zbarsky (:dzbarsky) 2012-05-22 10:56:25 PDT
(In reply to Henri Sivonen (:hsivonen) from comment #44)
> (In reply to Ms2ger from comment #43)
> > Created attachment 625949 [details] [diff] [review]
> > Import tests
> > 
> > Now we can import W3C tests automatically, I'd do that.
> 
> Oh, that works now. Yeah, it would good to use the general import mechanism
> that has a README with legal stuff already.

Unfortunately the actual test that Opera submitted uses callers (calling properties("prop"), for instance), which we don't support, so we end up throwing on a bunch of tests.  I removed all the callers because they obscured actual failures.
Comment 46 David Zbarsky (:dzbarsky) 2012-05-22 11:12:39 PDT
> I'm a bit uncomfortable with landing without <time> support. But not
> uncomfortable enough not to land. Is there a follow-up bug on file?

I think Microdata makes a lot of sense even without time.  bug 629801 is about implementing the <time> element, so once that is fixed I can add the Microdata part of it.

> <track> also. I wonder if bits of the patch for bug 629350 could be landed
> ahead of WebVTT support or if showing DOM traits for <track> without actual
> WebVTT support would be worse than incomplete Microdata support.

I think it would make sense to add Microdata support after <track> lands


> >+/* ***** BEGIN LICENSE BLOCK *****
> >+ * Version: MPL 1.1/GPL 2.0/LGPL 2.1
> 
> The license blocks need to be updated to MPL 2.0 boilerplate:
> http://www.mozilla.org/MPL/headers/

I'll do this.

> >+nsDOMNodeList::nsDOMNodeList(nsINode* aParent)
> >+ : mParent(aParent)
> >+{ }
> 
> Where is this class ever instantiated?

This class is never used, I'm removing it.
 
> Where does this get called? Or this just exist to satisfy some inheritance
> oddities and never actually gets called?

This is a method on nsIHTMLCollection which we implement for bindings. (bz, is this necessary?)

> >+void
> >+nsDOMHTMLPropertiesCollection::CrawlPropertiesRecursive(nsIContent* aContent)
> 
> I guess this will stay recursive then. :-/ 

It's a lot easier to do it this way than to implement the algorithm in the spec, and as you mentioned, layout already has recursive algorithms.


> I have trouble verifying by inspection that this algorithm is equivalent to
> the one given in the spec. In particular, this algorithm seems to process
> itemref on non-mRoot nodes. It seems to me that the algorithm in the spec
> only processes itemref on root. What am I missing?

Hmmm, it seems that you are correct, but I think it would make sense to allow itemref anywhere.  For example, say you have a Person item that itemrefs a car.  If you then have a Family item that itemrefs multiple people, you won't have the cars as properties.  I will ask Hixie about this.

> This is about rejecting names that contain space characters, because such
> names never match, right? Please add a comment explaining what's going on.

Ok,

> >+ NS_IMPL_INT_ATTR_DEFAULT_VALUE(nsGenericHTMLElement, TabIndex, tabindex, -1)
> >+ NS_IMPL_BOOL_ATTR(nsGenericHTMLElement, Hidden, hidden)
> 
> Please delete the leading extra space per line added here.

Ok.
Comment 47 Philip Jägenstedt 2012-05-23 01:15:46 PDT
(In reply to David Zbarsky from comment #45)
> (In reply to Henri Sivonen (:hsivonen) from comment #44)
> > (In reply to Ms2ger from comment #43)
> > > Created attachment 625949 [details] [diff] [review]
> > > Import tests
> > > 
> > > Now we can import W3C tests automatically, I'd do that.
> > 
> > Oh, that works now. Yeah, it would good to use the general import mechanism
> > that has a README with legal stuff already.
> 
> Unfortunately the actual test that Opera submitted uses callers (calling
> properties("prop"), for instance), which we don't support, so we end up
> throwing on a bunch of tests.  I removed all the callers because they
> obscured actual failures.

This is required by the spec: http://www.whatwg.org/specs/web-apps/current-work/multipage/common-dom-interfaces.html#htmlpropertiescollection

If you are deliberately not supporting this, should it be dropped from the spec as well?
Comment 48 Boris Zbarsky [:bz] (still a bit busy) 2012-05-23 06:47:31 PDT
I believe we have repeatedly pushed back on the whole "legacycaller" thing for HTMLCollection.  We don't support it now, and we don't have any plans to so far, since it doesn't seem to be necessary for web compat.  I believe the relevant W3C bugs are in some sort of limbo....
Comment 49 Boris Zbarsky [:bz] (still a bit busy) 2012-05-23 06:50:59 PDT
Ah, and in fact it's gone for HTMLCollection itself.  Given that, having it for HTMLPropertiesCollection seems like a spec bug: the whole point of legacycaller is that it should be used when specifying existing APIs that have that wacky behavior.

I filed https://www.w3.org/Bugs/Public/show_bug.cgi?id=17161
Comment 50 Philip Jägenstedt 2012-05-23 07:51:11 PDT
Thanks, I commented on that bug.
Comment 51 Boris Zbarsky [:bz] (still a bit busy) 2012-05-23 23:10:04 PDT
Sorry for the lag here; for some reason I didn't get mail for the review request... :(  Will try to get to this ASAP.
Comment 52 Boris Zbarsky [:bz] (still a bit busy) 2012-05-23 23:20:56 PDT
Though one comment: The implementation of CrawlPropertiesRecursive looks wrong to me (unlike the spec algorithm, which at first glance is correct).

Consider the following simple example:

  <div id="x">
    <span itemscope itemref="x"></span>
  </div>

As far as I can tell, the impl of CrawlPropertiesRecursive in this patch will go into an infinite loop on this markup.
Comment 53 Boris Zbarsky [:bz] (still a bit busy) 2012-05-23 23:24:43 PDT
Oh, and also... it's worth adding getters to nsDOMTokenList that return the atoms instead of the strings and use those here; it'd save you a bunch of string ops.
Comment 54 Boris Zbarsky [:bz] (still a bit busy) 2012-05-23 23:31:48 PDT
Ans also also... what guarantees that aContent in CrawlPropertiesRecursive is an HTML element?  Certainly the static_cast to call GetItemRefInternal assumes it.

One other note: the code seems to only add HTML elements to mProperties, but the spec (http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-properties-of-an-item step 10) says to put _any_ element with an @itemprop in there.  I believe this is a spec bug that we should raise.
Comment 55 David Zbarsky (:dzbarsky) 2012-05-24 11:09:48 PDT
(In reply to Boris Zbarsky (:bz) from comment #52)
> Though one comment: The implementation of CrawlPropertiesRecursive looks
> wrong to me (unlike the spec algorithm, which at first glance is correct).
> 
> Consider the following simple example:
> 
>   <div id="x">
>     <span itemscope itemref="x"></span>
>   </div>
> 
> As far as I can tell, the impl of CrawlPropertiesRecursive in this patch
> will go into an infinite loop on this markup.

I have an array of properties, and if I hit an element that is already a property I don't examine it's children.
Comment 56 David Zbarsky (:dzbarsky) 2012-05-24 11:12:52 PDT
(In reply to Boris Zbarsky (:bz) from comment #54)
> Ans also also... what guarantees that aContent in CrawlPropertiesRecursive
> is an HTML element?  Certainly the static_cast to call GetItemRefInternal
> assumes it.

I assert it at the top.  I only call it for nsIContents that are elements when traversing the tree, and the original call is for an nsGenericHTMLElement.
I'll just change the signature to take an Element.
 
> One other note: the code seems to only add HTML elements to mProperties, but
> the spec
> (http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.
> html#the-properties-of-an-item step 10) says to put _any_ element with an
> @itemprop in there.  I believe this is a spec bug that we should raise.

If you read a little further in the spec, it says this:
"Currently, the itemscope, itemprop, and other microdata attributes are only defined for HTML elements. This means that attributes with the literal names "itemscope", "itemprop", etc, do not cause microdata processing to occur on elements in other namespaces, such as SVG."
Comment 57 Boris Zbarsky [:bz] (still a bit busy) 2012-05-24 11:22:15 PDT
> I have an array of properties, and if I hit an element that is already a property

Neither of the elements above is a property.

> I assert it at the top.

You assert that it's an Element, not that it's an HTMLElement.  An HTMLElement can have non-HTML Element kids.

> If you read a little further in the spec, it says this:

Great!
Comment 58 David Zbarsky (:dzbarsky) 2012-05-24 16:01:35 PDT
Created attachment 626999 [details] [diff] [review]
Patch
Comment 59 David Zbarsky (:dzbarsky) 2012-05-25 13:08:11 PDT
Created attachment 627332 [details] [diff] [review]
Patch with files added
Comment 60 Boris Zbarsky [:bz] (still a bit busy) 2012-05-25 20:45:16 PDT
Comment on attachment 626999 [details] [diff] [review]
Patch

>Bug 591467 - Implement HTML5 Microdata API
>* * *
>try: -b do

Fix that before landing, please.

>+++ b/content/base/src/nsGenericElement.cpp
>+    tmp->DeleteProperty(nsGkAtoms::properties);

I think calling this microdataProperties or something would be a lot clearer...

>+++ b/content/html/content/src/Makefile.in
>+		nsDOMHTMLPropertiesCollection.cpp \

Can we name this HTMLPropertiesCollection.cpp?

And call the class mozilla::dom::HTMLPropertiesCollection (with some "using mozilla::dom;" in C++ files as needed to make it simple to work with)?

Note that I'd like the former, even if we don't do the latter.  But I'd really prefer the latter too.

>+++ b/content/html/content/src/nsDOMHTMLPropertiesCollection.cpp
>+typedef mozilla::dom::Element Element;

And hey, if this whole thing were inside namespace mozilla and namespace dom,
you wouldn't need that bit.  ;)

Just make sure that the DOMCI_DATA is outside the namespaces.

>+TraverseEntries(const nsAString& aKey, nsIDOMPropertyNodeList* aEntry, void* aData)

TraverseNamedProperties?

>+NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN(nsDOMHTMLPropertiesCollection)

Shouldn't you mNamedItemEntries.EnumerateRead(ReleaseEntries, NULL) in here?

But more on this below.

>+SetDocuments(const nsAString& aKey, nsIDOMPropertyNodeList* aEntry, void* aData)

SetPropertyListDocument, perhaps?

>+  static_cast<nsPropertyNodeList*>(aEntry)->SetDocument(

That cast will be able to go away, I suspect.  More on this below.  ;)

>+nsDOMHTMLPropertiesCollection::NamedItem(const nsAString& aName,
>+                                         nsIDOMNode** aResult)
>+{
>+  return NS_OK;

Need to set *aResult to null.

This method should ideally go away once we migrate this stuff to webidl.  Can you file a followup bug on that, depending on Peter's bug for moving all this stuff to webidl?

>+nsISupports*
>+nsDOMHTMLPropertiesCollection::GetNamedItem(const nsAString& aName,
>+                                            nsWrapperCache **aCache)
>+{
>+  nsIDOMPropertyNodeList* list = nsnull;
>+  NamedItem(aName, &list);
>+  *aCache = nsnull;
>+  return list;

I think you should reverse this and have this method do the real work so that you can actually set *aCache.

Note also that this code as written leaks if NamedItem() actually addrefs its out param.  Another reason to reverse things: this code should be able to not addref at all.

>+nsDOMHTMLPropertiesCollection::GetNodeAt(PRUint32 aIndex)
>+  return static_cast<nsIContent*>(mProperties.SafeObjectAt(aIndex));

You don't need the cast, do you?

>+nsDOMHTMLPropertiesCollection::NamedItem(const nsAString& aName,
>+                                         nsIDOMPropertyNodeList** aResult)
>+  if (mNamedItemEntries.Get(aName, aResult)) {
>+    return NS_OK;

But aResult wasn't addrefed here.  This will crash when caller releases and objects end up dead.

>+  NS_ADDREF(*aResult = propertyList);
>+  mNamedItemEntries.Put(aName, *aResult);

And _this_ will have a dead object in mNamedItemEntries once the caller releases.

It really sounds like you want mNamedItemEntries holding references, right?  Why not just make it an nsRefPtrHashtable<nsStringHashKey, PropertyNodeList> (after renaming the nsDOMPropertyNodeList class accordingly)?  Then it'll hold refs, your CC code will actually make sense, and this stuff can be made more sane.  Plus since you'll have the PropertyNodeList* in getters you'll be able to set *aWrapperCache above.

>+nsDOMHTMLPropertiesCollection::AttributeChanged(nsIDocument *aDocument, Element* aElement,
>+  mIsDirty = true;

Is it worth at all whitelisting the set of attributes that affect this stuff?  Maybe not.

>+MarkDirty(const nsAString& aKey, nsIDOMPropertyNodeList* aEntry, void* aData)
>+{
>+  static_cast<nsPropertyNodeList*>(aEntry)->mIsDirty = true;

And you wouldn't need the cast here.

>+nsDOMHTMLPropertiesCollection::EnsureFresh()
>+  mProperties.Clear();
>+  mNames->Clear();
>+  mNamedItemEntries.EnumerateRead(MarkDirty, nsnull);

Hmm.  So this is not clearing mNamedItemEntries, which I _think_ is correct, because the spec requires that namedItem() return a list even for names for which there are no entries (unlike [], which doesn't do that).

Do the tests actuall test this?  If not, they should.

>+    nsDOMSettableTokenList* itemProp =
>+    static_cast<nsGenericHTMLElement*>(mProperties.ObjectAt(i))->

Is there a reason mProperties is not an nsTArray<nsRefPtr<nsGenericHTMLElement> > ?

>+      bool contains = mNames->ContainsInternal(propName);

Please document here, and in ContainsInternal, that it's _really_ important that it not call EnsureFresh?

This worries me a bit because this setup is quadratic in number of names, unless we're keeping them sorted somehow or something...  Maybe a follow to see about doing this better if needed?

>+static nsIContent*
>+GetElementById(nsIContent* aContent, const nsAString& aId)

Please call this GetElementByIdForConnectedSubtree or something?  And again, I think this should be taking an nsIAtom.  nsDOMSettableTokenList already operates on atoms.  You'll just need to expose a way to get those out.  Something along the lines of GetParsedAttr on nsDOMTokenList, say.  Just make it public.

I can live with the atomization being a followup if you'd prefer, but it would really simplify and speed up this code...

>+  nsINode* node = aContent->SubtreeRoot(); 
>+  if (!node->IsNodeOfType(nsINode::eCONTENT))
>+    return nsnull;

Why?  This seems wrong....  Not that you should ever hit this case, anyway... I guess you can keep this if you add a NS_NOTREACHED or something?

>+    aContent->GetAttr(kNameSpaceID_None, nsGkAtoms::id, itemId);

See, with atoms you could just compare to GetID().

>+void nsDOMHTMLPropertiesCollection::CrawlProperties()
>+    if (!ref.Equals(itemId)) {

No, this is wrong.  If there are multiple elements in the document with the given id, one of which comes before mRoot, then per spec we should be walking that element.

If there weren't tests for this already, please add some (and contribute them back to the spec test suite).

>+      if (content && content->IsElement()) {

The return value of GetElementById() is always an Element.  In fact, you should perhaps make your GetElementById return Element*.  That's what nsIDocument::GetElementById returns.

>+nsDOMHTMLPropertiesCollection::CrawlPropertiesRecursive(Element* aElement)

Is there a reason to not use an iterative walk here, using GetNextNode() and GetNextNonChildNode() (the latter to handle @itemscope)?

>+nsPropertyNodeList::nsPropertyNodeList(nsDOMHTMLPropertiesCollection* aCollection,
>+  // If there is a space in the name, we will never match it

Is this really worth optimizing for?

>+    mName.Assign(aName);

If it is, assign |name| here, not aName.  It'll be faster.

>+nsPropertyNodeList::GetValues(nsIVariant** aValues)
>+  nsTArray<nsIVariant*> values;

I assume this can't be nsTArray< nsCOMPTr<nsIVariant> > because you need an nsIVariant** for SetAsArray?  Probably worth documenting....

>+      static_cast<nsGenericHTMLElement*>(mElements.ObjectAt(i))->GetItemValue(&itemValue);

If mElements were nsTArray<nsRefPtr<nsGenericHTMLElement> >, no cast needed.

>+nsPropertyNodeList::EnsureFresh()
>+  if (mIsEmpty) {
>+    return;

If we _are_ doing this optimization, why can't this be the first thing in the method?

>+    nsDOMSettableTokenList* itemProp =
>+      static_cast<nsGenericHTMLElement*>(prop)->GetItemPropInternal();

Again, no cast would be needed here...

>+NS_INTERFACE_MAP_BEGIN_CYCLE_COLLECTION(PropertyStringList)
...
>+  NS_INTERFACE_MAP_ENTRIES_CYCLE_COLLECTION(PropertyStringList)

BEGIN_CYCLE_COLLECTION already did ENTRIES_CYCLE_COLLECTION, so you don't need it here.
>+nsGenericHTMLElement::~nsGenericHTMLElement()

This totally doesn't belong here.  Please put it in the right file, in the right place!

>+nsDOMHTMLPropertiesCollection::nsDOMHTMLPropertiesCollection(nsGenericHTMLElement* aRoot)

This should be up above near where we do the CC stuff for this class.

>+  mNamedItemEntries.Init();

Can you file a followup bug to think about moving hashtable init into the ctor, now that we have infallible init?

>+++ b/content/html/content/src/nsDOMHTMLPropertiesCollection.h
>+class nsDOMHTMLPropertiesCollection : public nsIDOMHTMLPropertiesCollection,
>+  nsresult Init();

I don't see such a method anywhere.  Did this actually link?  In any case, remove this, please.

>+  void EnsureFresh();
>+  void CrawlProperties();
>+  void CrawlPropertiesRecursive(mozilla::dom::Element* startNode);

Please document.

And also document the member variables, please.

>+class nsPropertyNodeList : public nsINodeList,
>+public:
>+  bool mIsDirty;

Please make this protected and expose a public SetDirty() method or something.

>+++ b/content/html/content/src/nsGenericHTMLElement.cpp
>@@ -1103,16 +1105,21 @@ nsGenericHTMLElement::BindToTree(nsIDocu
>+    nsDOMHTMLPropertiesCollection* properties = 
>+      static_cast<nsDOMHTMLPropertiesCollection*>(GetProperty(nsGkAtoms::properties));

Worth checking the HasProperties() bit first?  Seems like it should be.

> nsGenericHTMLElement::UnbindFromTree(bool aDeep, bool aNullParent)
>+  nsDOMHTMLPropertiesCollection* properties = 
>+    static_cast<nsDOMHTMLPropertiesCollection*>(GetProperty(nsGkAtoms::properties));

And here.

>@@ -3485,8 +3504,158 @@ nsGenericHTMLElement::ChangeEditableStat
>+NS_IMPL_STRING_ATTR(nsGenericHTMLElement, ItemType, itemtype)

itemType is supposed to return a DOMSettableTokenList, no?  Surprised there are no tests for this.

>+NS_IMPL_URI_ATTR(nsGenericHTMLElement, ItemId, itemid)

Hmm.  I guess this is correct, but the spec is slightly unclear here.  Please add tests for this if we don't have them already?

>+nsGenericHTMLElement::SetItemValue(nsIVariant* aValue)
>+  if (!HasAttr(kNameSpaceID_None, nsGkAtoms::itemprop) ||
>+    HasAttr(kNameSpaceID_None, nsGkAtoms::itemscope)) {

Please fix the indent.

>+nsGenericHTMLElement::GetItemRef(nsIVariant** aResult)

Why nsIVariant** and not nsIDOMDOMSettableTokenList**?  That looks wrong to me.  Or is that to deal with the PutForwards business?

This should call GetItemRefInternal.

>+nsGenericHTMLElement::SetItemRef(nsIVariant* aValue)

Man, am I looking forward to this moving to Paris bindings.  This would be way simpler....

>+nsGenericHTMLElement::GetItemProp(nsIVariant** aResult)

Again, please call GetItemPropInternal.

>+++ b/content/html/content/src/nsGenericHTMLElement.h
>@@ -109,16 +115,35 @@ public:
>+  virtual void GetItemValueText(nsAString& text);
>+  virtual void SetItemValueText(const nsAString& text);

Document that these are used to implement the behavior of GetSetItemValue when
the item has @itemprop but not @itemscope?

>+  nsRefPtr<nsDOMSettableTokenList> mItemRef;
>+  nsRefPtr<nsDOMSettableTokenList> mItemProp;

I'm not entirely happy adding two words to every HTML element for this edge case.  Can you please either use properties or put this in DOMSlots or something?

>+++ b/content/html/content/src/nsHTMLAnchorElement.cpp
>+nsHTMLAnchorElement::GetItemValueText(nsAString& aValue)
>+{
>+  GetURIAttr(nsGkAtoms::href, nsnull, aValue);

GetHref?

>+nsHTMLAnchorElement::SetItemValueText(const nsAString& aValue)
>+  SetAttr(kNameSpaceID_None, nsGkAtoms::href, aValue, true);

SetHref?

Or is there a difference?

>+++ b/content/html/content/src/nsHTMLAreaElement.cpp

Same here.

>+++ b/content/html/content/src/nsHTMLAudioElement.cpp

Again, Get/SetSrc unless there's a difference.  If there is, document what it is, please.

Similar for the other HTML element classes.

>+++ b/content/html/content/src/nsHTMLSharedObjectElement.cpp

I believe the changes to this file are wrong for <applet>.  That should be calling on through to the parent class.

>+++ b/content/html/document/src/nsHTMLDocument.cpp
>+static bool MatchItems(nsIContent* aContent, PRInt32 aNameSpaceID, 
>+  if (!elem->HasAttr(kNameSpaceID_None, nsGkAtoms::itemscope) ||
>+    elem->HasAttr(kNameSpaceID_None, nsGkAtoms::itemprop)) {

Fix indent, please.

>+  nsTArray<nsString>* tokens = static_cast<nsTArray<nsString>*>(aData);

Seems like this might work better as an array of atoms....

>+  nsAutoString itemType;
>+  elem->GetItemType(itemType);
>+  return tokens->Contains(itemType);

No, this is wrong.  You want to get the itemType as an array (of atoms) and make sure it's a subset of the other.  As written, this will fail if the element has multiple item types.

Again, please add tests if those weren't present already.

>+nsHTMLDocument::GetItems(const nsAString& types, PRUint8 _argc, nsIDOMNodeList** aReturn)
>+  *aReturn = elements.get();

You meant to use forget() here somewhere.  Right now you're not giving *aReturn a ref!

>+++ b/dom/interfaces/html/nsIDOMHTMLDocument.idl
>+  [optional_argc] nsIDOMNodeList  getItems([optional] in DOMString types);

Why do you need the optional_argc if you never use it?

You need to rev the IID here.

>+++ b/dom/interfaces/html/nsIDOMHTMLElement.idl
>+           attribute DOMString                      itemType;

This needs to be an nsIVariant like itemPop and itemRef.

And for all three document why, please?

>+++ b/dom/interfaces/html/nsIDOMHTMLPropertiesCollection.idl

MPL2, please.

>+interface nsIDOMHTMLPropertiesCollection : nsISupports

Why can't this just inherit from nsIDOMHTMLCollection?  Then you'd only need to declare "names" here; everything else would just get inherited...

Unless you really care about having the XPCOM GetNamedItem returning nsIDOMPropertyNodeList... which I personally don't care about.

Or is this to avoid multiple inheritance due to inheriting from nsIHTMLCollection?  Or because you need to keep supporting xpconnect bindings when new list bindings are disabled?

>+++ b/dom/interfaces/html/nsIDOMPropertyNodeList.idl

MPL2.

>+interface nsIDOMPropertyNodeList : nsISupports {

Again, why not just inherit from nsIDOMNodeList?

>+++ b/js/xpconnect/src/dom_quickstubs.qsconf
>-    'nsIDOMHTMLElement.*',

Why this change?  If you're having to not quickstub something, please document what the something is and why it's not quickstubbed.

>+    'nsIDOMHTMLPropertiesCollection.item',
>+    'nsIDOMHTMLPropertiesCollection.length',

Do we need to do this for the new nodelist too?
Comment 61 David Zbarsky (:dzbarsky) 2012-05-26 22:54:22 PDT
(In reply to Boris Zbarsky (:bz) from comment #60)
> Comment on attachment 626999 [details] [diff] [review]
> Patch
> This method should ideally go away once we migrate this stuff to webidl. 
> Can you file a followup bug on that, depending on Peter's bug for moving all
> this stuff to webidl?

Filed bug 758903.  I couldn't find the bug you're talking about.

> But aResult wasn't addrefed here.  This will crash when caller releases and
> objects end up dead.

The reason that I don't addref in NamedItem is that an addref happens later - somewhere in the binding code.
I've checked by setting breakpoints in AddRef and Release.  Also, the way the code is written right now doesn't leak or hold onto dead objects.

> 
> It really sounds like you want mNamedItemEntries holding references, right? 
> Why not just make it an nsRefPtrHashtable<nsStringHashKey, PropertyNodeList>
> (after renaming the nsDOMPropertyNodeList class accordingly)? 

I didn't use an nsRefPtrHashtable because when an item is first added, you have to already be holding a ref to it, otherwise it gets destroyed.
I figured I may as well just add the ref manually to avoid addrefing and releasing multiple times.  Also, I don't want Get to addref, because then I would be addrefing items twice (see above).

> 
> Do the tests actuall test this?  If not, they should.

The tests test this, and the patch currently fails, because it creates a PropertyNodeList even for [] used with an undefined name.
I'm not sure how to distinguish [] from GetNamedItem calls, perhaps you know.
There is also another issue - when we call collection[-1], the test claims we should return undefined, but we end up converting -1 to a String and creating a PropertyNodeList because of the issue above.

> This worries me a bit because this setup is quadratic in number of names,
> unless we're keeping them sorted somehow or something...  Maybe a follow to
> see about doing this better if needed?

Filed bug 758928


> No, this is wrong.  If there are multiple elements in the document with the
> given id, one of which comes before mRoot, then per spec we should be
> walking that element.

I don't think this is a problem anymore.


> Can you file a followup bug to think about moving hashtable init into the
> ctor, now that we have infallible init?

Filed bug 758929


> itemType is supposed to return a DOMSettableTokenList, no?  Surprised there
> are no tests for this.

The tests were written against an older version of the spec, when itemType was a string.
I changed this, and we now fail these tests.

> >+NS_IMPL_URI_ATTR(nsGenericHTMLElement, ItemId, itemid)
> 
> Hmm.  I guess this is correct, but the spec is slightly unclear here. 
> Please add tests for this if we don't have them already?

There are tests for this.  We fail one test which sets "" to itemId and expects to get "" back, but we end up resolving the URI.
I read http://www.whatwg.org/specs/web-apps/current-work/#resolve-a-url, which points to http://tools.ietf.org/html/rfc3986#section-5.2
I think the relevant part is 
"if (R.path == "") then
 T.path = Base.path;"
so an empty URI should still resolve, which I think makes the test wrong.
 

> Why nsIVariant** and not nsIDOMDOMSettableTokenList**?  That looks wrong to
> me.  Or is that to deal with the PutForwards business?

Yep, because of PutForwards.  I added a comment to the idl.

> 
> Similar for the other HTML element classes.
> 
> >+++ b/content/html/content/src/nsHTMLSharedObjectElement.cpp
> 
> I believe the changes to this file are wrong for <applet>.  That should be
> calling on through to the parent class.

Good catch.


> 
> No, this is wrong.  You want to get the itemType as an array (of atoms) and
> make sure it's a subset of the other.  As written, this will fail if the
> element has multiple item types.
> 
> Again, please add tests if those weren't present already.

I implemented this, so we now fail the relevant tests (they expect an item to match if it's itemtype is equal to any of the types getItems is called with, rather than all of them).  I will write correct tests.

> >+nsHTMLDocument::GetItems(const nsAString& types, PRUint8 _argc, nsIDOMNodeList** aReturn)
> >+  *aReturn = elements.get();
> 
> You meant to use forget() here somewhere.  Right now you're not giving
> *aReturn a ref!

I believe this gets reffed when the ContentList gets a binding.  This code doesn't crash.


> Why can't this just inherit from nsIDOMHTMLCollection?  Then you'd only need
> to declare "names" here; everything else would just get inherited...
> 
> Unless you really care about having the XPCOM GetNamedItem returning
> nsIDOMPropertyNodeList... which I personally don't care about.
> 
> Or is this to avoid multiple inheritance due to inheriting from
> nsIHTMLCollection?  Or because you need to keep supporting xpconnect
> bindings when new list bindings are disabled?

Basically it's because it works the way it is and I don't think it's worth changing since we'll move to webidl eventually.
Comment 62 David Zbarsky (:dzbarsky) 2012-05-27 01:49:30 PDT
Created attachment 627546 [details] [diff] [review]
Interdiff

Working on the tests now.
Comment 63 David Zbarsky (:dzbarsky) 2012-05-27 13:11:45 PDT
*** Bug 742633 has been marked as a duplicate of this bug. ***
Comment 64 Boris Zbarsky [:bz] (still a bit busy) 2012-05-28 10:50:11 PDT
Yikes.  Your interdiff was after the file rename and it looks like you didn't use hg rename for the rename or something?  Any chance of an interdiff that would actually be meaningful for the new files?

(In reply to David Zbarsky from comment #61)
> The reason that I don't addref in NamedItem is that an addref happens later
> - somewhere in the binding code.

So what?  This is an XPCOM getter.  It can be called by arbitrary consumers, who will expect to hold an owning ref to the object once the call returns.

The binding code is not calling NamedItem() directly, is it?  It's calling GetNamedItem(), I would bet.  And your GetNamedItem implementation is violating the XPCOM contract for NamedItem, which is why it ends up working out, sort of.

> Also, the way the code is written right now doesn't leak or hold onto dead objects.

If anyone from C++ ever calls your NamedItem, you will be holding on to dead objects.

> I didn't use an nsRefPtrHashtable because when an item is first added, you
> have to already be holding a ref to it, otherwise it gets destroyed.

Worth filing a bug on that, but even if that's the case, just do this:

  nsRefPtr<MyClass> obj = new MyClass();
  myTable.Put(aName, obj);

Yes, it's an extra addref+release the first time for the name (when creating the object).  I think that's just fine.

> Also, I don't want Get to addref

Use GetWeak, then.

> The tests test this, and the patch currently fails, because it creates a
> PropertyNodeList even for [] used with an undefined name.

I actually meant that the tests test that namedItem() with a name not in the collection returns a list.  But it sounds like the tests test both that that happens and that [] does not have that behavior?

> I'm not sure how to distinguish [] from GetNamedItem calls, perhaps you know.

My guess is that you don't actually want the "forward(getNamedItem)" bit in your IDL, since namedItem is supposed to have quite different behavior from getNamedItem (the latter is what implements [], and should only return non-null for lists that actually map something).  And then you need to have slightly different implementations for NamedItem and GetNamedItem in your C++.  Try that?  If it doesn't work, check with peterv?

> > No, this is wrong.  If there are multiple elements in the document with the
> > given id, one of which comes before mRoot, then per spec we should be
> > walking that element.
> 
> I don't think this is a problem anymore.

Why not?  Sure still looks like a problem to me.  Consider a testcase like this:

  <div id="foo" itemprop="bar"></div>
  <span itemscope id="foo" itemref="foo"></span>

The .properties for the <span> should have the <div> as the "bar" prop, right?  Does it with your code?  Do the tests have a test for this?
  
> I changed this, and we now fail these tests.

Awesome.  ;)  Fix the tests and get those pushed back to the test suite as needed?

> There are tests for this.  We fail one test which sets "" to itemId and
> expects to get "" back, but we end up resolving the URI.

I believe our behavior is correct, yeah.  Please push back on the tests?

> I implemented this, so we now fail the relevant tests (they expect an item
> to match if it's itemtype is equal to any of the types getItems is called
> with, rather than all of them).  I will write correct tests.

Again, we should upstream those.

> I believe this gets reffed when the ContentList gets a binding.  This code
> doesn't crash.

It will if a GC actually happens, so that the JS object is collected and drops its reference.  At that point your refcount can underflow.

If it doesn't crash, in this case all that means it's not being tested enough.  ;)

> Basically it's because it works the way it is and I don't think it's worth
> changing since we'll move to webidl eventually.

Agreed.  Just add a comment, please.
Comment 65 David Zbarsky (:dzbarsky) 2012-05-28 22:04:24 PDT
Created attachment 627852 [details] [diff] [review]
Fix tests to match spec
Comment 66 :Ms2ger (⌚ UTC+1/+2) 2012-05-29 02:42:27 PDT
Comment on attachment 627852 [details] [diff] [review]
Fix tests to match spec

Review of attachment 627852 [details] [diff] [review]:
-----------------------------------------------------------------

Please add a test that properties(foo) throws.

zcorpan approved the changes with these two changes, but with reservation that Opera will defer more careful review until they update their implementation to match the spec.

I can address those comments if you like; let me know?

::: tests/submission/Opera/microdata/001.html
@@ +3,5 @@
>  	<head>
>  		<meta charset="UTF-8">
>  		<title>Microdata tests</title>
> +		<script type="text/javascript" src="http://w3c-test.org/resources/testharness.js"></script>
> +		<script type="text/javascript" src="http://w3c-test.org/resources/testharnessreport.js"></script>

Leave those as they are.
Comment 67 David Zbarsky (:dzbarsky) 2012-05-29 11:11:40 PDT
Created attachment 628023 [details] [diff] [review]
Fix tests to match spec

Ms2ger, you know what to do with this.
Comment 68 :Ms2ger (⌚ UTC+1/+2) 2012-05-29 11:48:06 PDT
Comment on attachment 628023 [details] [diff] [review]
Fix tests to match spec

https://dvcs.w3.org/hg/html/rev/50bd3075d1b6
Comment 69 :Ms2ger (⌚ UTC+1/+2) 2012-05-29 12:56:01 PDT
Created attachment 628072 [details] [diff] [review]
Import the test
Comment 70 David Zbarsky (:dzbarsky) 2012-05-29 23:14:52 PDT
Created attachment 628247 [details] [diff] [review]
Interdiff

This interdiff should be better.  Just ignore the test file, I deleted it since we're going to import the test suite.
Comment 71 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2012-05-30 07:00:34 PDT
Comment on attachment 626999 [details] [diff] [review]
Patch

Unsetting the review request, because Boris is looking at this and making better review comments than I would have.
Comment 72 David Zbarsky (:dzbarsky) 2012-05-31 14:40:04 PDT
bz, Ignore the refcounting for NamedItem/GetNamedItem.  It's not quite right, and I'm going to switch to the nsRefPtrHashtable in the next iteration of this patch.
Comment 73 David Zbarsky (:dzbarsky) 2012-05-31 17:11:51 PDT
Created attachment 628984 [details] [diff] [review]
Hashtable and leak fix
Comment 74 Boris Zbarsky [:bz] (still a bit busy) 2012-05-31 19:58:31 PDT
Comment on attachment 628247 [details] [diff] [review]
Interdiff

>+++ b/content/html/content/src/HTMLPropertiesCollection.cpp
>+SetPropertyListDocument(const nsAString& aKey, PropertyNodeList* aEntry, void* aData)
>+  aEntry->SetDocument( static_cast<nsIDocument*>(aData));

Nix the extra space there?

>+HTMLPropertiesCollection::Item(PRUint32 aIndex, nsIDOMNode** aResult)
>+  nsGenericHTMLElement* property = mProperties.SafeElementAt(aIndex);
>+  *aResult = property ? property->AsDOMNode() : NULL;
>+  return NS_OK;

Hey, this code _used_ to be correct.  ;)

You need an NS_IF_ADDREF(*aResult) here.

>+HTMLPropertiesCollection::GetNamedItem(const nsAString& aName,

This shouldn't be addrefing its return value, as you presumably discovered.

>+HTMLPropertiesCollection::NamedItem(const nsAString& aName,

Wonder whether it makes any sense to factor out the "ensure property list" code between here and GetNamedItem....

>+class TreeOrderComparator {
>+      return nsContentUtils::PositionIsBefore(const_cast<nsGenericHTMLElement*>(aElem1),
>+                                              const_cast<nsGenericHTMLElement*>(aElem2));

File a followup bug to make PositionIsBefore work on const nodes?  Though I guess it does use IndexOf(), so is fundamentally not that const-safe.  :(

>+HTMLPropertiesCollection::EnsureFresh()

Come to think of it, an explicit comment saying that mNamedItemEntries is not being cleared on purpose might be a good idea.

Also, do you really need to GetItemPropInternal here?  Seems like GetParsedAttr() would work just fine, and then you could work with dependent atom strings as needed, and not force allocations of the nsDOMSettableTokenList.

>+HTMLPropertiesCollection::CrawlProperties()

Again, just call GetParsedAttr() on the element if you don't need the nsDOMSettableTokenList for anything else?

>+        nsString id;
>+        ref->ToString(id);
>+        element = doc->GetElementById(id);

  element = doc->GetElementById(nsDependentAtomString(ref));

>+      if (element && element != mRoot) {

Is it worth to check that element is not a descendant of mRoot?  Might not be....  I guess the check for mRoot is pretty cheap, even though it'll pretty much never be mRoot, right?

>+HTMLPropertiesCollection::CrawlPropertiesRecursive(Element* aElement)

Probably better to rename this to CrawlSubtree().

>+    if (!aContent->IsElement()) {

Since you plan to check IsHTML() and GetNextNode() if not anyway, can't you do:

  if (aContent == mRoot || !aContent->IsHTML()) {
    // Move on to the next node in the tree
    aContent = aContent->GetNextNode(aElement);
  } else {
    MOZ_ASSERT(aContent->IsElement(), "IsHTML() returned true!");
    Element* element = aContent->AsElement();
    // etc, with some stuf outdented since now you know this is HTML
  }

>+      if (element != mRoot && element->IsHTML()) {

Document that the !mRoot check is not just an optimization.  It's actually needed for correctness, because the spec explicitly excludes an element from being one of its own properties.

>+PropertyNodeList::PropertyNodeList(HTMLPropertiesCollection* aCollection,
>+  mName.Assign(aName);

Just toss mName(aName) in the initializer list instead?

>+PropertyNodeList::GetValues(nsIVariant** aValues)
>       static_cast<nsGenericHTMLElement*>(mElements.ObjectAt(i))->GetItemValue(&itemValue);

I still claim that mElements should be nsTArray<nsRefPtr<nsGenericHTMLElement> >.

>+PropertyNodeList::EnsureFresh()

And here, again seems like you should be able to use GetParsedAttr instead of GetItemPropInternal.

> }
>+}

Add comments like "// namespace dom" and "// namespace mozilla" after those close curlies?  And similar in the header.

>+++ b/content/html/content/src/nsGenericHTMLElement.cpp
> nsGenericHTMLElement::GetItemRef(nsIVariant** aResult)
>+  out->SetAsInterface(NS_GET_IID(nsIDOMDOMSettableTokenList), itemRef);

You should probably make the type of itemRef be nsIDOMDOMSettableTokenList*.  Otherwise you're depending on the exact inheritance chain of nsDOMSettableTokenList, since SetAsInterface takes void*, not nsISupports*.

Similar comments in GetItemProp and GetItemType.

I don't see where you're traversing/unlinking these three new properties in nsGenericHTMLElement (and I guess the first patch had that problem too).

Also, shouldn't they have property destructors that release the nsDOMSettableTokenLists?

>+++ b/content/html/content/src/nsHTMLAudioElement.cpp
> nsHTMLAudioElement::SetItemValueText(const nsAString& aValue)
>+  // Can't call GetSrc because we don't have a JSContext

s/GetSrc/SetSrc/

>+++ b/content/html/content/src/nsHTMLVideoElement.cpp
> nsHTMLVideoElement::SetItemValueText(const nsAString& aValue)
>+  // Can't call GetSrc because we don't have a JSContext

Likewise.

>+++ b/content/html/document/src/nsHTMLDocument.cpp
>+  nsDOMSettableTokenList* itemType = elem->GetItemTypeInternal();

Why allocate the nsDOMSettableTokenList when you just need the parsed attribute?  Use GetParsedAttr().

>+  if (attr->Type() == nsAttrValue::eAtomArray) {

I don't think you need this.  This attr was parsed via ParseAtomArray.  Just use a loop like so:

  for (PRUint32 i = 0; i < tokens->Length(); i++) {
    if (!attr->Contains(tokens->ElementAt(i), eCaseMatters)) {
      return false;
    }
  }
Comment 75 Boris Zbarsky [:bz] (still a bit busy) 2012-05-31 20:02:26 PDT
Comment on attachment 628984 [details] [diff] [review]
Hashtable and leak fix

The hashtable bits are good.

The delete calls are wrong, I think.  Can't JS still be holding refs to those objects?  Really, you want to give those objects destructors that will release them, and that should fix the leak.  Well, that, and doing cycle collection properly for them.
Comment 76 David Zbarsky (:dzbarsky) 2012-06-01 13:15:45 PDT
Created attachment 629315 [details] [diff] [review]
Interdiff from last review
Comment 77 Boris Zbarsky [:bz] (still a bit busy) 2012-06-01 22:25:45 PDT
Comment on attachment 629315 [details] [diff] [review]
Interdiff from last review

>+++ b/content/base/src/nsGenericElement.cpp

Might be good to put the new stuff in unlink/traverse behind an IsHTML() check, come to think of it.

The traverse code assumes that nsISupports is on the primary inheritance chain for nsDOMSettableTokenList.  That's probably OK (esp. since Paris bindings require that in general), but please document that in both nsDOMSettableTokenList or nsDOMTokenList?

Alternately, you need to cast to nsIDOMDOMSettableTokenList* when storing and retrieving the prop (and when retrieving you can then further cast to nsDOMTokenList*).

Working with void* is a huge pain.  :(

>+++ b/content/html/content/src/nsGenericHTMLElement.cpp

>+nsGenericHTMLElement::GetTokenList(nsIAtom* aAtom)
>+  NS_ADDREF(list);
>+  return list;

If you're addrefing, you should return already_AddRefed<nsDOMSettableTokenList>.

But I don't think you want this addref.  It certainly just makes the consumers leak!

> nsGenericHTMLElement::SetItemRef(nsIVariant* aValue)
>+  nsRefPtr<nsDOMSettableTokenList> itemRef = GetTokenList(nsGkAtoms::itemref);

Why does this need to be a strong ref?  I don't think it does.

> nsGenericHTMLElement::SetItemProp(nsIVariant* aValue)
>+  nsRefPtr<nsDOMSettableTokenList> itemProp = GetTokenList(nsGkAtoms::itemprop);

Same here.

> nsGenericHTMLElement::SetItemType(nsIVariant* aValue)
>+  nsRefPtr<nsDOMSettableTokenList> itemType = GetTokenList(nsGkAtoms::itemtype);

And here.

r=me with those issues fixed.  I'd like to see the interdiff fixing them, though.

And thank you again for the interdiffs!
Comment 78 David Zbarsky (:dzbarsky) 2012-06-03 00:14:47 PDT
Created attachment 629565 [details] [diff] [review]
(Hopefully) final interdiff

Try looks green.
Comment 79 Boris Zbarsky [:bz] (still a bit busy) 2012-06-03 00:18:35 PDT
Comment on attachment 629565 [details] [diff] [review]
(Hopefully) final interdiff

>-    NS_ADDREF(aAtom);
>+    NS_ADDREF(list);

Uh.... yes.  Very much.  I have no idea how I missed that!

r=me
Comment 80 David Zbarsky (:dzbarsky) 2012-06-04 10:47:44 PDT
Created attachment 629849 [details] [diff] [review]
Rollup patch
Comment 81 David Zbarsky (:dzbarsky) 2012-06-04 16:54:21 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/28dd33748be5
Comment 82 Boris Zbarsky [:bz] (still a bit busy) 2012-06-04 18:34:39 PDT
David, are you going to land the tests too?
Comment 83 David Zbarsky (:dzbarsky) 2012-06-04 18:58:37 PDT
I did, I just didn't post the cset.
https://hg.mozilla.org/integration/mozilla-inbound/rev/05e95c78c017
Comment 85 Kohei Yoshino [:kohei] 2012-10-15 09:24:08 PDT
[FYI] Movable Type has been affected by this change. The specific feature no longer works as a custom DOM property called itemId has been used in their code. Today Six Apart published the workaround patch:
https://github.com/movabletype/movabletype/commit/83d2f3d21d9c9a951d7e872d70bac5d355bd3d4d
http://www.movabletype.jp/faq/firefox-16-patches.html

I've added this bug to our site compatibility document:
https://dev.mozilla.jp/2012/10/firefox-16-site-compatibility/
Comment 86 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2013-08-26 21:28:02 PDT
Note that bug 909633 proposes removing this feature.
Comment 87 Jean-Yves Perrier [:teoli] 2016-05-25 09:46:41 PDT
Marking as dev-doc-complete as we had docs and the feature got removed (docs update listed in bug 909633)

Note You need to log in before you can comment on or make changes to this bug.