Closed Bug 1037483 Opened 10 years ago Closed 8 years ago

update firefox microformats parser with microformats2 support

Categories

(Firefox Graveyard :: SocialAPI, defect)

32 Branch
x86
macOS
defect
Not set
normal

Tracking

(firefox47 fixed)

RESOLVED FIXED
Firefox 47
Tracking Status
firefox47 --- fixed

People

(Reporter: mixedpuppy, Assigned: mixedpuppy)

References

(Blocks 1 open bug)

Details

(Whiteboard: microformats,microformats2)

Attachments

(2 files, 11 obsolete files)

22.03 KB, patch
mixedpuppy
: review+
Details | Diff | Splinter Review
1.20 MB, patch
mixedpuppy
: review+
Details | Diff | Splinter Review
The microformats modules is v1, we should update to support v2.  Bug 926417 led my to try using the microformats-shiv project on github.  That could be used as a module, but should be written to take advantage of internal apis in firefox.
For reference, the microformats2 parsing spec: http://microformats.org/wiki/microformats2-parsing
Whiteboard: microformats,microformats2
Summary: update firefox microformats support → update firefox microformats support with microformats2 support
Summary: update firefox microformats support with microformats2 support → update firefox microformats parser with microformats2 support
I've been using microformats-shiv as well. It's a much better implementation than what I wrote. My microformats API didn't get any traction (as far as I know).

I'm not sure it should even be in Firefox anymore...
Attached patch microformats (obsolete) — Splinter Review
Patch that adds microformats v2 support, updates PageMetaData.jsm to also use microformats
You are a gentleman and a scholar.
We had some microformats tests. Does this pass them? I'm wondering how close the objects will be.
(In reply to Mike Kaply [:mkaply] from comment #6)
> We had some microformats tests. Does this pass them? I'm wondering how close
> the objects will be.

Glenn has done a lot of work on the test suite, I'm looking at how to integrate it into the codebase, but am not so familiar with mocha.  You can see his new branch here:

https://github.com/glennjones/microformat-shiv/tree/dev

update-tests pulls all the tests down from the test repo.
(In reply to Mike Kaply [:mkaply] from comment #2)
> I've been using microformats-shiv as well. It's a much better implementation
> than what I wrote. My microformats API didn't get any traction (as far as I
> know).
> 
> I'm not sure it should even be in Firefox anymore...

Can we get a follow-up bug to pull it?
(In reply to Jet Villegas (:jet) from comment #8)
> (In reply to Mike Kaply [:mkaply] from comment #2)
> > I've been using microformats-shiv as well. It's a much better implementation
> > than what I wrote. My microformats API didn't get any traction (as far as I
> > know).
> > 
> > I'm not sure it should even be in Firefox anymore...
> 
> Can we get a follow-up bug to pull it?

Well, it appears some components are starting to use it now (and we need to check add-ons). Maybe now's the time to actually try to get some traction.
Attached patch microformats-shiv (obsolete) — Splinter Review
Attachment #8632273 - Attachment is obsolete: true
Attached patch microformats-tests (obsolete) — Splinter Review
Test infrastructure for microformats2 integrated to marionette.
Attached patch microformats-share (obsolete) — Splinter Review
PageMetaData and share support
The results of the microformats-shiv are similar to what Microformats.js does.

Do you see any reason not to simply replace one with the other?
(In reply to Mike Kaply [:mkaply] from comment #13)
> The results of the microformats-shiv are similar to what Microformats.js
> does.
> 
> Do you see any reason not to simply replace one with the other?

not really, there was only one addon on AMO that used the old parser.  removal can be a separate patch or bug.
In addition to reviewing the current attachments, I am reviewing:
* the v1.3 update to microformats-shiv at https://github.com/glennjones/microformat-shiv and 
* the update to microformats-tests at https://github.com/microformats/tests
r+ microformats-share
(still reviewing attachment + updates for microformats-shiv and microformats-tests)
Attached patch microformats-tests (obsolete) — Splinter Review
Attachment #8645998 - Attachment is obsolete: true
Attachment #8667997 - Flags: review?(tantek)
Attached patch microformats-shiv (obsolete) — Splinter Review
Attachment #8645997 - Attachment is obsolete: true
Attachment #8667998 - Flags: review?(tantek)
Attached patch microformats-share (obsolete) — Splinter Review
Attachment #8646000 - Attachment is obsolete: true
Attachment #8667999 - Flags: review?(tantek)
Attachment #8667999 - Flags: review?(tantek) → review+
(In reply to Shane Caraveo (:mixedpuppy) from comment #17)
> Created attachment 8667997 [details] [diff] [review]
> microformats-tests

I don't seem to be able to review this attachment, e.g. when I go to https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=1037483&attachment=8667997 I get the error message: "Attachment 8667997 [details] [diff] is not an attachment to bug 1037483" - Shane could you double-check the reviewer flag on this attachment?
Attachment #8667997 - Attachment is patch: true
Still reviewing microformats-tests and microformats-shiv attachments and will be doing so this weekend til I've gotten through all the files in both!
Comment on attachment 8667997 [details] [diff] [review]
microformats-tests

Review of attachment 8667997 [details] [diff] [review]:
-----------------------------------------------------------------

Very thorough number and combination of test files and harness. A few prose spelling errors in comments, but those can be fixed in a subsequent small patch.
Attachment #8667997 - Flags: review?(tantek) → review+
Comment on attachment 8667998 [details] [diff] [review]
microformats-shiv

Review of attachment 8667998 [details] [diff] [review]:
-----------------------------------------------------------------

A few prose spelling errors in comments, but those can be fixed in a subsequent small patch.

In addition, the following minor fixes can also be made in a subsequent small patch:
* Uses of ".indexOf(':')" should be more conservatively ".indexOf('://')" (no known examples where it would break either way)
* removeWhiteSpace should be renamed collapseWhiteSpace since it "leaves a single space" according to comments (and code)
* for modules.maps['h-recipe'] - 'tag': {} - should likely be like h-entry, h-review etc.
'category': {
				'map': 'p-category',
				'relAlt': ['tag']
},

I've updated the h-recipe spec accordingly to explicitly mention "p-category" in the properties list and for Parser Compat just like h-entry (omission was an oversight).
Attachment #8667998 - Flags: review?(tantek) → review+
(In reply to Tantek Çelik from comment #22)
> Very thorough number and combination of test files and harness. A few prose
> spelling errors in comments, but those can be fixed in a subsequent small
> patch.

A follow-up just to fix those will make it more difficult for future developers to dig through past changes. Let's just fix those now in this patch, while we're here. Should be super easy if it's not code change.
There are many included bits of external code in the shiv patch. Prior to landing we need to get all the licensing for that code approved.

Mossop, what's the right way to do that? I usually email authors to get the code re-licensed under MPL, but want to confirm that's the correct path.
Flags: needinfo?(dtownsend)
Looks like it is MIT licensed which I believe we can just include as-is but I defer licensing questions to someone who knows what they're talking about like Gerv.
Flags: needinfo?(dtownsend) → needinfo?(gerv)
The shiv patch has a bunch of different things inside, pretty sure I saw some with different licenses than MIT.
dietrich and mossop - I put microformats-shiv under MIT, as it seems to be one of most widely know and understood of the family of permissive free licenses.  

If we can keep as MIT that would be great, I have nothing against MPL, but this code is used in other JavaScript projects. In some cases companies have a short list of pre-approved licenses of which MIT seems quite common. A dual license MIT/MPL would also be great, but I am not sure if that is possible? I would consider putting it under MPL only license as a last resort, but it would most likely reduce it uptake elsewhere which would be a shame.

There are two pieces of 3rd party code in microformats-shiv, one is under a BDS license the other has a public domain statement.  Interestingly neither of these two pieces of code is needed for Firefox over version 26.0; they are in the library as polyfills for missing/broken features in older browsers.

We could as dietrich suggests ask the authors of these two polyfills to relicense there code or I could change the project to produce a version of the microformats-shiv.js without the polyfills just for use in Firefox.  Updating the build to produce the version without the polyfills would be easy, about 30 mins work.
I would suggest updating the library to produce a non-polyfills version (why carry unused code). After that, MIT licensing is fine. You will need to add a copy of the MIT license to about:license (toolkit/content/license.html), in alphabetical order. It should be fairly obvious how to do that. Set r? to me on the patch.

Gerv
Flags: needinfo?(gerv)
Attached patch microformats-tests (obsolete) — Splinter Review
Attachment #8667997 - Attachment is obsolete: true
Attachment #8671593 - Flags: review?(tantek)
Attached patch microformats-shiv (obsolete) — Splinter Review
Attachment #8667998 - Attachment is obsolete: true
Attachment #8671594 - Flags: review?(tantek)
Shanes last update to the patch hopefully addresses all current issues. As per Gerv comment, it now has no 3rd party polyfills and embedded licenses. It also addressed all of Tantek review points:

* Added p-category support to v1 hrecipe instead of tag{}
* Renamed removeWhiteSpace to collapseWhiteSpace
* Update test for protocol separator to indexOf('://')
* Corrected spelling errors in comments and try to make them clearer
Comment on attachment 8671594 [details] [diff] [review]
microformats-shiv

Review of attachment 8671594 [details] [diff] [review]:
-----------------------------------------------------------------

Verified code fixes have been made. I didn't keep track of each specific comment typo - and this review tool does not show a diff from the previous patch. If any further typos are found in comments, they can be fixed in subsequent patches that will be trivial for future developers to look past (since such patches will actually show diffs, instead of this first patch of the code which even when updated here shows everything in its entirety as "new").
Attachment #8671594 - Flags: review?(tantek) → review+
Comment on attachment 8671593 [details] [diff] [review]
microformats-tests

Review of attachment 8671593 [details] [diff] [review]:
-----------------------------------------------------------------

I didn't keep track of each specific comment typo - and this review tool does not show a diff from the previous patch. If any further typos are found in comments, they can be fixed in subsequent patches that will be trivial for future developers to look past (since such patches will actually show diffs, instead of this first patch of the code which even when updated here shows everything in its entirety as "new").
Attachment #8671593 - Flags: review?(tantek) → review+
(In reply to Dietrich Ayala (:dietrich) from comment #24)
> (In reply to Tantek Çelik from comment #22)
> > Very thorough number and combination of test files and harness. A few prose
> > spelling errors in comments, but those can be fixed in a subsequent small
> > patch.
> 
> A follow-up just to fix those will make it more difficult for future
> developers to dig through past changes. Let's just fix those now in this
> patch, while we're here. Should be super easy if it's not code change.

Dietrich, looks like all fixes and other reviews are in. Could you take a quick look to make sure your concerns are handled so we can land this?
Comment on attachment 8667999 [details] [diff] [review]
microformats-share

r+, just a few questions below.

>-      gBrowser.selectedBrowser.messageManager.sendAsyncMessage("PageMetadata:GetMicrodata", null, { target });
>+      gBrowser.selectedBrowser.messageManager.sendAsyncMessage("PageMetadata:GetPageData", null, { target });

GetMicroData is removed entirely? If so, have you taken a look at how many consumers of this exist, eg in the wild in add-ons? (we can search source of all add-ons on AMO).

> var PageMetadataMessenger = {
>   init() {
>     addMessageListener("PageMetadata:GetPageData", this);
>-    addMessageListener("PageMetadata:GetMicrodata", this);
>   },
>   receiveMessage(message) {
>     switch(message.name) {
>       case "PageMetadata:GetPageData": {
>-        let result = PageMetadata.getData(content.document);
>+        let target = message.objects.target;
>+        let result = PageMetadata.getData(content.document, target);

Hm, is message.objects guaranteed to be not null?


>@@ -35,20 +36,21 @@ this.PageMetadata = {
>    * - title
>    * - Metadata specified in <meta> tags, including OpenGraph data
>    * - Links specified in <link> tags (short, canonical, preview images, alternative)
>    * - Content that can be found in the page content that we consider useful metadata
>    * - Microdata, as defined by the spec:
>    *   http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata-2.html
>    *
>    * @param {Document} document - Document to extract data from.
>+   * @param {Element} [target] - Optional element to restrict microdata lookup to.
>    * @returns {Object} Object containing the various metadata, normalized to
>    *                   merge some common alternative names for metadata.
>    */
>-  getData(document) {
>+  getData(document, target = null) {

why nullifying that?

>@@ -63,21 +65,26 @@ this.PageMetadata = {
>           shentry.value && shentry.value.URIWasModified) {
>         return result;
>       }
>     }
> 
>     this._getMetaData(document, result);
>     this._getLinkData(document, result);
>     this._getPageData(document, result);
>-    result.microdata = this.getMicrodata(document);
>+    result.microdata = this.getMicrodata(document, target);
>+    result.microformats = this.getMicroformats(document, target);
>

Those are sync operations - can they block UI?
Attachment #8667999 - Flags: review+
Comment on attachment 8671593 [details] [diff] [review]
microformats-tests

These are tests for the shiv. Are there tests for the priv js code that you've changed in these patches?
Blocks: 543630
(In reply to Dietrich Ayala (:dietrich) from comment #37)
> Comment on attachment 8671593 [details] [diff] [review]
> microformats-tests
> 
> These are tests for the shiv. Are there tests for the priv js code that
> you've changed in these patches?

To clarify: Are there pre-existing tests for the chrome microformat code in Firefox to make sure that 1) it works as expected and 2) it doesn't get regressed by other future changes in Firefox?

If not, this is the time to write them. Should be trivial to do so - mozilla-central contains many examples of writing tests that check that chrome JS handles web content in the expected way.
Flags: needinfo?(mixedpuppy)
(In reply to Dietrich Ayala (:dietrich) from comment #37)
> Comment on attachment 8671593 [details] [diff] [review]
> microformats-tests
> 
> These are tests for the shiv. Are there tests for the priv js code that
> you've changed in these patches?

There are pagedata tests in the socialapi stuff
(In reply to Shane Caraveo (:mixedpuppy) from comment #39)
> (In reply to Dietrich Ayala (:dietrich) from comment #37)
> > These are tests for the shiv. Are there tests for the priv js code that
> > you've changed in these patches?
> 
> There are pagedata tests in the socialapi stuff

Do those run per-push, and do they exercise this code? And how thoroughly?
(In reply to Dietrich Ayala (:dietrich) from comment #40)
> (In reply to Shane Caraveo (:mixedpuppy) from comment #39)
> > (In reply to Dietrich Ayala (:dietrich) from comment #37)
> > > These are tests for the shiv. Are there tests for the priv js code that
> > > you've changed in these patches?
> > 
> > There are pagedata tests in the socialapi stuff
> 
> Do those run per-push, and do they exercise this code? And how thoroughly?

They are mochitests, and do excercise this code pretty well, but not specifically the microformats part.  I can easily add a simple test for that.

On the comments on the other patch, I can leave GetMicroData in, it's no big deal.  Blocking UI may be an issue with the call to getMicroformats.
Flags: needinfo?(mixedpuppy)
Shane, Dietrich, we're all here at Mozlando this week - let's set aside an hour to resolve any outstanding questions and land this!


Regarding:

> but not specifically the microformats part.  I can easily add a simple test for that.

Shane, do you think you can take care of this in the next day?


> On the comments on the other patch, I can leave GetMicroData in, it's no big deal.

I'd rather drop if there's no good reason to keep. Let's discuss in person to understand why/why not.


> Blocking UI may be an issue with the call to getMicroformats.

Can we resolve this "may" with an actual or not? Blocking UI is a pretty bad thing in any situation, and especially bad as anything to be introduce. I'd like to understand *how* "Blocking UI may be an issue", and preferably falsify that assertion so that we can respond to any questions in the future with a "no" and "here's why".
Perf code-review IRL at Orlando: Nothing blocks page load by default. Only action occurs upon user interaction with Share. There's some cost on microformat detection, but is in response to that user action only. As long as tryserver run is clean, r=me for landing.
TODO:

1. add microformats test to PageMetaData module tests.
2. pull out microdata support (for bug 909633)

note: microformats will block UI if called, this only happens when a user specifically takes action to share.  [performance data to follow]

Performance results

Averaged over 3 page renders. Laptop used was a MacBook Pro (Retina, Mid 2014) and phone used was a HTC M8. Used latest browsers 2015-07-21

Microformats.count()
site                        Firefox Fennec  Chrome  Chrome for Andriod  Opera  Safari
bbc.co.uk (no micorformats)     4ms   33ms     3ms                45ms    4ms     2ms
glennjones.net/notes            5ms   60ms     3ms                28ms    2ms     3ms
tantek.com                      5ms   64ms     7ms                52ms    4ms     3ms

Microformats.getParent()
site                        Firefox Fennec  Chrome  Chrome for Andriod  Opera  Safari
bbc.co.uk (no micorformats)     2ms    7ms     1ms                 5ms    1ms     1ms
glennjones.net/notes           18ms  192ms    14ms               101ms   10ms     7ms
tantek.com                     16ms  237ms    12ms               116ms   10ms     9ms

Microformats.get()
site                        Firefox Fennec  Chrome  Chrome for Andriod  Opera  Safari
bbc.co.uk (no micorformats)     5ms   61ms     6ms                42ms*   4ms     2ms
glennjones.net/notes           91ms  790ms    68ms               503ms   76ms    46ms
tantek.com                    163ms 1481ms   116ms               980ms  120ms    59ms
* Random results in 170ms range seen
note: socialapi share is the only current user of microformats, performance notes for desktop are relevant there, fennec will only be relevant if microformats is used there in the future.
Sorry I missed the remoting into code-review at Orlando. Was offline for a couple of days and did not get Tanteks invite at the time. Its great to see this moving again.
Attached patch microformats-share (obsolete) — Splinter Review
update the share patch replacing microdata with microformats (updates tests as well).
Attachment #8667999 - Attachment is obsolete: true
Attachment #8700861 - Flags: review+
Glenn,
The shiv is used from a content script where we do not have the URL class.  We need to address that in modules.url.resolve somehow.  Sorry, this only showed up once I revamped all the chrome side tests.
Flags: needinfo?(glennjonesnet)
Shane, I think we have a couple of options. The first is to back track and reinstate Gary Court uri-js http://github.com/garycourt/uri-js which works fine, but has a custom license which is one of the reasons we took it out. 

The second is to make use of the DOMParser object create a very small html doc with a <base> and <a> tag. The resulting tiny DOM object will allows to get a the resolved URL. On one hand this is great as it uses the browsers internal URL parsing, but could be slower if there are lots a URLs to revolve. The code for resolving URLs is only called for URL data in a discovered microformat and then only if the URL does not start with a protocol schema i.e. http://

In terms a getting this out I would go with DOMParser options, will not legal review. Happy to knock up the code
Flags: needinfo?(glennjonesnet)
Lets go with the domparser.  Could it be cached and only replaced if the baseurl changes?  Maybe also do the check for typeof(URL) and use it if it is available?
OK, just about to go out for evening will have a look at in the morning. Caching it so we only create a single instance pre a parse is a good idea. I known from past work you can change the <base> in an existing and it updates a absolute paths you can get from <a> tag.  

The use of typeof(URL) is a bit more of an issue, I wrote about while testing shiv. Short answer is IE team only built one of the two URL object spec so you cannot use testing without breaking the whole script in IE 
In depth - http://glennjones.net/articles/the-problem-with-window-url
Ok, lets just do what will work best.
OK, microformats shiv is now updated to use the DOMParser method when the URL API is not available. I have tested the performance which does not seem to be an issue at all.

Using DOMParser Method – With cached DOM object and links to nodes:

Desktop - Avg time to resolve URL 0.014 milliseconds for 400k calls
Android - Avg time to resolve URL 0.080 milliseconds for 400k calls  (HTC OneX Android 4)

Shane if you use the upgrade script it should pull the new version.

Any problems let me know
Attached patch microformats v2 (obsolete) — Splinter Review
consolidated microformats v2 patch.  carry forward r+ from tantek.  addresses jsm specific issues with URL and DOMParser.
Assignee: nobody → mixedpuppy
Attachment #8671593 - Attachment is obsolete: true
Attachment #8671594 - Attachment is obsolete: true
Attachment #8704757 - Flags: review+
removes microdata, adds microformats, updates tests for microformats.  carry forward r+ from dietrich
Attachment #8700861 - Attachment is obsolete: true
Attachment #8704760 - Flags: review+
Blocks: 1237480
Small update...

try tests are failing (win/lin) because of some difference in test packaging.  On osx the entire toolkit/components/microformats directory is packaged, allowing the tests to include the shiv.  On win/lin only toolkit/components/microformats/test is packaged, and all tests fail because the shiv is unavailable.  Will continue to track down a fix for this.
Attached patch microformats v2Splinter Review
The fix was to move the manifest.ini file into the microformats directory.  Looks like we're passing now.  carry forward the r+
Attachment #8704757 - Attachment is obsolete: true
Attachment #8707087 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/ed69cf9c7396
https://hg.mozilla.org/mozilla-central/rev/96f4549cab32
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 47
Blocks: 923806
Flags: needinfo?(mixedpuppy)
Depends on: 1244647
:mixedpuppy, the fix for this bug did not remove the old microformats parser AFAIK, should we create another bug to remove the old one? does it have any remaining dependencies?
:mkaply - any objections?
Tantek, removal would be blocked by bug 1237480.  We should do a new bug for removal.
Product: Firefox → Firefox Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: