Closed Bug 497543 Opened 15 years ago Closed 12 years ago

Provide a thumbnail service

Categories

(Toolkit :: Places, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla12

People

(Reporter: mak, Assigned: ttaubert)

References

(Depends on 7 open bugs, Blocks 1 open bug)

Details

(Keywords: dev-doc-needed)

Attachments

(8 files, 16 obsolete files)

22.26 KB, image/png
Details
93.02 KB, image/png
Details
6.54 KB, patch
dietrich
: review+
Details | Diff | Splinter Review
15.23 KB, patch
Details | Diff | Splinter Review
9.99 KB, patch
Details | Diff | Splinter Review
5.27 KB, patch
Details | Diff | Splinter Review
2.45 KB, patch
dietrich
: review+
Details | Diff | Splinter Review
1023 bytes, patch
dietrich
: review+
Details | Diff | Splinter Review
We want to show bookmarks/history thumbnails in some future view, and many extensions are implementing their own thumbnail service.
We could provide our own for them.

Some requests:
- separate db
- the more async as possible

Some open question:
- Which items should have a thumbnail? bookmarks, history, both, most visited?
- full or cropped thumbnails, which size?
- normal resize or liquid resize (slow, should be in a separate thread)?
- are there privacy concerns?
Blocks: 387750
One more question is the location of this db on the filesystem. From the past I know that people don't wanna have it located inside their profile. Could we use the profiles cache folder as storage location?
iirc dietrich tried that and said was harder to do. what's the point of users not wanting that in the profile folder? size? privacy? maybe we can find a different solution.
As far as I can remember from older bug reports (I cannot find those at the moment) users don't wanted to have them due to disk space limits for remote profiles. We have a couple of big files already in the profile which makes it harder to work with.
we could provide prefs to disable or limit the service for those, does not appear a so common use case.
Attached file service module (obsolete) —
Attached file moz-thumb protocol handler (obsolete) —
attached is a simple prototype of a standalone thumbnail service. it's a js module that inits when Cu.import()ed, watches for page loads, and stores thumbnails in the disk cache.

very easy to use: thumbnails can be accessed in content using moz-thumb://<your url>.

disk cache is fast and simple, but is lost in the event of unclean shutdowns, so not persistent enough, would need to move to more permanent storage.
Summary: Provide a Places thumbnails service → Provide a thumbnail service
(In reply to comment #4)
> we could provide prefs to disable or limit the service for those, does not
> appear a so common use case.

on some mobile devices the profile folder is on a space-limited mount. Other mounts have much more space. Currently, Fennec tries to push as much "big ticket" items to other mounts (like the page cache) using prefs that allow us to tell the services where to store the data.
i think i could take this as next target after expiration starting from dietrich's first exploration. I did some experience with async streams yesterday playing with aeropeek icon stuff, that will be useful.

Maybe we should not implement a new protocol handler but use moz-anno:thumb: like we did for moz-anno:favicon:
But at the same time that would complicate the above in a maybe unneeded way and would require this service to provide an xpcom interface, need to look at that code.

Another question is if this should just cache page thumbnails when they are available in browser, or load urls in background. the former has clearly advantages since the page is already shown in browser, but can't provide thumbs for not yet visited pages/bookmarks. Maybe an hybrid of the 2 approaches?
Loading pages in a separate window will cause additional http requests to be fired (take in count security, XHR!, performances issues coming from duplicating a page full of heavy flash content, etc). Favicons are loaded by the browser and stored by faviconSvc, maybe browser could do the same for thumbnails.

What to store? We can't just store all bookmarks and history clearly. Store items on thumb request? if we go for the async http request we can do that. Otherwise we could only store on visit and that could give a bad experience (lot of empties thumbs, especially after a bookmarks restore or a cpd).

Where to store? Mark comment tells us we should not use profile folder, or better, we can use it but we have to provide a pref to move the datastore. At that point probably the best option is an sqlite db that will self contain thumbs rather than a folder (also for "privacy feeling"). But being items frequently added and removed that could bring to fragmentation (vacuuming a large db can hurt even if we can do that async). Need to compare size and speed of a folder vs a db. 

When to expire? thumbs will need to be refreshed, but not too often, still we will probably have to save last access date, and expire thumbs not used from more than x days. Clearly we also need to observe page changes.

Would make sense to unify this to binary annotations? It is pretty clear that places.sqlite is not a good place for binary annotations, it grows too much making everything damn slow. Since we plan to move out binary annotations and this looks like a binary annotation, could make sense to do that now?

enough questions for now...
Assignee: nobody → ttaubert
Status: NEW → ASSIGNED
Blocks: 455553
Attached patch WIP v1 (obsolete) — Splinter Review
So... that's quite a big patch. It introduces a new protocol handler for the "moz-thumb" protocol. You can just add an image with the following URL:

moz-thumb://default?url=http%3A%2F%2Ftwitter.com%2F&width=201&height=127

That URL delivers an image for http://twitter.com/ with a width of 201px and a height of 127px. If the thumbnail is not contained in the cache it creates a hidden frame in the background, loads the URL, takes a screenshot and writes the thumbnail to the cache.

"default" denotes the algorithm that should be used to create the thumbnail. That's a placeholder for the moment and we should provide an easy hook to register new algorithms later - e.g. something that tries to detect the most interesting part of the web page or something that excludes ads and the like.

Looking forward to your feedback! :) PS: If you want to try it out you can just use the latest UX build. I pushed it together with the new "New Tab" page.
Attachment #383545 - Attachment is obsolete: true
Attachment #383546 - Attachment is obsolete: true
Attachment #570674 - Flags: feedback?(mak77)
Attached patch WIP v2 (obsolete) — Splinter Review
Fixes and some refactoring.
Attachment #570674 - Attachment is obsolete: true
Attachment #570971 - Flags: feedback?(mak77)
Attachment #570674 - Flags: feedback?(mak77)
(In reply to Tim Taubert [:ttaubert] from comment #11)
> Created attachment 570674 [details] [diff] [review] [diff] [details] [review]

> That URL delivers an image for http://twitter.com/ with a width of 201px and
> a height of 127px. If the thumbnail is not contained in the cache it creates
> a hidden frame in the background, loads the URL, takes a screenshot and
> writes the thumbnail to the cache.

This is something to think about, since means adding a moz-thumb image to any content may visit the page in background, and this hurts users privacy. Suppose I am connected to some public wifi and a new tab page or a bookmarks view shows these thumbnails, my own navigation details are going to be disclosed to the network, in form of these background visits.
As for the favicons protocol, it's likely we may need a version that fetches from network and caches and a version that returns an image only if one is already cached. The hard part is to figure when the user may be fine to allow background visits and when not. For sure in PB mode we should not fetch, nor cache anything.

> Looking forward to your feedback! :) PS: If you want to try it out you can
> just use the latest UX build. I pushed it together with the new "New Tab"
> page.

OK, thanks.
(In reply to Marco Bonardo [:mak] from comment #13)
> (In reply to Tim Taubert [:ttaubert] from comment #11)
> > Created attachment 570674 [details] [diff] [review] [diff] [details] [review] [diff] [details] [review]
> 
> > That URL delivers an image for http://twitter.com/ with a width of 201px and
> > a height of 127px. If the thumbnail is not contained in the cache it creates
> > a hidden frame in the background, loads the URL, takes a screenshot and
> > writes the thumbnail to the cache.
> 
> This is something to think about, since means adding a moz-thumb image to
> any content may visit the page in background, and this hurts users privacy.
> Suppose I am connected to some public wifi and a new tab page or a bookmarks
> view shows these thumbnails, my own navigation details are going to be
> disclosed to the network, in form of these background visits.

Completely different issue: I've just seen a user report that sound was playing because of the background load. Seems like this approach is going to open a can of worms. I think we should just display placeholder image for uncached thumbnails. For the new tab page at least, this shouldn't happen too often after a warmup phase.
Agree and related to that, if the page does some heavy computation or has plug-ins like Flash, this will likely hang the UI.

Related issues:
- if we use the default disk cache expiration threshold of a week, a bunch of pages will lack a thumbnail.
- If we only show page that have been visited, we expose a privacy concern when the user clears history but some page has thumbnails (thus has been visited), some has not.

So we may need a different expiration handling, one possibility may be to use a separate cache folder and an history observer (onDeleteURI, onDeleteVisits, onClearHistory), on the other side this may create a pretty large cache folder if we just cope with history expiration since these images are "big" in size, so we may also expire thumbnails that have not been requested in the last X days.
I discover that when thumbnail service are doing caching, if the website got any type of player, it will come out some mystery sound and cant stop it until I close firefox, if I open the new tab page and it will play it again if the website which have player are doing caching.
Comment on attachment 570971 [details] [diff] [review]
WIP v2

Review of attachment 570971 [details] [diff] [review]:
-----------------------------------------------------------------

I think I'm not going deep into a code review, since the above issues likely need an overhaul of thumbnails caching and approach, so would be useless nitpicking.
I think you can probably better split the code in a PageThumbs.jsm module and have pageThumbsProtocol.js just take care of the network and protocol stuff.

::: browser/components/thumbnails/ThumbnailProtocol.js
@@ +42,5 @@
> + * Overview
> + * This service provides a protocol implementation that allows the simple
> + * creation and caching of thumbnails for URLs.
> + */
> +#endif

nit: ifdef 0 for the license seems uncommon in browser, I can just find 3 instances of it.

@@ +51,5 @@
> +const Cc = Components.classes;
> +const Cr = Components.results;
> +const Ci = Components.interfaces;
> +
> +// define the scheme to use - "moz-thumb://..."

unless it's intended to generate a thumb for any kind of data you pass into, I suppose it would be better to restrict the naming here, maybe something like "moz-page-thumb"

@@ +72,5 @@
> +XPCOMUtils.defineLazyGetter(this, "gNetUtil", function () {
> +  var obj = {};
> +  Cu.import("resource://gre/modules/NetUtil.jsm", obj);
> +  return obj.NetUtil;
> +});

can't this use lazyModuleGetter?

::: browser/components/thumbnails/ThumbnailUtils.jsm
@@ +68,5 @@
> +    if (aOptions && aOptions.refresh)
> +      url += "&refresh=1";
> +
> +    return Services.io.newURI(url, "UTF8", null);
> +  }

I wonder if, rather than a full module for a single methods, wouldn't be simpler to add it to utilityOverlay, or make a global browserUtils.jsm module for browser (even if it may end up becoming a too easy throw-garbage-into-me target).
Otherwise, I'd probably move most of the code you have in ThumbnailProtocol.js in a more generic PageThumbnails.jsm module that would also include this method. There is no reason all of that code is part of the really simple protocol implementation.
Attachment #570971 - Flags: feedback?(mak77)
Bah, my understanding was that we were going to take this approach:

1. first check cache
2. check open tabs
3. load in background

I apologize if I didn't communicate that very well :(

IIRC there's already code for #2 in the ctrl+tab feature. Maybe could add some lines to add it to the cache in our moz-thumb way and separate that code out to ensure it always runs for page loads.

#3 is not do-able at all if we can't disable plug-ins. I'm not worried about the thumbnail not looking correct without a plug-in loaded - not a reason to block this.

Cache: default expiration should be fine since we'll be refreshing the thumbnail whenever the page loads (via #2)

Cache being blown: yeah this'll be sub optimal until we get more advanced cache features like pinning.

Privacy: If the pages being loaded in the background are visited already, there's not such a big concern IMO. If they're not visited, then we should load them. Regardless, we need to schedule a privacy review asap, instead of ruminating on it here and then having to change it later.
I have attached a screenshot of a bug where a local address, http://ringring/clapclap, loads the thumbnail for a site that I have never visited before, http://ringring.com

Another bug I find with this patch is that sites that require credentials causes the Master Password dialog to appear when I open a new tab. The thumbnail for the site doesn't load until I cancel the Master Password dialog.

Based on these two issues, I don't think we should be going over the network to get thumbnails, and show a placeholder thumbnail if we don't have one in our cache.
(In reply to Marco Bonardo [:mak] from comment #15)
> - If we only show page that have been visited, we expose a privacy concern
> when the user clears history but some page has thumbnails (thus has been
> visited), some has not.

If the user clears their history, what sites would we show on the New Tab page anyways? If we had thumbnails in our cache at the time of history cleaning, shouldn't we also just clear the thumbnails?
(In reply to Jared Wein [:jwein and :jaws] from comment #20)
> If the user clears their history, what sites would we show on the New Tab
> page anyways? If we had thumbnails in our cache at the time of history
> cleaning, shouldn't we also just clear the thumbnails?

depends on the implementation, it's not automatic, surely it's feasible.
Attached patch Part 1 - JavaScript Module (obsolete) — Splinter Review
I rewrote the whole patch. It does now consist of a JSM, a protocol/channel implementation and a part integrated into the browser.

The browser part takes care of capturing screenshots of web pages when specific events occur (TabSelect, SSTabRestored, load). The JSM creates a thumbnail and writes it to the file cache. Those cached images are now accessible via the protocol (moz-page-thumb://) and can just be included as the src of an <img> tag.

This is actually the same approach that browser-tabPreviews.js (ctrl-tab feature) takes but caches the thumbnails and makes them easily accessible.

With some additions this is should become the thumbnail service for Panorama and Ctrl-Tab as well. The only thing missing here is that these components should be able to check if there's a stored thumbnail for live tabs and if not, request one immediately. This should be done in a follow-up bug.

To be clear: This new patch does *not* capture thumbnails in the background (like the previous did). Thumbnails are being captured while browsing these pages. So users will typically start with a page of empty thumbnails that will probably be filled quickly because those are the most-visited/favorite pages.
Attachment #570971 - Attachment is obsolete: true
Attachment #582789 - Flags: review?(dietrich)
Attached patch Part 3 - Browser integration (obsolete) — Splinter Review
Attachment #582792 - Flags: review?(dao)
First of all, I am having problems with the fact that I have 2 tabs opened and still when I open the New Tab Page, it shows the thumbnails of those two tabs as blank. n order to get their thumbnails up, I have to open those tabs from the new tab page itself and wait for 5 to 10 seconds, and then when I open the new tab page again, the thumbnail appears.
Secondly , if a thumbnail is not present , then It should be visually clear. I don't know if its this bug's problem or New Tab Feature's Bug 455553 . But the blank thumbnail and the background of the page make it hard to differentiate the thumbnail from the background . May be provide a slight border or so ? or change some colors.
(In reply to scrapmachines from comment #25)
> First of all, I am having problems with the fact that I have 2 tabs opened
> and still when I open the New Tab Page, it shows the thumbnails of those two
> tabs as blank. n order to get their thumbnails up, I have to open those tabs
> from the new tab page itself and wait for 5 to 10 seconds, and then when I
> open the new tab page again, the thumbnail appears.

Probably the tabs' URLs don't match those on the New Tab Page. Overall that's not a big problem but rather a first-use experience... these are your most-visited tabs so you're gonna have thumbnails for them pretty quickly I guess. There is a bug that says we should exclude app tabs because they really don't make sense here.

> Secondly , if a thumbnail is not present , then It should be visually clear.
> I don't know if its this bug's problem or New Tab Feature's Bug 455553 . But
> the blank thumbnail and the background of the page make it hard to
> differentiate the thumbnail from the background . May be provide a slight
> border or so ? or change some colors.

Yeah, that's not exactly this bug's problem. We're working in refining the overall design of the grid so this would be solved in another bug.
(In reply to Tim Taubert [:ttaubert] from comment #27)
> (In reply to scrapmachines from comment #25)
> > First of all, I am having problems with the fact that I have 2 tabs opened
> > and still when I open the New Tab Page, it shows the thumbnails of those two
> > tabs as blank. n order to get their thumbnails up, I have to open those tabs
> > from the new tab page itself and wait for 5 to 10 seconds, and then when I
> > open the new tab page again, the thumbnail appears.
> 
> Probably the tabs' URLs don't match those on the New Tab Page. Overall
> that's not a big problem but rather a first-use experience... these are your
> most-visited tabs so you're gonna have thumbnails for them pretty quickly I
> guess. There is a bug that says we should exclude app tabs because they
> really don't make sense here.

Both the tabs' URLs are exactly same.
I tested with 3 app tabs and 3 normal tabs. I cleared the cache to remove all thumbnails and restarted. After all tabs were restored the thumbnails were visible on the 'New Tab Page'. I have no idea why this shouldn't be working for you...
Comment on attachment 582791 [details] [diff] [review]
Part 2 - moz-page-thumb:// Protocol and Channel

Review of attachment 582791 [details] [diff] [review]:
-----------------------------------------------------------------

Some questions still

::: browser/components/thumbnails/PageThumbsProtocol.js
@@ +93,5 @@
> +   * Constructs a new channel from the given URI for this protocol handler.
> +   * @param aUri The URI for which to construct a channel.
> +   * @return The newly created channel.
> +   */
> +  newChannel: function Proto_newChannel(aUri) {

global-nit: uppercase aURI wherever it's an nsIURI

@@ +94,5 @@
> +   * @param aUri The URI for which to construct a channel.
> +   * @return The newly created channel.
> +   */
> +  newChannel: function Proto_newChannel(aUri) {
> +    let {url} = parseUri(aUri);

So we only use the url part for now?
do we ignore passing into nonexistent algorithm? what about sizes? (it's fine if those are follow-ups, just to understand how we plan to handle errors)

@@ +97,5 @@
> +  newChannel: function Proto_newChannel(aUri) {
> +    let {url} = parseUri(aUri);
> +
> +    // Prevent recursion if a thumbnail for a thumbnail is requested.
> +    if (0 == url.indexOf(PageThumbs.scheme + "://"))

I suspect a substr would be more efficient. In the non-matching case indexOf should walk all of the string.

@@ +98,5 @@
> +    let {url} = parseUri(aUri);
> +
> +    // Prevent recursion if a thumbnail for a thumbnail is requested.
> +    if (0 == url.indexOf(PageThumbs.scheme + "://"))
> +      aUri = Services.io.newURI(url, "UTF8", null);

NewUtil.newURI would allow you to avoid the last optional argument.

Though, this is sort of a weak check (I may ask for thumb of the thumb of the thumb), so I'm not sure how much is really useful, probably we don't even do this for common protocols like http.
Unless it's a common case you hit, I'd probably just return a 404

@@ +132,5 @@
> +Channel.prototype = {
> +  /** 
> +   * Tracks if the channel has been opened, yet.
> +   */
> +  _wasOpened: false,

could you just use _isPending?

@@ +152,5 @@
> +
> +    this._isPending = true;
> +    this._wasOpened = true;
> +
> +    let self = this;

you can use bind(this) on the function

@@ +189,5 @@
> +      aCallback();
> +      return;
> +    }
> +
> +    let self = this;

ditto on bind(this)

@@ +200,5 @@
> +      if (!inputStream || !inputStream.available()) {
> +        if (aEntry)
> +          aEntry.close();
> +
> +        aCallback();

OT: I don't see the point of skipping braces around ifs if then we add newlines to improve readability, at that point would be simpler to brace according to the code style guide

@@ +208,5 @@
> +      // Read the cache entry's data.
> +      NetUtil.asyncFetch(inputStream, function (aData, aStatus) {
> +        // We might have been canceled while waiting.
> +        if (self.canceled)
> +          return;

is it ok to not notify the callback with empty data if we were canceled?

@@ +212,5 @@
> +          return;
> +
> +        // Check if we have a valid data stream.
> +        if (!Components.isSuccessCode(aStatus) || !aData.available())
> +          aDataStream = null;

what is aDataStrem? it's not defined anywhere, maybe you meant aData = null?

@@ +217,5 @@
> +
> +        aEntry.close();
> +        aCallback(aData);
> +      });
> +    });

I suppose in some really rare case asyncFetch may throw, should we catch that and invoke the callback in such a case?

@@ +314,5 @@
> +    // Synchronous data delivery is not implemented.
> +    throw Cr.NS_ERROR_NOT_IMPLEMENTED;
> +  },
> +
> +  /* :::::::: nsIHttpChannel ::::::::::::::: */

you don't seem to respect the idl

@@ +384,5 @@
> +   * @param aValue The response header value to set.
> +   */
> +  setResponseHeader: function Channel_setResponseHeader(aHeader, aValue, aMerge) {
> +    if (aMerge)
> +      throw Cr.NS_ERROR_NOT_IMPLEMENTED;

I think that per idl you should just ignore this, not throw ("This flag is ignored if the specified header does not support merging").

@@ +386,5 @@
> +  setResponseHeader: function Channel_setResponseHeader(aHeader, aValue, aMerge) {
> +    if (aMerge)
> +      throw Cr.NS_ERROR_NOT_IMPLEMENTED;
> +
> +    this._responseHeaders[aHeader.toLowerCase()] = aValue;

note that, per idl, if aValue is empty you should remove the header from the array, not set it to an empty value.

@@ +454,5 @@
> + * Parses a given uri and extracts all parameters relevant to this protocol.
> + * @param aUri The URI to parse.
> + * @return The parsed parameters.
> + */
> +function parseUri(aUri) {

Uppercase URI helps clarifying this takes a nsIURI

@@ +456,5 @@
> + * @return The parsed parameters.
> + */
> +function parseUri(aUri) {
> +  let params = {};
> +  let [host, query] = aUri.spec.split("?");

what's "host"? what's used for? It's not a host in the strict term, so this looks confusing. It's probably what we may call a "path".

I think you should add documentation at the top of this protocol, explaining its parts and their usage.

@@ +463,5 @@
> +    let [key, value] = param.split("=").map(decodeURIComponent);
> +    params[key.toLowerCase()] = value;
> +  });
> +
> +  params.algorithm = host.replace(/^.*\/+/, "").replace(/\/+.*$/, "");

I'm having an hard time understanding what exotic case this is handling (especially the second replace, "test1/test2/test3///" is solved to "") and why we should handle it.
A comment explaining when this may happen, would have helped.
Attachment #582791 - Flags: review?(mak77)
Comment on attachment 582789 [details] [diff] [review]
Part 1 - JavaScript Module

>--- /dev/null
>+++ b/browser/components/thumbnails/PageThumbs.jsm

Please put standalone modules in browser/modules/.
Comment on attachment 582792 [details] [diff] [review]
Part 3 - Browser integration

>+  _determineTabForLoadEvent:
>+    function Thumbnails_determineTabForLoadEvent(aEvent) {

Please organize the code such that this can return a browser. Getting the browser is cheaper than getting the tab.

>+    let doc = aEvent.originalTarget;
>+    if (doc instanceof HTMLDocument) {

What is this check for?

>+      let win = doc.defaultView;
>+      while (win.frameElement)
>+        win = win.frameElement.ownerDocument.defaultView;
>+
>+      return gBrowser._getTabForContentWindow(win);
>+    }
>+  }

Let this method always explicitly return a value.

Looks pretty good overall.
Attachment #582792 - Flags: review?(dao) → review-
Depends on: 707313
Attached patch Part 1 - JavaScript Module (obsolete) — Splinter Review
(In reply to Dão Gottwald [:dao] from comment #31)
> Please put standalone modules in browser/modules/.

Done.
Attachment #582789 - Attachment is obsolete: true
Attachment #583904 - Flags: review?(dietrich)
Attachment #582789 - Flags: review?(dietrich)
Attached patch Part 1 - JavaScript Module (obsolete) — Splinter Review
Replaced all tab arguments so that we're now working with browsers.
Attachment #583904 - Attachment is obsolete: true
Attachment #583961 - Flags: review?(dietrich)
Attachment #583904 - Flags: review?(dietrich)
Attached patch Part 3 - Browser integration (obsolete) — Splinter Review
(In reply to Dão Gottwald [:dao] from comment #32)
> Please organize the code such that this can return a browser. Getting the
> browser is cheaper than getting the tab.

Done.
Attachment #582792 - Attachment is obsolete: true
Attachment #583962 - Flags: review?(dao)
Comment on attachment 583962 [details] [diff] [review]
Part 3 - Browser integration

>+XPCOMUtils.defineLazyModuleGetter(this, "PageThumbs",
>+  "resource:///modules/PageThumbs.jsm");
>+
>+/**
>+ * A singleton that takes care of keeping thumbnails of web pages up-to-date
>+ * automatically.
>+ */
>+let Thumbnails = {

Thumbnails vs. PageThumbs is pretty subtle. Any chance to get a more descriptive name for either or both of these objects? You're in the global browser window scope here; you share this with lots of other browser and extension code, so try to be as deliberate as possible with what you define and expose.
(In reply to Dão Gottwald [:dao] from comment #36)
> Thumbnails vs. PageThumbs is pretty subtle. Any chance to get a more
> descriptive name for either or both of these objects? You're in the global
> browser window scope here; you share this with lots of other browser and
> extension code, so try to be as deliberate as possible with what you define
> and expose.

+1, would be better to be explicit as possible here.
Comment on attachment 583961 [details] [diff] [review]
Part 1 - JavaScript Module

Review of attachment 583961 [details] [diff] [review]:
-----------------------------------------------------------------

would prefer s/thumbs/thumbnails/g in all usages.

f+ for the first pass. needs some api and naming improvements, and tests.

::: browser/modules/PageThumbs.jsm
@@ +52,5 @@
> +
> +/**
> + * The default height for page thumbnails.
> + */
> +const THUMBNAIL_HEIGHT = 127;

should explain why these these are the default values.

@@ +71,5 @@
> +XPCOMUtils.defineLazyServiceGetter(this, "gAppShellService",
> +  "@mozilla.org/appshell/appShellService;1", "nsIAppShellService");
> +
> +XPCOMUtils.defineLazyServiceGetter(this, "gCacheService",
> +  "@mozilla.org/network/cache-service;1", "nsICacheService");

can you instead add these to Services.jsm?

@@ +95,5 @@
> +   * @param aWidth The thumbnail's width.
> +   * @param aHeight The thumbnail's height.
> +   * @return The thumbnail image's url.
> +   */
> +  getThumbnailForUrl:

maybe getThumbnailURLForURL, since it's not actually returning the thumbnail itself. kind of a mouthful though. maybe look at the favicon service for how we named similar apis there? maybe even getThumbnailURL would be ok, since the whole thing is url-centric.

@@ +106,5 @@
> +  },
> +
> +  /**
> +   * Captures a thumbnail for the given browser.
> +   * @param aBrowser The browser to capture a thumbnail from.

should add detail about exactly what this means. eg, captures and does what with it?

should say explicitly what a "browser" is.

why passing browser instead of the content window?

do we want a callback for when capture and storage is complete?

@@ +122,5 @@
> +    // Draw the window contents to the canvas.
> +    ctx.drawWindow(contentWindow, sx, sy, sw, sh, THUMBNAIL_BG_COLOR,
> +                   ctx.DRAWWINDOW_DO_NOT_FLUSH);
> +
> +    // Store the capture thumbnail in the file cache.

nit: genericize the comment - might be some other backing store at some point (which is why you abstracted that bit into a separate private method in the first place, i assume?)

@@ +133,5 @@
> +   * @param aBrowser The browser whose content is depicted in the thumbnail.
> +   * @param aCanvas The canvas containing the thumbnail's image data.
> +   */
> +  _store: function PageThumbs_store(aBrowser, aCanvas) {
> +    let key = aBrowser.currentURI.spec;

can you ensure that we're using the same source for browsing history, favicon key, which URI gets bookmarked, etc? the fact that the URL used as the cache key is defined separately from the caller asking to save the thumbnail makes me a little nervous.

@@ +142,5 @@
> +      if (!aEntry)
> +        return;
> +
> +      // Extract image data from the canvas.
> +      self._readImageData(aCanvas, function (aData) {

is there a way to read the canvas data as a stream, so that we can copy from input stream direct to output stream, instead of reading all into memory at once?

@@ +159,5 @@
> +
> +  /**
> +   * Reads the image data from a given canvas and passes it to the callback.
> +   * @param aCanvas The canvas to read the image data from.
> +   * @param aCallback The callbak that the data is passed.

nit: typo

also, would be more explicit as "The function that the image data is passed to."

@@ +261,5 @@
> +    return inputStream && inputStream.available();
> +  },
> +
> +  /**
> +   * Opens the cache entry identified by the given keys

keys?

@@ +263,5 @@
> +
> +  /**
> +   * Opens the cache entry identified by the given keys
> +   * @param aKey The key identifying the desired cache entry.
> +   * @param aAccess The desired access mode.

define what modes are available, or point at where they're defined

@@ +264,5 @@
> +  /**
> +   * Opens the cache entry identified by the given keys
> +   * @param aKey The key identifying the desired cache entry.
> +   * @param aAccess The desired access mode.
> +   * @param aCallback The callback to be called when the cache entry was opened.

"function to be called" would be more explicit

@@ +301,5 @@
> +
> +/**
> + * Define a lazy getter for the cache session.
> + */
> +XPCOMUtils.defineLazyGetter(PageThumbsCache, "_session", function () {

_cacheSession would be more explicit.
Attachment #583961 - Flags: review?(dietrich) → feedback+
Attached patch Part 3 - Browser integration (obsolete) — Splinter Review
(In reply to Dão Gottwald [:dao] from comment #36)
> Thumbnails vs. PageThumbs is pretty subtle. Any chance to get a more
> descriptive name for either or both of these objects? You're in the global
> browser window scope here; you share this with lots of other browser and
> extension code, so try to be as deliberate as possible with what you define
> and expose.

I renamed Thumbnails to gBrowserThumbnails and turned PageThumbs from a global into a "private property" (of gBrowserThumbnails).
Attachment #583962 - Attachment is obsolete: true
Attachment #585574 - Flags: review?(dao)
Attachment #583962 - Flags: review?(dao)
Attached patch Part 3 - Browser integration (obsolete) — Splinter Review
Forgot to rename Thumbnails to gBrowserThumbnails in browser.js (delayed startup).
Attachment #585574 - Attachment is obsolete: true
Attachment #585747 - Flags: review?(dao)
Attachment #585574 - Flags: review?(dao)
(In reply to Marco Bonardo [:mak] from comment #30)
> > +  newChannel: function Proto_newChannel(aUri) {
> 
> global-nit: uppercase aURI wherever it's an nsIURI

Fixed.

> > +  newChannel: function Proto_newChannel(aUri) {
> > +    let {url} = parseUri(aUri);
> 
> So we only use the url part for now?
> do we ignore passing into nonexistent algorithm? what about sizes? (it's
> fine if those are follow-ups, just to understand how we plan to handle
> errors)

Yes, sorry if I wasn't clear enough about this. We're only using the url part for now as the thumbnailer captures screenshots in the right size. Later this should be a slightly larger size and thumbnails will be down-scaled on-demand.

> > +    // Prevent recursion if a thumbnail for a thumbnail is requested.
> > +    if (0 == url.indexOf(PageThumbs.scheme + "://"))
> 
> I suspect a substr would be more efficient. In the non-matching case indexOf
> should walk all of the string.

I removed the whole part but for reference please see bug 713813 (on why indexOf() is faster here).

> Though, this is sort of a weak check (I may ask for thumb of the thumb of
> the thumb), so I'm not sure how much is really useful, probably we don't
> even do this for common protocols like http.
> Unless it's a common case you hit, I'd probably just return a 404

I removed this part. It was necessary before because that could've been a real loop (because we took screenshots on-demand).

> > +  _wasOpened: false,
> 
> could you just use _isPending?

I think there is a difference between wasOpened (private) and isPending (public API). So wasOpened just prevents that the channel is opened twice and will not be set to false when finished. isPending will be false when the request has succeeded and can be accessed using nsIRequest.

> > +    let self = this;
> 
> you can use bind(this) on the function

Yes, I did this but I'm not really a big fan of .bind(). This is pretty useful for a one-line callback but otherwise just mixes up the reading flow (if you have to look at the bottom of a big callback to determine the value of 'this').

> > +      NetUtil.asyncFetch(inputStream, function (aData, aStatus) {
> > +        // We might have been canceled while waiting.
> > +        if (self.canceled)
> > +          return;
> 
> is it ok to not notify the callback with empty data if we were canceled?

We're cancelled intentionally by the caller so the channel listener shouldn't expect any further onStart/StopRequest or onDataAvailable notifications.

> > +        // Check if we have a valid data stream.
> > +        if (!Components.isSuccessCode(aStatus) || !aData.available())
> > +          aDataStream = null;
> 
> what is aDataStrem? it's not defined anywhere, maybe you meant aData = null?

Yeah, typo... Fixed.

> > +        aEntry.close();
> > +        aCallback(aData);
> > +      });
> > +    });
> 
> I suppose in some really rare case asyncFetch may throw, should we catch
> that and invoke the callback in such a case?

Yeah, I guess that's possible. Done.

> > +    // Synchronous data delivery is not implemented.
> > +    throw Cr.NS_ERROR_NOT_IMPLEMENTED;
> > +  },
> > +
> > +  /* :::::::: nsIHttpChannel ::::::::::::::: */
> 
> you don't seem to respect the idl

According to the spec 'nsIChannel implementations are not required to implement this method'.

> > +  setResponseHeader: function Channel_setResponseHeader(aHeader, aValue, aMerge) {
> > +    if (aMerge)
> > +      throw Cr.NS_ERROR_NOT_IMPLEMENTED;
> 
> I think that per idl you should just ignore this, not throw ("This flag is
> ignored if the specified header does not support merging").

Fixed.

> > +  setResponseHeader: function Channel_setResponseHeader(aHeader, aValue, aMerge) {
> > +    if (aMerge)
> > +      throw Cr.NS_ERROR_NOT_IMPLEMENTED;
> > +
> > +    this._responseHeaders[aHeader.toLowerCase()] = aValue;
> 
> note that, per idl, if aValue is empty you should remove the header from the
> array, not set it to an empty value.

Fixed.

> > +function parseUri(aUri) {
> > +  let params = {};
> > +  let [host, query] = aUri.spec.split("?");
> 
> what's "host"? what's used for? It's not a host in the strict term, so this
> looks confusing. It's probably what we may call a "path".
> 
> I think you should add documentation at the top of this protocol, explaining
> its parts and their usage.

Done.

> > +    let [key, value] = param.split("=").map(decodeURIComponent);
> > +    params[key.toLowerCase()] = value;
> > +  });
> > +
> > +  params.algorithm = host.replace(/^.*\/+/, "").replace(/\/+.*$/, "");
> 
> I'm having an hard time understanding what exotic case this is handling
> (especially the second replace, "test1/test2/test3///" is solved to "") and
> why we should handle it.

I restructured the parseURI() method and it should be a lot clearer now.
Attachment #582791 - Attachment is obsolete: true
Attachment #585756 - Flags: review?(mak77)
(In reply to Tim Taubert [:ttaubert] from comment #41)
> I removed the whole part but for reference please see bug 713813 (on why
> indexOf() is faster here).

FWIW, that refers to a loop, surely in a loop I'd create the regex out of it, not inside it.
Comment on attachment 585747 [details] [diff] [review]
Part 3 - Browser integration

>+const THUMBNAIL_CAPTURE_DELAY_MS = "2000";

This should be a gBrowserThumbnails property.

>+      case "unload":
>+        this.uninit();
>+        break;

Just call uninit from BrowserShutdown.
See also bug 555337 -- Jeff said this was "causing problems for graphics guys", but I don't know what kind of problems exactly.
Jeff, in bug 555337 you said that frequently calling drawWindow() in the background "seems to be causing problems for graphics guys". Can you please elaborate on that? What's the actual problem here? We're going to introduce a New Tab Page and a Thumbnail Service and will be active by default.
*and both will be active by default.
Added tests.
Attachment #586073 - Flags: review?(dietrich)
Attached patch Part 3 - Browser integration (obsolete) — Splinter Review
(In reply to Dão Gottwald [:dao] from comment #43)
> >+const THUMBNAIL_CAPTURE_DELAY_MS = "2000";
> 
> This should be a gBrowserThumbnails property.

Done.

> >+      case "unload":
> >+        this.uninit();
> >+        break;
> 
> Just call uninit from BrowserShutdown.

Done.
Attachment #585747 - Attachment is obsolete: true
Attachment #586078 - Flags: review?(dao)
Attachment #585747 - Flags: review?(dao)
Attached patch Part 1 - JavaScript Module (obsolete) — Splinter Review
(In reply to Dietrich Ayala (:dietrich) from comment #38)
> > +/**
> > + * The default height for page thumbnails.
> > + */
> > +const THUMBNAIL_HEIGHT = 127;
> 
> should explain why these these are the default values.

Done.

> > +XPCOMUtils.defineLazyServiceGetter(this, "gAppShellService",
> > +  "@mozilla.org/appshell/appShellService;1", "nsIAppShellService");
> > +
> > +XPCOMUtils.defineLazyServiceGetter(this, "gCacheService",
> > +  "@mozilla.org/network/cache-service;1", "nsICacheService");
> 
> can you instead add these to Services.jsm?

Done.

> > +  getThumbnailForUrl:
> 
> maybe getThumbnailURLForURL, since it's not actually returning the thumbnail
> itself. kind of a mouthful though. maybe look at the favicon service for how
> we named similar apis there? maybe even getThumbnailURL would be ok, since
> the whole thing is url-centric.

Changed to getThumbnailURL().

> > +   * Captures a thumbnail for the given browser.
> > +   * @param aBrowser The browser to capture a thumbnail from.
> 
> should add detail about exactly what this means. eg, captures and does what
> with it?

Done.

> should say explicitly what a "browser" is.
> why passing browser instead of the content window?
> do we want a callback for when capture and storage is complete?

Passing in a contentWindow now and separated into two public methods (capture + store). store() does now take a callback argument.

> > +    // Store the capture thumbnail in the file cache.
> 
> nit: genericize the comment - might be some other backing store at some
> point (which is why you abstracted that bit into a separate private method
> in the first place, i assume?)

Done.

> > +  _store: function PageThumbs_store(aBrowser, aCanvas) {
> > +    let key = aBrowser.currentURI.spec;
> 
> can you ensure that we're using the same source for browsing history,
> favicon key, which URI gets bookmarked, etc? the fact that the URL used as
> the cache key is defined separately from the caller asking to save the
> thumbnail makes me a little nervous.

store() expects now a key provided by the caller. We're using 'browser.currentURI.spec' as that's what the bookmark button and also jetpack does with 'tab.url'.

> > +      // Extract image data from the canvas.
> > +      self._readImageData(aCanvas, function (aData) {
> 
> is there a way to read the canvas data as a stream, so that we can copy from
> input stream direct to output stream, instead of reading all into memory at
> once?

Alas, I don't think so. I searched but that seems to be the best/only method to do it.

> > +  /**
> > +   * Opens the cache entry identified by the given keys
> 
> keys?

... by a single key of course :)

> > +   * Opens the cache entry identified by the given keys
> > +   * @param aKey The key identifying the desired cache entry.
> > +   * @param aAccess The desired access mode.
> 
> define what modes are available, or point at where they're defined

Done.

> @@ +264,5 @@
> > +  /**
> > +   * Opens the cache entry identified by the given keys
> > +   * @param aKey The key identifying the desired cache entry.
> > +   * @param aAccess The desired access mode.
> > +   * @param aCallback The callback to be called when the cache entry was opened.
> 
> "function to be called" would be more explicit

Done.

> > +/**
> > + * Define a lazy getter for the cache session.
> > + */
> > +XPCOMUtils.defineLazyGetter(PageThumbsCache, "_session", function () {
> 
> _cacheSession would be more explicit.

Done.
Attachment #583961 - Attachment is obsolete: true
Attachment #586085 - Flags: review?(dietrich)
(In reply to Dão Gottwald [:dao] from comment #44)
> See also bug 555337 -- Jeff said this was "causing problems for graphics
> guys", but I don't know what kind of problems exactly.

I don't remember what this meant. How often will this call drawWindow()?
Whenever a content window (in a browser) fires a load event or a tab gets restored or selected. drawWindow() is always called with a delay of 2secs (dunno if that matters).
Comment on attachment 585756 [details] [diff] [review]
Part 2 - moz-page-thumb:// Protocol and Channel

Review of attachment 585756 [details] [diff] [review]:
-----------------------------------------------------------------

global-nit: may use MPL2

question: looks like some of the other parts have moved from "thumbs" to "thumbnails", I wonder if the protocol should be coherent with that and be moz-page-thumbnail:, as well as the source filenames? This is something more at an SR level.

::: browser/components/thumbnails/PageThumbsProtocol.js
@@ +45,5 @@
> + *
> + * moz-page-thumb://thumbnail?url=http%3A%2F%2Fwww.mozilla.org%2F&width=201&height=127
> + *
> + * This URL requests an image for 'http://www.mozilla.org/' with a width of
> + * 201px and a height of 127px.

Though both width and height are not yet supported, so this comment needs updating.

@@ +104,5 @@
> +  },
> +
> +  /**
> +   * Constructs a new channel from the given URI for this protocol handler.
> +   * @param aUri The URI for which to construct a channel.

uppercase aURI

@@ +130,5 @@
> +function Channel(aUri) {
> +  this._uri = aUri;
> +
> +  // nsIChannel
> +  this.originalURI = aUri;

Uppercase aURI in all of this constructor

@@ +191,5 @@
> +    let {url} = parseURI(this._uri);
> +
> +    // Return early if there's no valid URL given.
> +    if (!url) {
> +      aCallback();

nit: may explicitly pass null, just to clarify the callback expects something, also later you set aData = null and pass it, so would be more consistent

@@ +199,5 @@
> +    // Try to get a cache entry.
> +    PageThumbsCache.getReadEntry(url, function (aEntry) {
> +      let inputStream = aEntry && aEntry.openInputStream(0);
> +
> +      function cleanup(aData) {

a better name for this helper would be worth it, it does not just cleanup, it also invokes the callback, but that's unclear later. cleanup(aData) also looks like it's clearing data!

@@ +208,5 @@
> +      }
> +
> +      // Check if we have a valid entry and if it has any data.
> +      if (!inputStream || !inputStream.available()) {
> +        cleanup();

nit: as well as here

@@ +463,5 @@
> +};
> +
> +/**
> + * Parses a given URI and extracts all parameters relevant to this protocol.
> + * @param aUri The URI to parse.

nit: uppercase aURI
Attachment #585756 - Flags: review?(mak77) → review+
(In reply to Tim Taubert [:ttaubert] from comment #51)
> Whenever a content window (in a browser) fires a load event or a tab gets
> restored or selected. drawWindow() is always called with a delay of 2secs
> (dunno if that matters).

I don't really like this extra painting and would like to see a careful evaluation of the performance impact that this has.

Personally, I think it might make more sense to grab a copy of the existing painted copy instead of repainting. We should at least make sure that we're reusing the existing layers and not redrawing them.

Also, why do we need to regrab a thumbnail when a tab gets selected? And do we need thumbnails for all of the pages?
(In reply to Marco Bonardo [:mak] from comment #52)
> global-nit: may use MPL2

Yes, good idea.

> question: looks like some of the other parts have moved from "thumbs" to
> "thumbnails", I wonder if the protocol should be coherent with that and be
> moz-page-thumbnail:, as well as the source filenames? This is something more
> at an SR level.

Are you ok with me doing that in a follow-up bug? There are some other follow-up bugs as well that we'll tackle before pref'ing the New Tab Page on.

> > + * moz-page-thumb://thumbnail?url=http%3A%2F%2Fwww.mozilla.org%2F&width=201&height=127
> > + *
> > + * This URL requests an image for 'http://www.mozilla.org/' with a width of
> > + * 201px and a height of 127px.
> 
> Though both width and height are not yet supported, so this comment needs
> updating.

Done.

> > +    // Return early if there's no valid URL given.
> > +    if (!url) {
> > +      aCallback();
> 
> nit: may explicitly pass null, just to clarify the callback expects
> something, also later you set aData = null and pass it, so would be more
> consistent

Done.

> > +    PageThumbsCache.getReadEntry(url, function (aEntry) {
> > +      let inputStream = aEntry && aEntry.openInputStream(0);
> > +
> > +      function cleanup(aData) {
> 
> a better name for this helper would be worth it, it does not just cleanup,
> it also invokes the callback, but that's unclear later. cleanup(aData) also
> looks like it's clearing data!

Renamed to 'closeEntryAndFinish()'.
Attachment #585756 - Attachment is obsolete: true
(In reply to Tim Taubert [:ttaubert] from comment #54)
> > question: looks like some of the other parts have moved from "thumbs" to
> > "thumbnails", I wonder if the protocol should be coherent with that and be
> > moz-page-thumbnail:, as well as the source filenames? This is something more
> > at an SR level.
> 
> Are you ok with me doing that in a follow-up bug? There are some other
> follow-up bugs as well that we'll tackle before pref'ing the New Tab Page on.

provided the follow-up is addressed in the same release. Otherwise add-ons and docs would eventually have to be updated shortly, making this harder to change later.
Btw, mine was mostly a question, a decision should still be made.
I totally concur that we should streamline the wording and use "Thumbnail(s)" instead of "Thumb(s)". Just don't want to start changing the patches all over again. So I'm going to address this in the same release.
s/thumb/thumbnail/ should definitely happen, but yeah can be followup - doesn't need to block initial landing and testing by nightly users, since doesn't affect the experience at all.
Comment on attachment 586085 [details] [diff] [review]
Part 1 - JavaScript Module

Review of attachment 586085 [details] [diff] [review]:
-----------------------------------------------------------------

r=me

::: browser/modules/PageThumbs.jsm
@@ +67,5 @@
> +   * Gets the thumbnail image's url for a given web page's url.
> +   * @param aUrl The web page's url that is depicted in the thumbnail.
> +   * @param aWidth The thumbnail's width.
> +   * @param aHeight The thumbnail's height.
> +   * @return The thumbnail image's url.

width and height should be optional, no?

@@ +238,5 @@
> +
> +    let inputStream = aEntry.openInputStream(0);
> +
> +    // check if the entry actually contains data
> +    return inputStream && inputStream.available();

does this validation hit the disk? seems potentially expensive. can we just catch exceptions in getReadEntry instead?
Attachment #586085 - Flags: review?(dietrich) → review+
Attachment #586073 - Flags: review?(dietrich) → review+
(In reply to Dietrich Ayala (:dietrich) from comment #58)
> > +   * Gets the thumbnail image's url for a given web page's url.
> > +   * @param aUrl The web page's url that is depicted in the thumbnail.
> > +   * @param aWidth The thumbnail's width.
> > +   * @param aHeight The thumbnail's height.
> > +   * @return The thumbnail image's url.
> 
> width and height should be optional, no?

Yes, at the moment they're not even used at all. I removed them until we'll implement resizing in the protocol handler.

> > +    let inputStream = aEntry.openInputStream(0);
> > +
> > +    // check if the entry actually contains data
> > +    return inputStream && inputStream.available();
> 
> does this validation hit the disk? seems potentially expensive. can we just
> catch exceptions in getReadEntry instead?

Good catch, removed the whole part because the Protocol did this check again and we don't need it twice.
Attachment #586085 - Attachment is obsolete: true
(In reply to Jeff Muizelaar [:jrmuizel] from comment #53)
> I don't really like this extra painting and would like to see a careful
> evaluation of the performance impact that this has.

The performance impact kind of depends on the speed of the drawWindow() implementation. If this could be better we should fix it because it's an advertised API and add-ons use it.

> Personally, I think it might make more sense to grab a copy of the existing
> painted copy instead of repainting. We should at least make sure that we're
> reusing the existing layers and not redrawing them.

So you mean not using drawWindow()? Is there any way to get image data (in PNG format) saved to the cache? That's all we'd need. And a JavaScript API, of course.

Or do you mean using flags like (DRAWWINDOW_DO_NOT_FLUSH | DRAWWINDOW_ASYNC_DECODE_IMAGES) to reduce the performance impact?

> Also, why do we need to regrab a thumbnail when a tab gets selected? And do
> we need thumbnails for all of the pages?

Actually we don't need that for the New Tab Page because that doesn't necessarily require thumbnails to be up-to-date. But as this is aimed to be a thumbnail service that should consolidate all different kinds of custom thumbnailing codes in the browser (like Ctrl-Tab, TabPreviews, Drag/Drop of tabs, Bookmarks, ...) we need to stay somewhat up-to-date.
Comment on attachment 586078 [details] [diff] [review]
Part 3 - Browser integration

>+/**
>+ * A singleton that takes care of keeping thumbnails of web pages up-to-date
>+ * automatically.
>+ */

"Keeps thumbnails of open web pages up-to-date."? I don't think it makes sense to call this a singleton.

>+  /**
>+   * The delay in ms to wait before taking a screenshot of a page.
>+   */
>+  _captureDelayMS: 2000,

Three lines of dummy comments for each property will only make it harder to read this code. This for one seems self-explanatory.

>+  _tabEvents: ["TabClose", "TabSelect", "SSTabRestored"],

Doesn't the progress listener make SSTabRestored redundant?

>+  /**
>+   * Initializes the Thumbnails singleton and registers event handlers.
>+   */

>+  /**
>+   * Uninitializes the Thumbnails singleton and unregisters event handlers.
>+   */

Can be dropped.

>+  /**
>+   * Generic event handler.
>+   * @param aEvent The event to handle.
>+   */
>+  handleEvent: function Thumbnails_handleEvent(aEvent) {

Doesn't need to be documented, handleEvent means the same wherever you see it.

>+    let browser;

>+      case "TabClose":
>+        browser = aEvent.target.linkedBrowser;
>+        if (this._timeouts.has(browser)) {
>+          clearTimeout(this._timeouts.get(browser));
>+          this._timeouts.delete(browser);
>+        }
>+        break;

Could be written as:

case "TabClose": {
  let browser = ...;
  ...
  break;
}

>+  /**
>+   * State change progress listener for all tabs.
>+   * @param aBrowser The browser whose state has changed.
>+   * @param aWebProgress The nsIWebProgress that fired this notification.
>+   * @param aRequest The nsIRequest that has changed state.
>+   * @param aStateFlags Flags indicating the new state.
>+   * @param aStatus Error status code associated with the state change.
>+   */
>+  onStateChange: function Thumbnails_onStateChange(aBrowser, aWebProgress,
>+                                                   aRequest, aStateFlags, aStatus) {
>+    if (aStateFlags & Ci.nsIWebProgressListener.STATE_STOP &&
>+        aStateFlags & Ci.nsIWebProgressListener.STATE_IS_NETWORK)
>+      this.delayedCapture(aBrowser);
>+  },

This is again a standard callback. It's useful to stress that this is being called for all tabs, but you don't need to document the parameters.

>+  /**
>+   * Captures a screenshot of the given browser with a delay.
>+   * @param aBrowser The browser to take a screenshot of.
>+   */

>+  /**
>+   * Captures a screenshot of the given browser.
>+   * @param aBrowser The browser to take a screenshot of.
>+   */

Basically, you're just reading out the method names and parameters here...

>+    let self = this;
>+
>+    let timeout = setTimeout(function () {
>+      self._timeouts.delete(aBrowser);
>+      self.capture(aBrowser);
>+    }, this._captureDelayMS);

use bind(this)

>+  capture: function Thumbnails_capture(aBrowser) {
>+    if (!aBrowser.parentNode || !this.shouldCapture(aBrowser))
>+      return;

You're clearing the timeouts when getting TabClose, so aBrowser.parentNode shouldn't happen, should it?

>+  /**
>+   * Determines whether we should capture a screenshot of the given browser.
>+   * @param aBrowser The browser to possibly take a screenshot of.
>+   * @return Whether we should capture a screenshot.
>+   */
>+  shouldCapture: function Thumbnails_shouldCapture(aBrowser) {

Can you prefix methods with an underscore when they aren't supposed to be called from external code? As far as I can tell, all but handleEvent, onStateChange, init and uninit are strictly private. Maybe this helps fighting the desire to document everything.

>@@ -1804,16 +1806,17 @@ function BrowserShutdown() {
>   } else {
>     if (Win7Features)
>       Win7Features.onCloseWindow();
> 
>     gPrefService.removeObserver(ctrlTab.prefName, ctrlTab);
>     gPrefService.removeObserver(allTabs.prefName, allTabs);
>     ctrlTab.uninit();
>     TabView.uninit();
>+    gBrowserThumbnails.init();

uninit
Attached patch Part 3 - Browser integration (obsolete) — Splinter Review
(In reply to Dão Gottwald [:dao] from comment #61)
> >+/**
> >+ * A singleton that takes care of keeping thumbnails of web pages up-to-date
> >+ * automatically.
> >+ */
> 
> "Keeps thumbnails of open web pages up-to-date."? I don't think it makes
> sense to call this a singleton.

Fixed.

> >+  /**
> >+   * The delay in ms to wait before taking a screenshot of a page.
> >+   */
> >+  _captureDelayMS: 2000,
> 
> Three lines of dummy comments for each property will only make it harder to
> read this code. This for one seems self-explanatory.

Removed.

> >+  _tabEvents: ["TabClose", "TabSelect", "SSTabRestored"],
> 
> Doesn't the progress listener make SSTabRestored redundant?

Yes, good catch, removed.

> >+  /**
> >+   * Initializes the Thumbnails singleton and registers event handlers.
> >+   */
> 
> >+  /**
> >+   * Uninitializes the Thumbnails singleton and unregisters event handlers.
> >+   */
> 
> Can be dropped.

Removed.

> >+  /**
> >+   * Generic event handler.
> >+   * @param aEvent The event to handle.
> >+   */
> >+  handleEvent: function Thumbnails_handleEvent(aEvent) {
> 
> Doesn't need to be documented, handleEvent means the same wherever you see
> it.

Removed.

> >+    let browser;
> 
> >+      case "TabClose":
> >+        browser = aEvent.target.linkedBrowser;
> >+        if (this._timeouts.has(browser)) {
> >+          clearTimeout(this._timeouts.get(browser));
> >+          this._timeouts.delete(browser);
> >+        }
> >+        break;
> 
> Could be written as:
> 
> case "TabClose": {
>   let browser = ...;
>   ...
>   break;
> }

Yeah, I removed some code and this variable definition is rather legacy. Fixed.

> >+  /**
> >+   * State change progress listener for all tabs.
> >+   * @param aBrowser The browser whose state has changed.
> >+   * @param aWebProgress The nsIWebProgress that fired this notification.
> >+   * @param aRequest The nsIRequest that has changed state.
> >+   * @param aStateFlags Flags indicating the new state.
> >+   * @param aStatus Error status code associated with the state change.
> >+   */
> >+  onStateChange: function Thumbnails_onStateChange(aBrowser, aWebProgress,
> >+                                                   aRequest, aStateFlags, aStatus) {
> >+    if (aStateFlags & Ci.nsIWebProgressListener.STATE_STOP &&
> >+        aStateFlags & Ci.nsIWebProgressListener.STATE_IS_NETWORK)
> >+      this.delayedCapture(aBrowser);
> >+  },
> 
> This is again a standard callback. It's useful to stress that this is being
> called for all tabs, but you don't need to document the parameters.

Removed.

> >+  /**
> >+   * Captures a screenshot of the given browser with a delay.
> >+   * @param aBrowser The browser to take a screenshot of.
> >+   */
> 
> >+  /**
> >+   * Captures a screenshot of the given browser.
> >+   * @param aBrowser The browser to take a screenshot of.
> >+   */
> 
> Basically, you're just reading out the method names and parameters here...

Removed.

> >+    let self = this;
> >+
> >+    let timeout = setTimeout(function () {
> >+      self._timeouts.delete(aBrowser);
> >+      self.capture(aBrowser);
> >+    }, this._captureDelayMS);
> 
> use bind(this)

Done.

> >+  capture: function Thumbnails_capture(aBrowser) {
> >+    if (!aBrowser.parentNode || !this.shouldCapture(aBrowser))
> >+      return;
> 
> You're clearing the timeouts when getting TabClose, so aBrowser.parentNode
> shouldn't happen, should it?

Yes, that shouldn't happen, removed.

> >+  /**
> >+   * Determines whether we should capture a screenshot of the given browser.
> >+   * @param aBrowser The browser to possibly take a screenshot of.
> >+   * @return Whether we should capture a screenshot.
> >+   */
> >+  shouldCapture: function Thumbnails_shouldCapture(aBrowser) {
> 
> Can you prefix methods with an underscore when they aren't supposed to be
> called from external code? As far as I can tell, all but handleEvent,
> onStateChange, init and uninit are strictly private. Maybe this helps
> fighting the desire to document everything.

Done.

> >@@ -1804,16 +1806,17 @@ function BrowserShutdown() {
> >   } else {
> >     if (Win7Features)
> >       Win7Features.onCloseWindow();
> > 
> >     gPrefService.removeObserver(ctrlTab.prefName, ctrlTab);
> >     gPrefService.removeObserver(allTabs.prefName, allTabs);
> >     ctrlTab.uninit();
> >     TabView.uninit();
> >+    gBrowserThumbnails.init();
> 
> uninit

m(
Attachment #586078 - Attachment is obsolete: true
Attachment #586975 - Flags: review?(dao)
Attachment #586078 - Flags: review?(dao)
Comment on attachment 586975 [details] [diff] [review]
Part 3 - Browser integration

>+ * This Source Code is subject to the terms of the Mozilla Public License
>+ * version 2.0 (the "License"). You can obtain a copy of the License at
>+ * http://mozilla.org/MPL/2.0/.

This is an outdated version of the boilerplate...

>+      case "SSTabRestored":

remove
Attachment #586975 - Flags: review?(dao) → review+
(In reply to Dão Gottwald [:dao] from comment #63)
> >+ * This Source Code is subject to the terms of the Mozilla Public License
> >+ * version 2.0 (the "License"). You can obtain a copy of the License at
> >+ * http://mozilla.org/MPL/2.0/.
> 
> This is an outdated version of the boilerplate...

Updated (everywhere).

> >+      case "SSTabRestored":
> 
> remove

Done.
Attachment #586975 - Attachment is obsolete: true
(In reply to Tim Taubert [:ttaubert] from comment #60)
> (In reply to Jeff Muizelaar [:jrmuizel] from comment #53)
> > I don't really like this extra painting and would like to see a careful
> > evaluation of the performance impact that this has.
> 
> The performance impact kind of depends on the speed of the drawWindow()
> implementation. If this could be better we should fix it because it's an
> advertised API and add-ons use it.

Currently, when add-ons use drawWindow they see the cost of drawWindow directly. My concern with this service that it takes the cost of drawWindow and hides it from the users of the service giving the impression that it's cheap. Instead of having a particular feature responsible for the cost, it gets absorbed into the platform making the platform slower and more bloated.

> 
> > Personally, I think it might make more sense to grab a copy of the existing
> > painted copy instead of repainting. We should at least make sure that we're
> > reusing the existing layers and not redrawing them.
> 
> So you mean not using drawWindow()? Is there any way to get image data (in
> PNG format) saved to the cache? That's all we'd need. And a JavaScript API,
> of course.
> 

There is not really any better way that I know of. This is infrastructure that would need to be built.

> 
> Actually we don't need that for the New Tab Page because that doesn't
> necessarily require thumbnails to be up-to-date. But as this is aimed to be
> a thumbnail service that should consolidate all different kinds of custom
> thumbnailing codes in the browser (like Ctrl-Tab, TabPreviews, Drag/Drop of
> tabs, Bookmarks, ...) we need to stay somewhat up-to-date.

This means that we need to pay for the cost of the more aggressive users of the service even when they're not running and it makes it less attractive for particular features to optimize for their particular usecase. (e.g. drag/drop of tabs only needs a thumbnail of the dragged tab and wants it be live, Ctrl-tab probably doesn't need all of the thumbnails at once etc.). Further, if the service is exposed to add-ons we end having to support it indefinitely into the future, even if the interface turns out to be a poor one. This is not great.


+  /**
+   * Reads the image data from a given canvas and passes it to the callback.
+   * @param aCanvas The canvas to read the image data from.
+   * @param aCallback The function that the image data is passed to.
+   */
+  _readImageData: function PageThumbs_readImageData(aCanvas, aCallback) {
+    let dataUri = aCanvas.toDataURL(PageThumbs.contentType, "");
+    let uri = Services.io.newURI(dataUri, "UTF8", null);
+
+    NetUtil.asyncFetch(uri, function (aData, aResult) {
+      if (Components.isSuccessCode(aResult) && aData && aData.available())
+        aCallback(aData);
+    });
+  },

I'm not exactly sure what this function is for, but why is it using toDataURL instead of getImageData()?
(In reply to Jeff Muizelaar [:jrmuizel] from comment #65)
> Currently, when add-ons use drawWindow they see the cost of drawWindow
> directly. My concern with this service that it takes the cost of drawWindow
> and hides it from the users of the service giving the impression that it's
> cheap. Instead of having a particular feature responsible for the cost, it
> gets absorbed into the platform making the platform slower and more bloated.

* This is not how API usage actually plays out in add-on land (nor the core platform, actually). APIs get wrapped and re-wrapped and abstracted to Alaska and back. We need to solve this problem with telemetry and data and well-designed APIs, not by saying "please don't use this API so much" <- that approach fails consistently (see the sync vs async discussion in the LevelDB bug for more). We need to ensure that the API is constructed to be as low-cost as possible. But not providing it because it might get over-used just hurts us *and* users (see my next point). Using that logic we should stop writing any API that might touch the filesystem at any point. Or the network, that's even slower ;)

* Any add-on author can use drawWindow at terrible times, and can store the thumbnail in far crappier ways than we will. Here, we'll be doing it in the best possible way, and under continuous performance testing.

* We need this feature. We need thumbnails for New Tab, and we need it for other features that have been long-blocked on the lack of them. Our UX team chafes at this constraint on their vision of what Firefox can be. We need it for our add-on platform, which already exposes an API for tab thumbnails. Chrome/Safari/IE have thumbnails built-in and are building features on them, so we also need this from a competitive standpoint.

* We absolutely want our ecosystem to use this API. We want to encourage people to build great visualizations and navigational interfaces to history and bookmarks and tabs, and thumbnails are an integral part of that.

> > 
> > Actually we don't need that for the New Tab Page because that doesn't
> > necessarily require thumbnails to be up-to-date. But as this is aimed to be
> > a thumbnail service that should consolidate all different kinds of custom
> > thumbnailing codes in the browser (like Ctrl-Tab, TabPreviews, Drag/Drop of
> > tabs, Bookmarks, ...) we need to stay somewhat up-to-date.
> 
> This means that we need to pay for the cost of the more aggressive users of
> the service even when they're not running and it makes it less attractive
> for particular features to optimize for their particular usecase. (e.g.
> drag/drop of tabs only needs a thumbnail of the dragged tab and wants it be
> live, Ctrl-tab probably doesn't need all of the thumbnails at once etc.).
> Further, if the service is exposed to add-ons we end having to support it
> indefinitely into the future, even if the interface turns out to be a poor
> one. This is not great.

I generally agree about encouraging optimization per use-case, but not in this case. We've got at least three features using tab thumbnails, and want to build yet more. We need them to be available on-demand in the core, to make it as easy and painless and performant as possible.

And in the add-on ecosystem, you're talking about hundreds of add-ons all re-implementing poorly-performing storage of binary files, and storing the same data redundantly, in this case. They can all call drawWindow, and at inopportune times. Yes, we need to be defensive against poor usage, but we do that by building great APIs instead of no APIs.

We have a cache, optimized for this exact use-case. We have experienced core engineers (eg: you and Tim)). And the sum total of images saved from this is probably less than are loaded over the network by hitting the Yahoo.com front page every day, or browsing Facebook for a few minutes. If drawWindow is the problem that's left, then let's figure out what is wrong and fix it.
I did some measuring of how long drawWindow() takes:

Quad Core Linux: min/avg/max:  7/12/7 ms
MacBook Air:     min/avg/max: 14/45/90 ms

Those values are ok but not great. Probably worse than 90ms on a netbook. It would be great to speed up drawWindow() calls but making them async would help as well.

I noticed there is ctx.asyncDrawXULElement() but that is pretty useless without knowing when it's finished - I don't know why the MozAsyncCanvasRender was removed but this method still exists. With bug 619591 fixed I think we could switch to using this function without blocking the UI.
I'll file a follow-up bug about maybe switching to asyncDrawXULElement().
Backed out because it messed up tscroll :/

https://hg.mozilla.org/integration/fx-team/rev/3f877d406d8a
https://hg.mozilla.org/integration/fx-team/rev/97973b8c04be
https://hg.mozilla.org/integration/fx-team/rev/43588ad492ac
https://hg.mozilla.org/integration/fx-team/rev/f3cf8f1827a2

tscroll increase 11% on MacOSX 10.6 (rev4) Mozilla-Inbound
http://mzl.la/wlsrqr

tscroll increase 15.2% on MacOSX 10.5.8 Mozilla-Inbound
http://mzl.la/xzA09M

tscroll increase 2.35% on XP Mozilla-Inbound-Non-PGO
http://mzl.la/x6h0eN
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Ok, so it seems like we don't have any other choice than using ctx.drawWindow() to create page thumbnails which obviously doesn't seem an option because of talos.

asyncDrawXULElement() is only used if the XUL element is contained in a remote browser (e10s). If the browser/tab is local is just falls back to drawWindow().

@Jeff: What we'd really need is some kind of drawWindow() flag that makes it work in the background and fires an event when ready or takes a callback to call when finished. Is that even doable? We could of course also try to drastically speed up drawWindow()...
(In reply to Tim Taubert [:ttaubert] from comment #72)
> @Jeff: What we'd really need is some kind of drawWindow() flag that makes it
> work in the background and fires an event when ready or takes a callback to
> call when finished. Is that even doable? We could of course also try to
> drastically speed up drawWindow()...

We don't currently have any way of doing the actual painting in the background. i.e. All painting happens on the main thread.

There are some different things that could be investigated. First, we could make sure that we're only recompositing the layers of a page instead of repainting it entirely.

We could also build API to asynchronously get a copy (or scaled copy) of the currently painted window. That would avoid the cost of painting altogether but might limit what content you can get when.

I would suggest getting roc's advice as he knows more about what drawWindow does than I do.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #73)
> There are some different things that could be investigated. First, we could
> make sure that we're only recompositing the layers of a page instead of
> repainting it entirely.

The current patch redraws everything.

When thumbnailing a visible tab, we could just recompose the layer subtree for that tab. This should be very fast (apart from reading back from the GPU). With OMTC, it can be made asynchronous as well if necessary. This will require new layers API and some drawWindow implementation changes, though. We could make it even cheaper by making a new API to deliver a copy of the visible tab next time we recompose it normally; then we could keep a copy of what was composited for the visible tab, and read it back asynchronously.

Do you need to take thumbnails of hidden tabs?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #74)
> Do you need to take thumbnails of hidden tabs?

Do you mean hidden or not selected tabs? Anyway, we don't really need that, we could just switch to only capturing the visible tab when it's selected.
(In reply to Tim Taubert [:ttaubert] from comment #71)
> Backed out because it messed up tscroll :/

The landing also caused regressions in SVG Opacity (on OS X 10.5 only) and Trace Malloc Allocs / Trace Malloc Maxheap (on all platforms).
(In reply to Dão Gottwald [:dao] from comment #31)
> >+++ b/browser/components/thumbnails/PageThumbs.jsm
> 
> Please put standalone modules in browser/modules/.

I don't think this matters much, but this isn't really how I foresaw modules/ being used. modules/ is for modules that wouldn't otherwise have a reasonable place to live - if browser/components/thumbnails is going to exist anyways (for the JS component in this case), you might as well put the related module there too (I'm assuming the component and module are related enough for this to be reasonable).
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #74)
> When thumbnailing a visible tab, we could just recompose the layer subtree
> for that tab. This should be very fast (apart from reading back from the
> GPU). With OMTC, it can be made asynchronous as well if necessary. This will
> require new layers API and some drawWindow implementation changes, though.
> We could make it even cheaper by making a new API to deliver a copy of the
> visible tab next time we recompose it normally; then we could keep a copy of
> what was composited for the visible tab, and read it back asynchronously.

Filed bug 720575.
This patch adds "scroll" listener to the page we'll take a thumbnail for. When the scroll event is received we'll reset the capture timer and wait a bit more to not judder when scrolling.
Attachment #590941 - Flags: review?(dietrich)
Don't capture thumbnails for SVG/XML documents as that will regress SVG talos tests. We can re-enable this when bug 720575 is fixed.
Attachment #590943 - Flags: review?(dietrich)
(In reply to Gavin Sharp (use gavin@gavinsharp.com for email) from comment #77)
> I don't think this matters much, but this isn't really how I foresaw
> modules/ being used. modules/ is for modules that wouldn't otherwise have a
> reasonable place to live - if browser/components/thumbnails is going to
> exist anyways (for the JS component in this case), you might as well put the
> related module there too (I'm assuming the component and module are related
> enough for this to be reasonable).

Moved the JSM back to browser/components/thumbnails/.
Attachment #590943 - Flags: review?(dietrich) → review+
Comment on attachment 587113 [details] [diff] [review]
Part 3 - Browser integration

Review of attachment 587113 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/base/content/browser-thumbnails.js
@@ +99,5 @@
> +    if (aBrowser.docShell.busyFlags != Ci.nsIDocShell.BUSY_FLAGS_NONE)
> +      return false;
> +
> +    // Don't take screenshots of about: pages.
> +    if (aBrowser.currentURI.schemeIs("about"))

what about resource:// URIs and other uncommon schemes? Places has a list for determining what URLs to store in history... maybe this should be in sync with that? what happens if you put a thumbnail URL in a thumbnail URL?
Comment on attachment 590941 [details] [diff] [review]
Part 5 - Don't capture thumbnails while the page is scrolled

Review of attachment 590941 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/base/content/browser-thumbnails.js
@@ +59,5 @@
>        case "TabSelect":
>          this._delayedCapture(aEvent.target.linkedBrowser);
>          break;
>        case "TabClose": {
> +        this._clearTimeout(aEvent.target.linkedBrowser);

do we also want to clear timeouts and recapture on top-level document load events?
Attachment #590941 - Flags: review?(dietrich) → review+
(In reply to Dietrich Ayala (:dietrich) from comment #82)
> what about resource:// URIs and other uncommon schemes? Places has a list
> for determining what URLs to store in history... maybe this should be in
> sync with that? what happens if you put a thumbnail URL in a thumbnail URL?

Will file a follow-up bug to implement a scheme white list.

(In reply to Dietrich Ayala (:dietrich) from comment #83)
> do we also want to clear timeouts and recapture on top-level document load
> events?

The timeout is initiated at some point. When the timeout is fired we capture and store a screenshot for the currently loaded page no matter where the user has navigated to in the meantime.
moz-page-thumb://thumbnail?url=.... can access from a web page.
It should access only in privileged site for security and privacy reason.

If can access on a web page, evil site owner can know that the user accessed or not the URL in past.
(In reply to teramako from comment #87)
> moz-page-thumb://thumbnail?url=.... can access from a web page.
> It should access only in privileged site for security and privacy reason.

Great finding. Could you please file a new bug on this? Thank you!
(In reply to Dão Gottwald [:dao] from comment #88)
> (In reply to teramako from comment #87)
> > moz-page-thumb://thumbnail?url=.... can access from a web page.
> > It should access only in privileged site for security and privacy reason.
> 
> Great finding. Could you please file a new bug on this? Thank you!

Reported: https://bugzilla.mozilla.org/show_bug.cgi?id=721398
(In reply to teramako from comment #89)
> Reported: https://bugzilla.mozilla.org/show_bug.cgi?id=721398

Unfortunately this isn't publicly accessible...
(In reply to Dão Gottwald [:dao] from comment #90)
> (In reply to teramako from comment #89)
> > Reported: https://bugzilla.mozilla.org/show_bug.cgi?id=721398
> 
> Unfortunately this isn't publicly accessible...

oops, sorry, I had been checked "security" :(.
I re-reported: https://bugzilla.mozilla.org/show_bug.cgi?id=721408
Depends on: 721408
Depends on: CVE-2012-0476
Is there a chance to get thumbnail service work without disk cache?
Not at the moment because that's the best place we have to efficiently store images. If you'd like (for whatever reason) to have other storage options investigated please file a bug saying so.
Depends on: 716944
Depends on: 705958
Depends on: 716949
Depends on: 721329
Depends on: 723050
Depends on: 705911
Depends on: 726347
Depends on: 727765
Depends on: 724408
Depends on: 726267
Depends on: 727120
Depends on: 725589
Depends on: 725189
Depends on: 736709
Depends on: 884519
Depends on: 1413097
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: