Last Comment Bug 634139 - Need the ability to extract the most representative color from a favicon
: Need the ability to extract the most representative color from a favicon
Status: RESOLVED FIXED
: dev-doc-complete
Product: Toolkit
Classification: Components
Component: Places (show other bugs)
: unspecified
: All All
: -- normal with 3 votes (vote)
: mozilla17
Assigned To: Andrew Hurle [:ahurle]
:
:
Mentors:
http://blog.margaretleibovic.com/post...
Depends on: 818936 774907 950546
Blocks: 779027 587909 747789 768703 783510
  Show dependency treegraph
 
Reported: 2011-02-14 17:05 PST by Alex Faaborg [:faaborg] (Firefox UX)
Modified: 2014-07-22 11:52 PDT (History)
22 users (show)
MattN+bmo: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Part 1 v1 - Add a service for finding the representative color in an image (48.60 KB, patch)
2012-07-07 00:53 PDT, Andrew Hurle [:ahurle]
MattN+bmo: feedback+
Details | Diff | Splinter Review
v2 - Addressing Matt's comments, size limit (51.27 KB, patch)
2012-07-12 00:19 PDT, Andrew Hurle [:ahurle]
MattN+bmo: review-
Details | Diff | Splinter Review
v3 - More cleanup, use chroma instead of saturation, comments (54.04 KB, patch)
2012-07-28 21:49 PDT, Andrew Hurle [:ahurle]
MattN+bmo: review+
Details | Diff | Splinter Review
v4 - Split worker into multiple files, remove dumps, address comments (54.30 KB, patch)
2012-08-03 14:18 PDT, Andrew Hurle [:ahurle]
no flags Details | Diff | Splinter Review

Description Alex Faaborg [:faaborg] (Firefox UX) 2011-02-14 17:05:13 PST
For a few different user interfaces we are planning, we are going to need the ability to extract the dominant color used in a site's favicon.

Two specific cases where we may use this ability:

Application buttons for pinned Web apps in the windows taskbar:
http://areweprettyyet.com/5/desktopApps/

Search engine buttons for user added engines (or engines shipped with Firefox that did not request a particular color):
http://areweprettyyet.com/5/searchBar/
Comment 2 Alex Faaborg [:faaborg] (Firefox UX) 2011-06-29 13:21:38 PDT
yep, forgot to cc Margaret
Comment 4 Andrew Hurle [:ahurle] 2012-06-26 18:05:11 PDT
I'm planning on using this implementation Matt suggested since it's MPL licensed:
https://github.com/harthur/rainbow/tree/master/chrome/content/analyzer
https://harthur.wordpress.com/2009/12/18/getting-the-color-scheme-of-a-website-using-canvas-and-hierarchical-clustering/

Will need to tweak it a bit for our purposes.  We're using a different algorithm than margaret's because hers won't cluster together similar colors, so it won't pick up dominant gradients for example.

I'll be implementing it as a JS XPCOM service that finds the representative color in an arbitrary image, and then the favicon service can use it to store colors in its database to avoid recalculating the color every time you want to display it.
Comment 5 Andrew Hurle [:ahurle] 2012-07-07 00:53:01 PDT
Created attachment 639942 [details] [diff] [review]
Part 1 v1 - Add a service for finding the representative color in an image

Part 2 will involve integrating this with the favicon service to cache representative colors in moz_favicons and provide a way to get that color.

This uses Heather Arthur's algorithm, modified to have the desirable properties described in the 99designs article, accompanied by tests for those properties.  

Big images can be really slow and tie up the worker, so I'm not sure if we should set a size cap and resize anything that's bigger, or maybe just give each big image its own worker.  Creating a new worker for _every_ image proved to be about 3x slower than having a static worker when processing many small images.
Comment 6 Matthew N. [:MattN] (In Taipei until Sep. 23) 2012-07-07 20:13:19 PDT
Comment on attachment 639942 [details] [diff] [review]
Part 1 v1 - Add a service for finding the representative color in an image

Review of attachment 639942 [details] [diff] [review]:
-----------------------------------------------------------------

Nice tests.  This seems to be coming along well.

I know some of my comments are on imported code but it'd be good to have them follow our coding style.

I didn't review the clusterfck.js or rgb.js portions yet but I think I have enough review comments for now.

::: toolkit/components/places/ColorAnalyzer.js
@@ +6,5 @@
> +
> +const Ci = Components.interfaces;
> +const Cc = Components.classes;
> +
> +Components.utils.import("resource://gre/modules/XPCOMUtils.jsm");

Use "Cu" here

@@ +9,5 @@
> +
> +Components.utils.import("resource://gre/modules/XPCOMUtils.jsm");
> +
> +const hiddenWindowDoc = Components.classes["@mozilla.org/appshell/appShellService;1"]
> +                      .getService(Components.interfaces.nsIAppShellService)

Nit: Use Cc/Ci

@@ +10,5 @@
> +Components.utils.import("resource://gre/modules/XPCOMUtils.jsm");
> +
> +const hiddenWindowDoc = Components.classes["@mozilla.org/appshell/appShellService;1"]
> +                      .getService(Components.interfaces.nsIAppShellService)
> +                      .hiddenDOMWindow.document;

Nit: fix indentation (align periods or use trailing periods and align the characters.

@@ +12,5 @@
> +const hiddenWindowDoc = Components.classes["@mozilla.org/appshell/appShellService;1"]
> +                      .getService(Components.interfaces.nsIAppShellService)
> +                      .hiddenDOMWindow.document;
> +const XHTML_NS = "http://www.w3.org/1999/xhtml";
> +const ERROR_COLOR = 0xFFFFFF;

I find the ERROR_COLOR idea a bit unusual.  Perhaps we should just have two callbacks (success and error)? Or we can use this internally but don't make guarantees about the color when the callback is called with success = false.

@@ +27,5 @@
> +ColorAnalyzer.prototype = {
> +  // todo: how do we handle massive images?  cap sizes?  make a new worker?
> +  findRepresentativeColor: function(imageURI, callback) {
> +    var image = hiddenWindowDoc.createElementNS(XHTML_NS, "img");
> +    var that = this;

Use .bind(this) on the onload function to avoid this.

@@ +28,5 @@
> +  // todo: how do we handle massive images?  cap sizes?  make a new worker?
> +  findRepresentativeColor: function(imageURI, callback) {
> +    var image = hiddenWindowDoc.createElementNS(XHTML_NS, "img");
> +    var that = this;
> +    image.onload = function() {

Nit: name the anonymous functions (ie. ColorAnalyzer_onload) to help with stack traces.

@@ +35,5 @@
> +      canvas.height = image.naturalHeight;
> +      var ctx = canvas.getContext("2d");
> +      ctx.drawImage(image, 0, 0);
> +      that.startJob(ctx.getImageData(0, 0, canvas.width, canvas.height),
> +                       callback);

Nit: align callback with ctx.

@@ +67,5 @@
> +  onWorkerError: function(error) {
> +    // this shouldn't happen, but just in case
> +    error.preventDefault();
> +    dump("ColorAnalyzer worker error in " + error.filename + " line " +
> +         error.lineno + ": " + error.message + "\n");

You can do Cu.reportError(error); here to have this go to the error console.

@@ +72,5 @@
> +    this.callbacks.shift().onComplete(false, ERROR_COLOR);
> +  },
> +
> +  classID: Components.ID("{d056186c-28a0-494e-aacc-9e433772b143}"),
> +  _xpcom_factory: XPCOMUtils.generateSingletonFactory(ColorAnalyzer),

Is this line needed?

@@ +77,5 @@
> +  QueryInterface: XPCOMUtils.generateQI([Ci.mozIColorAnalyzer])
> +};
> +
> +let components = [ColorAnalyzer];
> +var NSGetFactory = XPCOMUtils.generateNSGetFactory(components);

Nit: inline the array (make these two lines into one).

::: toolkit/components/places/ColorAnalyzer_worker.js
@@ +17,5 @@
> + * Heather Arthur.
> + * Portions created by the Initial Developer are Copyright (C) 2009
> + * the Initial Developer. All Rights Reserved.
> + *
> + * Contributor(s): 

I'm not sure what it takes to relicense this to MPL 2.0

@@ +37,5 @@
> +"use strict";
> +
> +// event.data is an ImageData object
> +onmessage = function(event) {
> +  var pixels = event.data.data;

To avoid this comment and make it more clear, can we just call the property imageData instead of data?

@@ +43,5 @@
> +  var height = event.data.height;
> +
> +  var t1 = Date.now();
> +  var allColors = getColors(pixels, width, height);
> +  var t2 = Date.now();

I'm assuming t1/t2 will either be used or go away eventually.

@@ +46,5 @@
> +  var allColors = getColors(pixels, width, height);
> +  var t2 = Date.now();
> +
> +  var colors = allColors.slice(0, 2400); // only merge top colors by frequency for speed
> +  var threshold = 12;

Add a comment about what threshold means.

@@ +52,5 @@
> +  var backgroundColor = getBackgroundColor(pixels, width, height);
> +  dump("backgroundColor: " + backgroundColor + "\n");
> +  
> +  var merged = mergeColors(colors, width * height, threshold).map(function(cluster) {
> +    var rv = cluster.canonical;

Nit: I prefer result or rv but nbd.

@@ +89,5 @@
> +
> +  var t3 = Date.now();
> +  //postMessage({colors: merged, pixelTime: t2 - t1, clusterTime: t3 - t2, num: allColors.length});
> +  
> +  postMessage(merged.length > 0 ? merged[0].color : null);

We should probably send the N most desirable colors where N is defined in a const (or passed from the creator with a default value).  This will make it more useful for the future.

@@ +98,5 @@
> +  var colorFrequency = {};
> +  var aliased = 0;
> +  for(var x = 0; x < width; x++) {
> +    for(var y = 0; y < height; y++) {
> +     var offs = x*4 + y*4*width;

Nit: use spaces around binary operators.  If you want to make the order of operations more clear, you can use brackets.

@@ +100,5 @@
> +  for(var x = 0; x < width; x++) {
> +    for(var y = 0; y < height; y++) {
> +     var offs = x*4 + y*4*width;
> +     
> +     if (data[offs + 3] < 25) // transparent

Using constants for the offsets of RGBA would make this more clear:
const COLOR_RED = 0;
const COLOR_GREEN = 1;
const COLOR_BLUE = 2;
const COLOR_ALPHA = 3;

@@ +113,5 @@
> +    }
> +  }
> +   
> +  var colors = [];
> +  for(color in colorFrequency)

"let color"

@@ +119,5 @@
> +  colors.sort(descendingSort);
> +  return colors;
> +}
> +
> +function mergeColors(colors, numPixels, threshold) {

Please add a comment on these helpers that are non-trivial.

@@ +124,5 @@
> +  var items = colors.map(function(item) {
> +    var color = item.color;
> +    var freq = item.freq;
> +    return {
> +      mean: convertRGBtoLAB(color >> 16, color >> 8 & 0xff, color & 0xff),

Using constants for 8 and 16 here and other places would be good

@@ +131,5 @@
> +      colors : [color],
> +      highFreq: freq,
> +      highRatio : freq / numPixels,
> +      highColor: color, // the individual color w/ the highest frequency in this cluster
> +      ratio: freq / numPixels, // ratio of page taken up by colors in this cluster

s/page/image/ ?

@@ +140,5 @@
> +  var merged = clusterfck.hcluster(items, distance, merge, threshold);
> +  return merged;
> +}
> +
> +function descendingSort(a, b){

descendingFreqSort

@@ +151,5 @@
> +  var r2 = c2.highRatio;
> +  var f1 = c1.highFreq;
> +  var f2 = c2.highFreq;  
> +  var handicap;
> +  if((r1 > 0.2 && r2 > 0.2) || (f1 > 200000 && f2 > 200000))

Nit: space after "if"

@@ +153,5 @@
> +  var f2 = c2.highFreq;  
> +  var handicap;
> +  if((r1 > 0.2 && r2 > 0.2) || (f1 > 200000 && f2 > 200000))
> +    handicap = 5; 
> +  else if((r1 > 0.14 && r2 > 0.14) || (f1 > 100000 && f2 > 100000))

Could you add more comments explaining what these cases are doing?

@@ +250,5 @@
> +// color is a Lab array
> +function getBackgroundColor(data, width, height) {
> +  // we'll assume that if the four corners are roughly the same color,
> +  // then that's the background color
> +  var coordinates = [[0,0], [width-1,0], [width-1,height-1], [0,height-1]];

Nit: spaces after the commas

@@ +255,5 @@
> +  
> +  // find the corner colors in LAB
> +  var cornerColors = [];
> +  for (var i = 0; i < coordinates.length; i++) {
> +    var offs = coordinates[i][0]*4 + coordinates[i][1]*4*width;

Nit: spaces around * here too

@@ +256,5 @@
> +  // find the corner colors in LAB
> +  var cornerColors = [];
> +  for (var i = 0; i < coordinates.length; i++) {
> +    var offs = coordinates[i][0]*4 + coordinates[i][1]*4*width;
> +    if(data[offs + 3] < 20) {

Use the constants in this loop too.

@@ +289,5 @@
> +
> +///// clusterfck.js
> +
> +/* 
> +Copyright (c) 2011 Heather Arthur <fayearthur@gmail.com>

We need to make sure that we meet any licensing restrictions.

@@ +441,5 @@
> +}
> +
> +var SINGLE_LINKAGE = function(c1, c2) { return c1; };
> +var COMPLETE_LINKAGE = function(c1, c2) { return c1; };
> +var AVERAGE_LINKAGE = function(c1, c2) { return c1; };

Not sure why these variables are all caps? We should change the case of the name and not assign the function to a variable.

@@ +544,5 @@
> +	H = Math.round(H);
> +	S = Math.round(rnd * S * 100)/rnd;
> +	V = Math.round(rnd * V * 100)/rnd;
> +	
> +	return new Array(H,S,V);

It would be more clear to return an object with three properties.

::: toolkit/components/places/mozIColorAnalyzer.idl
@@ +1,3 @@
> +#include "nsISupports.idl"
> +
> +interface mozIRepresentativeColorCallback;

I think it's preferred to just switch the order of the interfaces to avoid the forward declaration.

@@ +29,5 @@
> +[function, scriptable, uuid(e4089e21-71b6-40af-b546-33c21b90e874)]
> +interface mozIRepresentativeColorCallback : nsISupports
> +{
> +  /**
> +  * Must always be called when color analysis finishes.

Nit: s/Must always/Will/

::: toolkit/components/places/tests/browser/browser_colorAnalyzer.js
@@ +4,5 @@
> +
> +"use strict";
> +
> +const Cc = Components.classes;
> +const Ci = Components.interfaces;

I think Cc and Ci are already defined in browser-chrome tests.

@@ +8,5 @@
> +const Ci = Components.interfaces;
> +const CA = Cc["@mozilla.org/places/colorAnalyzer;1"]
> +             .getService(Ci.mozIColorAnalyzer);
> +const hiddenWindowDoc = Components.classes["@mozilla.org/appshell/appShellService;1"]
> +                      .getService(Components.interfaces.nsIAppShellService)

Nit: Use Cc/Ci and fix indentation here.

@@ +25,5 @@
> +
> +function newURI(uri) {
> +  var ioService = Components.classes["@mozilla.org/network/io-service;1"]  
> +                  .getService(Components.interfaces.nsIIOService);  
> +  return ioService.newURI(uri, "", null); 

You can use Services.io.newURI(...) in which case this helper may not be necessary.

@@ +59,5 @@
> +context of the canvas, sticks the generated canvas into findRepresentativeColor.
> +See frcTest.
> +*/
> +function canvasTest(width, height, paintCanvasFunc, expected, message, skipNextStep) {
> +  var canvas = hiddenWindowDoc.createElementNS(XHTML_NS, "canvas");

Nit: Please use "let" instead of "var" in new code.

@@ +198,5 @@
> +    ctx.fillStyle = "FFDDDD";
> +    ctx.fillRect(0, 0, 16, 16);
> +    ctx.fillStyle = "red";
> +    ctx.fillRect(0, 0, 3, 16);
> +  }, 0xFF0000, "interestingColorPreference analysis returns red");

TODO: make sure pink isn't excluded because it's a background color

@@ +201,5 @@
> +    ctx.fillRect(0, 0, 3, 16);
> +  }, 0xFF0000, "interestingColorPreference analysis returns red");
> +});
> +
> +// make sure the preference for interesting colors won't stupidly pick 1 pixel

1?
Comment 7 Andrew Hurle [:ahurle] 2012-07-12 00:19:52 PDT
Created attachment 641380 [details] [diff] [review]
v2 - Addressing Matt's comments, size limit

findRepresentativeColor will now fail if the image contains more than 128^2 pixels for perf reasons.  Also, the worker will now only consider the 500 most frequent colors for clustering, again for perf reasons.  I went through the rest of the imported code and fixed the coding style (hopefully).  The dump statements will eventually go away...

On my debug build and beefy laptop, a 16x16 image with one color takes something like 10ms, 128x128 with almost all pixels a unique color takes 1400ms, and the 32x32 firefox.ico file takes 900ms.

(In reply to Matthew N. [:MattN] from comment #6)
> @@ +37,5 @@
> > +"use strict";
> > +
> > +// event.data is an ImageData object
> > +onmessage = function(event) {
> > +  var pixels = event.data.data;
> 
> To avoid this comment and make it more clear, can we just call the property
> imageData instead of data?

I think event.data is unavoidable, so I have event.data.imageData.blah now (along with event.data.maxColors).

> @@ +113,5 @@
> > +    }
> > +  }
> > +   
> > +  var colors = [];
> > +  for(color in colorFrequency)
> 
> "let color"

For whatever reason, using "let" here was very slow, but "var" was fine, so I used "var" to achieve the clarity I think you were aiming for here.
Comment 8 :Gavin Sharp [email: gavin@gavinsharp.com] 2012-07-12 09:30:35 PDT
(In reply to Andrew Hurle [:ahurle] from comment #7)
> > > +  for(color in colorFrequency)
> > 
> > "let color"
> 
> For whatever reason, using "let" here was very slow, but "var" was fine, so
> I used "var" to achieve the clarity I think you were aiming for here.

If you have a reproducible, measurable testcase, you should file a JS engine bug about this.
Comment 9 Matthew N. [:MattN] (In Taipei until Sep. 23) 2012-07-17 02:28:17 PDT
Comment on attachment 641380 [details] [diff] [review]
v2 - Addressing Matt's comments, size limit

Review of attachment 641380 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good still.  I haven't seen major problems yet and I'm doing a deeper pass this time.

I would still prefer you use let instead of var in all of these new files.  If there is a major performance difference in a specific case (ie. a loop), use var in that case and file a bug then reference that bug in a comment near the use of "var".  A quick search didn't show existing bugs about the performance of let.

I only reviewed the worker so far and stopped at mergeClusters.  I'll finish the review pass tomorrow.

::: toolkit/components/places/ColorAnalyzer_worker.js
@@ +46,5 @@
> +const RED_SHIFT = 16;
> +const GREEN_SHIFT = 8;
> +
> +onmessage = function(event) {
> +  var imageData = event.data.imageData;

Would be worth documenting the arguments here in javadoc style

@@ +62,5 @@
> +  dump("getColors: " + (t2 - t1) + "\n");
> +
> +  // Only merge top colors by frequency for speed.
> +  // Images with more than this many unique colors will have reduced accuracy.
> +  var colors = allColors.slice(0, 500);

Move this number to a const. (ie. const MAX_CLUSTERED_COLORS) along with other magic numbers.

@@ +67,5 @@
> +
> +  // Each cluster of colors has a mean color in the Lab color space.
> +  // If the euclidean distance between the means of two clusters is greater
> +  // than this threshold, they won't be merged.
> +  var threshold = 12;

ditto

@@ +73,5 @@
> +  var backgroundColor = getBackgroundColor(pixels, width, height);
> +  dump("backgroundColor: " + backgroundColor + "\n");
> +
> +  var t3 = new Date();
> +  var merged = mergeColors(colors, width * height, threshold);

Nit: mergedColors would be more clear for the variable

@@ +78,5 @@
> +  var t4 = new Date();
> +  dump("mergeColors: " + (t4 - t3) + "\n");
> +
> +  merged = merged.map(function(cluster) {
> +    var result = cluster.canonical;

Nit: Use a more descriptive variable name here.

@@ +87,5 @@
> +    if (backgroundColor != null) {
> +      var backgroundDistance = labEuclidean(result.mean, backgroundColor);
> +      dump("backgroundDistance: " + backgroundDistance + "\n");
> +      if (backgroundDistance < 15) {
> +        multiplier = backgroundDistance / 15;

ditto

@@ +101,5 @@
> +    dump("saturation: " + saturation + "\n");
> +    if (saturation < 1) {
> +      saturation = 1;
> +    } else if (saturation > 90) {
> +      saturation = 90;

ditto…

@@ +124,5 @@
> +};
> +
> +// Given the pixel data and dimensions of an image, return an array of objects
> +// associating each unique color and its frequency in the image, sorted
> +// descending by frequency.  Sufficiently transparent colors are ignored.

Can you convert this to javadoc style like:
/**
 * Given the pixel data…
 *
 * @param data
 *        Pixel data of the image to get the colors from.
 * …
 */

Could you also add this to all functions that are non-trivial (ie. not descendingFreqSort)? They don't have to be long.

@@ +130,5 @@
> +  var colorFrequency = {};
> +  var aliased = 0;
> +  for (var x = 0; x < width; x++) {
> +    for (var y = 0; y < height; y++) {
> +      var offs = (x * 4) + (y * 4 * width);

Nit: offset would be clearer for a variable name

@@ +132,5 @@
> +  for (var x = 0; x < width; x++) {
> +    for (var y = 0; y < height; y++) {
> +      var offs = (x * 4) + (y * 4 * width);
> +
> +      if (data[offs + PIXEL_ALPHA] < 25) {

make 25 a const with a description

@@ +141,5 @@
> +                  | data[offs + PIXEL_GREEN] << GREEN_SHIFT
> +                  | data[offs + PIXEL_BLUE];
> +
> +      if (color in colorFrequency) {
> +        colorFrequency[color] = colorFrequency[color] + 1;

colorFrequency[color]++; ?

@@ +199,5 @@
> +  // a high frequency.  If so, make their distance seem larger, which makes
> +  // them less likely to be merged.
> +  if ((r1 > 0.2 && r2 > 0.2) || (f1 > 200000 && f2 > 200000)) {
> +    handicap = 5;
> +  } else if ((r1 > 0.14 && r2 > 0.14) || (f1 > 100000 && f2 > 100000)) {

Are frequencies in the second case referring to the number of pixels of the high color?  If so, then shouldn't the ranges be relative to size of the image?

@@ +215,5 @@
> +  return dist + handicap;
> +}
> +
> +// find the euclidean distance between two colors in the Lab color space
> +function labEuclidean(col1, col2) {

Nit: color1, color2

@@ +237,5 @@
> +
> +  var total = num1 + num2;
> +
> +  var mean = {
> +    l: (lab1.l * num1 + lab2.l * num2) / total,

Nit: Use "lightness:"

@@ +269,5 @@
> +  }
> +
> +  return {
> +    mean: mean,
> +    num: c1.num + c2.num,

numColors would be clearer but isn't it the same as colors.length? If so, we can remove this if there isn't a major perf impact looking up .length repeatedly.

@@ +281,5 @@
> +  };
> +}
> +
> +function dumpObj(obj) {
> +  dump(obj + ": {\n");

It seems like this is an unused debugging function and should be removed before landing.

@@ +293,5 @@
> +  }
> +  dump("}\n")
> +}
> +
> +

Nit: there's an extra new line here

@@ +299,5 @@
> +// it.  Returned color is a Lab array.
> +function getBackgroundColor(data, width, height) {
> +  // we'll assume that if the four corners are roughly the same color,
> +  // then that's the background color
> +  var coordinates = [[0, 0], [width-1, 0], [width-1, height-1], [0, height-1]];

Nit: spaces around "-" operator

@@ +304,5 @@
> +
> +  // find the corner colors in LAB
> +  var cornerColors = [];
> +  for (var i = 0; i < coordinates.length; i++) {
> +    var offs = (coordinates[i][0] * 4) + (coordinates[i][1] * 4 * width);

Nit: offset would be clearer for a variable name

@@ +307,5 @@
> +  for (var i = 0; i < coordinates.length; i++) {
> +    var offs = (coordinates[i][0] * 4) + (coordinates[i][1] * 4 * width);
> +    if (data[offs + PIXEL_ALPHA] < 20) {
> +      // we can't make very accurate judgements below this opacity
> +      return null;

I don't think we want to bail from getBackgroundColor if one color has a low alpha.  It seems like "continue" would be good and then bail from the function after the loop |if (!cornerColors.length)|

@@ +315,5 @@
> +                                      data[offs + PIXEL_BLUE]));
> +  }
> +
> +  // find the average color among the corners
> +  var averageColor = { l: 0, a: 0, b: 0 };

"lightness" here too

@@ +327,5 @@
> +  }
> +
> +  // if any of the corner colors deviate enough from the average, they aren't
> +  // similar enough to be considered the background color
> +  for (var i = 0; i < cornerColors.length; i++) {

for (let cornerColor of cornerColors) {

@@ +366,5 @@
> +    var module = { exports: {}};
> +    var exports = module.exports;
> +module.exports = (function() {
> +    var module = { exports: {}};
> +    var exports = module.exports;

I would prefer if the module/exports boilerplate was removed.

@@ +368,5 @@
> +module.exports = (function() {
> +    var module = { exports: {}};
> +    var exports = module.exports;
> +
> +var HierarchicalClustering = function(distance, merge, threshold) {

I think we generally just do:
function HierarchicalClustering(distance, merge, threshold) {

@@ +375,5 @@
> +  this.threshold = threshold == undefined ? Infinity : threshold;
> +}
> +
> +HierarchicalClustering.prototype = {
> +  cluster: function(items, snapshot, snapshotCallback) {

Add some javadocs comments to these functions explaining their arguments.  It seems like "snapshot" is the size of snapshots so it would be more clear as snapshotSize.

@@ +378,5 @@
> +HierarchicalClustering.prototype = {
> +  cluster: function(items, snapshot, snapshotCallback) {
> +
> +    var clusters = [];
> +    var dists = [];  // distances between each pair of clusters

Nit: "distances"

@@ +379,5 @@
> +  cluster: function(items, snapshot, snapshotCallback) {
> +
> +    var clusters = [];
> +    var dists = [];  // distances between each pair of clusters
> +    var mins = []; // closest cluster for each cluster

Nit: better variable name?

@@ +381,5 @@
> +    var clusters = [];
> +    var dists = [];  // distances between each pair of clusters
> +    var mins = []; // closest cluster for each cluster
> +    var index = []; // keep a hash of all clusters by key
> +    for (var i = 0; i < items.length; i++) {

let item in items

@@ +403,5 @@
> +        }
> +      }
> +    }
> +
> +

Nit: extra new line

@@ +406,5 @@
> +
> +
> +    var toMerge = this.closestClusters(clusters, dists, mins);
> +    var i = 0;
> +    while (toMerge) {

It seems like this can be written as a for loop at quick glance:
for (let i = 0; toMerge = this.closestClusters(clusters, dists, mins); i++) { (with toMerge defined above the loop).  This reduces the number of lines and the duplication of the call to closestClusters.

@@ +412,5 @@
> +        snapshotCallback(clusters);
> +      }
> +
> +      var c1 = index[toMerge[0]];
> +      var c2 = index[toMerge[1]];

Nit: Use clearer variable names since there are multiple terms with "c" as a prefix around here: color, cluster, canonical, etc.

@@ +421,5 @@
> +    }
> +    return clusters;
> +  },
> +
> +  mergeClusters: function(clusters, dists, mins, index, c1, c2) {

ditto

@@ +540,5 @@
> +return module.exports;   })();
> +return module.exports;   })()
> +
> +
> +/////// rgb.js

Can we keep this as a separate file and call importScripts to import it? rgb.js or colorConversion.js? We may also want to do the same with the clustering code.

@@ +597,5 @@
> +  H = Math.round(H);
> +  S = Math.round(rnd * S * 100) / rnd;
> +  V = Math.round(rnd * V * 100) / rnd;
> +
> +  return { h: H, s: S, v: V };

I guess I should have been more explicit that you should use the full word as the property name: hue, saturation, and value

@@ +632,5 @@
> +
> +  var L = (116 * Y) - 16;
> +  var A = 500 * (X - Y);
> +  var B = 200 * (Y - Z);
> +  return { l: L, a: A, b: B };

Nit: Use "lightness" instead of "l"
Comment 10 Matthew N. [:MattN] (In Taipei until Sep. 23) 2012-07-17 18:48:35 PDT
Comment on attachment 641380 [details] [diff] [review]
v2 - Addressing Matt's comments, size limit

Review of attachment 641380 [details] [diff] [review]:
-----------------------------------------------------------------

Just a few more comments but I'd like to take another look after the fixes.

::: toolkit/components/places/ColorAnalyzer.js
@@ +55,5 @@
> +  },
> +
> +  onImageError: function ColorAnalyzer_onImageError(image, callback) {
> +    Cu.reportError("ColorAnalyzer error: image at " + image.src +
> +                   " didn't load\n");

You don't need \n in calls to reportError (same below also).  You can also fit this on one line if you remove "error:" which will automatically be shown as the prefix in the error console.

::: toolkit/components/places/ColorAnalyzer_worker.js
@@ +526,5 @@
> +         .cluster(items, snapshot, snapshotCallback);
> +}
> +
> +clusterfck = {
> +  hcluster: hcluster,

The module stuff can be replaced by just having the function definitions inside the clusterfck object:

let clusterfck = {
  hcluster: function hcluster(…) {

  },

  single_linkage: function single_linkage(…) {

  },

  …
};

::: toolkit/components/places/mozIColorAnalyzer.idl
@@ +5,5 @@
> +[function, scriptable, uuid(e4089e21-71b6-40af-b546-33c21b90e874)]
> +interface mozIRepresentativeColorCallback : nsISupports
> +{
> +  /**
> +  * Will be called when color analysis finishes.

Nit: The comments are not aligned correctly for these 2 functions.  The * one the second line should align with the first one on the first line:
/**
 *

@@ +17,5 @@
> +  *        The representative color as an integer in RGB form.
> +  *        e.g. 0xFF0102 == rgba(255,1,2)
> +  *        If success is false, the value of color is not defined.
> +  */
> +  void onComplete(in boolean success, in unsigned long color);

I *think* we can make the 2nd argument |[optional] in unsigned long color| so that we can get rid of the ERROR_COLOR stuff and just call the callback with 1 argument on error.

@@ +39,5 @@
> +  *        A URI pointing to the image - ideally a data: URI, but any scheme
> +  *        that will load when setting the src attribute of a DOM img element
> +  *        should work.
> +  * @param callback
> +  *        Function to call when the representative color is found

…or an error occurs.

::: toolkit/components/places/tests/browser/Makefile.in
@@ +26,5 @@
>  	browser_settitle.js \
> +	colorAnalyzer/dino_head.ico \
> +	colorAnalyzer/firefox.ico \
> +	colorAnalyzer/thunderbird.ico \
> +	colorAnalyzer/amo.png \

I think we should avoid putting images of trademarks in toolkit. List of trademarks: https://www.mozilla.org/foundation/trademarks/list.html

You could instead copy some of the logos from elsewhere in the tree such as the ones at https://mxr.mozilla.org/mozilla-central/source/toolkit/themes/pinstripe/mozapps/extensions/
Comment 11 Justin Dolske [:Dolske] 2012-07-17 20:07:40 PDT
Comment on attachment 641380 [details] [diff] [review]
v2 - Addressing Matt's comments, size limit

Review of attachment 641380 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/places/ColorAnalyzer_worker.js
@@ +17,5 @@
> + * Heather Arthur.
> + * Portions created by the Initial Developer are Copyright (C) 2009
> + * the Initial Developer. All Rights Reserved.
> + *
> + * Contributor(s):

Pretty sure this can just be replaced with the standards new MPL2 boilerplate (since the existing license allows doing so, which is how we did it tree-wide). Can check with gerv to be sure, or harther since she's CC'd here.

A couple other random driveby comments:

* Speaking of harther, I bet she will have some insightful review comments! ;) At the very least I'd be curious for overall thoughts.

* Perhaps poke someone in GFX on the side (joedrew?) for long-term thoughts/ideas. Depending on how fancy we want to get with things, there might be ways to do some of this on-GPU or in optimized native code. Doesn't need to hold up this patch, but against I'd at least be curious for overall thoughts from domain experts.

* Can't help but wonder if some Typed Array goodness would be an optimization to look at (https://developer.mozilla.org/en/JavaScript_typed_arrays). Followup? Good-first-bug for investigation, even?
Comment 12 Heather Arthur [:harth] 2012-07-20 08:42:34 PDT
Hey guys, I wrote the algorithm you're planning on using there, and I'm sorry I didn't get on this bug faster.

I wrote that awhile ago and if I had to implement it again I'd probably look for other solutions. Hierarchical clustering does a good job, but as an algorithm is a pretty slow way of achieving this.

I tried TypedArrays a bit ago and I don't think it helped much at all.
Comment 13 Heather Arthur [:harth] 2012-07-20 09:04:26 PDT
Just mentioned this in an email, but you can re-license my code or do whatever you want with it.

As for rgb.js, I used it because it was convenient, but really all you need is a function that converts RGB to LAB, which is a common color conversion algorithm that you could just steal, see: https://github.com/harthur/color-convert/blob/master/conversions.js#L142 (also my code, you can just take the individual function without attribution).

If it works fine, then it could be a short-term solution, but we should investigate faster algos.
Comment 14 Andrew Hurle [:ahurle] 2012-07-28 21:49:53 PDT
Created attachment 646940 [details] [diff] [review]
v3 - More cleanup, use chroma instead of saturation, comments

Some notes:
- I've changed from using the saturation from HSV to using chroma calculated from Lab.  Saturation has a problem where you can have, say, hsl(0, 100%, 5%) and hsl(0, 80%, 30%), and we will prefer the first very dark color since it has higher saturation, even though it's close to black.  Chroma is better suited for the human perception of the intensity of color, and doesn't have this problem.  It also lets us get rid of the HSV conversion function in exchange for a one line chroma calculation.

- I've clarified/renamed/cleaned up some things that weren't specifically mentioned in the review.  Terminology should be clearer now.

- Dump statements/timers will go away once I know I don't have to make any more changes that can affect performance or accuracy

- I've left everything in one file for now - Matt, that's easier for you if you want to view the diffs since the last patch, right?


(In reply to Matthew N. [:MattN] from comment #10)
> I would still prefer you use let instead of var in all of these new files. 
> If there is a major performance difference in a specific case (ie. a loop),
> use var in that case and file a bug then reference that bug in a comment
> near the use of "var".  A quick search didn't show existing bugs about the
> performance of let.

I'll do this later - the seven |var|s left in there will reliably increase my test running time by 200 - 1000 ms each if I switch them to |let| (tried it on OSX, too).  The other ones I changed didn't seem to have an effect.  I think I'll just try to get one good testcase and file that, rather than seven :)

> @@ +199,5 @@
> > +  // a high frequency.  If so, make their distance seem larger, which makes
> > +  // them less likely to be merged.
> > +  if ((r1 > 0.2 && r2 > 0.2) || (f1 > 200000 && f2 > 200000)) {
> > +    handicap = 5;
> > +  } else if ((r1 > 0.14 && r2 > 0.14) || (f1 > 100000 && f2 > 100000)) {
> 
> Are frequencies in the second case referring to the number of pixels of the
> high color?  If so, then shouldn't the ranges be relative to size of the
> image?

Yes, it refers to the number of pixels of the high color.  This whole block doesn't really make sense to me - why use only the data about the single highest color when you could be scaling based on the size of the cluster as a whole?  Isn't the point to make it hard to merge big clusters?  I've changed it to make more sense and be a lot shorter.

> @@ +307,5 @@
> > +  for (var i = 0; i < coordinates.length; i++) {
> > +    var offs = (coordinates[i][0] * 4) + (coordinates[i][1] * 4 * width);
> > +    if (data[offs + PIXEL_ALPHA] < 20) {
> > +      // we can't make very accurate judgements below this opacity
> > +      return null;
> 
> I don't think we want to bail from getBackgroundColor if one color has a low
> alpha.  It seems like "continue" would be good and then bail from the
> function after the loop |if (!cornerColors.length)|

Perhaps we shouldn't bail immediately, but I don't think it's a good idea to assert that we've found the background color based on a single data point if all other corners are transparent.  Two is slightly sane, three is meh.  I'll accept (4 > cornerColors > 1), but require them to be closer together if there are fewer points.


> @@ +381,5 @@
> > +    var clusters = [];
> > +    var dists = [];  // distances between each pair of clusters
> > +    var mins = []; // closest cluster for each cluster
> > +    var index = []; // keep a hash of all clusters by key
> > +    for (var i = 0; i < items.length; i++) {
> 
> let item in items

Looks like for...in loops are incredibly slow compared to the old fashioned way, and shouldn't be used on arrays according to MDN docs.  Makes my tests run 6x slower.

> @@ +406,5 @@
> > +
> > +
> > +    var toMerge = this.closestClusters(clusters, dists, mins);
> > +    var i = 0;
> > +    while (toMerge) {
> 
> It seems like this can be written as a for loop at quick glance:
> for (let i = 0; toMerge = this.closestClusters(clusters, dists, mins); i++)
> { (with toMerge defined above the loop).  This reduces the number of lines
> and the duplication of the call to closestClusters.

Alright, hopefully you're okay with what I did to make it fit on one line
Comment 15 Matthew N. [:MattN] (In Taipei until Sep. 23) 2012-08-03 12:36:30 PDT
Comment on attachment 646940 [details] [diff] [review]
v3 - More cleanup, use chroma instead of saturation, comments

Review of attachment 646940 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Justin Dolske [:Dolske] from comment #11)
> Comment on attachment 641380 [details] [diff] [review]
> * Can't help but wonder if some Typed Array goodness would be an
> optimization to look at
> (https://developer.mozilla.org/en/JavaScript_typed_arrays). Followup?

Luckily ImageData returned from the canvas is a typed array (Uint8ClampedArray).


r=me. Great job! We can land this once we split the ColorAnalyzer_worker.js.

::: toolkit/components/places/ColorAnalyzer_worker.js
@@ +91,5 @@
> +    // metadata holds a bunch of information about the color represented by
> +    // this cluster
> +    let metadata = cluster.item;
> +
> +    // how much of the image this color is responsible for is the basis for

Typo: "is responsible for is the basis for"

@@ +346,5 @@
> + * @param height
> + *        The height of the image
> + *
> + * @return The background color of the image as a Lab object, or null if we
> + *          can't determine the background color

Nit: remove extra space here to align remark

@@ +476,5 @@
> +
> +    // initialize distance matrix and cached neighbors
> +    for (let i = 0; i < clusters.length; i++) {
> +      for (let j = 0; j <= i; j++) {
> +        var dist = (i == j) ? Infinity :

Was this one of the cases where "let" had a perf. impact?

@@ +653,5 @@
> +};
> +
> +
> +
> +/////// rgb.js

Nit: How about colorConversion.js since they don't all involve rgb?

@@ +716,5 @@
> +
> +  return {
> +    lightness: (116 * y) - 16,
> +            a:  500 * (x - y),
> +            b:  200 * (y - z)

Nit: should probably just align these with lightness since not much is gained from aligning the colons IMO.
Comment 16 Andrew Hurle [:ahurle] 2012-08-03 14:18:05 PDT
Created attachment 648855 [details] [diff] [review]
v4 - Split worker into multiple files, remove dumps, address comments

Try push: https://tbpl.mozilla.org/?tree=Try&rev=b7c5ee41ecb3
Comment 17 Matthew N. [:MattN] (In Taipei until Sep. 23) 2012-08-15 01:01:26 PDT
Great job Andrew!

https://hg.mozilla.org/integration/mozilla-inbound/rev/62cc5a20b2f1

Try results: https://tbpl.mozilla.org/?tree=Try&rev=bb9965f70f1f plus local Windows testing
Comment 18 Ed Morley [:emorley] 2012-08-15 09:46:02 PDT
https://hg.mozilla.org/mozilla-central/rev/62cc5a20b2f1
Comment 19 Mardeg 2012-08-15 09:51:31 PDT
Will the Addon-kit have access to this?
If so, will that need to be documented somewhere?
Comment 20 Jim Mathies [:jimm] 2012-09-20 14:43:45 PDT
I was trying to work with this module today and ran into security problems in loading into hidden window. Is this expected or should I file a follow up bug?

[JavaScript Error: "ColorAnalyzer: image at moz-anno:favicon:http://support.mozilla.org/media/img/favicon.ico didn't load" {file: "file:///T:/Mozilla/ELM-REL/dist/bin/components/ColorAnalyzer.js" line: 60}]

success: false 0 

uri: moz-anno:favicon:http://support.mozilla.org/media/img/favicon.ico 

Security Error: Content at resource://gre-resources/hiddenWindow.html may not load or link to moz-anno:favicon:http://support.mozilla.org/media/img/favicon.ico.
Comment 21 Matthew N. [:MattN] (In Taipei until Sep. 23) 2012-09-20 15:10:20 PDT
(In reply to Jim Mathies [:jimm] from comment #20)
> I was trying to work with this module today and ran into security problems
> in loading into hidden window. Is this expected or should I file a follow up
> bug?

This is expected as we're using the hiddenWindow which is unprivileged on non-OSX.  I don't know how we would make this work unless we resolve moz-anno:favicon:* to an HTTP URI in ColorAnalyzer.js before setting the image source. In this case you can just give the http URI rather than the moz-anno one (eg. http://support.mozilla.org/media/img/favicon.ico ) yourself. This is something which would probably need to be handled for bug 779027 too.
Comment 23 Marco Bonardo [::mak] 2012-12-06 09:21:04 PST
So, how did this code ended up into toolkit/components/places rather than as a separate component or in toolkit/content?

I filed Bug 818936 to find a proper home for this stuff.

Note You need to log in before you can comment on or make changes to this bug.