Last Comment Bug 710259 - New Safari migrator (OS X & Windows)
: New Safari migrator (OS X & Windows)
Status: RESOLVED FIXED
dev-doc-need for new PropertyListUtil...
: dev-doc-needed
Product: Firefox
Classification: Client Software
Component: Migration (show other bugs)
: unspecified
: All All
: P1 normal (vote)
: Firefox 14
Assigned To: Mano (::mano, needinfo? for any questions; not reading general bugmail)
:
Mentors:
: 388053 (view as bug list)
Depends on: 326701 713818 713820 737197 713519 713830 713832 718608 749548
Blocks: 718280 737861 396647 707224 739213
  Show dependency treegraph
 
Reported: 2011-12-13 09:29 PST by Mano (::mano, needinfo? for any questions; not reading general bugmail)
Modified: 2013-12-27 14:28 PST (History)
12 users (show)
mozillamarcia.knous: in‑litmus?
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
required CF stuff + preferences part, mostly (16.04 KB, text/plain)
2011-12-13 09:29 PST, Mano (::mano, needinfo? for any questions; not reading general bugmail)
no flags Details
that, + bookmarks (22.94 KB, text/plain)
2011-12-14 06:45 PST, Mano (::mano, needinfo? for any questions; not reading general bugmail)
no flags Details
very very close (102.81 KB, patch)
2011-12-21 07:44 PST, Mano (::mano, needinfo? for any questions; not reading general bugmail)
no flags Details | Diff | Review
Starting to work on Windows (111.43 KB, patch)
2011-12-23 18:47 PST, Mano (::mano, needinfo? for any questions; not reading general bugmail)
no flags Details | Diff | Review
some cleanup (111.41 KB, patch)
2011-12-24 07:08 PST, Mano (::mano, needinfo? for any questions; not reading general bugmail)
no flags Details | Diff | Review
checkpoint (111.94 KB, patch)
2011-12-27 06:49 PST, Mano (::mano, needinfo? for any questions; not reading general bugmail)
no flags Details | Diff | Review
bug fixes (110.68 KB, patch)
2011-12-28 05:49 PST, Mano (::mano, needinfo? for any questions; not reading general bugmail)
no flags Details | Diff | Review
Dict.jsm changes + test (2.56 KB, patch)
2012-01-02 02:55 PST, Mano (::mano, needinfo? for any questions; not reading general bugmail)
sid.bugzilla: review-
Details | Diff | Review
Property List Utils + tests (45.17 KB, patch)
2012-01-02 02:57 PST, Mano (::mano, needinfo? for any questions; not reading general bugmail)
no flags Details | Diff | Review
Dict changes + tests with comments addressed (4.81 KB, patch)
2012-01-12 10:18 PST, Mano (::mano, needinfo? for any questions; not reading general bugmail)
sid.bugzilla: review+
dtownsend: superreview+
Details | Diff | Review
Property List Utils + tests (41.31 KB, patch)
2012-01-15 04:49 PST, Mano (::mano, needinfo? for any questions; not reading general bugmail)
mak77: review+
dtownsend: superreview+
Details | Diff | Review
part 1, as checked in (39.22 KB, patch)
2012-01-22 13:54 PST, Mano (::mano, needinfo? for any questions; not reading general bugmail)
no flags Details | Diff | Review
Safari migrator (42.85 KB, patch)
2012-03-23 15:50 PDT, Mano (::mano, needinfo? for any questions; not reading general bugmail)
mak77: feedback+
Details | Diff | Review
patch (86.51 KB, patch)
2012-04-09 04:17 PDT, Mano (::mano, needinfo? for any questions; not reading general bugmail)
mak77: review+
Details | Diff | Review
for checkin (88.89 KB, patch)
2012-04-14 12:06 PDT, Mano (::mano, needinfo? for any questions; not reading general bugmail)
no flags Details | Diff | Review

Description Mano (::mano, needinfo? for any questions; not reading general bugmail) 2011-12-13 09:29:15 PST
Created attachment 581302 [details]
required CF stuff + preferences part, mostly

Ben Goodger wrote the Safari profile migrator quite a few years ago. I finished his work in 6/2005. The code has aged quite a bit, so did I.

Anyway, I'm rewriting the migrator in JS+ctypes. As part of the rewrite, I'll also update the migrator a little bit:
1. Use new async places apis.
2. Migrate newer preferences, like its phishing-blocker and window-to-tab behavior.
3. Stop migrating stuff the user cannot undo.
4. More!

A little check point, mostly for myself, is attached. Ignore the file name and its suffix.  It doesn't implement the migration API yet (I'm testing in the Error Console).
Comment 1 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2011-12-14 06:45:42 PST
Created attachment 581627 [details]
that, + bookmarks
Comment 2 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2011-12-21 07:44:03 PST
Created attachment 583492 [details] [diff] [review]
very very close
Comment 3 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2011-12-23 18:47:04 PST
Created attachment 584153 [details] [diff] [review]
Starting to work on Windows
Comment 4 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2011-12-24 02:47:09 PST
Various notes:
(a) Since Safari 3, a "DownloadPath" preference is used instead of "NSDefaultOpenDir". Since we only support 10.5+ (meaning Safari 4+), the migrator should only check for the new preference.
(b) Need to decide what to do about Safari's Top Sites feature.  Some users tend to use them instead of regular bookmarks, and the UI encourages that.  The current migrator will not take care of them at all, because Top Sites are saved in a separate Property List. There are three options:
  (1) Don't import Top Sites at all
  (2) Import them directly under the Bookmarks Menu Folder (or under "From Safari" folder in case of non-initial Import).
  (3) Import them in a "Top Sites" folder under the Bookmarks Menu folder (or under From Safari, as above).

I think (2) is the way to go. (1) is not most users would expect, and (3) leads the user to think that we have a Top Sites feature.

(c/b2) There's a similar problem with Safari's Reading List feature. Two options:
  (1) Don't import them it all.
  (2) Import them as unsorted bookmarks.

(d) While Safari's RSS feature is different from ours (It's a little bit less intuitive but a little bit more useful), we could support it in the migrator.

(e) We can import Spell Check as I type preference (WebContinuousSpellCheckingEnabled->layout.spellcheckDefault).

(f) We can paritally import the selection of items in the Reset Safari dialog. More on that later, maybe.

(h) Similar to that, we can import Clear on Exit preferences.

(i) Not sure about Safari's Enable Java preference.

(j) Fradulent websites preference can be imported.
Comment 5 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2011-12-24 07:08:10 PST
Created attachment 584193 [details] [diff] [review]
some cleanup
Comment 6 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2011-12-24 09:46:21 PST
*** Bug 388053 has been marked as a duplicate of this bug. ***
Comment 7 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2011-12-27 06:49:42 PST
Created attachment 584429 [details] [diff] [review]
checkpoint

Random notes:

1. Migration of all fonts settings and default charset now sort-of-works (much better than in the old migrator, but still sucks for intl).
2. Migrating downloads folder works as expected for special ~/Donwlods folder. The old migrator stopped importing the downloads folder setting back when Safari 3 was released.
3. Migrating spell-checker-enabled pref.
4. Migration of "Remove Download list items when Safari quits" now works with the user-facing prefs (clearOnShutdown.downloads) and not with the old download manager pref.
5. A lot of comments added in this revision.

Remaining:
1. Import Top Sites. This is a must-have given how it's promoted in Safari as a folder for bookmarks (as opposed to Reading List items). I'm still not sure if they should go to the bookmarks menu or to a sub folder (I'm referring to first-time migration).
2. Decide what to do with Reading List items. I think I'll go with Unsorted Bookmarks.
3. Import clear-private-data settings.
4. Import accept-cookies on Windows.

Asking for first-pass, as all of this items are just additions.
Comment 8 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2011-12-28 05:49:20 PST
Created attachment 584556 [details] [diff] [review]
bug fixes
Comment 9 Marco Bonardo [::mak] 2011-12-28 13:12:58 PST
Comment on attachment 584556 [details] [diff] [review]
bug fixes

Review of attachment 584556 [details] [diff] [review]:
-----------------------------------------------------------------

I still have to go through the 2 biggest files (SafariProfileMigrator.js and PropertyListReader.jsm), btw I prefer publishing comments in parts.
Actually, if PropertyListReader.jsm and the Dict.jsm changes would be in separate patches, would even be better!

::: browser/components/migration/content/migration.xul
@@ +68,4 @@
>        <!-- If you are adding a migrator, please add the appropriate
>             hooks to GetDefaultBrowserMigratorKey in
>             browser/components/migration/src/nsProfileMigrator.cpp -->
> +#ifndef XP_GNOME

OT: I honestly dislike this naming, using gnome to mean unix-not-mac is a bit nonsense :)

::: browser/components/migration/src/Makefile.in
@@ +56,4 @@
>  ifeq ($(OS_ARCH)_$(GNU_CXX),WINNT_)
>  CPPSRCS += nsIEProfileMigrator.cpp \
>             $(NULL)
> +endif     

some trailing spaces

::: browser/components/migration/src/MigrationUtils.jsm
@@ +14,5 @@
> + *
> + * The Original Code is Profile Migration Utils
> + *
> + * The Initial Developer of the Original Code is
> + * Mozilla Foundation.

nit: should be "the Mozilla Foundation"

@@ +41,5 @@
> +const Cu = Components.utils;
> +
> +Cu.import("resource://gre/modules/Services.jsm");
> +Cu.import("resource://gre/modules/FileUtils.jsm");
> +Cu.import("resource://gre/modules/PlacesUtils.jsm");

nit: While most likely doesn't matter, I'd probably use defineLazyModuleGetter on PlacesUtils

@@ +47,5 @@
> +const MIGRATION_BUNDLE = "chrome://browser/locale/migration/migration.properties";
> +
> +EXPORTED_SYMBOLS = ["MigrationUtils"];
> +
> +let MigrationUtils = {

global comment valid for all methods in this module: some documenting javadocs on methods please, since this should be used by others.

@@ +50,5 @@
> +
> +let MigrationUtils = {
> +  migrateItems: function(aMigrationObjects, aWantedItemsBitField, aReplace) {
> +    function notify(aMsg, aData)
> +      Services.obs.notifyObservers(null, "Migration:" + aMsg, aData);

I'd honestly prefer if we'd have a global scoped
let MESSAGE = {
  BEFORE_MIGRATE: "Migration:ItemBeforeMigrate",
  AFTER_MIGRATE: "Migration:ItemAfterMigrate",
  ERROR: ... and so on, it's simpler than having to build strings mentally when reading code, imo.
}

@@ +59,5 @@
> +    // stands for "all items".
> +    let pendingItems = wantedItems.length;
> +
> +    notify("Started");
> +    wantedItems.forEach(function(itemType) {

since we are tracking 2 nested countdowns and that's a bit confusing, maybe we could better distinguish the two using Array.shift() and while (Array.length) for the external wantedItems loop?
then you could just notify Ended after the while loop.

@@ +61,5 @@
> +
> +    notify("Started");
> +    wantedItems.forEach(function(itemType) {
> +      let itemMigObjs = [migObj for each (migObj in aMigrationObjects)
> +                         if (migObj.type == itemType && migObj.exists)];

this is practically filtering the available objects against the requested ones, right?
Wouldn't be simpler to use Array.filter and then just walk the remaining array?
Something like:
let migrators = aMigrationObjects.filter(function(obj) {return wantedItems.indexOf(Obj.type) != -1 && obj.exists;});
while (migrators.length) {
  let migrator = migrators.shift();
  ...

@@ +87,5 @@
> +  },
> +
> +  getMigrateData: function(aMigrationObjects)
> +    [migObj.type for each (migObj in aMigrationObjects)
> +     if (migObj.exists)].reduce(function(a, b) a |= b),

nit: fwiw, since aMigrationObjects is already an array (I find comprehension more useful to make arrays out of objects), you may even just
  aMigrationObjects.filter(function (obj) obj.exists)
                   .reduce(function(a, b) a |= b);

@@ +94,5 @@
> +    const BPM = Ci.nsIBrowserProfileMigrator;
> +    let all = [BPM.SETTINGS, BPM.COOKIES, BPM.HISTORY, BPM.FORMDATA,
> +               BPM.FORMDATA, BPM.PASSWORDS, BPM.BOOKMARKS, BPM.OTHERDATA];
> +    if (aBitField == BPM.ALL)
> +      return all;

I suppose in future we'll want to expose constants from the module (if we deCOM), maybe we could already provide shortcuts that map to the idl values?

@@ +106,5 @@
> +   let sourceName = bundle.GetStringFromName("sourceName" + aSourceNameStr);
> +   let folderName = bundle.formatStringFromName("importedBookmarksFolder",
> +                                                [sourceName], 1);
> +   let bms = PlacesUtils.bookmarks;
> +   return bms.createFolder(aParentId, folderName, bms.DEFAULT_INDEX);

may we avoid shortcutting PlacesUtils.bookmarks here? I don't see a compelling reason to do that

::: toolkit/content/Dict.jsm
@@ +116,5 @@
>    /**
> +   * Sets a lazy getter function for a key's value. If the key is a not a string,
> +   * it will be converted to a string before the set happens.
> +   */
> +  setWithLazyGetter: function Dict_setWithLazyGetter(aKey, aGetterFunction) {

imo s/setWith/setAs/
Btw, whoever worked on this module should probably co-review these changes.

@@ +123,5 @@
> +    if (!items.hasOwnProperty(prop))
> +      this._state.count++;
> +
> +    Object.defineProperty(items, prop, {
> +      get: function() {

label this anon function, somehow

@@ +125,5 @@
> +
> +    Object.defineProperty(items, prop, {
> +      get: function() {
> +        delete items[prop];
> +        return items[prop] = aGetterFunction();

should we apply a scope, like Object, here? (similarly to defineLazyGetter)
Comment 10 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2011-12-29 01:29:02 PST
Thanks for reviewing this part. I'll work on your comments and post separate patches as soon as I can.

Couple of disagreements though:
1. I really do like and prefer the Array comprehension syntax, a lot.  If it's just a matter of personal taste, I prefer to leave it the way it is.
2. :

@@ +125,5 @@
> +
> +    Object.defineProperty(items, prop, {
> +      get: function() {
> +        delete items[prop];
> +        return items[prop] = aGetterFunction();

should we apply a scope, like Object, here? (similarly to defineLazyGetter)

Now that we have function.bind in place, I prefer encouraging callers to adopt it.
Comment 11 Marco Bonardo [::mak] 2011-12-29 05:00:58 PST
(In reply to Mano from comment #10)
> 1. I really do like and prefer the Array comprehension syntax, a lot.  If
> it's just a matter of personal taste, I prefer to leave it the way it is.

Oh I like that syntax as well, but I usually compare the usual syntax and the array comprehension and try to take the clearer one. In some case the comprehension is really good, in some case it obfuscate a really simple filtering. Btw that was a nit.

> should we apply a scope, like Object, here? (similarly to defineLazyGetter)
> 
> Now that we have function.bind in place, I prefer encouraging callers to
> adopt it.

Sure, but if they don't, should we give them fallback?
Comment 12 Marco Bonardo [::mak] 2011-12-29 07:51:50 PST
Comment on attachment 584556 [details] [diff] [review]
bug fixes

Review of attachment 584556 [details] [diff] [review]:
-----------------------------------------------------------------

This is about the global plist module structure. Did not yet go through the (binary/xml) parsing code.

::: toolkit/content/PropertyListReader.jsm
@@ +14,5 @@
> + *
> + * The Original Code is Property List Reader.
> + *
> + * The Initial Developer of the Original Code is
> + * Mozilla Foundation.

nit: "the Mozilla Foundation"

@@ +35,5 @@
> + * the terms of any one of the MPL, the GPL or the LGPL.
> + *
> + * ***** END LICENSE BLOCK ***** */
> +
> +EXPORTED_SYMBOLS = ["ReadPropertyList", "ReadPropertyListFile"];

It's not mandatory to export an object named as the module, though imo would be nicer to have this just export a PropertyListReader object with fromFile, fromArrayBuffer and maybe getArrayBufferFormat() and getFileFormat() helpers returning PropertyListReader.TYPE_BINARY or PropertyListReader.TYPE_XML to detect the format. Otherwise, adding functionality in future, will have to keep exporting more symbols, with additional risk of naming conflicts.

@@ +67,5 @@
> +Cu.import("resource://gre/modules/Services.jsm");
> +Cu.import("resource://gre/modules/Dict.jsm");
> +
> +// For working with 64-bit integers.
> +Cu.import("resource://gre/modules/ctypes.jsm");

both Dict and ctypes may be lazyModuleGetter since they may not yet be loaded and are not needed immediately.
On a side note, there is a bug in lazyModuleGetter, where it only exports a single symbol, instead of all the symbols exported by the module, I should file a bug about that, soon or later, dunno if it would hurt you here. It should probably also inject any other enumberable property in the temp scope to the provided Object.

@@ +76,5 @@
> + * @param aFile
> + *        The file from which to read the property list (nsILocalFile).
> + * @param aCallback(aPropertyListRoot)
> + *        callback function to be called when the property is read or in case of an
> + *        error.  If the property list was read successfully, aPropertyListRoot

should clarify somewhere how the caller identifies errors, just by checking if the callback argument is null?

@@ +78,5 @@
> + * @param aCallback(aPropertyListRoot)
> + *        callback function to be called when the property is read or in case of an
> + *        error.  If the property list was read successfully, aPropertyListRoot
> + *        is set to the root object of the property list.  Usually, it would be
> + *        either a dictionary object (see Dict.jsm) or an array.

"when the property is read" sounds like the callback is invoked once for each property, while it's not.

"Usually, it would be either a dictionary object (see Dict.jsm) or an array." should clarify when either of them happen.  Should the caller always check it or is this predictable based on input or on properties of the file? Is is possible to unify both to a single kind of output, so there is not that feeling that "this will return something, up to you figure out what".

@@ +81,5 @@
> + *        is set to the root object of the property list.  Usually, it would be
> + *        either a dictionary object (see Dict.jsm) or an array.
> + */
> +function ReadPropertyListFile(aFile, aCallback) {
> +  if (!aFile.exists()) {

may be worth to also check aFile instanceof Ci.nsILocalFile?

@@ +117,5 @@
> + * @returns the root object of the property list.  Usually, it would be either
> + *          a dictionary object (see Dict.jsm) or an array.
> + * @throws if the buffer could not be read as a property list.
> + */
> +function ReadPropertyList(aBuffer) {

I find the differences between the two exposed methods puzzling. one gets a callback and never throws, the other one returns synchronously and throws, though the only apparent difference from the outside is the input. Would be healthier to have some coherence in input/output/exceptions handling in exposed stuff.
I'd probably have both get a callback and make this fake-async for now. Who knows, in future the parsing may be effectively done in a worker.
Comment 13 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2011-12-29 10:54:42 PST
(In reply to Marco Bonardo [:mak] from comment #12)

> It's not mandatory to export an object named as the module, though imo would
> be nicer to have this just export a PropertyListReader object with fromFile,
> fromArrayBuffer and maybe getArrayBufferFormat() and getFileFormat() helpers
> returning PropertyListReader.TYPE_BINARY or PropertyListReader.TYPE_XML to
> detect the format. Otherwise, adding functionality in future, will have to
> keep exporting more symbols, with additional risk of naming conflicts.
>

1. Scoping the methods in an object - sure.
2. What's the use case for getFileFormat? There's no difference in the JS representation of the property list.

> @@ +67,5 @@
> > +Cu.import("resource://gre/modules/Services.jsm");
> > +Cu.import("resource://gre/modules/Dict.jsm");
> > +
> > +// For working with 64-bit integers.
> > +Cu.import("resource://gre/modules/ctypes.jsm");
> 
> both Dict and ctypes may be lazyModuleGetter since they may not yet be
> loaded and are not needed immediately.

Will try.

> @@ +76,5 @@
> > + * @param aFile
> > + *        The file from which to read the property list (nsILocalFile).
> > + * @param aCallback(aPropertyListRoot)
> > + *        callback function to be called when the property is read or in case of an
> > + *        error.  If the property list was read successfully, aPropertyListRoot
> 
> should clarify somewhere how the caller identifies errors, just by checking
> if the callback argument is null?

I think null argument is enough.  We could pass the exception in as a second argument, but again, I cannot see a use case. For the sake of debugging, the exceptions are reported to the error console.

> "Usually, it would be either a dictionary object (see Dict.jsm) or an
> array." should clarify when either of them happen.  Should the caller always
> check it or is this predictable based on input or on properties of the file?
> Is is possible to unify both to a single kind of output, so there is not
> that feeling that "this will return something, up to you figure out what".

Well, I'm not sure what you mean here. As noted in the header of this file, all plist objects (number/boolean/data/date/string/array/dictionary) are represented, in this implementation, by normal js objects, with the exception of NSDictionary, for which we use Dict.jsm.  All property lists have a root object, which, in theory, could be of any of these types. However, in practice, only dict and array are used (btw, the XCode editor can edit array & dict rooted plists, but only create dict plists).

So, yes, the caller should either know what's the expected structure of the plist in question, or check it manually, not just for the root object, but for any object in this list (a key in dictionary could point to another dictionary, for example).

Now, we could make type-checking a little bit easier, and I considered doing so when I wrote this reader.
My idea was to use primitive-wrappers when possible and set the property-list-type as a property on the wrapper (or on the object i, namely:

new Boolean ...  .propertyListType = "bool" (for boolean);
new Number                         = "number" (for number);
new String                         = "string"
null (no way to wrap that nicely, afaict)
new Array                          = "array"
new Dict                           = "dictionary"
new ArrayBuffer                    = "data"
new Date                           = "date"

This may save Array.isArray + Dict.prototype.isPrototypeOf + ArrayBuffer.prototype.isPrototypeOf checks (the other types are just primitives), but I think it's an overkill. Bear in mind most of the time the structure of the property list is knows, thus usually only one check is needed, for example

if (!Array.isArray(plistObject))
  throw "Unexpected plist structure";

Another options is to add an helper to the module that returns the property list type for a given object or primitive.
Comment 14 Marco Bonardo [::mak] 2011-12-29 13:08:35 PST
(In reply to Mano from comment #13)
> 2. What's the use case for getFileFormat? There's no difference in the JS
> representation of the property list.

no idea, I was mostly thinking of future utils since this is a generic reader, no need to add those till they are needed.

> > should clarify somewhere how the caller identifies errors, just by checking
> > if the callback argument is null?
> 
> I think null argument is enough.  We could pass the exception in as a second
> argument, but again, I cannot see a use case. For the sake of debugging, the
> exceptions are reported to the error console.

ok, just clarify in the method comment.

> All property lists
> have a root object, which, in theory, could be of any of these types.
> However, in practice, only dict and array are used (btw, the XCode editor
> can edit array & dict rooted plists, but only create dict plists).

Ah, I see, I was confused a bit by the comment, so adding some of this explanation to it would help. It's unfortunate there isn't a single type for the root, but looks like it's the format itself dictating that.

> Another options is to add an helper to the module that returns the property
> list type for a given object or primitive.

this sounds interesting, rather than strings you may use constants exposed by the module object. Though actually the problematic cases are mostly ArrayBuffer and Dict since the other checks are common js checks. Maybe you could just Expose isArrayBuffer() and isDict() helpers.
Comment 15 Marco Bonardo [::mak] 2011-12-29 16:09:23 PST
Comment on attachment 584556 [details] [diff] [review]
bug fixes

Review of attachment 584556 [details] [diff] [review]:
-----------------------------------------------------------------

some comments on the parsing

::: toolkit/content/PropertyListReader.jsm
@@ +137,5 @@
> +// Here's the base structure of binary plists:
> +// * First four bytes - the magic number ("bplist")
> +// * Last 32 bytes - trailer with info about the offset of the offset table,
> +//   the size of each int the offset tabel, the number of objects, and the
> +//   root-object index in the offset table.

typo: "the size of each int the offset tabel" (both the nonsense phrase, and tabel)

@@ +141,5 @@
> +//   root-object index in the offset table.
> +// * Offset Table - the offset of each object. The integer size of the
> +//   references is specified in the trailer.
> +// * Objects - Each object starts at a offset pointed in the offset table.
> +//   See _readObject for how vairous types of objects are constructed.

typo: vairous

Btw, I think the comment should better separate the structure parts

[ Header (8 bytes) ]
* 6 bytes - "bplist" marker
* 2 bytes - format version, as 0 followed by any char. 00 is the only supported version.

[ Objects Table ]
Variable sized objects, see _redObject for how various types of objects are constructed.

[ Offsets Table ]
The offset of each object. The integer size is specified in the trailer.

[ Trailer (32 bytes) ]
* 5 bytes - unused, 0-filled
* 1 byte - unused, sort version
* 1 byte - size of integers in the offsets table
* 1 byte - size of references for arrays and dicts
* 8 bytes - number of offsets in the offsets table
* 8 bytes - index of the root object's offset in the offsets table
* 8 bytes - offset of the offsets table

@@ +145,5 @@
> +//   See _readObject for how vairous types of objects are constructed.
> +function BinaryFormatPropertyList(aBuffer) {
> +  this._buffer = aBuffer;
> +
> +  let jsMaxInt = Math.pow(2,53);

nit: this may be a const

@@ +156,5 @@
> +
> +  this._objects = [];
> +}
> +
> +BinaryFormatPropertyList.prototype = {

Give names to each function, many are lacking one.

Add javadocs to functions whose name is not self documenting

@@ +158,5 @@
> +}
> +
> +BinaryFormatPropertyList.prototype = {
> +  /* "static" */
> +  canProcess: function BinaryFormatPropertyList_canProcess(aBuffer) {

Mixing up js and c concepts doesn't look that good :)
maybe you could add this as an helper to the global PropertyListReader object... well, this may actually be the getArrayBufferFormat() helper I suggested in previous review step :)

@@ +161,5 @@
> +  /* "static" */
> +  canProcess: function BinaryFormatPropertyList_canProcess(aBuffer) {
> +    let magicNumber = [String.fromCharCode(c) for each (c in Uint8Array(aBuffer, 0, 8))].
> +                      join("");
> +    return magicNumber == "bplist00";

note that leopard also accept bplist01, and snow leopard accepts /bplist0.+/. even if I have not found any good description of the differences, may be just they decided to keep the format backwards compatible.

@@ +168,5 @@
> +  get root() this._readObject(this._rootObjectIndex),
> +
> +  _readTrailer: function BinaryFormatPropertyList__readTrailer() {
> +    // The trailer is the last 32 bytes of the stream, but the first 6 bytes are
> +    // unused.

just "// Skip the first 6 bytes, since they are unused", all the rest is already documented above

@@ +174,5 @@
> +    [this._offsetTableIntegerSize, this._objectRefSize] =
> +      this._readUnsignedInts(trailerOffset, 1, 2);
> +
> +    [this._numberOfObjects, this._rootObjectIndex, this._offsetTableOffset] =
> +      this._readUnsignedInts(trailerOffset + 2, 8, 3);

According to the plist documentation, all 8 or 16 bytes integers have to be considered signed, while all 1,2,4 bytes ones are always unsigned, so readUnsignedInts with size 8 sounds a bit strange. Maybe should be just _readInts().
The source doesn't seem to distinguish for integers used in objects or in the file header/trailer/descriptors.

@@ +187,5 @@
> +  // @@mano: remove when DataView is landed.
> +  _swapForBigEndian: function(aByteOffset, aIntSize, aNumberOfInts) {
> +    let bytesCount = aIntSize * aNumberOfInts;
> +    let bytes = new Uint8Array(this._buffer, aByteOffset, bytesCount);
> +    let swapedBytes = new Uint8Array(bytesCount);

typo: I think should be "swapped", not "swaped"

@@ +190,5 @@
> +    let bytes = new Uint8Array(this._buffer, aByteOffset, bytesCount);
> +    let swapedBytes = new Uint8Array(bytesCount);
> +    for (let i = 0; i < aNumberOfInts; i++) {
> +      for (let j = 0; j < aIntSize; j++) {
> +        swapedBytes[i * aIntSize + j] = bytes[i * aIntSize +  aIntSize - 1 - j];

while operators precedence is clear, having some explicit parenthesis may be less error prone

@@ +197,5 @@
> +
> +    return swapedBytes;
> +  },
> +
> +  _readSignedInt64: function(aByteOffset) {

As I said, I think there is no doubt all int64 values are signed, so I'm not sure if there's really the need to differentiate from the other readUnsignedInts method... do you have evidence this is false?
I think we can have a single method to rule them all :)

@@ +198,5 @@
> +    return swapedBytes;
> +  },
> +
> +  _readSignedInt64: function(aByteOffset) {
> +    let swapedBytes = this._swapForBigEndian(aByteOffset, 8, 1);

ditto: swapped (search & replace, so I don't have to further comment on this!)

@@ +210,5 @@
> +    return parseInt(int64.toString(), 10);
> +  },
> +
> +  _readReal: function(aByteOffset, aRealSize) {
> +    let invertedBytes = this._swapForBigEndian(aByteOffset, aRealSize, 1);

nit: be consistend using either "inverted" or "swapped" prefix

@@ +222,5 @@
> +
> +  _readDate: function(aByteOffset) {
> +    const NSDATE_REFERENCE_DATE = Date.parse("1 January 2001, GMT");
> +    let secondsSinceReferenceDate = this._readReal(aByteOffset, 8);
> +    return new Date(NSDATE_REFERENCE_DATE + secondsSinceReferenceDate * 1000);

I think you may do:
let date = new Date("1 January 2001, GMT");
date.setSeconds(this._readReal(aByteOffset, 8));
return date;

@@ +225,5 @@
> +    let secondsSinceReferenceDate = this._readReal(aByteOffset, 8);
> +    return new Date(NSDATE_REFERENCE_DATE + secondsSinceReferenceDate * 1000);
> +  },
> +
> +  _createString: function(aByteOffset, aNumberOfChars, aUnicode) {

why is this named _createString rather than _readString to cope with the other readers?

@@ +226,5 @@
> +    return new Date(NSDATE_REFERENCE_DATE + secondsSinceReferenceDate * 1000);
> +  },
> +
> +  _createString: function(aByteOffset, aNumberOfChars, aUnicode) {
> +    // Note about about unicode surrogate pairs: all unicode chars

typo: "about about"

@@ +269,5 @@
> +          if (ctypes.UInt64.compare(uint64, this._JS_MAX_INT_UNSIGNED) == 1) {
> +            if (aCanBeString)
> +              intsArray.push(uint64.toString());
> +            else
> +              throw "Integer too big to be read as float 64";

throw new Error("Integer too big to be read as float 64") should give better errors in the console (replace everywhere is worth it)

@@ +278,5 @@
> +        }
> +        return intsArray;
> +      }
> +      else
> +        throw "Unsupported size: " + aIntSize;

if one side of an if/else is braced, brace all of them

@@ +283,5 @@
> +    }
> +  },
> +
> +  _readUnsignedInt: function(aByteOffset, aIntSize, aCanBeString)
> +    this._readUnsignedInts(aByteOffset, aIntSize, 1, aCanBeString)[0],

I find this confusing, since the name differs only for the plural, and arguments position differs (and both have an optional arg), could you just use the plural version everywhere?

@@ +285,5 @@
> +
> +  _readUnsignedInt: function(aByteOffset, aIntSize, aCanBeString)
> +    this._readUnsignedInts(aByteOffset, aIntSize, 1, aCanBeString)[0],
> +
> +  _readDataLengthAndOffset: function(aObjectOffset) {

everywhere you pass offset and length, in that order, so you may want to _readOffsetAndLength, for coherence

@@ +286,5 @@
> +  _readUnsignedInt: function(aByteOffset, aIntSize, aCanBeString)
> +    this._readUnsignedInts(aByteOffset, aIntSize, 1, aCanBeString)[0],
> +
> +  _readDataLengthAndOffset: function(aObjectOffset) {
> +    let objectTypeAndMaybeLengthByte = this._readUnsignedInt(aObjectOffset, 1);

The documentation about what this means is spread below inside the _readObject, should be here instead, otherwise this clownshoe variable name is surprising :)
btw, I'd prefer an objectDescriptor var name here

@@ +288,5 @@
> +
> +  _readDataLengthAndOffset: function(aObjectOffset) {
> +    let objectTypeAndMaybeLengthByte = this._readUnsignedInt(aObjectOffset, 1);
> +    let maybeLength = (objectTypeAndMaybeLengthByte & 0x0F);
> +    if (maybeLength != 15) // 1111

parseInt("1111", 2) would be self-documenting

@@ +292,5 @@
> +    if (maybeLength != 15) // 1111
> +      return [maybeLength, aObjectOffset + 1];
> +
> +    let [objTypeBits, intSizeInfo] = this._splitByte(aObjectOffset + 1);
> +    let intSize = Math.pow(2, intSizeInfo);

this wants comments

@@ +324,5 @@
> +  },
> +
> +  // A NSDictionary in the binary format is stored as a list of references to
> +  // key-objects, followed by a list of references to the value-objecte for
> +  // those keys. The size of each list is aNumberOfObjcts * this._objectRefSize.

typos: "value-objecte" and "aNumberOfObjcts"

make this a proper javadoc please

@@ +325,5 @@
> +
> +  // A NSDictionary in the binary format is stored as a list of references to
> +  // key-objects, followed by a list of references to the value-objecte for
> +  // those keys. The size of each list is aNumberOfObjcts * this._objectRefSize.
> +  _wrapDictionary: function(aOffset, aNumberOfObjcts) {

typo: aNumberOfObjcts

@@ +329,5 @@
> +  _wrapDictionary: function(aOffset, aNumberOfObjcts) {
> +    if (aNumberOfObjcts == 0)
> +      return new Dict();
> +
> +    let dict = new Dict();

well, at this point put let dict before the if and return it?

@@ +346,5 @@
> +  },
> +
> +  // Reads and object
> +  // aObjectIndex - the index of the object in the offset table.
> +  _readObject: function(aObjectIndex) {

proper javadoc please

@@ +348,5 @@
> +  // Reads and object
> +  // aObjectIndex - the index of the object in the offset table.
> +  _readObject: function(aObjectIndex) {
> +    if (this._objects[aObjectIndex] !== undefined)
> +      return this._objects[aObjectIndex];

add comment

@@ +352,5 @@
> +      return this._objects[aObjectIndex];
> +
> +    let objOffset = this._offsetTable[aObjectIndex];
> +
> +    // The first four bits hold the object type (0-15. 7, 9, 11, 14, 15 unused).

looks like lottery numbers! this can go away, object types should be put into a global fake-enum, or into global object conts, so here we don't have case 0, but case TYPE_BOOL or case TYPES.BOOL or similar.

@@ +353,5 @@
> +
> +    let objOffset = this._offsetTable[aObjectIndex];
> +
> +    // The first four bits hold the object type (0-15. 7, 9, 11, 14, 15 unused).
> +    // For some types, additional info is held in the other 4 bytes.

I suppose you mean 4 bits, not bytes.

@@ +371,5 @@
> +          default:
> +            throw "Unexpected value!";
> +        }
> +      }
> +      // 0001 nnnn: integer (int size=2^nnnn)

nit: add newline before each type comment to give some separation

@@ +376,5 @@
> +      case 1: {
> +        let intSize = Math.pow(2, objInfo);
> +
> +        // For obejcts, 64-bit integers are always signed.  Negative integers
> +        // are always represented by a 64-bit integer.

typo: obejcts

btw, I'm not sure this is only valid for objects, I think it's a rule for the entire file contents

@@ +398,5 @@
> +        return this._objects[aObjectIndex] = this._readDate(objOffset + 1);
> +      }
> +      // 0100 nnnn/1111: Data (nnnn=could be either the data length in bytes,
> +      // or 1111, in which case the next byte  consists of the int-type of the,
> +      // followed by the data-length.

typo: "of the,"

@@ +417,5 @@
> +        return this._objects[aObjectIndex] =
> +          this._createString(offset, unicharsCount, true);
> +      }
> +      // 0111 - unsued
> +      // case 7:

you may just leave unused types out

@@ +426,5 @@
> +      // 1001 - unused
> +      // case 9:
> +      // 1010 - array, 1100 - set (The format is the same).
> +      case 10:
> +      case 12: {

well, 12 is Set, may be worth a comment, then the fallback is fine

@@ +454,5 @@
> +  let bytes = [b for each (b in bytesView)];
> +  let doc = domParser.parseFromBuffer(bytes, bytes.length, "application/xml");
> +
> +  let docElt = doc.documentElement;
> +  if (docElt .localName != "plist" || !docElt.firstElementChild)

extraneous whitespace after docElt
Comment 16 Marco Bonardo [::mak] 2011-12-30 04:47:40 PST
per mail discussion, feel free to ignore the request to unify the 2 integers readers.
Comment 17 Marco Bonardo [::mak] 2011-12-30 05:11:38 PST
Comment on attachment 584556 [details] [diff] [review]
bug fixes

I think there are enough comments for now, in the next version I'll look directly into the migrator, since I think you said it's the less complete code change here?
Comment 18 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2011-12-30 06:28:59 PST
Regarding unsigned/signed issue: I talked with Marco, and we figured out the current approach does the right thing.  However, I'll do away with the readUnsignedInt alias.
Comment 19 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-01-02 02:55:40 PST
Created attachment 585267 [details] [diff] [review]
Dict.jsm changes + test
Comment 20 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-01-02 02:57:26 PST
Created attachment 585268 [details] [diff] [review]
Property List Utils + tests

I still need to check for typos, and add checks for dictionary lazy getters.
Comment 21 Siddharth Agarwal [:sid0] (inactive) 2012-01-02 04:10:38 PST
Comment on attachment 585267 [details] [diff] [review]
Dict.jsm changes + test

I appreciate the abstraction, but I presume that we might eventually want to move to Harmony maps [1] -- and because those maps won't directly support lazy getters, will you still want Dict.jsm to be around after that, perhaps as a wrapper around Harmony maps then? Otherwise, do you have another way to do this that doesn't involve lazy getters?

[1] http://wiki.ecmascript.org/doku.php?id=harmony:simple_maps_and_sets
Comment 22 Marco Bonardo [::mak] 2012-01-03 08:45:59 PST
Comment on attachment 585268 [details] [diff] [review]
Property List Utils + tests

Review of attachment 585268 [details] [diff] [review]:
-----------------------------------------------------------------

still something to fix, but looks much much better. actually clearing since you told me you are addressing Dict requirements.

::: toolkit/content/PropertyListUtils.jsm
@@ +63,5 @@
> + *
> + * -------------
> + * 1) Property lists supports storing U/Int64 numbers, while JS can only handle
> + *    numbers that are in this limits of float-64 (±2^53).  For numbers that
> + *    do not outbound this limits, simple primivtive numebr are always used.

typo: primivtive numebr

Btw, may be better to have "String or Number" in the table

@@ +67,5 @@
> + *    do not outbound this limits, simple primivtive numebr are always used.
> + *    Otherwise, a String object.
> + * 2) About NSNull and NSSet values: While the xml format has no support for
> + *    representing null and set values, the documentation for the binary format
> + *    states that is supports stroing both types.  However, the Cocoa APIs for

typo: stroing

@@ +68,5 @@
> + *    Otherwise, a String object.
> + * 2) About NSNull and NSSet values: While the xml format has no support for
> + *    representing null and set values, the documentation for the binary format
> + *    states that is supports stroing both types.  However, the Cocoa APIs for
> + *    serializing proeprty lists do not seem to support either types (test with

typo: proeprty

@@ +76,5 @@
> + *    As for usage within OS X, not surprisingly there's no known usage of
> + *    storing either of these types in a property list.  It seems that, for now,
> + *    Apple is keeping the features of binary and xml formats in sync, probably as
> + *    long as the XML format is not officially deprecated.
> + * 3) About NSSet representaion: For the time being, we represent those

typo: representaion

@@ +87,5 @@
> +"use strict";
> +
> +let EXPORTED_SYMBOLS = ["PropertyListUtils"];
> +
> +const { classes: Cc, interfaces: Ci, utils: Cu } = Components;

Afaik, we prefer avoiding this style in our code, since it has been reported as being more expensive than the usual consts.

@@ +92,5 @@
> +
> +Cu.import("resource://gre/modules/Dict.jsm");
> +
> +// For working with 64-bit integers.
> +Cu.import("resource://gre/modules/ctypes.jsm");

defineLazyModuleGetter both or any of these? shouldn't be immediately needed.

@@ +104,5 @@
> +   * @param aCallback(aPropertyListRoot)
> +   *        @see fromArrayBuffer
> +   *
> +   * @note aCallback may be called synchronously.  Do not rely on exceptions
> +   * being swallowed.

I wonder if we should force the callback to always be async (even by just Services.tm.mainThread.dispatch), having mixed behavior may create funny random failures.

@@ +114,5 @@
> +      throw new Error("Invalid value for aCallback");
> +
> +    let file = aFile;
> +    if (aFile instanceof Ci.nsILocalFile) {
> +      if (!aFile.exists()) {

since you assigned to file, you may well use file here.

@@ +151,5 @@
> +   *        If it's not read successfully, aPropertyListRoot is set to null.
> +   *        The reaon for failure is reported to the Error Console.
> +   *
> +   * @note For the time being, this methods works synchronously, but you
> +   * should not rely on this beahavior.

And this increases my will to make the callback fake-async, so people won't rely on it being synchronous.
Honestly we were not really good at reading documentation notes in the past :)

@@ +165,5 @@
> +      if (BinaryPropertyListReader.prototype.canProcess(aBuffer)) {
> +        aCallback(new BinaryPropertyListReader(aBuffer).root)
> +      }
> +      else {
> +        // Convert the buffer into a XML tree.

nit: my sucky english knowledge suggests should be "an XML"

@@ +169,5 @@
> +        // Convert the buffer into a XML tree.
> +        let domParser = Cc["@mozilla.org/xmlextras/domparser;1"].
> +                        createInstance(Ci.nsIDOMParser);
> +
> +        // XXX bug Bug 714589: parseFromBuffer requires a normal js array.

typo: bug bug, btw s/XXX bug Bug/TODO bug/

@@ +191,5 @@
> +    }
> +  },
> +
> +  /**
> +   * Reads a dom document as a property list.

uppercase dom

@@ +196,5 @@
> +   *
> +   * @param aDOMDoc
> +   *        a DOM document to be read as a property list.
> +   * @returns the property list root object.  Use getPropertyListObjectType
> +   * to detect its type.

@return (no s)

@@ +199,5 @@
> +   * @returns the property list root object.  Use getPropertyListObjectType
> +   * to detect its type.
> +   * @throws if the aDOMDoc could not be read as a property list.
> +   */
> +  fromXML: function(aDOMDoc) new XMLPropertyListReader(aDOMDoc).root,

As I said previously, we should be consistent even if this is not a proper idl, if you want to expose fromXML as another form of input (apart from fromFile and fromArrayBuffer) it should have a similar signature, with a callback and the same behavior.

@@ +251,5 @@
> +   * Wraps a 64-bit stored in the form of a string primitive as a String object,
> +   * which we can later distiguish from regular string values.
> +   * @param aNumberStr
> +   *        a number in the form of a primitive string.
> +   * @returns a String wrapper around aNumberStr that can later be identified

@return (no s)

@@ +252,5 @@
> +   * which we can later distiguish from regular string values.
> +   * @param aNumberStr
> +   *        a number in the form of a primitive string.
> +   * @returns a String wrapper around aNumberStr that can later be identified
> +   * as holding 64-bit number.

... using getObjectType

@@ +314,5 @@
> +
> +BinaryPropertyListReader.prototype = {
> +  /**
> +   * Checks if the given buffer can be read as a binary property list.
> +   * It can be called on the prototype.

specify which format aBuffer is expected.

@@ +338,5 @@
> +                                               this._offsetTableIntegerSize,
> +                                               this._numberOfObjects);
> +  },
> +
> +  // @@mano: remove when DataView is landed.

make this a proper "// TODO bug 123456: whatever" comment

@@ +402,5 @@
> +    LENGTH_INT_SIZE_FOLLOWS: parseInt("1111", 2)
> +  },
> +
> +  /**
> +   * Returns an object descriptor in the form of two intergers: object type and

typo: intergers

@@ +406,5 @@
> +   * Returns an object descriptor in the form of two intergers: object type and
> +   * additional info.
> +   * @param aByteOffset
> +   *        the descriptor's offset.
> +   * @returns [objType, additionalInfo] - the object type and additional info.

@return (no s), please just search&replace so I won't comment further on it.

Add a @see OBJECT_TYPE_BITS and ADDITIONAL_INFO_BITS

@@ +408,5 @@
> +   * @param aByteOffset
> +   *        the descriptor's offset.
> +   * @returns [objType, additionalInfo] - the object type and additional info.
> +   */
> +  _readObjectDescriptor: function BPLR__readReal(aByteOffset) {

the label and the name don't match

@@ +416,5 @@
> +    return [(byte & 0xF0) >> 4, byte & 0x0F];
> +  },
> +
> +  _readDate: function BPLR__readDate(aByteOffset) {
> +    // That's the reference data for NSDate.

s/data/date/?

@@ +425,5 @@
> +
> +  _readString:
> +  function BPLR__readString(aByteOffset, aNumberOfChars, aUnicode) {
> +    // Note about about unicode surrogate pairs: all unicode chars here are
> +    // readas 2 byte integers, including unicode surrogate pairs, meaning they

typo: readas

btw, this may be made into a proper javadoc

@@ +435,5 @@
> +  },
> +
> +  /**
> +   * Reads array of unsigned ints in the buffer which are stored in big endian
> +   * form (if aIntSize > 1).

nit: Add a period after "in the buffer" and make the second part a proper separate phrase like "Integers larger than 1 byte are stored in big endian form."

@@ +446,5 @@
> +   *        The number of ints in the array.
> +   * @param [optional] aBigIntAllowed (default: false)
> +   *        Whether or not an integer in the array, that its value outbounds
> +   *        the limits of js numbers, can be returned as a string or as a
> +   *        ctypes object.

Honestly I can't read this comment...

@@ +448,5 @@
> +   *        Whether or not an integer in the array, that its value outbounds
> +   *        the limits of js numbers, can be returned as a string or as a
> +   *        ctypes object.
> +   * @returns array of integers.  If aIntSize is 8 and aCanBeString is set to
> +   * true, values may be returned in the form of a string.

aCanBeString doesn't exist anymore

@@ +461,5 @@
> +    // There are two reasons for the complexity you see here:
> +    // (1) 64-bit integers - For which we use ctypes. When possible, the
> +    //     number is converted back to js's default float 64 type.
> +    // (2) The DataView object for ArrayBuffer, which takes care of swaping
> +    //     bytes, is not yet implemented.

is there a bug for (2) we can report in the comment?

@@ +492,5 @@
> +  /**
> +   * Reads the data length of an object in the buffer (length means the number
> +   * of objects in the number of elements in the object, which is not always
> +   * the number of bytes) and the offset in the buffer at which the object's
> +   * data starts.

the part between parenthesis is confusing

@@ +499,5 @@
> +   * @returns [dataLength, dataOffset] - the data length, as explained above,
> +   * and the offset in the buffer at which the data starts.
> +   */
> +  _readDataLengthAndOffset:
> +  function BPLR__readDataLengthAndOffset(aObjectOffset) {

nit: I still think this should return the data inversed, the offset first, the length later, for consistency with the other methods that get offset, length in this order.

@@ +534,5 @@
> +
> +    let array = new Array(aNumberOfObjects);
> +    let readObjectBound = this._readObject.bind(this);
> +
> +    // Each indes in the returned array is a lazy getter for its object.

typo: indes

@@ +585,5 @@
> +   *        index at the object table
> +   * @returns the property list object at the given index.
> +   */
> +  _readObject: function(aObjectIndex) {
> +    // If the object is was previously read, return the cached object.

typo: is was

::: toolkit/content/tests/unit/test_propertyListsReader.js
@@ +40,5 @@
> +function checkValue(aPropertyListObject, aType, aValue) {
> +  do_check_eq(PropertyListUtils.getObjectType(aPropertyListObject), aType);
> +  if (aValue !== undefined) {
> +    if (typeof(aPropertyListObject) == "object") {
> +      // XXX Bug 714467

I see you are looking at this, in case it doesn't land before this one, improve the comment please.

@@ +46,5 @@
> +                  typeof(aValue.valueOf()));
> +      do_check_eq(aPropertyListObject.valueOf(), aValue.valueOf());
> +    }
> +    else {
> +      // Ditto

as well as here :)
Actually you may have an helper function to do this kind of comparison in the meanwhile.

@@ +112,5 @@
> +  }
> +}
> +
> +function readPropertyList(aFile, aCallback) {
> +  do_test_pending();

why do you need this? there should already be one pending test till the next call to run_next_test

@@ +116,5 @@
> +  do_test_pending();
> +  PropertyListUtils.fromFile(aFile, function(aPropertyListRoot) {
> +    // Null root indicates failure to read property list.
> +    // Note: It is imprtant not to run do_check_n/eq directly on Dict and array
> +    // objcts, because it cases their toString to get invoked, doing away with

typo: "objcts" and "cases"

@@ +118,5 @@
> +    // Null root indicates failure to read property list.
> +    // Note: It is imprtant not to run do_check_n/eq directly on Dict and array
> +    // objcts, because it cases their toString to get invoked, doing away with
> +    // all the lazy getter we'd like to test later.
> +    do_check_true(aPropertyListRoot !== null);

is do_check_neq(aPropertyListRoot, null) too weak?

@@ +121,5 @@
> +    // all the lazy getter we'd like to test later.
> +    do_check_true(aPropertyListRoot !== null);
> +    aCallback(aPropertyListRoot);
> +    do_test_finished();
> +    run_next_test();

as well as this finished/next call should not be needed, your checker callback should just invoke run_next_test when done, supposed is the last piece of code that should run for that test that seems true.
Comment 23 Marco Bonardo [::mak] 2012-01-03 08:48:23 PST
ehr, neverming the do_check_neq comment, I just figured out I should connect what I read 1 second before :)
Comment 24 Marco Bonardo [::mak] 2012-01-03 12:08:47 PST
As a side note, you can now use the MPL 2.0 license header:
http://www.mozilla.org/MPL/headers/
Regardless, any old 1.1 headers will be replaced shortly.
Comment 25 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-01-04 01:33:20 PST
> As I said previously, we should be consistent even if this is not a proper idl, if you
> want to expose fromXML as another form of input (apart from fromFile and fromArrayBuffer
> it should have a similar signature, with a callback and the same behavior.

The difference is on purpose, actually.  I don't think it's a good idea to pretend we can work with a DOM tree asynchronously. fromFile is always async because the file itself is read asynchronously.  fromArrayBuffer is async because we definitively can decide to move binary-plist parsing to a Worker/ChromeWorker at some point (though not sure if it's worth anything, given its lazyness), and we don't know if the buffer is binary.  fromXML, on the other hand, cannot be async, and that's by design (not mine, Gecko's).
Comment 26 Marco Bonardo [::mak] 2012-01-04 03:07:35 PST
(In reply to Mano from comment #25)
> fromXML, on
> the other hand, cannot be async, and that's by design (not mine, Gecko's).

Well, but it could be done on a separate thread, I suppose, doesn't the new html5 parser use multiple threads, for example?
Comment 27 Siddharth Agarwal [:sid0] (inactive) 2012-01-04 14:51:50 PST
Comment on attachment 585267 [details] [diff] [review]
Dict.jsm changes + test

Review of attachment 585267 [details] [diff] [review]:
-----------------------------------------------------------------

r-ing because I'd like to see more tests.

::: toolkit/content/Dict.jsm
@@ +118,5 @@
> +   *        The key to set
> +   * @param aGetterFunction
> +   *        A getter function to call the first time the key value is needed.
> +   *        Use Function.bind if you wish to run this function in a certain
> +   *        context.

I'd prefer aThunk here rather than aGetterFunction.

@@ +122,5 @@
> +   *        context.
> +   */
> +  setAsLazyGetter: function Dict_setAsLazyGetter(aKey, aGetterFunction) {
> +    let prop = convert(aKey);
> +    let items = this._state.items;

(At the time I wrote this, let was banned from strict mode, which is why I used var. let's fine now.)

::: toolkit/content/tests/unit/test_dict.js
@@ +219,5 @@
> +  do_check_eq(dict.get("foo"), "bar");
> +  do_check_true(dict.has("foo"));
> +  do_check_false(dict.isLazyGetter("foo"));
> +  do_check_eq(dict.get("foo"), "bar");
> +}

These tests are great, but I'd like to see some sort of spec and tests for iteration. What happens when
(a) listvalues or listitems is called
(b) dict.values or dict.items is iterated over

At what point does the lazy getter get executed? When the list/generator is first created or when we come across it during the iteration? Intuitively, (a) should call the lazy generator when initialized, and (b) should call it during the iteration.

Also, can we provide a guarantee that the thunk provided is going to be called at most once? I think this should be the case, but again, it'd be nice to have some sort of full-battery test (iteration, access, etc) followed by checking how many times the thunk gets called.
Comment 28 Siddharth Agarwal [:sid0] (inactive) 2012-01-04 14:53:14 PST
(In reply to Siddharth Agarwal [:sid0] from comment #27)
> (a) should call the lazy generator when initialized

Sorry, I meant to write "(a) should call the thunk when initialized".
Comment 29 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-01-12 10:18:04 PST
Created attachment 588088 [details] [diff] [review]
Dict changes + tests with comments addressed
Comment 30 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-01-15 04:49:50 PST
Created attachment 588729 [details] [diff] [review]
Property List Utils + tests

Dict.jsm changes are included to ease testing.
Comment 31 Siddharth Agarwal [:sid0] (inactive) 2012-01-16 06:06:02 PST
Comment on attachment 588088 [details] [diff] [review]
Dict changes + tests with comments addressed

Looks good, thanks!
Comment 32 Siddharth Agarwal [:sid0] (inactive) 2012-01-16 06:06:57 PST
(Have you found out whether you need sr yet?)
Comment 33 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-01-16 06:32:10 PST
Comment on attachment 588088 [details] [diff] [review]
Dict changes + tests with comments addressed

Thanks for the review!

Yes, it seems this needs sr. I'll also need sr for the new module.
Comment 34 Dave Townsend [:mossop] 2012-01-17 12:34:49 PST
Comment on attachment 588088 [details] [diff] [review]
Dict changes + tests with comments addressed

Review of attachment 588088 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/content/Dict.jsm
@@ +117,5 @@
> +   * @param aKey
> +   *        The key to set
> +   * @param aThunk
> +   *        A getter function to be called the first time the value for aKey is
> +   *        retrieved. It is guarantee that aThunk wouldn't be called more than

*guaranteed
Comment 35 Marco Bonardo [::mak] 2012-01-18 17:32:19 PST
Comment on attachment 588729 [details] [diff] [review]
Property List Utils + tests

Review of attachment 588729 [details] [diff] [review]:
-----------------------------------------------------------------

skipping the Dict changes since those are part of the other patch that is already reviewed.

I'd like a quick SR pass on exposed method names and the module addition to toolkit, if possible.

::: toolkit/content/PropertyListUtils.jsm
@@ +1,5 @@
> +/*
> + * This Source Code is subject to the terms of the Mozilla Public License
> + * version 2.0 (the "License"). You can obtain a copy of the License at
> + * http://mozilla.org/MPL/2.0/.
> + */

the boilerplate has been updated to have  /* and */ in line, get the new version from http://www.mozilla.org/MPL/headers/ please (should not change anymore)

@@ +33,5 @@
> + *
> + * -------------
> + * 1) Property lists supports storing U/Int64 numbers, while JS can only handle
> + *    numbers that are in this limits of float-64 (±2^53).  For numbers that
> + *    do not outbound this limits, simple primitive numebr are always used.

typo: numebr

@@ +37,5 @@
> + *    do not outbound this limits, simple primitive numebr are always used.
> + *    Otherwise, a String object.
> + * 2) About NSNull and NSSet values: While the xml format has no support for
> + *    representing null and set values, the documentation for the binary format
> + *    states that is supports storing both types.  However, the Cocoa APIs for

typo: is supports

@@ +90,5 @@
> +      throw new Error("aFile is not a file object");
> +    if (typeof(aCallback) != "function")
> +      throw new Error("Invalid value for aCallback");
> +
> +    // Any other exceptions should be thrown asynchronously.

not sure what's the concept of an asynchronous  exception, but I suppose the concept this comment tries to express is that any other exception should be catched to ensure we always invoke the callback?

@@ +103,5 @@
> +        }
> +
> +        let fileReader = Cc["@mozilla.org/files/filereader;1"].
> +                         createInstance(Ci.nsIDOMFileReader);
> +        let onLoadEnd = function(aEvent) {

nit: you don't use aEvent, so you may avoid it

@@ +112,5 @@
> +              throw new Error("Could not read file contents: " + fileReader.error);
> +
> +            root = this.readFromArrayBufferSync(fileReader.result);
> +          }
> +          finally {

may we reportError the exception?

@@ +121,5 @@
> +        fileReader.readAsArrayBuffer(file);
> +      }
> +      catch(ex) {
> +        aCallback(null);
> +        throw ex;

may we reportError the exception?

@@ +145,5 @@
> +      doc = domParser.parseFromBuffer(bytesView, bytesView.length,
> +                                      "application/xml");
> +    }
> +    catch(ex2) {
> +      throw new Error("aBuffer cannot be parsed as a DOM document: " + ex2);

why ex2? (the 2, I mean)

@@ +148,5 @@
> +    catch(ex2) {
> +      throw new Error("aBuffer cannot be parsed as a DOM document: " + ex2);
> +    }
> +
> +    return new XMLPropertyListReader(doc).root;

nit: you may move let doc and this return inside the try, and just return null here (regardless we'd throw before)

@@ +365,5 @@
> +
> +  _readDate: function BPLR__readDate(aByteOffset) {
> +    // That's the reference date of NSDate.
> +    let date = new Date("1 January 2001, GMT");
> +    date.setMilliseconds(this._readReal(aByteOffset, 8) * 1000);

Add a comment regarding the fact we have to use milliseconds since this is a real. So nobody will mess it up in future.

@@ +449,5 @@
> +  },
> +
> +  /**
> +   * Reads the data length of an object in the buffer and the offset in the
> +   * buffer at which the object's data starts.

This caused me looping in a while(1) :) Maybe "Reads from the buffer the data length and the offset at which the object's data starts."?

@@ +453,5 @@
> +   * buffer at which the object's data starts.
> +   * @param aObjectOffset
> +   *        the object's offset.
> +   * @return [dataOffset, dataLength] - the offset in the buffer at which the
> +   * data starts, and the data-length.

Since it returns [offset, length] should not be called _readDataLengthAndOffset, rather it should be _readOffsetAndDataLength, or _readDataOffsetAndLength, or readDataOffsetAndCount to enforce the fact it's not a size.

@@ +456,5 @@
> +   * @return [dataOffset, dataLength] - the offset in the buffer at which the
> +   * data starts, and the data-length.
> +   *
> +   * @note data-length is the number of objects in the object (elements in
> +   * an array, number of charcaters in a string, etc.). It is by no means

typo: charcaters

@@ +511,5 @@
> +  },
> +
> +  /**
> +   * Reads dictionary from the buffer and wrap it as a Dict object (as defined
> +   * in Dict.jsm).

typo: s/wrap/wraps/

@@ +514,5 @@
> +   * Reads dictionary from the buffer and wrap it as a Dict object (as defined
> +   * in Dict.jsm).
> +   * @param aObjectOffset
> +   *        the offset in the buffer at which the dictionary starts
> +   * @param aNumberOfObjcts

typo: aNumberOfObjcts

This is repeated across all the method too

@@ +703,5 @@
> +        return this._wrapDictionary(aDOMElt);
> +      case "array":
> +        return this._wrapArray(aDOMElt);
> +      default:
> +        throw "Unexpected tagname";

new Error()

@@ +714,5 @@
> +    // In case of an outbound, we fallback to return the number as a string.
> +    let numberAsString = aDOMElt.textContent.toString();
> +    let parsedNumber = parseInt(numberAsString, 10);
> +    if (isNaN(parsedNumber))
> +      throw "Could not parse integer value";

ditto

@@ +730,5 @@
> +    //   <key>my string key</key>
> +    //   <string>My String Key</string>
> +    // </dict>
> +    if (aDOMElt.children.length % 2 != 0)
> +      throw "Invalid dictionary";

ditto

@@ +737,5 @@
> +      let keyElem = aDOMElt.children[i];
> +      let valElem = aDOMElt.children[i + 1];
> +
> +      if (keyElem.localName != "key")
> +        throw "Invalid dictionary";

ditto

::: toolkit/content/tests/unit/test_propertyListsUtils.js
@@ +1,5 @@
> +/*
> + * This Source Code is subject to the terms of the Mozilla Public License
> + * version 2.0 (the "License"). You can obtain a copy of the License at
> + * http://mozilla.org/MPL/2.0/.
> + */

the boilerplate has been updated to have  /* and */ in line, get the new version from http://www.mozilla.org/MPL/headers/ please (should not change anymore)

@@ +65,5 @@
> +    // Data
> +    checkLazyGetterValue(array, 6, PropertyListUtils.TYPE_UINT8_ARRAY);
> +    let dataAsString = [String.fromCharCode(b) for each (b in array[6])].join("");
> +    do_check_eq(dataAsString, "2011-12-31T11:15:33Z")
> +    // Dict

nit: add newline before // Dict, is not related to // Data, isn't it?

@@ +83,5 @@
> +
> +function readPropertyList(aFile, aCallback) {
> +  PropertyListUtils.read(aFile, function(aPropertyListRoot) {
> +    // Null root indicates failure to read property list.
> +    // Note: It is imprtant not to run do_check_n/eq directly on Dict and array

typo: imprtant

@@ +84,5 @@
> +function readPropertyList(aFile, aCallback) {
> +  PropertyListUtils.read(aFile, function(aPropertyListRoot) {
> +    // Null root indicates failure to read property list.
> +    // Note: It is imprtant not to run do_check_n/eq directly on Dict and array
> +    // objcts, because it cases their toString to get invoked, doing away with

typo: objcts
Comment 36 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-01-19 03:00:49 PST
Comment on attachment 588729 [details] [diff] [review]
Property List Utils + tests

Dave, I need another sr, for adding the module to toolkit, and for the public methods in PropretyListUtils. Thanks!
Comment 37 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-01-19 03:03:48 PST
Comment on attachment 588729 [details] [diff] [review]
Property List Utils + tests

Ugh, see my last comment.
Comment 38 Dave Townsend [:mossop] 2012-01-19 13:23:45 PST
Comment on attachment 588729 [details] [diff] [review]
Property List Utils + tests

Review of attachment 588729 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/content/PropertyListUtils.jsm
@@ +131,5 @@
> +   * DO NOT USE ME.  Once Bug 718189 is fixed, this method won't be public.
> +   *
> +   * Synchronously read an ArrayBuffer contents as a property list.
> +   */
> +  readFromArrayBufferSync: function PLU_readFromArrayBufferSync(aBuffer) {

How about prefixing this with "_" to make it clearer that it shouldn't normally be used?
Comment 39 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-01-22 13:54:04 PST
Created attachment 590585 [details] [diff] [review]
part 1, as checked in

Property List utils + Dict changes.

http://hg.mozilla.org/integration/mozilla-inbound/rev/5ad0ee32f855
Comment 40 Ed Morley [:emorley] 2012-01-22 14:15:27 PST
Backed out of inbound for failures on all platforms:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=bf035b906c2e
https://tbpl.mozilla.org/php/getParsedLog.php?id=8745262&tree=Mozilla-Inbound
{
TEST-UNEXPECTED-FAIL | xpccheck | found test_propertyListsUtils.js in xpcshell.ini and not in directory '/builds/slave/m-in-lnx-dbg/build/toolkit/content/tests/unit'
}

https://hg.mozilla.org/integration/mozilla-inbound/rev/7de99b312d6f

When this relands, please can it be rebased, rather than merged again - since the latter is generally frowned upon. Thanks :-)
Comment 41 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-01-23 04:06:19 PST
Thanks for taking care of this. Relanded now:
http://hg.mozilla.org/integration/mozilla-inbound/rev/920c698ea0eb
Comment 42 Ed Morley [:emorley] 2012-01-23 11:45:50 PST
No problem :-)

Merged to m-c:
https://hg.mozilla.org/mozilla-central/rev/920c698ea0eb

(Leaving open for the rest).
Comment 43 Dietrich Ayala (:dietrich) 2012-01-26 11:19:19 PST
Mano, what's the status of this? What has landed and what's still in progress?
Comment 44 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-02-29 11:10:16 PST
Status update:
1. part 1 landed long ago
2. part 2 is bug 718608 - Marco is reviewing that.
3. part 3 is the safari migrator (included in the last "bug fixes" patch) - I finished updating it to the requirements of bug 718608. I'm now updating it to fit some feedback from limi about Top Sites and Reading List.
Comment 45 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-03-23 15:50:55 PDT
Created attachment 608895 [details] [diff] [review]
Safari migrator

The NewTabUtils part is included here for testing purpose only. I'll finalize it in bug 737197 (the TopSites bug will be updated as needed).

I've not tested this on Windows for a very long while. I'll do so tomorrow.
Comment 46 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-03-23 15:54:23 PDT
By the way, I completely dropped Cookies import.  Leopard's market share shrinks every month.
Comment 47 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-03-24 03:50:16 PDT
(In reply to Mano from comment #46)
> By the way, I completely dropped Cookies import.  Leopard's market share
> shrinks every month.

I thought I mention bug 712240 (Import of Safari-cookies is broken on Lion) somewhere in this bug, but I didn't.  Of course, that is the smaller error I made with this comment.  The more important one is the reference to Leopard, whereas the OS that matters here is Snow Leopard. Snow Leopard is still common enough, for use to consider keeping the Cookies.plist-import code.

Still, given that the new migrator will first ship with Firefox 14, to be released just a month or so before Mountain Lion,  I think my decision to drop legacy Cookies.plist import from this migrator makes sense.
Comment 48 Marco Bonardo [::mak] 2012-03-28 08:56:41 PDT
Comment on attachment 608895 [details] [diff] [review]
Safari migrator

Review of attachment 608895 [details] [diff] [review]:
-----------------------------------------------------------------

NewTabUtils.jsm changes are already in bug 737197, so should not be here, I won't look at those.

Some stuff to fix, but looks promising!

::: browser/components/migration/src/BrowserProfileMigrators.manifest
@@ +3,5 @@
>  component {4cec1de4-1671-4fc3-a53e-6c539dc77a26} ChromeProfileMigrator.js
>  contract @mozilla.org/profile/migrator;1?app=browser&type=chrome {4cec1de4-1671-4fc3-a53e-6c539dc77a26}
>  component {91185366-ba97-4438-acba-48deaca63386} FirefoxProfileMigrator.js
>  contract @mozilla.org/profile/migrator;1?app=browser&type=firefox {91185366-ba97-4438-acba-48deaca63386}
> +component {4b609ecf-60b2-4655-9df4-dc149e474da1} SafariMigrator.js

Others are called XXXProfileMigrator.js, maybe it's an overkill, but I think at this point we should stay consistent with naming and make this SafariProfileMigrator.js.

::: browser/components/migration/src/Makefile.in
@@ +58,5 @@
>  EXTRA_PP_COMPONENTS = \
>    ProfileMigrator.js \
>    ChromeProfileMigrator.js \
>    FirefoxProfileMigrator.js \
> +  SafariMigrator.js \

I assume on Linux, even if doesn't make sense to have it, it will just bail out.

::: browser/components/migration/src/SafariMigrator.js
@@ +10,5 @@
> +
> +Cu.import("resource://gre/modules/XPCOMUtils.jsm");
> +Cu.import("resource://gre/modules/FileUtils.jsm");
> +Cu.import("resource://gre/modules/Services.jsm");
> +Cu.import("resource://gre/modules/MigrationUtils.jsm");

MigrationUtils is not part of gre

@@ +19,5 @@
> +                                  "resource://gre/modules/PlacesUtils.jsm");
> +XPCOMUtils.defineLazyModuleGetter(this, "NetUtil",
> +                                  "resource://gre/modules/NetUtil.jsm");
> +XPCOMUtils.defineLazyModuleGetter(this, "NewTabUtils",
> +                                  "resource://gre/modules/NewTabUtils.jsm");

NewTabUtils is not part of gre

@@ +25,5 @@
> +let gPrefsDir;
> +let gSafariProfileDir;
> +
> +/**
> + * Get a preferences file from the preferences folder, if the folder and and

typo: and and
or use "(if both exists)" as in the GetProfileFile javadoc

@@ +28,5 @@
> +/**
> + * Get a preferences file from the preferences folder, if the folder and and
> + * file exis.  Null is returned otherwise.
> + */
> +function GetApplePreferencesFile(aFileName) {

camelCase please

@@ +37,5 @@
> +#else
> +      FileUtils.getDir("AppData", ["Apple Computer", "Preferences"], false);
> +#endif
> +    gPrefsDir = dir.exists() ? dir : null;
> +  }

could we rather make gPrefsDir a lazyGetter?

@@ +50,5 @@
> +/**
> + * Get a file in the Safari user-data directory (if both exist).  Null is returned
> + * otherwise.
> + */
> +function GetSafariProfileFile(aFileName) {

cameCalse please

@@ +59,5 @@
> +#else
> +      FileUtils.getDir("AppData", ["Apple Computer", "Safari"], false);
> +#endif
> +    gSafariProfileDir = dir.exists() ? dir : null;
> +  }

ditto on making a lazyGetter

at that point these 2 utils are basically doing the same, so you could just make a getSafariFile(dir, file) util and have the 2 dirs be lazyGetters...

@@ +77,5 @@
> + *
> + * So, rather than reading it three times, it's cached and managed here.
> + */
> +let MainPreferencesPropertyList = {
> +  init: function() {

MPPL_init

@@ +80,5 @@
> +let MainPreferencesPropertyList = {
> +  init: function() {
> +    delete this.__file;
> +    delete this.__dict;
> +    this._reading = false;

what is this _reading used for? It seems unused atm.

@@ +82,5 @@
> +    delete this.__file;
> +    delete this.__dict;
> +    this._reading = false;
> +    this._callbacks = [];
> +  },

hm, why not making this a proper instantiable CachedPropertyList(aPropertyList) object and just cache it in the migrator instance? Sounds easier to initialize and destroy.

@@ +109,5 @@
> +        this._dict = aDict;
> +        for (let callback of this._callbacks) {
> +          callback(aDict);
> +        }
> +        this._callbacks.splice(0);

while (this._callbacks.length > 0) {
  (this._callbacks.shift())(aDict);
}

@@ +116,5 @@
> +  },
> +
> +  // Workaround for nsIBrowserProfileMigrator.sourceHomePageURL until
> +  // it's replaced with an async method.
> +  readSync: function MPPL_readSync() {

sigh :( May we at least prefix with _ to show it's not standard and should ideally not be used? The same we do with _readFromArrayBufferSync...

@@ +135,5 @@
> +
> +/**
> + * Shared object for bookmarks import (reading-list items may, or may not, be
> + * under a separate resource.  See comment in above the ReadingList object
> + * declaration.

typo: in above the ReadingList object"

Btw, there is no ReadingList object above, so I don't understand this comment

@@ +180,5 @@
> +        runBatched: function() {
> +          this._self.migrateCollection(aEntries, aCollection);
> +        }
> +      }, null)
> +      this._batching = false;

don't understand this _batching complication, all this bookmarks stuff _so far_ is synchronous, so it can't enter/exit batch. Once it will be async it's likely we won't have a batch mechanism.

couldn't you just batch here and provide a _migrateCollectionInternal to do the actual work for now?

@@ +210,5 @@
> +    else {
> +      entriesFiltered = aEntries;
> +    }
> +
> +    if (entriesFiltered.length > 0) {

invert condition and bail out, instead of indenting everything below it.

@@ +225,5 @@
> +          // migration, would be the bookmarks menu).  This does not seem right
> +          // given that the result of moving bookmarks to the bookmarks root in
> +          // Safari is kind-of hidden bookmarks.  So, at least for the time
> +          // being, we just use the unfiled-bookmarks root, regardless of the
> +          // migration type (startup or post-startup)

compact this comment please, we actually don't care what the old migrator was doing, we look at the future! Btw, I agree with using unfiled, we do the same when some bookmark ends up in some place where it should not be, we move it to unfiled.

@@ +240,5 @@
> +          }
> +          else {
> +            folder = MigrationUtils.createImportedBookmarksFolder("Safari",
> +              PlacesUtils.toolbarFolderId);
> +          }

why is this managed differently from menuItemsFolder, they are basically doing same thing, should be consistent

@@ +248,5 @@
> +          // Reading list items are imported as regular bookmarks.
> +          // They are imported under their own folder, created either under the
> +          // bookmarks menu (in the case of startup migration), or under the
> +          // "From Safari" folder, located within the bookmarks menu (in the
> +          // case of post-startup migration). In the case of startup-migration.

"In the case of startup-migration." phrase looks out of context, even if the previous dot would be a comma, the condition is inverted.

@@ +252,5 @@
> +          // case of post-startup migration). In the case of startup-migration.
> +          // Since the imported folder doesn't function in the same manner as
> +          // in Safari, naming it "Reading List", when it's created right under
> +          // the bookmarks menu, would be quite misleading. Thus, in this case,
> +          // we suffix it with "(from Safari)".

As I say in the .properties file I think this is unneeded complication, we should just call this "Safari Reading List" and stop there.

@@ +263,5 @@
> +              folderName, PlacesUtils.bookmarks.DEFAULT_INDEX);
> +          break;
> +        }
> +        default:
> +          throw new Error("Unexpected value for aCollection!");

does this mean we bail out from the whole import process if we find a new collection? wouldn't be better to report it (if possible) but continue importing the rest? May be a bit more robust vs Safari upgrades.

@@ +274,5 @@
> +  // migrate the given array of safari bookmarks to the given places
> +  // folder.
> +  _migrateEntries: function BI__migrateEntries(aEntries, aFolderId) {
> +    for (let i = 0; i < aEntries.length; i++) {
> +      let entry = aEntries[i];

may use for...of

@@ +308,5 @@
> +  type: Ci.nsIBrowserProfileMigrator.SETTINGS,
> +
> +  migrate: function MPR_migrate(aCallback) {
> +    MainPreferencesPropertyList.read(
> +      MigrationUtils.wrapMigrateFunction(function(prefsDict) {

label the function, and aPrefsDict

@@ +310,5 @@
> +  migrate: function MPR_migrate(aCallback) {
> +    MainPreferencesPropertyList.read(
> +      MigrationUtils.wrapMigrateFunction(function(prefsDict) {
> +        if (!prefsDict)
> +          throw "Could not read preferences file";

please always throw new Error

@@ +315,5 @@
> +
> +        this._prefsDict = prefsDict;
> +
> +        let invert = function(webkitVal) !webkitVal;
> +        this._set("AlwaysShowTabBar", "browser.tabs.autoHide", invert);

while we should try to respect user's will, I think we should not import prefs that may dumb down the experience in firefox.
Like with tabs on top browser.tabs.autoHide doesn't make much sense (Australis will make that even less good), and there could be different reasons it had to be set in Safari.
I'd not import this one.

@@ +318,5 @@
> +        let invert = function(webkitVal) !webkitVal;
> +        this._set("AlwaysShowTabBar", "browser.tabs.autoHide", invert);
> +        this._set("AutoFillPasswords", "signon.rememberSignons");
> +        this._set("OpenNewTabsInFront", "browser.tabs.loadInBackground", invert);
> +        this._set("WebKitJavaScriptEnabled", "javascript.enabled");

Maybe the user is migrating cause websites don't work correctly due to this. Apart that, I think nowadays doesn't make much sense to import this pref, esp with out mission towards html5.
Plus our js engine is much better :)
Would not import this one.

@@ +324,5 @@
> +                   "dom.disable_open_during_load", invert);
> +
> +        // layout.spellcheckDefault is a boolean stored as a number.
> +        this._set("WebContinuousSpellCheckingEnabled",
> +                  "layout.spellcheckDefault", Number);

is Safari spellcheck worse than ours? is it enabled by default?
iirc ours is enabled by default, and if the safari one is disabled by default, we'd not do the right thing here.

@@ +337,5 @@
> +                  function(webkitVal) webkitVal ? 1 : 2);
> +
> +        // Default charset migration
> +        this._set("WebKitDefaultTextEncodingName", "intl.charset.default",
> +          function(webkitCharset) {

name the functions please

@@ +367,5 @@
> +  get exists() MainPreferencesPropertyList.exists,
> +
> +  /**
> +   * Attempts to migrates a preference from Safari.  Returns if the preference
> +   * has been migrated.

I think you meant "Returns whether the preference has been migrated"

@@ +406,5 @@
> +  },
> +
> +  // Fonts settings are quite problematic for migration, for a couple of
> +  // reasons:
> +  // (a) Every font preference in Gecko is set for a particular langauge.

typo: langauge

@@ +425,5 @@
> +  // For now, we use the language implied by the system locale as the
> +  // lang-group. The only exception is minimal font size, which is an
> +  // accessibility preference in Safari (under the Advanced tab). If it is set,
> +  // we set it for all languages.
> +  // As for the font type of the default font (serif/sans-serif), the default typ

typo: typ

@@ +520,5 @@
> +    // Retention Mode:      Safari  Firefox
> +    // ------------------------------------------
> +    // Remove Manually      0       browser.download.manager.retention: 2
> +    // Remove when Complete 2       browser.download.manager.retention: 0
> +    // Remove on Exit       1       Tottally Different Story

typo: tottally

@@ +535,5 @@
> +    // preferences but downlaods, are turned off.
> +    // We also set "Remember Downloads" in this case. While our UI does make it
> +    // possible to ask for downloads clearance on exit, alongside an unchecked
> +    // "Remember Downloads" checkbox, it is arguably meaningless, and certainly
> +    // confusing.

I think we should not care too much to import this properly. The downloads stuff is about to change with the new panel, we will always remove downloads, and consider downloads cleanup as part of clearing history (the dialog will have "Clear history and downloads"). Also our clear on shudown is actually clearing nothing since all downloads stay visible in the Library till you clear history too.
I don't want to add complication here that then we may have to spend time to undo soon. I'd say to just import browser.download.manager.retention and let exit behavior alone.

@@ +555,5 @@
> +};
> +
> +#ifdef XP_MACOSX
> +// On OS X, the cookie-accept policy preference is stored in a separate
> +// property list.

what about Win?

@@ +557,5 @@
> +#ifdef XP_MACOSX
> +// On OS X, the cookie-accept policy preference is stored in a separate
> +// property list.
> +let CookieBehavior = {
> +  type: Ci.nsIBrowserProfileMigrator.SETTINGS,

these migrators should use the MigrationUtils.resourceTypes stuff, all of them :)

@@ +616,5 @@
> +  get exists() this._file != null
> +};
> +
> +// The Reading List feature was introduced at the same time in Windows and
> +// Mac versions of Safari.  Not surpsingly, they are stored in the same

typo: surpsingly

@@ +617,5 @@
> +};
> +
> +// The Reading List feature was introduced at the same time in Windows and
> +// Mac versions of Safari.  Not surpsingly, they are stored in the same
> +// format in both versions.  sSurpsingly, only on windows there is a

typo: sSurpsingly, uppercase Windows

@@ +680,5 @@
> +          // to the right).
> +          // On top of that, in the case of non-startup-migration, our grid may
> +          // not be empty, and even if there're actually enough places for all
> +          // imported pinned links, we may not be able to use the exact same
> +          // indicies (For example, if the only cell used, both in Safari and

typo: indicies

@@ +722,5 @@
> +
> +        let bannedURLStrings = rootDict.get("BannedURLStrings");
> +        if (bannedURLStrings) {
> +          for (let bannedURL of bannedURLStrings) {
> +            NewTabUtils.blockedLinks.block({ url: bannedURL});

nit: missing whitespace after bannedURL

@@ +746,5 @@
> +      function(historyDict) {
> +        if (!historyDict)
> +          throw "Could not read history property list";
> +        if (!historyDict.has("WebHistoryDates"))
> +          throw "Unexpected history-property list format";

new Error pls

@@ +763,5 @@
> +          let lastVistDateDouble = parseFloat(entry.get("lastVisitedDate"));
> +          if (!isNaN(lastVistDateDouble)) {
> +            const NSDATE_REFERENCE_DATE = Date.parse("1 January 2001, GMT");
> +            let lastVisitDate = new Date(NSDATE_REFERENCE_DATE +
> +                                         lastVistDateDouble * 1000);

may you please move this stuff to a SafariTime conversion util with a javadoc? would cleanup this code quite a bit. It may take the entry and return either a Date or NaN. OR directly return a PRTime.

@@ +774,5 @@
> +                        });
> +          }
> +        });
> +        PlacesUtils.asyncHistory.updatePlaces(places);
> +      }.bind(this), aCallback));

updatePlaces should invoke the callback when done, not this.

@@ +802,5 @@
> +            let formHistory = Cc["@mozilla.org/satchel/form-history;1"].
> +                              getService(Ci.nsIFormHistory2);
> +            recentSearchStrings.forEach(function(searchString) {
> +              formHistory.addEntry("searchbar-history", searchString);
> +            });

hm, this is about to change in bug 697377, but since we don't have an ETA, we'll convert it in future.

::: browser/locales/en-US/chrome/browser/migration/migration.properties
@@ +12,5 @@
>  importedSearchUrlDesc=Type "%S <search query>" in the Location Bar to perform a search on %S.
>  
> +# Safari Reading List
> +importedSafariReadingListNoSuffix=Reading List
> +importedSafariReadingListSuffixed=Reading List (from Safari)

Honestly this distinction looks uneeded complication to me, may we just make this "Safari Reading List" in both cases?  It's not a big deal to just have the browser name there.
Comment 49 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-04-09 04:17:35 PDT
Created attachment 613266 [details] [diff] [review]
patch

The top sites resource is removed in this revision. I'll add it back in bug 713820.

The comment about resourceTypes isn't address - I'll fix it after the Firefox migrtor changes land.

This should be generally good-to-go.
Comment 50 Marco Bonardo [::mak] 2012-04-11 14:28:32 PDT
Comment on attachment 613266 [details] [diff] [review]
patch

Review of attachment 613266 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/migration/src/SafariProfileMigrator.js
@@ +10,5 @@
> +
> +Cu.import("resource://gre/modules/XPCOMUtils.jsm");
> +Cu.import("resource://gre/modules/FileUtils.jsm");
> +Cu.import("resource://gre/modules/Services.jsm");
> +Cu.import("resource://gre/modules/MigrationUtils.jsm");

not gre

@@ +24,5 @@
> +  this._file = aBookmarksFile;
> +}
> +Bookmarks.prototype = {
> +  type: Ci.nsIBrowserProfileMigrator.BOOKMARKS,
> +  migrate: function B_migrate(aCallback) {

nit: newline between props, plz

@@ +111,5 @@
> +        }
> +        else {
> +          folder = MigrationUtils.createImportedBookmarksFolder(
> +            "Safari", PlacesUtils.bookmarksMenuFolderId);
> +        }

you may compact this by

folder = PlacesUtils.bookmarksMenuFolderId;
if (!MigrationUtils.isStartupMigration)
 folder = MigrationUtils.createImportedBookmarksFolder("Safari", folder);

@@ +123,5 @@
> +          folder = MigrationUtils.createImportedBookmarksFolder("Safari",
> +            PlacesUtils.toolbarFolderId);
> +        }
> +        break;
> +      }

ditto

@@ +239,5 @@
> +      catch(ex) {
> +        Cu.reportError(ex);
> +        aCallback(false);
> +      }
> +    }.bind(this), aCallback);

this ", aCallback" looks wrong, is not this the PropertyListUtils.read()?

@@ +272,5 @@
> +    if (!alreadyReading) {
> +      PropertyListUtils.read(this._file, function readPrefs(aDict) {
> +        this._dict = aDict;
> +        for (let callback of this._callbacks) {
> +          callback(aDict);

may any of these throw and create us issues?

@@ +320,5 @@
> +                   "dom.disable_open_during_load", invert);
> +
> +        // layout.spellcheckDefault is a boolean stored as a number.
> +        this._set("WebContinuousSpellCheckingEnabled",
> +                  "layout.spellcheckDefault", Number);

I think I may have already asked this, though, is spellcheck enabled by default in Safari?
Asking cause our spellcheck is enabled by default if a dictionary is present, and works decently, would be a pity to disable it just cause we are copying a default pref value, the user would lose a feature he may never know about.

@@ +399,5 @@
> +   *        appropriate value for aMozPref.  If it's not passed, then the
> +   *        Safari value is set as is.
> +   *        If aConvertFunction returns undefined, then aMozPref is not set
> +   *        at all.
> +   * @return whether or not aMozPref was set.

fwiw, we don't use the return value anywhere
Comment 51 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-04-14 03:03:52 PDT
(In reply to Marco Bonardo [:mak] from comment #50)

> I think I may have already asked this, though, is spellcheck enabled by
> default in Safari?
> Asking cause our spellcheck is enabled by default if a dictionary is
> present, and works decently, would be a pity to disable it just cause we are
> copying a default pref value, the user would lose a feature he may never
> know about.

It's enabled by default. But anyway, note that preference file is designed not to include preferences which were not touched by the user (I'm speaking of user-facing preferences of course). This also applies to the spell-checker preference: if you have not set it, it's not there (and then our has check will fail).
Comment 52 Marco Bonardo [::mak] 2012-04-14 03:28:50 PDT
(In reply to Mano from comment #51)
> note that preference file is designed
> not to include preferences which were not touched by the user

perfect, thanks.
Comment 53 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-04-14 12:06:33 PDT
Created attachment 615075 [details] [diff] [review]
for checkin

While testing, I fixed few small bugs.
1. Somehow _set got broken between the iteration, setting safariVal rather than mozVal
2. The migration.xul changes introduced in early revisions for supporting windows were later omitted - fixed by completely removing the #ifdef mess from migration.xul and letting getMigrator fail silently. For now, I think that's good enough. However, we do need some XP_LINUX define.
3. I noticed two bugs wrt. to migrating the downloads folder:
3.1. Bug 745466  - patched.
3.2. The windows safari names the pref DownloadPath (rather than Download_s_Path as in the mac version). Fixed by checking for both.
4. Empty bookmark folder were not handled properly: these folders may not have a children dictionary. Fixed.
5. Completely removed the importing of the download cleanup policy. It turns out the UI for "remove when complete" is gone.

All of these changes are trivial, so landing this now.
Comment 54 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2012-04-14 12:13:36 PDT
http://hg.mozilla.org/integration/mozilla-inbound/rev/4343723f49e1

Note You need to log in before you can comment on or make changes to this bug.