Closed Bug 887043 Opened 11 years ago Closed 11 years ago

Use an async SQL query instead of a result to collect backups data

Categories

(Toolkit :: Places, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla28

People

(Reporter: mak, Assigned: raymondlee)

References

Details

(Whiteboard: [qa-])

Attachments

(1 file, 6 obsolete files)

we should stop using using results to build bookmarks backups, instead a single async query may be enough and speed up the process quite a bit, moreover we'll be able to do backups in background instead of enforcing them on idle or shutdown.
We are using PlacesUtils.getFolderContents to get the results.

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/BookmarkJSONUtils.jsm#556

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/PlacesUtils.jsm#757

Shall we do the async query in JS in PlacesUtils or in C++?
(In reply to Raymond Lee [:raymondlee] from comment #1)
> We are using PlacesUtils.getFolderContents to get the results.
> 
> http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/
> BookmarkJSONUtils.jsm#556
> 
> http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/
> PlacesUtils.jsm#757
> 
> Shall we do the async query in JS in PlacesUtils or in C++?

I think it's best to use  https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/Sqlite.jsm
we should rewrite some internals in BookmarkJSONUtils.jsm to do an async query. PlacesUtils will stay untouched for now.
Sqlite.jsm may be fine, we should just not use getFolderContents.
Assignee: nobody → raymond
Status: NEW → ASSIGNED
Attached patch bug-887043-v1.diff (obsolete) — Splinter Review
Marco, I have created BookmarkRow based on the BookmarkNode.  It gets all bookmarks using one async SQL query and then process the them to create a JSON object.  Please give me some feedback on this.
Attachment #794049 - Flags: feedback?(mak77)
Blocks: 627487
A while ago I talked with Marco about adding some sort of "cloning" functionality to Sqlite.jsm so that in cases like this, in which you have an open db connection handy, you could still use the nice & clean API Sqlite.jsm provides rather than falling back to the XPCOM API. I don't know whether or not a bug was filed and how trivial or not i might be to add that feature, but *if* it's trivial enough I think we should consider doing that fist. It would be nice not to revise this code once more later :)
No longer blocks: 627487
Blocks: 818589
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Oops, pressed the wrong button
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch v2 (obsolete) — Splinter Review
Output bookmark guids to backups as well.
Attachment #803512 - Flags: feedback?(mak77)
Attachment #794049 - Attachment is obsolete: true
Attachment #794049 - Flags: feedback?(mak77)
(In reply to Mano from comment #5)
> A while ago I talked with Marco about adding some sort of "cloning"
> functionality to Sqlite.jsm so that in cases like this, in which you have an
> open db connection handy, you could still use the nice & clean API
> Sqlite.jsm provides rather than falling back to the XPCOM API. I don't know
> whether or not a bug was filed and how trivial or not i might be to add that
> feature, but *if* it's trivial enough I think we should consider doing that
> fist. It would be nice not to revise this code once more later :)

I couldn't find a bug about that so have filed bug 915525 for that.  @Marco: what's your opinion about that?
Flags: needinfo?(mak77)
I think here we should open a new connection, do the backup, and then close it, so directly use Sqlite.jsm without the need to adopt any connection.
Doing that has a concurrency advantage and we just drop resources when we close it.
Flags: needinfo?(mak77)
Attached patch v3 (obsolete) — Splinter Review
Switched to use Sqlite.jsm
Attachment #803512 - Attachment is obsolete: true
Attachment #803512 - Flags: feedback?(mak77)
Attachment #804380 - Flags: review?(mak77)
Comment on attachment 804380 [details] [diff] [review]
v3

Review of attachment 804380 [details] [diff] [review]:
-----------------------------------------------------------------

I stopped the review in the middle, cause I think we may look at this differently, the new code is a little bit too much bound to the old one, I'd actually like if it would be a rewrite based on the json contents than on the old code. But mostly, as it is, it's not generating the backups properly

::: toolkit/components/places/BookmarkJSONUtils.jsm
@@ +575,5 @@
> +   * @param   aStream
> +   *          An nsIOutputStream. NOTE: it only uses the write(str, len)
> +   *          method of nsIOutputStream. The caller is responsible for
> +   *          closing the stream.
> +   * @param   aResolveShortcuts

for the current case, you can probably remove this and consider it always false (that is, remove the only if condition using it later)

@@ +579,5 @@
> +   * @param   aResolveShortcuts
> +   *          Converts folder shortcuts into actual folders.
> +   * @param   aExcludeItems
> +   *          An array of item ids that should not be written to the backup.
> +   * @returns Task promise

I think there's no need to specify who generates the promise (a Task)

@@ +583,5 @@
> +   * @returns Task promise
> +   * @resolves the number of serialized uri nodes.
> +   */
> +  serializeJSONToOutputStream: function(aStream, aResolveShortcuts, aExcludeItems) {
> +    // Create JSON with async SQL call.

this could be merged into the javadoc, by better specifying what the method does, rather than adding the comment here.

@@ +589,5 @@
> +      let nodeCount = 0;
> +      let nodeArray = [];
> +
> +      let dbFile = Services.dirsvc.get("ProfD", Ci.nsILocalFile);
> +      dbFile.append("places.sqlite");

I think you can use OS.Constants.Path.profileDir Sqlite.jsm

@@ +591,5 @@
> +
> +      let dbFile = Services.dirsvc.get("ProfD", Ci.nsILocalFile);
> +      dbFile.append("places.sqlite");
> +
> +      let conn = yield Sqlite.openConnection({ path: dbFile.path });

please set "sharedMemoryCache: false" option, we want maximum concurrency here since we are using WAL journaling

@@ +600,5 @@
> +          "FROM moz_bookmarks b " +
> +          "LEFT JOIN moz_bookmarks t ON t.id = b.parent " +
> +          "LEFT JOIN moz_places h ON h.id = b.fk");
> +
> +        for (let row of results) {

there's something strange here, you seem to be getting ALL of the bookmarks, not just the root contents, and then later you recursively get again all of the bookmarks for each container, not just its children.
you should either use order by clauses (to order by parent and index) and read all of the bookmarks in one shot (this would be very nice, even if the code should change quite a bit to preprocess all of the results to build a tree), or you should limit the query on a given parent (in this case would be WHERE b.parent = 0 and later the given container you are entering).
I would prefer the former, so that we query just once instead of once per container, and already convert the .getResultByName into actual properties, though the latter could be a simpler solution to implement in the interim.

As it is this is probably generating invalid backups

@@ +603,5 @@
> +
> +        for (let row of results) {
> +          let result = yield BookmarkRow._appendConvertedNode(results,
> +                                                              row,
> +                                                              row.getResultByName("position"),

doesn't see to make much sense to pass position apart from the row...

@@ +604,5 @@
> +        for (let row of results) {
> +          let result = yield BookmarkRow._appendConvertedNode(results,
> +                                                              row,
> +                                                              row.getResultByName("position"),
> +                                                              nodeArray,

I think it would be clearer as a simple "nodes"

@@ +631,5 @@
> +      let nodeCount = 0;
> +
> +      // Set index in order received
> +      // XXX handy shortcut, but are there cases where we don't want
> +      // to export using the sorting provided by the query?

this comment can go away from here

@@ +657,5 @@
> +        }
> +        yield this._addURIProperties(bRow, node);
> +        nodeCount++;
> +      } else if (type == Ci.nsINavBookmarksService.TYPE_FOLDER ||
> +                 type == Ci.nsINavBookmarksService.TYPE_DYNAMIC_CONTAINER) {

DYNAMIC_CONTAINERs are not supported from quite some time, can be removed
Attachment #804380 - Flags: review?(mak77) → review-
Attached patch v4 (obsolete) — Splinter Review
Attachment #804380 - Attachment is obsolete: true
Attachment #817192 - Flags: review?(mak77)
Marco: review ping
I'm doing this review today.
Sorry, I had some hardware problem (burned motherboard) and was also trying to figure your approach here.
One thing I dislike is that it looks very slow, instead of doing a single pass of the results, you must "search" for the row you need, for every container. So if you have 100 containers, you walk the whole results 100 times, up to the container.
Due to that, it looks a bit too much error prone.

My take on this was rather opposite, I think we should keep the original structure of the walking, and build something that works "like" the old result.
I'd walk the rows just once, build a tree of nodes (objects that look like nodes, containers will have a .children property that is an array of nodes), to quickly find parents keep a separate Map( parentId => object ) so when you read a row you can push to Map.get(parentId).children (pseudocode). The lookup should be really fast this way, and when you are done you just throw the Map away.

At that point you pretty much have a tree you can walk the same way as we were walking results, just you don't need to containerOpen/Close since .children is an array. In future we could then just make this object "final" and directly JSON.stringify it...

What do you think?
Flags: needinfo?(raymond)
Comment on attachment 817192 [details] [diff] [review]
v4

Review of attachment 817192 [details] [diff] [review]:
-----------------------------------------------------------------

better than before, but a bit too much error prone and slow.
Attachment #817192 - Flags: review?(mak77) → feedback-
Attached patch v5 (obsolete) — Splinter Review
(In reply to Marco Bonardo [:mak] from comment #15)
> Sorry, I had some hardware problem (burned motherboard) and was also trying
> to figure your approach here.
> One thing I dislike is that it looks very slow, instead of doing a single
> pass of the results, you must "search" for the row you need, for every
> container. So if you have 100 containers, you walk the whole results 100
> times, up to the container.
> Due to that, it looks a bit too much error prone.
> 
> My take on this was rather opposite, I think we should keep the original
> structure of the walking, and build something that works "like" the old
> result.
> I'd walk the rows just once, build a tree of nodes (objects that look like
> nodes, containers will have a .children property that is an array of nodes),
> to quickly find parents keep a separate Map( parentId => object ) so when
> you read a row you can push to Map.get(parentId).children (pseudocode). The
> lookup should be really fast this way, and when you are done you just throw
> the Map away.
> 
> At that point you pretty much have a tree you can walk the same way as we
> were walking results, just you don't need to containerOpen/Close since
> .children is an array. In future we could then just make this object "final"
> and directly JSON.stringify it...
> 
> What do you think?

I have created a Map so we can do a quick lookup in _appendConvertedComplexNode().
Attachment #817192 - Attachment is obsolete: true
Attachment #825801 - Flags: feedback?(mak77)
Flags: needinfo?(raymond)
Comment on attachment 825801 [details] [diff] [review]
v5

Review of attachment 825801 [details] [diff] [review]:
-----------------------------------------------------------------

OK, this is much better, I think we may still avoid one lookup but regardless this code definitely needs a polish pass once all of this conversion work is done, so I don't think we should avoid this goodness to try obtaining perfection now.
Provided you made some local tests of backups/restore, additional to the automated tests we have (that are very simple, so testing on some real profiles would be very nice), I think this is good enough to land and remove some jankiness.
Finally, apologize if this took more than expected.

::: toolkit/components/places/BookmarkJSONUtils.jsm
@@ +611,5 @@
> +          }
> +        }
> +
> +        let root = rowMap.get(0);
> +        if (root) {

we should probably throw if root is not defined, I can't think of a valid case

@@ +725,5 @@
> +    // XXXdietrich - store annos for non-bookmark items
> +  },
> +
> +  _addURIProperties: function BR__addURIProperties(
> +    aRow, aJSNode) {

please fix indentation, oneline should be fine
Attachment #825801 - Flags: feedback?(mak77) → review+
Comment on attachment 825801 [details] [diff] [review]
v5

Review of attachment 825801 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/places/BookmarkJSONUtils.jsm
@@ +778,5 @@
> +        repr[name] = value;
> +      repr.children = [];
> +
> +      let data = aRowMap.get(aNode.id);
> +      if (data) {

ah I forgot, I don't think this case is expected at all, probably we should at least reportError it.
I mean: "the case where data is not defined is not expected"
Attachment #8333750 - Attachment is obsolete: true
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/1408688b1f85
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → mozilla28
Blocks: 940181
Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: