Places Query API refactor/redesign

NEW
Unassigned

Status

()

defect
P3
normal
10 years ago
7 months ago

People

(Reporter: ddahl, Unassigned)

Tracking

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(blocking2.0 -)

Details

()

Attachments

(1 attachment, 26 obsolete attachments)

71.20 KB, patch
Details | Diff | Splinter Review
Reporter

Description

10 years ago
Create a sensible, easy to use Places Query API. Javascript - not C++, easy to use and elegant.

It would be nice to model this after more modern styles of database interaction including the ORM or CouchDB / BigTable.

We need to design our API that will allow the kinds of interaction the FX UI/UX team are planning for 3.7 and beyond.

See http://blog.mozilla.com/faaborg/2009/10/13/browsing-your-personal-web/
Reporter

Updated

10 years ago
Blocks: 523519
Reporter

Comment 1

10 years ago
Rough sketches are here, stillworking on more details, comments and ideas are welcome: https://wiki.mozilla.org/Firefox/Projects/PlacesQueryAPISketches
Reporter

Comment 2

10 years ago
I heard from mossop that you can use XUL templates with your own custom datasource. Is this documented anywhere. I've been reading through the XUL template guide, and I don't see it.

https://developer.mozilla.org/en/XUL/Template_Guide
Reporter

Comment 3

10 years ago
I have begun working on a tinkertoys version of this to demonstrate the ideas I have in a more concrete fashion. 

in a nutshell I have the following objects:

QueryProcessor processes "queryObject" input from the api consumer, which is basically a simple object with a phrase and a callback function:

// you need a UI call back first (optional)

function callback(results){ decorateUI(results); ... }

let queryObj = {phrase:"mozilla", uiCallback: callback};
let api = new QueryProcessor(queryObj);

The api object returns a QueryExecutor object, which has the fetch, narrow and widen methods attached to it.

when api.fetch() is called, the query is executed via executeAsync and the callback function is run (if present)

I will attach a working example fo this api.
Reporter

Comment 4

10 years ago
Posted file refactor of places query api (demo) (obsolete) —
There is a "dream schema" in this code that I have been playing with (in concept only) as well.
Reporter

Comment 5

10 years ago
Btw: this demo I have attached is easily run in jesse's Javascript Shell when installed as part of the extension developer's extension. It needs chrome privs of course. the demo uses an 'in memory' sqlite database.
Reporter

Comment 6

10 years ago
Wrapping the Existing Places QueryAPI, WIP. Tests soon.
Reporter

Comment 7

10 years ago
Reverted to wrapping the existing Places Query API, which unfortunately is synchronous. I plan to continue on another patch that targets 3.7/3.7 + 4.0 using mozStorage async apis.
Attachment #414421 - Attachment is obsolete: true
(In reply to comment #7)
> Reverted to wrapping the existing Places Query API, which unfortunately is
> synchronous. I plan to continue on another patch that targets 3.7/3.7 + 4.0
> using mozStorage async apis.
Yeah, if we are making a new API, it doesn't seem to make much sense to shoot ourselves in the foot by making it have some of the worst parts of the old one (synchronous!).
Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h".

In Thunderbird 3.0b, you do that as follows:
Tools | Message Filters
Make sure the correct account is selected. Click "New"
Conditions: Body   contains   places-to-b-and-h
Change the action to "Delete Message".
Select "Manually Run" from the dropdown at the top.
Click OK.

Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter.

Gerv
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
Reporter

Updated

10 years ago
Blocks: 531940
Reporter

Comment 10

10 years ago
Just wanted to mention that some discussion of this vis a vis Jetpack is happening here: http://groups.google.com/group/mozilla-labs-jetpack/browse_thread/thread/b5f177128c75f35
Reporter

Comment 11

10 years ago
Attachment #414587 - Attachment is obsolete: true
Reporter

Comment 12

10 years ago
Test are non-working, passing tests soon, just wanted to save my work
Attachment #415789 - Attachment is obsolete: true
Reporter

Comment 13

10 years ago
Posted patch 0.1.2 Work in Progress (obsolete) — Splinter Review
Successfully wrapping the resultNodes by calling containerOpen = false after the callback is fired.
Attachment #416224 - Attachment is obsolete: true
Reporter

Comment 14

10 years ago
I think I will have a useful patch for review next week.
Reporter

Comment 15

10 years ago
Attachment #416832 - Attachment is obsolete: true
Reporter

Comment 16

10 years ago
(In reply to comment #16)
> Created an attachment (id=417015) [details]
> WIP Async Patch, just a stub, saving work

can you file async as a followup? let's get the basic api landed, and into people's hands to experiment with. then follow with async and more api candy.
Hardware: x86 → All
Target Milestone: Firefox 3.7 → ---
Component: Bookmarks & History → Places
Product: Firefox → Toolkit
QA Contact: bookmarks → places
Reporter

Comment 18

10 years ago
Added some meta data and another test
Attachment #413777 - Attachment is obsolete: true
Attachment #416991 - Attachment is obsolete: true
Attachment #417850 - Flags: review?(mak77)
Posted file review pass comments (obsolete) —
Comment on attachment 417850 [details] [diff] [review]
v 0.1.4 Patch for initial review

comments attached above
Attachment #417850 - Flags: review?(mak77)
Reporter

Comment 21

10 years ago
Thanks Marco, i'll have some fixes and commentry early monday.
Reporter

Comment 22

10 years ago
Posted file first-pass reply. (obsolete) —
working on the changes now.
Attachment #418870 - Flags: review?(mak77)
Reporter

Comment 23

10 years ago
Almost done with this review iteration. Tagging test is failing. Will return Tuesday after holiday.
Attachment #417850 - Attachment is obsolete: true
Comment on attachment 418870 [details]
first-pass reply.

>+
>+let booksvc = Cc["@mozilla.org/browser/nav-bookmarks-service;1"].
>+              getService(Ci.nsINavBookmarksService);
>+
>+var tagsvc = Cc["@mozilla.org/browser/tagging-service;1"].
>+             getService(Ci.nsITaggingService);
>
>>> These should be lazy getters
>
>Should I use __defineGetter__ in this scope for each of them then?

yes, you should not initialize them till you need them

>>> when could this throw?
>
>perhaps I am too careful:) I gues sit could not. I do that for console.log and wasn't thinking

it's just that try catch is verbose, it is fine to use it, but not for simple assignements or calls that will practically never throw

>>> is this code compatible with our MPL license? if it's distributed under a not compatible license than you can't use any part of it, even if it's freely available on the net.
>>> btw, you could just return typeof unless it's an object, then use isArray (At this point isArray would be a private method inside this one)
>>> PS: typeOf is a too much conflicting name with typeof, if you need it i'd say strictTypeOf
>
>I'll look into the licensing, i think it is public domain

unless explicitely licenced by the author we can't reuse code, just do the same with different code.

>+  for (let prop in aQueryObj) {
>+    this.queryObject[prop] = aQueryObj[prop];
>+  }
>
>>> so, why are you not using the passed-in object and just check phrase and where and set callback?
>
>because you can set other properties as well. this is more flexible.

i'm not sure i get the point, you are copying all properties, you mean we could want to set other properties without modifying the user's object?

>this.queryObject = aQueryObject;
>if (!("where" in this.queryObject) ||
>    ['history', 'bookmarks'].indexOf(this.queryObject.where) == -1)
>  this.queryObject.where = 'bookmarks';
>
>>> Apart this, do we really need the Object suffix? it looks redundant.

>I think we are already overloading the word 'query' so to me 'queryObject' is more of a configuration parameter. perhaps we change it to 'config'?

the word Object does not add any value to the var name, use valuable names

>+PlacesQuery.prototype = {
>+  __name__: "PlacesQuery",
>
>>> not sure what value is adding this to tests, btw you can avoid this and do
>
>since the object morphs itself, i ues that as meta data to check what object I am in when an exception is thrown. Again, more of a diagnostic thing that may or may not have much value in this iteration. next, for sure.
>
>var a = new PlacesQuery();
>PlacesQuery.prototype.isPrototypeOf(a); // true

So what is my above method with isPrototypeOf not covering about your case?

>+  if (!queryObj || typeof queryObj !== 'object') {
>+    throw new QueryException(ERRORS.NO_QUERY_OBJECT_X);
>+  }
>
>>> can this ever happen? shouldn't we ensure this in the caller? didn't we already?
>
>perhaps yes. I am being too careful maybe?

you should protect entry points, so if we could reach this situation, the check is fine.

>+    let bmUri = uri(aObj.url);
>
>>> fyi, this could throw if the user passes in a bad url
>
>so create an nsIUri? ok.

this is already creating an nsIURI, the point is that it could throw, you should handle the error

>+
>+  get genericType() {
>+    try {
>+      if (this._rawNode.lastModified > 0) {
>+        return 'bookmark';
>+      }
>+      else {
>+        return 'history';
>+      }
>+    }
>+    catch(ex) {
>+      return 'history';
>+    }
>+  },
>
>>> error handling is cool, but you should avoid it if there is nothing that can possibly throw
>
>ok, is there too much overhead in try/catch? if so - WTF?:)

no, but what can throw here? you are just doing an if/else and a return... try catch adds lot of noise without any valuable gain
Attachment #418870 - Flags: review?(mak77)
Reporter

Comment 25

10 years ago
Attachment #418940 - Attachment is obsolete: true
Attachment #419485 - Flags: review?(mak77)
Reporter

Comment 26

10 years ago
(In reply to comment #24)

> >ok, is there too much overhead in try/catch? if so - WTF?:)
> 
> no, but what can throw here? you are just doing an if/else and a return... try
> catch adds lot of noise without any valuable gain

I think that was added after I kept getting undefined values from closing the container too soon. I better check the most recent patch.
Reporter

Comment 27

10 years ago
Attachment #419485 - Attachment is obsolete: true
Attachment #419487 - Flags: review?(mak77)
Attachment #419485 - Flags: review?(mak77)
Posted file some more comment (obsolete) —
Attachment #419487 - Flags: review?(mak77)
Attachment #418466 - Attachment is obsolete: true
Attachment #418870 - Attachment is obsolete: true
Reporter

Comment 29

10 years ago
(In reply to comment #28)
> Created an attachment (id=419932) [details]
> some more comment

Mak: your comments are great. you are the efficiency master:) making most of those changes right now.


>why is this not defined in prototype?
>
>+
>+  let self = this;
>+  this.reconfigure = function reconfigure(queryConf, callback) {
>
>not sure why this method is added runtime in the constructor, please clarify
>

Becasue it uses a closure over a possible existing callback function, incase you are calling a similar query again - using the same callback function. So the phrase might change but not the callback. 


>+    self.queryConf = queryConf;
>+    if (callback) {
>+      self.queryConf.callback = callback;
>+    }
>
>hm, you inited .callback to callback even if undefined in placesQuery >constructor, i think it's sane to check everywhere or nowhere (provided you >check it before using it)

not always true, as the default state will init aCallback to 'undefined'. The callback is always optional.
Reporter

Comment 30

10 years ago
>+    self.query = hs.getNewQuery();
>
>purpose for saving query in current object?
>
>
>+    if (self.queryConf.phrase) {
>+      self.query.searchTerms = self.queryConf.phrase;
>+    }
>+    else {
>+      throw new Error(ERRORS.MISSING_PHRASE);
>+    }
>
>so, i can't config a query without a search term?

Correct. I was trying to make this optional before... hmm... thinking

perhaps the "reconfigure" method shold be called "reQuery", since it configures and then executes the query? The naming is a little off now. I am going withthe theory that you will want to change your phrase but not your callback as oftehn to replace results over and over when you are not sure what phrase to use.

>
>+    self.find();
>
>looks like you should pass in the callback?

The callback is already registered in this.queryConf.callback eariler in the constructor
Reporter

Comment 31

10 years ago
>+  find: function QueryExecutor_find(queryConf, callback) {
>+    if (queryConf) {
>+      this.reconfigure(queryConf, callback);
>+    }
>+    if (!this.queryConf.phrase) {
>+      throw new Error(ERRORS.MISSING_PHRASE);
>+    }
>
>this looks fragile, and phrase can be null, you init it to null.
>Most likely you need to add some documentation on top of you methods, or i >can't guess what are you trying to achieve.
>I guess find() always wants to look for a phrase? what will fail if the phrase >is not set, can't we just go on without looking for search term?

I think i originally init'd the phrase to null to keep it from being "undefined", which is how I was figuring out when the query was ready to execute. If there is no phrase, what will come back froma query? all rows? do we want a way to get all rows? if so, we might want another method for that.

I guess that should be if/else not if/if ?
Reporter

Comment 32

10 years ago
on second thoguht, i can kill that second if statement, the first one just needs a comment.(In reply to comment #31)
Reporter

Comment 33

10 years ago
>+  create: function QE_bookmark_create(aObj, aCallback) {
>
>i still don't think an object called placesQuery should have a .bookmark.create >method...
>then it would be a placesHelper, not a querying object. this design path needs >
>to be cleaned up still.

in SQL a query can be an insert, update or delete, but if you think the design is cleaner, i can make an "nsPlacesQueryHelper.jsm" that includes this. I kind of liked the lazy getter though - it does make the "bookmarks" property a bit independent.
Reporter

Comment 34

10 years ago
I tried to use the isPrototypeOf with my objects, but it fails. I do not explicitly set the prototype of the morphed object to the predecessor. I was justcheck ing for a __type__ property. I never wanted the prototype to be "inherited". no big deal as I don't really need this functionality right now. 

let qApi2 = new PlacesQuery({phrase: "baz"});

do_check_true(qApi2.prototype.isPrototypeOf('PlacesQuery'));

TEST-UNEXPECTED-FAIL | (xpcshell/head.js) | TypeError: qApi2.prototype is undefined

I also tried to get the prototype:

TEST-UNEXPECTED-FAIL | (xpcshell/head.js) | TypeError: qApi2.getPrototypeOf is not a function
Reporter

Comment 35

10 years ago
Posted patch v 0.1.7 Updated patch (obsolete) — Splinter Review
ready for more comments.
Attachment #419487 - Attachment is obsolete: true
Attachment #420018 - Flags: review?(mak77)
(In reply to comment #34)
> let qApi2 = new PlacesQuery({phrase: "baz"});
> 
> do_check_true(qApi2.prototype.isPrototypeOf('PlacesQuery'));
You want do_check_eq(qApi2.__proto__, PlacesQuery.prototype);
See https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Global_Objects/Object/proto
Reporter

Comment 37

10 years ago
thanks shawn. i need to stop programming before my brain melts.
(In reply to comment #34)
> do_check_true(qApi2.prototype.isPrototypeOf('PlacesQuery'));

btw, this should have been PlacesQuery.prototype.isPrototypeOf(qApi2)
Reporter

Comment 39

10 years ago
Posted file Scratchpad of queries (obsolete) —
Saving some queries I worked on today.
(In reply to comment #29)
> (In reply to comment #28)
> >+  let self = this;
> >+  this.reconfigure = function reconfigure(queryConf, callback) {
> >
> >not sure why this method is added runtime in the constructor, please clarify
> >
> 
> Becasue it uses a closure over a possible existing callback function, incase
> you are calling a similar query again - using the same callback function. So
> the phrase might change but not the callback. 

sorry, this is still a bit unclear to me, can you provide a code example of this behavior?
could be just that i get confused by the fact both methods use a var names "aCallback" and they are fighting in scopes.
Comment on attachment 420018 [details] [diff] [review]
v 0.1.7 Updated patch

As an alternative, couldn't we have a global Places object that can provide Places.query, Places.bookmarks and so on rather then having PlacesQuery.bookmarks? i think the jetpack is more toward that approach?

>diff --git a/toolkit/components/places/src/nsPlacesQueryAPI.jsm b/toolkit/components/places/src/nsPlacesQueryAPI.jsm

>+function PlacesQuery(aQueryConf, aCallback) {
>+
>+  if (!aQueryConf || typeof aQueryConf !== 'object') {
>+    throw new Error(ERRORS.NO_QUERY_CONFIG);
>+  }
>+
>+  this.queryConf = { phrase: null,
>+                     where: 'bookmarks',
>+                     callback: aCallback };
>+
>+  for (let prop in aQueryConf) {
>+    this.queryConf[prop] = aQueryConf[prop];
>+  }

please add comment about why you follow this way rather then the opposite (you already answered me, but i'd like that answer to be in a comment)

>+function QueryExecutor(queryConf, aCallback) {

>+  let self = this;
>+  this.reconfigure = function reconfigure(queryConf, aCallback) {

see comment 40

>+    if (self.queryConf.where === 'bookmarks') {
>+      self.queryOptions.queryType = Ci.nsINavHistoryQueryOptions.QUERY_TYPE_BOOKMARKS;
>+    }
>+    else if (self.queryConf.where === 'history') {
>+      self.queryOptions.queryType = Ci.nsINavHistoryQueryOptions.QUERY_TYPE_HISTORY;
>+    }

else throw?

>+  find: function QueryExecutor_find(queryConf, aCallback) {

>+    let _results = [];

not sure what's meaning that underscore in this context. Does not look like following any Moz code style (i could be wrong but i've never seen such). we use them for private members of objects, but not for vars afaict.

>+QE_bookmark.prototype = {
>+  createOne: function QE_bookmark_createOne(aObj, createdBookmarks) {
>+    if (!aObj.url) {
>+      throw new Error(ERRORS.BAD_ARGUMENT);
>+    }

should also check ("url" in aObj) before that otherwise will warn

>+    let bmUri;
>+    try {
>+      // XXX: make sure url is valid URI

no XXX

>+      bmUri = uri(aObj.url);
>+    }
>+    catch(ex) {
>+      throw new Error("Url is not valid");
>+    }

this should be defined in ERRORS i think?

>+    // XXXddahl: check for aObj.folderName to find the correct folder to
>+    // place the new bookmarks into
>+    if (!("title" in aObj)) {
>+      aObj.title = "";
>+    }
>+
>+    let id = bs.insertBookmark(bs.unfiledBookmarksFolder,
>+                               bmUri,
>+                               bs.DEFAULT_INDEX,
>+                               aObj.title);

notice that we support both a null title and an empty title, even if that sounds crazy, they are different things.
So, it depends if you want to expose this behavior, or flatten everything to empty strings.

>+    let tags = [];
>+    tags = tags.concat(aObj.tags);
>+    ts.tagURI(bmUri, tags);

did you ensure that ("tags" in aObj)?

>+    let _callback = function (results) {

ditto for _

>+    let _queryConf = { where:"bookmarks",
>+                       phrase: bmUri.spec };
>+    let _qapi = new PlacesQuery(_queryConf, _callback).find();

ditto for _

>+    if (isArray(aObj)) {
>+      for (let idx in aObj) {

use forEach on arrays, you can't be sure that nobody added external properties to the array object

>+
>+  get lastAccessTime() {
>+    return this._rawNode.time;
>+  },
>+
>+  get creationTime() {
>+    return this._rawNode.dateAdded;
>+  },
>+
>+  get modificationTime() {

actually i don't recall if we return microseconds or milliseconds, could make sense to return js Date objects or milliseconds maybe, if this is an api intended for js consumers.

>+ResultItem_getNodeType = function () {

i find the underscore confusing here, like if it's missing a dot

>diff --git a/toolkit/components/places/tests/unit/test_placesQueryApi.js b/toolkit/components/places/tests/unit/test_placesQueryApi.js

>+function isDate(aObj) {
>+  if (aObj.getTimezoneOffset){
>+    return true;
>+  }
>+  else {
>+    return false;
>+  }
>+}
>+
>+function isArray(aObj) {
>+  return (aObj.constructor.toString().indexOf("Array") != -1);
>+}

update code, the one in the module looks better


>+  dump("### Added visit with id of " + visitId + "\n");

use print() in xpcshell, it will add \n for you
Attachment #420018 - Flags: review?(mak77)
Reporter

Comment 42

10 years ago
>>+QE_bookmark.prototype = {
>>+  createOne: function QE_bookmark_createOne(aObj, createdBookmarks) {
>>+    if (!aObj.url) {
>>+      throw new Error(ERRORS.BAD_ARGUMENT);
>>+    }

>should also check ("url" in aObj) before that otherwise will warn

so this should be:

    if ("url" in aObj) {
      if (!aObj.url)  {
        throw new Error(ERRORS.BAD_ARGUMENT);
      }
    }

Seems redundant? what am I missing?
(In reply to comment #42)
> Seems redundant? what am I missing?

an && in the middle
ops sorry i meant
if (!("url" in aObj) || !aObj.url)
Reporter

Comment 45

10 years ago
Attachment #420018 - Attachment is obsolete: true
Attachment #423049 - Flags: review?(mak77)
Comment on attachment 423049 [details] [diff] [review]
v 0.1.8 Sync patch updated to jetpack code

please, ask me review once you've addressed Myk's comments, otherwise we end up doing overlapped work :)
Attachment #423049 - Flags: review?(mak77)
Reporter

Comment 47

9 years ago
marco: sorry, just making sure the jetpack changes still pass my xpcshell tests. I'll ask for review once this lands in jetpack:)
i'm fine with reviewing the thing, just that asking review to me and Myk at the same time is duping work. so once you get r+ from him, feel free to ask me another pass, it won't take long.
Reporter

Comment 49

9 years ago
I understand what you meant. What I should have said was that ( and had I been thinking straight) I would not have asked for review at this time.
Reporter

Comment 50

9 years ago
copied over committed patch from jetpack push, updated the tests, added a test for bookmark/tags issue. all tests pass. Do we need more tests? probably:) I still need to run this test on a 3.5 build which I do not have right now.
Attachment #419932 - Attachment is obsolete: true
Attachment #423049 - Attachment is obsolete: true
Attachment #424028 - Flags: review?(mak77)
So, i think we need to define a target use-case for this, especially for the bookmarking part. this is something like the 4th available access point: FUEL, Jetpack, basic API, this one. Before deciding where to put this we must understand what's the use case now that jetpack covers the add-ons needs.

Since, actually i'd prefer having a Places object and having Places.query, Places.bookmark, Places.history, but that ends up partly duplicating what the Jetpack is offering. (and by design is pretty similar)
This work has been useful for finalizing the jetpack, i think it's time to meet up with Dietrich and choose a direction for this inside-places implementation. Basicly i don't want to duplicate Jetpack unless we know we need to.
I thought the Jetpack API wrapped this new API. If that's not the case, than something has gone wrong. We should not be designing and maintaining multiple query APIs for the same set of data.

Re: FUEL - it has near-zero cost, because it's mostly unused afaict, and likely will be replaced by Jetpack APIs.
Reporter

Comment 53

9 years ago
At the very least, this is the code *for Jetpack*. this code SHOULD NOT be a part of the jetpack patch. It only is since this code has not landed on 1.9.1, 1.9.2 and trunk.

 The jetpack patch (after reboot) will import this .jsm
To clarify: You have an 0.8 Jetpack Places API that doesn't use this, and for Jetpack >0.8 it'll use this?
Reporter

Comment 55

9 years ago
This lands now with Jetpack 0.8

The plan is to import this jsm from places in jetpack's "reboot" release - maybe 0.9.
(In reply to comment #55)
> This lands now with Jetpack 0.8
> 
> The plan is to import this jsm from places in jetpack's "reboot" release -
> maybe 0.9.

Awesome, perfect. Comment #51 confused me.

Marco, the use-cases for this in my mind were the following:

- Jetpack API
- extensions (instead of FUEL)
- internal API for things like content pages
- eventually moving the views away from nsINavHistoryQuery/Result* to this, as it matures, removing as much XPCOM as possible in the process
The request looks like a miracle... A generalized api to achieve all of those tasks efficiently wrapping current specific apis

> - Jetpack API
> - extensions (instead of FUEL)

wfm

> - internal API for things like content pages

i have some concern about this, they need a specific interface, wrapping around apis won t give any good performance.
 
> - eventually moving the views away from nsINavHistoryQuery/Result* to this, as
> it matures, removing as much XPCOM as possible in the process

This is hard, unless it's a rewrite. Mano is already reducing com overhead, the perfs Are good and the code is well tested

i think we should choice a direction between helper for extendions/jetpack or a query api rewrite, they're quite different
Reporter

Comment 58

9 years ago
(In reply to comment #57)
> The request looks like a miracle... A generalized api to achieve all of those
> tasks efficiently wrapping current specific apis
> 
> > - internal API for things like content pages
> 
> i have some concern about this, they need a specific interface, wrapping around
> apis won t give any good performance.
> 

which is why we also have a bug for an asynchronous version of this api: bug 534983


> i think we should choice a direction between helper for extendions/jetpack or a
> query api rewrite, they're quite different

There is no reason why we cannot do this. You seem to be ruling out a new API that can handle *multiple use cases*.

As an example, every single web framework out there handles multiple use cases, and of course, you need to write custom code and queries at times to handle the outliers. 

The important issue here is that we are providing an elegant API that users can get their heads around in an hour, not days of study. 

The next step is to replace the guts with all-async queries and iterate on additions to the API as well as supporting additional properties to search on.

Frankly, we need to have a parallel query implementation working as our entire UI is completely changing. We need the flexibility to move in a radical direction if need be.

Do you think moving forward it makes sense to use an API and result node system that was built with a single UI element in mind (Tree)?
(In reply to comment #58)
> Do you think moving forward it makes sense to use an API and result node system
> that was built with a single UI element in mind (Tree)?

Current results API is complex, but versatile, sure was initially built for trees, but we use it on a toolbar, on menus, and with current changes for async folders and multiple viewers, it's most like a live-updating general purpose results system (that can be better, sure, and i think Mano still plans to convert it to JS so that nobody will anymore have to register bookmarks or history observers, they will just listen to a view).
Rewriting something better is clearly possible, but we should ask if it's needed and how much time will it take to have it working at least like the current one.
What i'm trying to say is that my scope is not to stop energy but to try direct it toward something useful before rather than later.

What are we trying to do:
- have nicer API for add-ons -> we can make a jetpack wrapper (here)
- have an async API -> jetpack async wrapper (here)
- have live-updating results from this API -> wrap around current result, pay attention to live-update.

This implementation is not bad, but mixes things like queries and bookmarks, and does not give me a general overview of where we are going.
David is pretty good in designing APIs, so i'd like to see a wiki page, or a diagram or whatever of the final idea, without taking in count implementation at all, and starting from the outside: we have history and bookmarks. Make your best API, introduce current API problems and fix it, introduce implementation problems and fix it. For example placesQuery.bookmark.create is not nice, places.bookmark.create is.

i want to see the puzzle image :)
so, i'd say, let's start from having a Place object, concentrate on Place.query API, make a nice API diagram that we can discuss on during next meeting.
Reporter

Comment 61

9 years ago
This is only a query and object creation system, the "M" in MVC, the exiting system bleeds over too much between M, V and C. I cannot tell where one starts and another ends in the current implementation. 

Sure a road map and diagrams would be great, but designing the entire end all be all version of this api on paper can be self defeating. We have a kernel to work with now, it needs async and some additional search properties. Iterate, Iterate. The logical steps are:

Asynchronous mozStorage usage, which requires a small bit of "SQLbuilding" to generate SQL queries for bookmarks, history and "unified" queries.

Next, ordering, searching by date, "parent node", bookmarks and other properties.

I really do not understand your problem with mixing queries and bookmarks. this is a non-issue.

In Django's ORM I would have everything under one roof:

places.objects.get(id=123) ## gets a single object from the database
places.objects.create(is_bookmark=True, title="foo title", url="http://foo.com", tags=["foo", "bar"]) ## creates the bookmarks with new tags
results = places.objects.filter(title__contains="foo bar", is_history=True) ## returns an array-like object where the query is not executed until the array is iterated

Also, each class can contain custom methods to do more complex queries, etc:

places.objects.filter_bookmarks_since("2009-12-31 00:00:00")[0] ## returns first bookmark created since that date

the basic ORM API handles 80% of use cases: 

places.objects.filter(title__contains="foo").order_by("-id") 
places.objects.create(title="foo",...)
placeObj = places.objects.get(id=9990)
placeObj.delete()

You still write sql when needed, getting a cursor is 1 line of code.

All modern web frameworks work like this. These new frameworks have *removed* the drudgery in web development. There are amazing abstractions for forms, data validation, db queries, HTTP middleware, xml parsing and on and on. The notion of real MVC is there and it makes the application development very clean.

We need to move in the same direction. We need to learn from these tools and techniques in our own internal APIs and systems. 

If this is all about the symantics of PlacesQuery.* vs. places.* in the API, fine. we can re-name it, I don't care. I think I started out by calling it PlacesAPI or something. It doesn't matter to me what it is called.
(In reply to comment #61)
> Sure a road map and diagrams would be great, but designing the entire end all
> be all version of this api on paper can be self defeating.

Not designing a target is defeating too.

> Asynchronous mozStorage usage, which requires a small bit of "SQLbuilding" to
> generate SQL queries for bookmarks, history and "unified" queries.

It is unclear atm if this is going to handle this here, i suppose we need a converter from whatever (API/place:uri) to a common internal format and an SQL builder. in my view this API is just one of the possible "inputs"

> places.objects.create(is_bookmark=True, title="foo title",
> url="http://foo.com", tags=["foo", "bar"]) ## creates the bookmarks with new
> tags
> results = places.objects.filter(title__contains="foo bar", is_history=True) ##
> returns an array-like object where the query is not executed until the array is
> iterated

this is nice, so why isn't there a simple diagram showing something like this? I don't want a complete API, i want a global overview of how things are related.
Also, it's .objects. not .query. I'm fine with having something like Places.entries, Places.entries.bookmarks.create, Places.entries.visits.create, Places.entries.bookmarks.filter, etc. i'm just not fine having bookmark being an attached helper to a query.
If i had bookmarksSvc.addVisit i'd consider it an issue, and here is the same since i have query_for_something.bookmark.create.

> Also, each class can contain custom methods to do more complex queries, etc:
> 
> places.objects.filter_bookmarks_since("2009-12-31 00:00:00")[0]

this should be an object passed to filter

> places.objects.filter(title__contains="foo").order_by("-id") 
> places.objects.create(title="foo",...)
> placeObj = places.objects.get(id=9990)
> placeObj.delete()

And this is much nicer than what we have above! Now i'm starting being interested. Make this Places object, start putting in objects or entries (i think entries is probably better?), providing mock bookmarks and visits objects, and hooking the filtering API there, reusing this queryAPI code.

> If this is all about the symantics of PlacesQuery.* vs. places.* in the API,
> fine. we can re-name it, I don't care. I think I started out by calling it
> PlacesAPI or something. It doesn't matter to me what it is called.

You're now moving toward what i want, not another useless wrapper but a completely different view of seeing the API. and i'm unsure why this did not start immediately that way.
Reporter

Comment 63

9 years ago
believe me, I wanted to start without wrapping anything. Just a pure-data api for getting, querying and creating "place" objects.
Depends on: 543888
Comment on attachment 424028 [details] [diff] [review]
v 1.0 Places Sync Patch for Firefox

clearing request, waiting for the new design.
Attachment #424028 - Flags: review?(mak77)
re-assigning to me
Assignee: ddahl → mak77
Status: NEW → ASSIGNED
Posted patch alpha wip for feedback (obsolete) — Splinter Review
Attachment #417015 - Attachment is obsolete: true
Attachment #421375 - Attachment is obsolete: true
Attachment #424028 - Attachment is obsolete: true
Posted patch wip alpha 2 (obsolete) — Splinter Review
added some doc to better present the idea and help feedback-ers
Attachment #443444 - Attachment is obsolete: true
Attachment #443464 - Flags: feedback?(adw)
Attachment #443464 - Flags: feedback?(ddahl)
Comment on attachment 443464 [details] [diff] [review]
wip alpha 2

If you all have some time, cool, otherwise don't feel guilty, i know we are all pretty busy.
Attachment #443464 - Flags: feedback?(dietrich)
Comment on attachment 443464 [details] [diff] [review]
wip alpha 2

feedback+ for simplicity, ease-of-use, jetpack-ableness. array of hashes is fine for results, for now.

per irc, please convert microsecond results do either millisecond or date objs.

in the TODO i'd list by priority and draw a line that says "LAND ME". most of what you have there seems like it can be in phase 2, fixed after initial landing.

what does the grouped result set look like?

some thoughts on the result values:

why expose pageId?
s/last/lastVisit/ ?
s/iconUrl/favicon/ ?
Attachment #443464 - Flags: feedback?(dietrich) → feedback+
(In reply to comment #69)
> what does the grouped result set look like?

there is no grouped result set so far, results are flat uri lists. once there groups should just be other queries. Unless i've misunderstood the question.

> why expose pageId?

We had a bunch of request for that, and we could find them useful in some case (using them in queries is faster), so I tried to make it "complete".

> s/last/lastVisit/ ?

yep, will do. I still have doubts on some name, escpecially time ones, while visits have a visit time, bookmarks can have a visit time but also a creation and a last modification time. So for example i could want to query bookmarks with visits between X and Y or modified between X and Y, while the latter would make no sense per visits. So the "when" entry i have is ambiguous, for example.

> s/iconUrl/favicon/ ?

I did that initially, but thought it was then hard to know if this was an url or the effective favicon image data, so i preferred being explicit
Attachment #443464 - Flags: feedback?(mano)
Hey, we need to talk ;)

The query mechanism looks very nice overall.   However, I don't like the representation - i.e. the callback with the array.  Reducing COM overhead doesn't mean that we have to drop the basic result node *interfaces*. They could work nicely here too - they won't do much, but they will allow us to use the current view interface for the "callback".  Long-term, I think that's much better.

Do note that changing the current patch to work that way shouldn't be hard at all.
Comment on attachment 443464 [details] [diff] [review]
wip alpha 2

For now...
Attachment #443464 - Flags: feedback?(mano) → feedback-
(In reply to comment #71)
> Do note that changing the current patch to work that way shouldn't be hard at
> all.

The idea is that the current result interface will be implemented by a wrapper around this one, that just returns simple static lists.

There should be no problem in building an object that reproduces current interface and uses internally this simple array querying method, it would cache the results locally and present them the way it wants (with the wanted interface), while at the same time we expose these simple arrays for implementers willing to just get a simple list of results. It is practically what we do today, we ask backend to give us results and we build a result around them.
As explained over IRC:
1. What I don't want to see is a two totally different sets of public interfaces.  Jetpack should expose nsINavHistoryResultNode and nsINavHistoryContainerResultNode
2. However, it doesn't have to be COMed.  And actually, it shouldn't.
3. Methods that are not implemented by this implementation of nsINavHistory*ResultNode should throw.
After a bit of discussion with Mano:
- I'll change some name to be adherent to current api
- I'll replace getChildren with a .query getter, so that you can get query for a container or directly call .query.execute(callback)
- add "this" specification in execute

this will remain a query object that returns raw static db data, the wrapper will take care of trying to replace what we have today
Comment on attachment 443464 [details] [diff] [review]
wip alpha 2

I didn't read the other comments before writing down these thoughts.

I'd like to see pages at the center.  So instead of thinking "Do I want to search bookmarks, pages, or visits?", I want to think "I want to search pages that are bookmarked" or "pages that are visited" or "pages that are bookmarked and visited" or "all pages".  (I know Places allows bookmarks that aren't backed by a page in moz_places, but IMO that's unnecessary, and we shouldn't bubble up that design into a new query API.)

"where" is a little confusing.  The value of one property on an object shouldn't change what the valid properties on that object are, and that's what "where" does.  Moreover, if bookmark-specific properties are present, they force "where" to be "bookmarks" even if it wasn't "bookmarks".  So I think if you stick with the three-way option of bookmarks, visited, and pages, there should be BookmarksQuery, VisitedQuery, and PagesQuery constructors, each taking options that are valid for it.  Or with the page-centric model I've suggested, "bookmarked" and "visited" properties that are objects with properties specific to them.  Like:

  // Pages that mention "foo" and are bookmarked in some folder.
  let query = {
    phrase: "foo",
    bookmarked: { folders: [folder] }
  };

  // Pages that mention "foo", are bookmarked in some folder, and visited
  // within some timespan.
  let query = {
    phrase: "foo",
    bookmarked: { folders: [folder] },
    visited: { when: [from, to] }
  };

There's a "limit" but no "offset".  Both would be useful for doing paged views for example.

There's no "sort".

ResultItem.getChildren should only be present when the result item is a container.  And "open" or "expand" might be a better name.

Be careful about what |this| refers to inside client callbacks.
Attachment #443464 - Flags: feedback?(adw) → feedback-
(In reply to comment #76)
> (I know Places allows bookmarks that aren't
> backed by a page in moz_places, but IMO that's unnecessary, and we shouldn't
> bubble up that design into a new query API.)

nope. anything with a URI has a record in moz_places. in the unvisited bookmark case, the moz_places row just has zero visits.
(In reply to comment #76)
> ResultItem.getChildren should only be present when the result item is a
> container.  And "open" or "expand" might be a better name.

yeah, something like open(callback) would be nice.
(In reply to comment #76)
> I'd like to see pages at the center.  So instead of thinking "Do I want to
> search bookmarks, pages, or visits?", I want to think "I want to search pages
> that are bookmarked" or "pages that are visited" or "pages that are bookmarked
> and visited" or "all pages". 

this cuts away visits (since you have more visits than pages, and visits to the same uris) and containers (folders, queries). since this is expected to replace our current query system in future, we can't really just have everything around pages. Sure I'd like to have pages in the middle, but that would require dropping boolmarks hierarchies as I already said, and I think we (and our users) are not ready for that.

(I know Places allows bookmarks that aren't
> backed by a page in moz_places, but IMO that's unnecessary, and we shouldn't
> bubble up that design into a new query API.)

Actually only containers and separators, but as I said before, visits are an issue, considering everything grouped by uri is different from getting latest visits.

> "where" is a little confusing.  The value of one property on an object
> shouldn't change what the valid properties on that object are, and that's what
> "where" does.

Sure, but there is no way to make page the center and have bookmarks hierarchies, separators and visits.

Moreover, if bookmark-specific properties are present, they
> force "where" to be "bookmarks" even if it wasn't "bookmarks".  So I think if
> you stick with the three-way option of bookmarks, visited, and pages, there
> should be BookmarksQuery, VisitedQuery, and PagesQuery constructors,

This isn't a bad idea, but that will mean implementers (And us) will than have to maintain 3 separate callers when most of the properties are the same

>   // Pages that mention "foo" and are bookmarked in some folder.
>   let query = {
>     phrase: "foo",
>     bookmarked: { folders: [folder] }
>   };
> 
>   // Pages that mention "foo", are bookmarked in some folder, and visited
>   // within some timespan.
>   let query = {
>     phrase: "foo",
>     bookmarked: { folders: [folder] },
>     visited: { when: [from, to] }
>   };
> 
> There's a "limit" but no "offset".  Both would be useful for doing paged views
> for example.

I added it initially, then I removed it since there is no gain in having offset, Sqlite just gathers results and drops them, thus any implementer can do the same without suffering perfomance problems. Adding an offset would let them believe we can manage that in a performant way, and i fear could bring bad perfomant callers. If SQLite should change we could add it in future.

> There's no "sort".

Yep, this is alpha, I added it today with other options.

> ResultItem.getChildren should only be present when the result item is a
> container.  And "open" or "expand" might be a better name.

As per talk with Mano this is going to be replaced by "query", that will just return a new PlacesQuery, if it's not a container will throw.
so just
if (result.readableType == "container)
  result.query.execute(callback, this);

or you can get the query object and execute it later (like in our openContainer case)

> Be careful about what |this| refers to inside client callbacks.

I added a this argument to the constructor, that I apply to callbacks if provided.

So, what I get from this is:
- being urlcentric: this means dropping any hierarchy, thus means we won't ever be able to replace our current query/result system with this one.
- Have 3 separate objects, based on "where" we search. I was planning in JetPack to have shortcuts, like Places.bookmarks, Places.pages, Places.visits that set where automatically. I could even split them here, I only fear that when managing a query result in code (like a treeview) we will then have to go three code paths and we will have three different ResultItem, that does not look like a simplification from what we have today.

Thoughts?
consider splitting objects and constructors will require who receives results to go looking of IsPrototypeOf (or we can give an helper, sure), with the current system they just check result.readableType... in the end it does not change much having one object or three, the caller will always have to know which kind of query and nodes is managing, so it's just matter fo different syntax to check that.
I can probably make some change to the API to go toward some more page-centric vision. I like .bookmarked and .visited suggestions from Drew, visits could be a special case for which I could provide an ungroupVisits option. Trying this way, if it works I'll have a nice merge of all good suggestions :)
Can you put up an etherpad or wiki page with example code for a few common use-cases such as history search, opening a bookmark folder, displaying visits for today, etc.
Posted patch wip alpha 3 (obsolete) — Splinter Review
this is more page-centric, and has some name that is more adherent to navHistoryResultNode
Attachment #443464 - Attachment is obsolete: true
Attachment #443464 - Flags: feedback?(ddahl)
Posted patch wip alpha 4 (obsolete) — Splinter Review
Reimplmented what was working in alpha 2, plus annotations, better tests and a bunch of fixes. Attaching as a checkpoint, since tomorrow I'll have to spend some time on reviews.
Still to figure out a nice way to handle .include and .exclude, will try to just add common bool options in .visited and .bookmarked, as suggested by Drew on IRC.
Will also put up examples in the wiki page as requested by Dietrich once the above is figured out.
Attachment #444445 - Attachment is obsolete: true
note to self: keywords, referringURI, GUIDs
Posted patch wip beta 1 (obsolete) — Splinter Review
apart suckiness of some piece of code (cleanup and perf come later in the game), this should implement most of flat querying (get list of pages, no containers).
Attachment #444536 - Attachment is obsolete: true
Attachment #445940 - Flags: feedback?(adw)
Attachment #445940 - Flags: feedback?(dietrich)
Attachment #445940 - Flags: feedback?(mano)
Drew, I'm not sure how we will handle compatibility in JetPack, should I just take all the code in this module and copy it into the JEP at each ne jetpack release? Importing it in the JEP is nice, but then we are not allowed to do any change to the "api" this exposes. Actually I'd hope we don't need changes to the external side of this, but by experience that's not a good bet.
Comment on attachment 445940 [details] [diff] [review]
wip beta 1


>+ * QueryConf.prototype = {
>+ *   phrase: string.
>+ *           Show only pages containing this string in either title, uri or
>+ *           tags.  Case insensitive.  Can use ^ and $ to match beginning or
>+ *           end.

i'd prefer "search" or "match" instead of "phrase" here. actually, i don't like either of those much. searchString is a bit long. searchFor?

>+ *   host: string.
>+ *         Show only pages containing this string in the host.  Case
>+ *         insensitive.  Can use ^ and $ to match beginning or end.
>+ *   uri: string.
>+ *        Show only pages containing this string in the uri.  Case
>+ *        insensitive.  Can use ^ and $ to match beginning or end.
>+ *   annotated: array of strings.
>+ *           Show only pages with these annotations (Either page or item).
>+ *   bookmarked: object {
>+ *                 tags: array of strings.
>+ *                       Show only pages tagged with these tags.
>+ *                 folders: array of numbers.
>+ *                          Show contents of these folders. (non-recursive)

comment should indicate that the contents of the folders are merged.

>+ *                 id: number.
>+ *                      Show only the bookmark with this id.
>+ *                 when: array of 2 Date objects.
>+ *                       Show only bookmarks created between these times.
>+ *                       Can use null beginning or end time to match till epoch
>+ *                       or now.
>+ *                 modified: array of 2 Date objects.
>+ *                           Show only bookmarks modified between these times.
>+ *                           Can use null beginning or end time to match till
>+ *                           epoch or now.

for both "when" and "modified", i'd prefer explicit options for start/end instead of an array. eg:

when: {start: XXX, end: YYY}

also, why "when" instead of "created"? the latter seems much more descriptive.

>+ *                 excludeNonContainers: boolean.
>+ *                                       Removes any non-container from results.
>+ *                                       Default is false.

instead of the negative, i'd prefer "onlyFolders" or something like that.

>+ *                 excludeReadOnlyContainers: boolean.
>+ *                                            Removes read only containers from
>+ *                                            results.  Default is false.
>+ *               }
>+ *   visited: object {
>+ *              count: array of numbers.
>+ *                     Show only pages with these so many visits.
>+ *                     Can use null minimum or maximum to match anything.

"count" is ambiguous - maybe "visitCount" or "numberOfVisits"?

s/these so/this/

what do you mean by "null minimum or maximum"?

>+ *              transitions: array fo transition types.
>+ *                           Show only pages with at least one visit with these
>+ *                           transitions.

s/fo/of/

>+ *              when: array of 2 Date objects.
>+ *                    Show only pages with visits between these times.
>+ *                    Can use null beginning or end time to match till epoch
>+ *                    or now.

ditto as above for explicit start/end properties

>+ *              excludeRedirectSources: boolean.
>+ *                                      Removes redirects sources from results.
>+ *                                      Default is false.
>+ *              excludeRedirectTargets: boolean.
>+ *                                      Removes redirects targets from results.
>+ *                                      Default is false.

Hm, I think one of those should be on by default. I'm not sure which though.

>+ *              includeHiddenPages: boolean.
>+ *                                  Returns also pages marked as hidden.
>+ *                                  Default is false.

"Includes pages marked as hidden in the results."

What's the use-case? What pages are marked hidden right now?

>+ *              includeVisits: boolean.
>+ *                             Returns all visits.
>+ *                             Default is false, that means visits are grouped
>+ *                             by uri, and no duplicates are returned.

visited.includeVisits. Hm, I think needs a different name that more clearly says that the results will be per-visit, not grouped by URI. ungrouped? allVisits?

>+ *            }
>+ *   sort: object {
>+ *           by: string.
>+ *               Either "none", "title", "time", "uri", "accessCount" or
>+ *               "lastModified", "frecency".  Defaults to "none".
>+ *           dir: string.
>+ *                Either "asc" or "desc".  Defaults to "asc".
>+ *         }
>+ *   group: string.
>+ *          Either "tags", "folders", "day", "month", "year" or "domain".
>+ *          Defaults to "none".
>+ *   limit: number.
>+ *          Maximum umber of results to return.  Defaults to all results.

number

>+ *   merge: string.
>+ *          How to merge this query's results with others in the same request.
>+ *          Either "union", "intersect" or "except".
>+ * }

Should add details for what each option does.

Feedback+. Outside of these external API comments, the rest of the patch looks pretty good.
Attachment #445940 - Flags: feedback?(dietrich) → feedback+
(In reply to comment #89)
> (From update of attachment 445940 [details] [diff] [review])
> 
> >+ * QueryConf.prototype = {
> >+ *   phrase: string.

> i'd prefer "search" or "match" instead of "phrase" here. actually, i don't like
> either of those much. searchString is a bit long. searchFor?

Not sure, David had good feelings about using "phrase", the old API is using "terms" that does not look nice still. I think I won't change it unless we find a good alternative

> >+ *   bookmarked: object {
> >+ *                 tags: array of strings.
> >+ *                       Show only pages tagged with these tags.
> >+ *                 folders: array of numbers.
> >+ *                          Show contents of these folders. (non-recursive)
> 
> comment should indicate that the contents of the folders are merged.

actually I've changed this to 

at: { folder: folderId, position: indexInFolder }

Why not supporting more folders? Well this was wrong in the old api imho, since more queries can be ORed, if I want to get results from multiple queries I should OR 2 queries, and since asking for a bookmark to be in more folders is a nonsense, this should have been a single folder id.
Other params are ANDed, this was the only one to be ORed.

> also, why "when" instead of "created"? the latter seems much more descriptive.

well, visited.when sounds better than visited.created, so I retained bookmarked.when for simmetry

> >+ *   visited: object {
> >+ *              count: array of numbers.
> >+ *                     Show only pages with these so many visits.
> >+ *                     Can use null minimum or maximum to match anything.
> 
> "count" is ambiguous - maybe "visitCount" or "numberOfVisits"?

I did not want to repeat the visit part since it's already visited.count

> >+ *              excludeRedirectSources: boolean.
> >+ *                                      Removes redirects sources from results.
> >+ *                                      Default is false.
> >+ *              excludeRedirectTargets: boolean.
> >+ *                                      Removes redirects targets from results.
> >+ *                                      Default is false.
> 
> Hm, I think one of those should be on by default. I'm not sure which though.

Unfortunately that creates confusion into users due to Hotels and Airports using redirecting portals (pages disappear then)

> >+ *              includeHiddenPages: boolean.
> >+ *                                  Returns also pages marked as hidden.
> >+ *                                  Default is false.
> 
> "Includes pages marked as hidden in the results."
> 
> What's the use-case? What pages are marked hidden right now?

pages with non visible transitions, redirects are also sometimes marked as hidden. history code is not crystal clear regarding this, but i need it for compatibility with old queries, even if as of today we don't use it anywhere (I have to check recent download history patches though)

> visited.includeVisits. Hm, I think needs a different name that more clearly
> says that the results will be per-visit, not grouped by URI. ungrouped?
> allVisits?

I like allVisits
Posted patch wip beta2 (obsolete) — Splinter Review
- a ton of bug fixes (jetpack tests helped here)
- added a first implementation of place: query deserializer, not complete but wanted to check coverage. tests missing. Could be finished in followups.
- added a first try at a grouping, alpha, just wanted to check how far I am. followups more likely.
- addressed first comments, some cleanup.

What I care more right now is the external interface, that is most likely the only showstopper to put this out and finish it along the way.

Dietrich, do you have any time for another pass in the next days/next week?

Anybody willing to make drive-by comments is more than welcome.
Attachment #445940 - Attachment is obsolete: true
Attachment #447870 - Flags: feedback?(dietrich)
Attachment #445940 - Flags: feedback?(mano)
Attachment #445940 - Flags: feedback?(adw)
I'm sorry for not providing feedback on your earlier patch.  Here's a drive-by on the API in the latest one.

>+ * EXAMPLES:
>+ *  function callback(results) {
>+ *    if (results.length == 0)
>+ *      dump("Finished collecting results\n");
>+ *    else
>+ *      dump("Collected " + results.length + " results this turn\n");
>+ *  }
>+ *  new PlacesQuery({_QueryConf_}, callback, thisObject);
>+ *  new PlacesQuery([{_QueryConf1_}, {_QueryConf2_}], callback, thisObject);

It would be a little less work for the client if the callback were called with only one result each time (like Array.forEach()), i.e., if you wrote the for-loop for him.

Does the query start immediately once you instantiate it?  If so, that would be surprising to me as someone using this API, since constructors usually don't have side effects like that.  I'd prefer a query.fetch(callback) or some similar method that would start it.

>+ * _QueryConf_ = {
>+ *   phrase: string.
>+ *           Containing this string in either title, uri or tags.  Case
>+ *           insensitive.  Can use ^ and $ to match at beginning or end.
>+ *   host: string.
>+ *         Containing this string in the host.  Case insensitive.
>+ *         Can use ^ and $ to match at beginning or end.
>+ *   uri: string.
>+ *        Containing this string in the uri.  Case insensitive.
>+ *        Can use ^ and $ to match beginning or end.

Since you allow ^ and $, how about just letting these be regexpes also?  Either don't make me learn non-conventions for strings, or go all the way and let me use regexpes.

>+ *   bookmarked: object
>+ *   {
>+ *     tags: array of strings.
>+ *           Tagged with these tags.
>+ *     at: object
>+ *     {
>+ *       folder: number.
>+ *               Inside this folder. (non-recursive)
>+ *       position: number.
>+ *                 At this position. (relative to folder).
>+ *                 If undefined or null matches all children.
>+ *     }

As I mentioned before, I like the separate bookmarked and visited objects, since they're different modes, but a sub-sub-object for position seems unnecessary.  Why not just simple folder and position properties?  Same for the when, modified, and count objects.

>+ *   sort: object
>+ *   {
>+ *     by: string.
>+ *         Either "none", "title", "time", "uri", "accessCount", "lastModified",
>+ *         "frecency".  Defaults to "none".
>+ *     dir: string.
>+ *          Either "asc" or "desc".  Defaults to "asc".
>+ *   }

Similar for sort.  "sortBy" or and "sortDir" properties would be simpler.

>+ *   group: string.
>+ *          Either "tags", "containers", "days", "months", "years" or "domains".
>+ *          Defaults to "none".
>+ *          NOTE: Not yet implemented.

One thing that confused me is allVisits.  Grouping is controlled by this group property, but allVisits also mentions grouping.  It would be simpler if whatever grouping allVisits does could be specified as part of this group property instead.

>+ *   limit: number.
>+ *          Maximum number of results to return.  Defaults to all results.

How about an offset too?  It would make paged views easier.

>+ *   merge: string.
>+ *          How to merge this query's results with others in the same request.
>+ *          Valid values:
>+ *          - "union": merge results from the 2 queries.
>+ *          - "except": exclude current results from the previous ones.
>+ *          - "intersect": only current results that are also in previous ones.

This is kind of cool.  So the merges are applied in the order of the queryconfs?  But, it's not so nice for simple cases, like union all of the queryconfs, since you have to write |merge: "union"| in all of them.  And what happens if you give multiple queryconfs but don't define merge on some of them?

>+ * NOTE: In case of multiple queries, sort, group and limit of the first query
>+ * will be used for the global result.

You can enforce this in the API by making these properties not be properties of the queryconf but of the query instead.  (And then you won't need the NOTE.)  I think you should.  Just as ORDER BY and LIMIT are global to all SELECTs in SQL.  But, does group really need to be global?  Couldn't I group by days for one of my queryconfs, by months for another, and then union them?

Actually, this raises a larger issue.  There are queryconfs, and there are queries.  Since you can specify multiple queryconfs in a query, a query is more like a manager for queryconfs, and queryconfs are more like queries.  That's a sign that at least these things could use more accurate names, but I think it also might be a sign that the queries are trying to do too much.  How about one queryconf per query?  If you want to combine multiple queries, you pass them to another function that does that for you.  Just thinking out loud:

  var query1 = new PlacesQuery({ ... });
  var query2 = new PlacesQuery({ ... });
  query1.union(query2);
  // Or maybe var query3 = query1.union(query2);
  query1.sort(key, dir).limit(100).fetch(callback);
(In reply to comment #92)

> It would be a little less work for the client if the callback were called with
> only one result each time (like Array.forEach()), i.e., if you wrote the
> for-loop for him.

Hm, I like the idea, will probably do.


> Does the query start immediately once you instantiate it?  If so, that would be
> surprising to me as someone using this API, since constructors usually don't
> have side effects like that.  I'd prefer a query.fetch(callback) or some
> similar method that would start it.

Actually if you pass a callback to the constructor it is ran immediately, otherwise you can call query.execute(callback) to run it. So both methods are supported.


> Since you allow ^ and $, how about just letting these be regexpes also?  Either
> don't make me learn non-conventions for strings, or go all the way and let me
> use regexpes.

The problem is that our sqlite does not have regexp support, sure I could make that if a regexp is defined I just don't use LIKE, but not using LIKE means being slower, so regexes here are the exception.
I did not want to invent new ways to tell how to match, since this syntax is known. Will evaluate if makes sense to allow regexes and sniff their content to use like when possible.


> >+ *   bookmarked: object
> >+ *   {
> >+ *     tags: array of strings.
> >+ *           Tagged with these tags.
> >+ *     at: object
> >+ *     {
> >+ *       folder: number.
> >+ *               Inside this folder. (non-recursive)
> >+ *       position: number.
> >+ *                 At this position. (relative to folder).
> >+ *                 If undefined or null matches all children.
> >+ *     }
> 
> As I mentioned before, I like the separate bookmarked and visited objects,
> since they're different modes, but a sub-sub-object for position seems
> unnecessary.  Why not just simple folder and position properties?

because there is no point in querying for position without querying for folder, can you give me a use case of when i need to know all bookmarks that are third in their container?


Same for the
> when, modified, and count objects.
> Similar for sort.  "sortBy" or and "sortDir" properties would be simpler.

it looks more organized to me with these small objects, otherwise is just a bag of hundreds properties to recall. Why do you think it's simpler having all seprarate properties?


> >+ *   group: string.
> >+ *          Either "tags", "containers", "days", "months", "years" or "domains".
> >+ *          Defaults to "none".
> >+ *          NOTE: Not yet implemented.
> 
> One thing that confused me is allVisits.  Grouping is controlled by this group
> property, but allVisits also mentions grouping.  It would be simpler if
> whatever grouping allVisits does could be specified as part of this group
> property instead.

I have no idea how, allVisits is not grouping, it is ungruping. Also I could want to group by date and get visits as I could want to group by date but get pages.

> >+ *   limit: number.
> >+ *          Maximum number of results to return.  Defaults to all results.
> 
> How about an offset too?  It would make paged views easier.

You already asked for it, even if I could put into a fake offset, there would be no performance gain. If it's just because of "we do it for the dev" I could do that


> >+ *   merge: string.
> >+ *          How to merge this query's results with others in the same request.
> >+ *          Valid values:
> >+ *          - "union": merge results from the 2 queries.
> >+ *          - "except": exclude current results from the previous ones.
> >+ *          - "intersect": only current results that are also in previous ones.
> 
> This is kind of cool.  So the merges are applied in the order of the
> queryconfs?  But, it's not so nice for simple cases, like union all of the
> queryconfs, since you have to write |merge: "union"| in all of them.

union is the default value, so you don't have to specify it, even if you can

And what
> happens if you give multiple queryconfs but don't define merge on some of them?

union is used as default then

> 
> >+ * NOTE: In case of multiple queries, sort, group and limit of the first query
> >+ * will be used for the global result.
> 
> You can enforce this in the API by making these properties not be properties of
> the queryconf but of the query instead.

that means two things:
- I would always have to ask user to call execute, since I cannot anymore execute automatically.
- The user has to go through a more complicated configuration, ideally this was one-line, make a query definition and pass it in... the only way I see is adding more params to the constructor...

>  But, does group really need to be global?  Couldn't I group by days for one of
> my queryconfs, by months for another, and then union them?

It could be really hard to postfilter/postprocess this kind of mixed results, you have to know where each result comes from and how to handle it. Actually  grouping is still a bit up in the air, so I don't have good answers, but having a history query grouped by date and a bookmarks query grouped by folders put toghether is giving me an headache.

> Actually, this raises a larger issue.  There are queryconfs, and there are
> queries.  Since you can specify multiple queryconfs in a query, a query is more
> like a manager for queryconfs, and queryconfs are more like queries.  That's a
> sign that at least these things could use more accurate names, but I think it
> also might be a sign that the queries are trying to do too much.  How about one
> queryconf per query?  If you want to combine multiple queries, you pass them to
> another function that does that for you.  Just thinking out loud:
> 
>   var query1 = new PlacesQuery({ ... });
>   var query2 = new PlacesQuery({ ... });
>   query1.union(query2);
>   // Or maybe var query3 = query1.union(query2);
>   query1.sort(key, dir).limit(100).fetch(callback);

This is interesting:
var query1 = new PlacesQuery({q1});
var query2 = new PlacesQuery({q2});
var query3 = new PlacesQuery({q3});
query1.union(query2).except(query3).sort(key, dir).limit(100).fetch(callback);

current syntax is instead:
new PlacesQuery([ {q1, sort: {by: key, dir: dir}, limit: 100},
                  {q2},{q3, merge: "except"} ], callback);

There is one possible problem though, current place: uris can include multiple queries, translating it would require a new [queries] = PlacesQueryConverter(placeuri) instead of allowing placeuri as a PlacesQuery argument.
But i'll evaluate the approach because it solves a couple concerns I had.
(In reply to comment #93)
> Actually if you pass a callback to the constructor it is ran immediately,
> otherwise you can call query.execute(callback) to run it. So both methods are
> supported.

Why support both?  In addition to being surprising for clients as I mentioned, allowing query execution on construction means that you (Marco) have to write more code, more tests, more documentation, and it's more for people who are using the API to learn and potentially screw up.

> The problem is that our sqlite does not have regexp support, sure I could make
> that if a regexp is defined I just don't use LIKE, but not using LIKE means
> being slower, so regexes here are the exception.

OK.  That's understandable.

> because there is no point in querying for position without querying for folder,
> can you give me a use case of when i need to know all bookmarks that are third
> in their container?

No use case, and I didn't mean to imply there is.  Even with a sub-object, clients can define position but not folder, right?  So that's not the issue.  The hiearchy doesn't seem necessary.

> it looks more organized to me with these small objects, otherwise is just a bag
> of hundreds properties to recall. Why do you think it's simpler having all
> seprarate properties?

Both extremes are bad -- hundreds of flat properties, or a hierarchy with hundreds of levels.  But in this small case, each of these sub-objects has only two properties, and in each case you don't even have to define both properties.

> > property, but allVisits also mentions grouping.  It would be simpler if
> > whatever grouping allVisits does could be specified as part of this group
> > property instead.
> 
> I have no idea how, allVisits is not grouping, it is ungruping. Also I could
> want to group by date and get visits as I could want to group by date but get
> pages.

The description of allVisits mentions grouping twice.  When I wrote that comment I was actually confused, but today I realized that what allVisits is doing is determining whether you get back a list of visits or a list of URIs, and then the allVisits description made sense to me.  I was confused and I work on Places.

So I'd suggest making a new mode called "pages" that works like the |bookmarked| and |visited| objects.  The visited object would act like allVisits == true, and the pages object would act like allVisits == false and return URIs.  That would mean |pages| would return more pages that the ones that are visited, but you could control that if you want by adding an "isVisited" property to the pages object.

> You already asked for it, even if I could put into a fake offset, there would
> be no performance gain. If it's just because of "we do it for the dev" I could
> do that

Why are we designing a new API if we're not doing it for the dev?

> > This is kind of cool.  So the merges are applied in the order of the
> > queryconfs?  But, it's not so nice for simple cases, like union all of the
> > queryconfs, since you have to write |merge: "union"| in all of them.
> 
> union is the default value, so you don't have to specify it, even if you can
> 
> And what
> > happens if you give multiple queryconfs but don't define merge on some of them?
> 
> union is used as default then

OK, how about if I wanted all intersections then.  Or something else.

> > You can enforce this in the API by making these properties not be properties of
> > the queryconf but of the query instead.
> 
> that means two things:
> - I would always have to ask user to call execute, since I cannot anymore
> execute automatically.

As I mentioned, I think you shouldn't execute in the constructor, but for this point the client would just pass these properties into the query constructor.

> - The user has to go through a more complicated configuration, ideally this was
> one-line, make a query definition and pass it in... the only way I see is
> adding more params to the constructor...

Yes, to the previous point.

But I like the method-based approach that I suggested better than trying to do everything in the constructor.

> grouping is still a bit up in the air, so I don't have good answers, but having
> a history query grouped by date and a bookmarks query grouped by folders put
> toghether is giving me an headache.

Why?  They're just nodes after all.  Just a list of nodes.

> There is one possible problem though, current place: uris can include multiple
> queries, translating it would require a new [queries] =
> PlacesQueryConverter(placeuri) instead of allowing placeuri as a PlacesQuery
> argument.

If you're thinking about query URIs while designing a new API, you're doing it wrong. :)  If we have to provide some external mapping between query URIs and the new awesome query API to keep the new query API awesome, that's a good tradeoff.
(In reply to comment #94)
> If you're thinking about query URIs while designing a new API, you're doing it
> wrong. :)  If we have to provide some external mapping between query URIs and
> the new awesome query API to keep the new query API awesome, that's a good
> tradeoff.

I think there will be a point where we will want to replace the old api with a new one, then we will need to translate existing place: uris. And also to provide a result of a bookmarks folder, if it contains a place: query, we should be able to open it and show its contents.
I'm going to implement the following changes and call it v1:

- call callback per individual result
- remove immediate execution from the ctor. explicit is better here.
- flatten the sub-sub-objects. readability is better, and so is less complex bracketry when constructing queries.
- omit support for multiple queryconfs per query. -> v2.
- omit support for grouping. -> v2

Further improvements can build on this core set of features, in subsequent bugs.

I'm also going to ship this in the Jetpack SDK instead of the core product for now. Bundling inside Jetpack gets the API into developers hands sooner and gives us the flexibility to easily respond to developer feedback.

If we want to utilize this API inside Firefox for specific feature(s), then we'll cross that bridge at that point. (Maybe we should just keep it in a user repo for now?)
blocking2.0: --- → ?
This doesn't block the release. See above comment where we're going to ship it as a module in the Jetpack SDK.
blocking2.0: ? → -
above choices looks fine to me
Attachment #447870 - Flags: feedback?(dietrich)
Comment on attachment 447870 [details] [diff] [review]
wip beta2

feedback+ on this, outside of the changes identified in adw's comments, mostly fixed (or disabled) in my upcoming patch.
Attachment #447870 - Flags: feedback+
Posted patch wip beta3 (obsolete) — Splinter Review
implements the aforementioned changes.

todo:

* fix allVisits to clearer
* allow simple bool for bookmarked/visited, instead of "bookmarked: {}
* replace .tags with .tagsArray,  don't need to raw field
Assignee: mak77 → dietrich
Attachment #447870 - Attachment is obsolete: true
Posted patch wip rc1Splinter Review
All targeted changes made. I didn't go for a full "pages" object, seemed unnecessary. Instead, I renamed the property includeAllVisits which is a much more explicit description of what it actually does.

I also added some more tests. The property tests could be more filled out, but I'm not going to block moving forward on it, since I'll probably flex it in the Jetpack tests.
Attachment #463253 - Attachment is obsolete: true
Attachment #463536 - Flags: review?(mak77)
Comment on attachment 463536 [details] [diff] [review]
wip rc1

I had just finished doing review and Firefox crashed leaving me with nothing, so could be I'll be a bit more quick in comments. Trying to recall everything.

>diff --git a/toolkit/components/places/src/PlacesQuery.jsm b/toolkit/components/places/src/PlacesQuery.jsm

>+ * NOTE: history results returned by this object may be up to two minutes behind
>+ * since it does not handle TEMP tables.  We plan to remove them, so this bad
>+ * behavior will be rectified at that point.

report bug number

>+ * - Hierarchal queries.

hierarchical?

>+ *     createdBegin: optional Date object
>+ *                   Bookmarks created after this time (included).
>+ *                   Defaults to epoch.
>+ *     createdEnd: optional Date object 
>+ *                 Bookmarks created before this time (included).
>+ *                 Defaults to now.
>+ *     modifiedBegin: optional Date object
>+ *                    Bookmarks modified after this time (included).
>+ *                    Defaults to epoch.
>+ *     modifiedEnd: optional Date object 
>+ *                  Bookmarks modified before this time (included).
>+ *                  Defaults to now.


for datetime fields I'd prefer using from/to rather than begin/end
createdFrom: XYZ, createdTo: XYZ sounds better then createdBegin: XYZ, createdEnd : XYZ

>  *   visited: object
>  *   {

  
>+ *     begin: optional Date object
>+ *            With visits after this time (included).
>+ *            Defaults to epoch.
>+ *     end: optional Date object 
>+ *          With visits before this time (included).
>+ *          Defaults to now.

begin and end look too generic now (begin of what?), what about firstVisit: XYZ, lastVisit: XYZ?

>+// Note: Multiple queryconf support is currently denied from callers
>+// as the API is not complete. However, the support is kept in the back-end
>+// so we're just wrapping the query conf in an array in two places
>+// in this ctor for now.
>+function PlacesQuery(aQueryConf)

I think we should go the way adw pointed at (making new queries and then telling them to merge(), union() etc through methods, so probably this will always take a single conf

>-      throw Cr.NS_ERROR_INVALID_ARG;  // XXX nice errors please.
>+      throw Cr.NS_ERROR_INVALID_ARG;  // TODO: nice errors please.

we should probably throw new Error("something"); (maybe with a bunch of predefined ERROR_SOMETHING strings) all around. this is not COM and throwing xpcom errors is my overlooking.
this happens in various parts of the code.

the remaining part looks ok and there is a bunch of tests.
Attachment #463536 - Flags: review?(mak77) → feedback+
(In reply to comment #102)
> we should probably throw new Error("something"); (maybe with a bunch of
> predefined ERROR_SOMETHING strings) all around. this is not COM and throwing
> xpcom errors is my overlooking.
> this happens in various parts of the code.
Although it's even better to throw Components.Error because then you can adjust the callstack to the calling site like we do in NetUtil.jsm.
Duplicate of this bug: 534983
I'm not working on this at the moment, so un-assigning.

If anyone's interested in using/testing it, the module is bundled with the Jetpack wrapper in my github account, and listed here:

https://wiki.mozilla.org/Labs/Jetpack/Modules#Browser_Internals
Assignee: dietrich → nobody
Priority: -- → P3
No assignee, updating the status.
Status: ASSIGNED → NEW
No assignee, updating the status.
No assignee, updating the status.
You need to log in before you can comment on or make changes to this bug.