Closed Bug 813833 Opened 7 years ago Closed 7 years ago

Promise-based SQLite interface

Categories

(Cloud Services :: Firefox: Common, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
mozilla20

People

(Reporter: gps, Assigned: gps)

References

Details

(Keywords: dev-doc-complete)

Attachments

(1 file, 9 obsolete files)

Attached patch Generic SQLite JS interface, v1 (obsolete) — Splinter Review
Taras and the Performance Team convinced me to use SQLite as the storage backend for FHR. However, I was warned that the mozIStorage* APIs are a bit of a mess. I quickly discovered this for myself.

As I was implementing the FHR storage code, I realized I was really doing a lot of boilerplate SQLite interface work. And, the low-level mozIStorage* interface code was cluttering my nice and clean JS. So, I decided to extract my SQLite work as a standalone module. This patch is the result of that.

What we have here is a truly generic and JavaScript-y API for interfacing with SQLite databases via mozIStorage. Everything is promise-based. Where there aren't async APIs exposed by mozIStorage (tsk tsk - bug 702559), I made them appear to be via promises. Once mozIStorage exposes async APIs, we should be able to change the implementation while preserving the API (downstream consumers won't need to change).

Currently the only pieces of mozIStorage that leak outside of this code are the row results (mozIStorageRow instances). I think I'm OK with it. I certainly could implement my own wrapper type that defined a dynamic getter to allow retrieving fields by numeric index or string name. At this juncture, I'm not sure it's worth it. But, if prodded, I could certainly do this.

I'm sure people outside of Services may be interested in this code. I'd love to land it in Toolkit. However, I'm trying to land things for FHR and don't want a long, drawn-out review process. I know I can land this in /services quickly. If Toolkit can provide a quick turnaround time on reviews, I'd happily land it there. Or we can file follow-up bug to "graduate" the code to Toolkit when we have time.

Anyway, rnewman technically gets final review for /services. However, I'm pretty sure I'm touching code he would like help with. So, I'm kindly requesting Blair and Marco help out with the review. I'm confident the code is too naive in areas (especially around connection open - things like add-on manager have lots of extra code here and I'm not exactly sure what needs to be ported over).

I'm also missing some APIs from mozIStorageConnection and the like. I'd be fine with implementing these as follow-ups, where needed. However, the missing APIs should be considered to ensure there is room for them without breaking the as-implemented API.
Attachment #683844 - Flags: review?(rnewman)
Attachment #683844 - Flags: feedback?(mak77)
Attachment #683844 - Flags: feedback?(bmcbride)
Duplicate of this bug: 804108
Gah. I need to do a better job of searching bugzilla before I spend time tackling a problem. At least it looks like there was no little to no harm this time. Whew.
Comment on attachment 683844 [details] [diff] [review]
Generic SQLite JS interface, v1

Review of attachment 683844 [details] [diff] [review]:
-----------------------------------------------------------------

As a reprisal for this case of duplicate bug, and since I am probably the biggest user of promises on m-c at the moment, let me commit a drive-by review :)

Generally, I would use this module. However, I have a few suggestions that I believe could improve it.

::: services/common/sqlite.js
@@ +19,5 @@
> +
> +
> +/**
> + * Represents the result of the execution of an individual sqlite statment.
> + *

I find it a little strange that we use the same class both for success and for error.
Wouldn't it be simpler to do the following?
- resolve() to an array of rows in case of success;
- reject() to a JavaScript exception, which may hold a sqlite error code and message if it is due to a sqlite error.

Among other things, this would let us use Task.jsm's asynchronous try ... catch ... with this module – according to my experience, this is a killer feature.

@@ +34,5 @@
> +  /**
> +   * Whether the statement encountered an error.
> +   */
> +  get hasError() {
> +    return this.errorCode != null || this.errorMessage != null;

That second test is a little strange.

@@ +46,5 @@
> +   * both be false.
> +   */
> +  get success() {
> +    return this.finishedReason == Ci.mozIStorageStatementCallback.REASON_FINISHED;
> +  },

Nit: I seem to remember that we do not like trailing commas.

@@ +57,5 @@
> +/**
> + * A statement result that buffers fetched rows.
> + *
> + * The mozIStorageRow instances are stored in an Array under the `rows`
> + * property.

Nit: I would rather move that line of documentation (or something equivalent) just before |this.rows = [];|

@@ +66,5 @@
> +  this.rows = [];
> +}
> +
> +BufferedSqliteStatementResult.prototype = {
> +  __proto__: SqliteStatementResult.prototype,

I am almost sure that __proto__ has been deprecated in favor of |Object.create|.

@@ +133,5 @@
> +   *
> +   *   1) This SqliteStatement instance
> +   *   2) A mozIStorageRow
> +   *
> +   * @return Promise<SqliteStatementResult>

Let me suggest the following (to match the style of documentation of OS.File):

@return {Promise} A promise completed once the execution of the statement is complete.
@resolves {SqliteStatementResult} In case of success, the final result.
@rejects {I don't know} (document the error)

@@ +135,5 @@
> +   *   2) A mozIStorageRow
> +   *
> +   * @return Promise<SqliteStatementResult>
> +   */
> +  executeStreaming: function(listener, params) {

Nit: For consistency with the implementation of promises, |listener| might also be called |onResult|. Your call.

@@ +138,5 @@
> +   */
> +  executeStreaming: function(listener, params) {
> +    return this._dispatch(params, listener);
> +  },
> +

I believe that this method can both throw an error and return a rejected promise. This might be a little too much.

@@ +142,5 @@
> +
> +  _dispatch: function(params=null, listener=null) {
> +    if (this._statement.state != this._statement.MOZ_STORAGE_STATEMENT_READY) {
> +      throw new Error("Statement is not ready for execution.");
> +    }

It seems that we could also perform this check in the constructor. Am I missing something?

@@ +182,5 @@
> +            try {
> +              listener(self, row);
> +            } catch (ex) {
> +              self._log.warn("Exception when calling row listener: " +
> +                             CommonUtils.exceptionStr(ex));

In my implementation of (a small subset of) this module, an exception in |listener| stops the asynchronous request. Not sure which is best, your call.

@@ +197,5 @@
> +        self._log.info("Error when executing SQL (" + error.result + "): " +
> +                       error.message);
> +
> +        result.errorCode = error.result;
> +        result.errorMessage = error.message;

I would have |handleError| reject |deferred| immediately. IIRC, this covers both REASON_CANCELLED and REASON_ERROR, which would make |handleCompletion| below absolutely trivial.

@@ +240,5 @@
> + *
> + * @param dbPaths
> + *        (Array) The path components to create the file from.
> + * @param dirServiceKey
> + *        (string) The key in the directory service to use to construct the path.

In the spirit of not reinventing the wheel too many times, I would suggest using a string here and referring users to OS.Path for path component manipulation.
Attachment #683844 - Flags: feedback+
(In reply to David Rajchenbach Teller [:Yoric] from comment #3)
> Nit: I seem to remember that we do not like trailing commas.

There was a time when a trailing comma produced a strict JS warning, which caused a small amount of pain. That's no longer the case.


> I am almost sure that __proto__ has been deprecated in favor of
> |Object.create|.

No, __proto__ has since been added to ES6 (adopted as a de facto standard).
@Blair: Ah, good to know, thanks.
Comment on attachment 683844 [details] [diff] [review]
Generic SQLite JS interface, v1

Review of attachment 683844 [details] [diff] [review]:
-----------------------------------------------------------------

General high-level impression: me like!

::: services/common/Makefile.in
@@ +15,5 @@
>    log4moz.js \
>    observers.js \
>    preferences.js \
>    rest.js \
> +  sqlite.js \

I wish services would use the convention for filenames used everywhere else in the tree - SQLite.jsm/Sqlite.jsm (insert bikeshelding about SQLiteWhatever vs SqliteWhatever).

::: services/common/sqlite.js
@@ +19,5 @@
> +
> +
> +/**
> + * Represents the result of the execution of an individual sqlite statment.
> + *

Agreed - this could be simplified/abstracted away. For buffered results, if there's an error I think any partial results should just be thrown away.

@@ +182,5 @@
> +            try {
> +              listener(self, row);
> +            } catch (ex) {
> +              self._log.warn("Exception when calling row listener: " +
> +                             CommonUtils.exceptionStr(ex));

I'd rather have a more explicit method of halting the request. Halting from an exception seems like it would be an accidental side-effect a lot of the time.

@@ +197,5 @@
> +        self._log.info("Error when executing SQL (" + error.result + "): " +
> +                       error.message);
> +
> +        result.errorCode = error.result;
> +        result.errorMessage = error.message;

handleError can be called multiple times - whereas I think deferred should only be rejected once (I forget if the promises API actually enforces that or not).

Which also means that result should probably hold an array of objects describing errors.

@@ +240,5 @@
> + *
> + * @param dbPaths
> + *        (Array) The path components to create the file from.
> + * @param dirServiceKey
> + *        (string) The key in the directory service to use to construct the path.

That (referring to OS.Path) would be fine as long as the default is still to use the profile dir (without needing to go though OS.Path), which is by far the most common usage.

@@ +254,5 @@
> +  this._dbPath = FileUtils.getFile(dirServiceKey, dbPaths);
> +  this._connection = null;
> +}
> +
> +SqliteConnection.prototype = {

At the very least, this also needs a wrapper for asyncClose()

And my top list of other things to cover:
* transactionInProgress
* clone()
* connectionReady
* lastInsertRowID (which is a little unsafe, but we have no alternative in some cases)
* schemaVersion
* tableExists()
* indexExists()

@@ +272,5 @@
> +      return Promise.resolve(this);
> +    }
> +
> +    try {
> +      this._connection = Services.storage.openDatabase(this._dbPath);

Need a way to specify that this should use openUnsharedDatabase() or openSpecialDatabase().

@@ +323,5 @@
> +   *        (Array or Object) Parameters to bind to the statement.
> +   *
> +   * @return Promise<BufferedSqliteStatementResult>
> +   */
> +  execute: function(sql, params=null) {

I wonder if it should be made more obvious that this uses executeBuffered.

@@ +334,5 @@
> +   * @param type
> +   *        (int) One of the TRANSACTION_* constants on this type.
> +   */
> +  beginTransaction: function(type) {
> +    this._connection.beginTransaction(type);

Shouldn't this be calling beginTransationAs() ?

(Bonus points for making type optional, and calling the right method.)
Attachment #683844 - Flags: feedback?(bmcbride) → feedback+
(In reply to Blair McBride (:Unfocused) from comment #6)
> @@ +182,5 @@
> > +            try {
> > +              listener(self, row);
> > +            } catch (ex) {
> > +              self._log.warn("Exception when calling row listener: " +
> > +                             CommonUtils.exceptionStr(ex));
> 
> I'd rather have a more explicit method of halting the request. Halting from
> an exception seems like it would be an accidental side-effect a lot of the
> time.

I believe that both make sense, so I will not insist if you and gps favor your approach.

> 
> @@ +197,5 @@
> > +        self._log.info("Error when executing SQL (" + error.result + "): " +
> > +                       error.message);
> > +
> > +        result.errorCode = error.result;
> > +        result.errorMessage = error.message;
> 
> handleError can be called multiple times - whereas I think deferred should
> only be rejected once (I forget if the promises API actually enforces that
> or not).

Indeed, this is enforced by promises.

> Which also means that result should probably hold an array of objects
> describing errors.

Ah, I had not thought about that. When do we want to receive several errors?

> @@ +240,5 @@
> > + *
> > + * @param dbPaths
> > + *        (Array) The path components to create the file from.
> > + * @param dirServiceKey
> > + *        (string) The key in the directory service to use to construct the path.
> 
> That (referring to OS.Path) would be fine as long as the default is still to
> use the profile dir (without needing to go though OS.Path), which is by far
> the most common usage.

With OS.Path, the profile dir is OS.Constants.Path.profileDir. While I agree that this covers almost all uses, special-casing the API for the purpose sounds a little awkward to me. If we want to do this (I am not sure that we do), perhaps:

@param path
       (string or Array) Either an absolute path (if a string) or an array of components starting from the profile directory.


> Shouldn't this be calling beginTransationAs() ?
> 
> (Bonus points for making type optional, and calling the right method.)

Bonus: Do we want to keep this an integer? It would be more robust if we had objects representing transaction types.
Assignee: nobody → gps
Status: NEW → ASSIGNED
gps: I will review this after you've addressed the most excellent drive-by comments and Blair's review :D
Thanks for the reviews, everyone! Lots of great feedback, as anticipated.

(In reply to David Rajchenbach Teller [:Yoric] from comment #3)
> I find it a little strange that we use the same class both for success and
> for error.

Agreed. This bothered me a bit too.

> Wouldn't it be simpler to do the following?
> - resolve() to an array of rows in case of success;
> - reject() to a JavaScript exception, which may hold a sqlite error code and
> message if it is due to a sqlite error.

I agreed with the reject() part using a regular Error or Error-derived type.

For resolve(), we need to handle both the buffered and non-buffered case, so a simple Array won't suffice. This is why I created a type to hold everything.
 
> @@ +66,5 @@
> > +  this.rows = [];
> > +}
> > +
> > +BufferedSqliteStatementResult.prototype = {
> > +  __proto__: SqliteStatementResult.prototype,
> 
> I am almost sure that __proto__ has been deprecated in favor of
> |Object.create|.

I may just kill this type altogether and put rows on SqliteStatementResult and just not set it for the streaming case.
 
> @@ +133,5 @@
> > +   *
> > +   *   1) This SqliteStatement instance
> > +   *   2) A mozIStorageRow
> > +   *
> > +   * @return Promise<SqliteStatementResult>
> 
> Let me suggest the following (to match the style of documentation of
> OS.File):
> 
> @return {Promise} A promise completed once the execution of the statement is
> complete.
> @resolves {SqliteStatementResult} In case of success, the final result.
> @rejects {I don't know} (document the error)

Do we have any documentation generators this would confuse? I stick to @return because I know popular tools handle it. That being said, we don't exactly do a good job of running doc generators over our code. Now that MDN has a write HTTP API, I'm tempted to hack up something...
 
> @@ +138,5 @@
> > +   */
> > +  executeStreaming: function(listener, params) {
> > +    return this._dispatch(params, listener);
> > +  },
> > +
> 
> I believe that this method can both throw an error and return a rejected
> promise. This might be a little too much.

I'm curious why you feel this way. I only throw exceptions when the input or state is bad.

I'm strongly opposed to returning promises in the case where the arguments to a function are bad.

In the case where state is bad, I still think throwing is warranted. But, I'm not as convicted of it, especially if the caller has little control over that. I could be convinced to return a rejected promise instead.

If I grok promises currently, if we throw, one of two things could happen:

1) If this is the "head" promise in a chain (we are creating the first promise), the exception propagates immediately to the caller, just like a regular exception.
2) If this is called inside a promise chain, the exception will be caught by the promise and the rejected handler fires. Promise chain aborted.

I don't see anything wrong with this. Perhaps I'm missing something?

> @@ +142,5 @@
> > +
> > +  _dispatch: function(params=null, listener=null) {
> > +    if (this._statement.state != this._statement.MOZ_STORAGE_STATEMENT_READY) {
> > +      throw new Error("Statement is not ready for execution.");
> > +    }
> 
> It seems that we could also perform this check in the constructor. Am I
> missing something?

This is mostly defensive programming. I'm not exactly sure when mozIStorageBaseStatement.state can mutate.

> @@ +182,5 @@
> > +            try {
> > +              listener(self, row);
> > +            } catch (ex) {
> > +              self._log.warn("Exception when calling row listener: " +
> > +                             CommonUtils.exceptionStr(ex));
> 
> In my implementation of (a small subset of) this module, an exception in
> |listener| stops the asynchronous request. Not sure which is best, your call.

I think the callback should be able to control behavior. For an exception, aborting is probably best. Perhaps we should also accept a return value with undefined implying continue.

> @@ +197,5 @@
> > +        self._log.info("Error when executing SQL (" + error.result + "): " +
> > +                       error.message);
> > +
> > +        result.errorCode = error.result;
> > +        result.errorMessage = error.message;
> 
> I would have |handleError| reject |deferred| immediately. IIRC, this covers
> both REASON_CANCELLED and REASON_ERROR, which would make |handleCompletion|
> below absolutely trivial.

Per Blair's feedback, since handleError can be called multiple times, I think we should wait and expose all error information.

> @@ +240,5 @@
> > + *
> > + * @param dbPaths
> > + *        (Array) The path components to create the file from.
> > + * @param dirServiceKey
> > + *        (string) The key in the directory service to use to construct the path.
> 
> In the spirit of not reinventing the wheel too many times, I would suggest
> using a string here and referring users to OS.Path for path component
> manipulation.

Sure. The only reason I did it this way is because I didn't feel like going through the dance to get an nsIFile. I can never remember how to do it the "right" way. But, I'll figure it out.

(In reply to Blair McBride (:Unfocused) from comment #6)
> I wish services would use the convention for filenames used everywhere else
> in the tree - SQLite.jsm/Sqlite.jsm (insert bikeshelding about
> SQLiteWhatever vs SqliteWhatever).

I'm starting to use .jsm on the new code. However, /services/common still uses .js and convention tends to win. And, things in /toolkit (like promise's core.js) still use the .js extension. My general reaction is apathy. That being said, I do prefer .jsm for modules.

> ::: services/common/sqlite.js
> @@ +19,5 @@
> > +
> > +
> > +/**
> > + * Represents the result of the execution of an individual sqlite statment.
> > + *
> 
> Agreed - this could be simplified/abstracted away. For buffered results, if
> there's an error I think any partial results should just be thrown away.

We need to consider the implications of this. If this goes away, then with the current set of APIs, callers are always passing SQL strings for execution instead of passing SQL initially and only dealing with bound parameters later. If we leave things as is, we need to create a new statement on each call. Whereas with the statement abstraction, we can re-use a statement multiple times. Presumably that has better performance. Although, if mozIStorage is already caching statements, maybe this isn't too bad.

We could add a caching layer to SQLiteConnection. If we see the same SQL again, we re-use an existing mozIStorageAsyncStatement. We would need to worry about cache size explosion and potential issues of transparent mozIStorageAsyncStatement re-use (are there any?).

Even better, we could add an API on SQLiteConnection to "register" common SQL statements. Only registered statements would have their mozIStorageAsyncStatement cached.

e.g.

conn.registerStatement("select-all-names", "SELECT * FROM NAMES");
conn.executeNamed("select-all-names").then(...)

 
> @@ +182,5 @@
> > +            try {
> > +              listener(self, row);
> > +            } catch (ex) {
> > +              self._log.warn("Exception when calling row listener: " +
> > +                             CommonUtils.exceptionStr(ex));
> 
> I'd rather have a more explicit method of halting the request. Halting from
> an exception seems like it would be an accidental side-effect a lot of the
> time.

I generally say that if your callback throws an exception, then your callback is an error and processing should be aborted. One exception to this rule is the case of multiple, independent callbacks. For those, you obviously don't want an exception to interrupt processing of other registered listeners.

I do agree that these exceptions are often accidental side-effects.

Perhaps we could special-case StopIteration from a regular Error?

API design is hard.
 
> @@ +197,5 @@
> > +        self._log.info("Error when executing SQL (" + error.result + "): " +
> > +                       error.message);
> > +
> > +        result.errorCode = error.result;
> > +        result.errorMessage = error.message;
> 
> handleError can be called multiple times - whereas I think deferred should
> only be rejected once (I forget if the promises API actually enforces that
> or not).
> 
> Which also means that result should probably hold an array of objects
> describing errors.

Good catch. I completely missed this in the IDL.

> At the very least, this also needs a wrapper for asyncClose()
> 
> And my top list of other things to cover:
> * transactionInProgress
> * clone()
> * connectionReady
> * lastInsertRowID (which is a little unsafe, but we have no alternative in
> some cases)
> * schemaVersion
> * tableExists()
> * indexExists()
>
> Need a way to specify that this should use openUnsharedDatabase() or
> openSpecialDatabase().

I'll see what I can do!

> @@ +334,5 @@
> > +   * @param type
> > +   *        (int) One of the TRANSACTION_* constants on this type.
> > +   */
> > +  beginTransaction: function(type) {
> > +    this._connection.beginTransaction(type);
> 
> Shouldn't this be calling beginTransationAs() ?
> 
> (Bonus points for making type optional, and calling the right method.)

Oops. Yup. You've exposed a hole in my test coverage which will soon be filled.
So, putting this in toolkit would require killing log4moz and CommonUtils.exceptionStr. I really need to move these pieces to toolkit. If I had infinite time...
(In reply to Gregory Szorc [:gps] from comment #9)
> > Wouldn't it be simpler to do the following?
> > - resolve() to an array of rows in case of success;
> > - reject() to a JavaScript exception, which may hold a sqlite error code and
> > message if it is due to a sqlite error.
> 
> I agreed with the reject() part using a regular Error or Error-derived type.
> 
> For resolve(), we need to handle both the buffered and non-buffered case, so
> a simple Array won't suffice. This is why I created a type to hold
> everything.

That makes sense.

> > I am almost sure that __proto__ has been deprecated in favor of
> > |Object.create|.
> 
> I may just kill this type altogether and put rows on SqliteStatementResult
> and just not set it for the streaming case.

Ok, anyway, it seems that I was wrong.

>  
> > @@ +133,5 @@
> > > +   *
> > > +   *   1) This SqliteStatement instance
> > > +   *   2) A mozIStorageRow
> > > +   *
> > > +   * @return Promise<SqliteStatementResult>
> > 
> > Let me suggest the following (to match the style of documentation of
> > OS.File):
> > 
> > @return {Promise} A promise completed once the execution of the statement is
> > complete.
> > @resolves {SqliteStatementResult} In case of success, the final result.
> > @rejects {I don't know} (document the error)
> 
> Do we have any documentation generators this would confuse? I stick to
> @return because I know popular tools handle it. That being said, we don't
> exactly do a good job of running doc generators over our code. Now that MDN
> has a write HTTP API, I'm tempted to hack up something...

That would be great. Unfortunately, we mix so many different styles that I am not optimistic.

>  
> > @@ +138,5 @@
> > > +   */
> > > +  executeStreaming: function(listener, params) {
> > > +    return this._dispatch(params, listener);
> > > +  },
> > > +
> > 
> > I believe that this method can both throw an error and return a rejected
> > promise. This might be a little too much.
> 
> I'm curious why you feel this way. I only throw exceptions when the input or
> state is bad.
> 
> I'm strongly opposed to returning promises in the case where the arguments
> to a function are bad.

That makes sense. I would suggest using TypeError if listener is not a function, btw.

> In the case where state is bad, I still think throwing is warranted. But,
> I'm not as convicted of it, especially if the caller has little control over
> that. I could be convinced to return a rejected promise instead.
> 
> If I grok promises currently, if we throw, one of two things could happen:
> 
> 1) If this is the "head" promise in a chain (we are creating the first
> promise), the exception propagates immediately to the caller, just like a
> regular exception.
> 2) If this is called inside a promise chain, the exception will be caught by
> the promise and the rejected handler fires. Promise chain aborted.

Well, throwing when you are not a handler is regular throwing. Throwing from within a handler will ensure that you call the next onReject handler. If this next handler does not throw, the rejection has been caught.

> > It seems that we could also perform this check in the constructor. Am I
> > missing something?
> 
> This is mostly defensive programming. I'm not exactly sure when
> mozIStorageBaseStatement.state can mutate.

Ok, that makes sense.

> I think the callback should be able to control behavior. For an exception,
> aborting is probably best. Perhaps we should also accept a return value with
> undefined implying continue.

What about passing an additional argument that can be used to stop the async request? This is what OS.File does for async directory iteration. The idea is that calling |iterator.close()| is roughly the equivalent of a |break| in synchronous code.

Speaking of OS.File async directory iteration, we behave as if we had a loop of promises. I believe that the resulting semantics are interesting and useful, ymmv:
- if the callback returns a promise, we wait until the promise is resolved/rejected before proceeding (the promise module offers a function |Promise.resolve()| for this purpose);
- rejecting stops the iteration with a |reject()|. The idea is that this is roughly the equivalent of a |throw| in async code.

You do not have to follow these conventions, your call.

> > I would have |handleError| reject |deferred| immediately. IIRC, this covers
> > both REASON_CANCELLED and REASON_ERROR, which would make |handleCompletion|
> > below absolutely trivial.
> 
> Per Blair's feedback, since handleError can be called multiple times, I
> think we should wait and expose all error information.

That makes sense, too. I had not realized that handleError could be called several times. Do you know in which circumstances this can happen?

> > In the spirit of not reinventing the wheel too many times, I would suggest
> > using a string here and referring users to OS.Path for path component
> > manipulation.
> 
> Sure. The only reason I did it this way is because I didn't feel like going
> through the dance to get an nsIFile. I can never remember how to do it the
> "right" way. But, I'll figure it out.

new FileUtils.File(path)

:)

> We could add a caching layer to SQLiteConnection. If we see the same SQL
> again, we re-use an existing mozIStorageAsyncStatement. We would need to
> worry about cache size explosion and potential issues of transparent
> mozIStorageAsyncStatement re-use (are there any?).

I would avoid doing this. I believe that this should be left in the hands of API clients, they know better than us/you when statements can/should be reused.

> Even better, we could add an API on SQLiteConnection to "register" common
> SQL statements. Only registered statements would have their
> mozIStorageAsyncStatement cached.
> 
> e.g.
> 
> conn.registerStatement("select-all-names", "SELECT * FROM NAMES");
> conn.executeNamed("select-all-names").then(...)

Well, you could make the Statement constructor be a method of connection, which might resolve the issue.

> Perhaps we could special-case StopIteration from a regular Error?

See above, but that also sounds like a valid idea.

> 
> API design is hard.

It is :)
(In reply to Gregory Szorc [:gps] from comment #10)
> So, putting this in toolkit would require killing log4moz and
> CommonUtils.exceptionStr.

Yesplz.


(In reply to Gregory Szorc [:gps] from comment #9)
> For resolve(), we need to handle both the buffered and non-buffered case, so
> a simple Array won't suffice. This is why I created a type to hold
> everything.

I think it depends on how pure/true to the original API design we want to keep this.

If we simplify the errors a bit, I don't think its needed - errors just get passed via reject(). And for buffered, the whole result set via resolve(). For unbuffered, resolve() doesn't need anything passed. 

That's taking the all-or-nothing kind of approach when an error occurs (or at least, halt on error for unbuffered), which is the opposite to what the original API design does. But that's not necessarily a bad deviation.


> I may just kill this type altogether and put rows on SqliteStatementResult
> and just not set it for the streaming case.

SqliteConnection.execute() would have to return a statement then.


> This is mostly defensive programming. I'm not exactly sure when
> mozIStorageBaseStatement.state can mutate.

I'm rusty on the details, so that's a question for Marco.


> > Agreed - this could be simplified/abstracted away. For buffered results, if
> > there's an error I think any partial results should just be thrown away.
> 
> We need to consider the implications of this. If this goes away, then with
> the current set of APIs, callers are always passing SQL strings for
> execution instead of passing SQL initially and only dealing with bound
> parameters later.

Sorry, this was meant to be referring to David's comment about the results objects.


> We could add a caching layer to SQLiteConnection. If we see the same SQL
> again, we re-use an existing mozIStorageAsyncStatement. We would need to
> worry about cache size explosion and potential issues of transparent
> mozIStorageAsyncStatement re-use (are there any?).

That would be handy, as it's a relatively common pattern. Needs to be entirely opt-in though. Followup fodder?


> Perhaps we could special-case StopIteration from a regular Error?

Yesplz.
(In reply to David Rajchenbach Teller [:Yoric] from comment #11)
> What about passing an additional argument that can be used to stop the async
> request? This is what OS.File does for async directory iteration. The idea
> is that calling |iterator.close()| is roughly the equivalent of a |break| in
> synchronous code.

This would be useful.
Duplicate of this bug: 813155
Attached patch Generic SQLite JS interface, v2 (obsolete) — Splinter Review
I incorporated feedback. I /think/ I got all the major points. Please let me know what I'm missing.

I still need to update some docs and increase test coverage. But, it's better than it was.
Attachment #683844 - Attachment is obsolete: true
Attachment #683844 - Flags: review?(rnewman)
Attachment #683844 - Flags: feedback?(mak77)
Attachment #685411 - Flags: feedback?(mar.castelluccio)
Attachment #685411 - Flags: feedback?(bmcbride)
Comment on attachment 685411 [details] [diff] [review]
Generic SQLite JS interface, v2

Wrong Marco...
Attachment #685411 - Flags: feedback?(mar.castelluccio) → feedback?(mak77)
Comment on attachment 685411 [details] [diff] [review]
Generic SQLite JS interface, v2

Review of attachment 685411 [details] [diff] [review]:
-----------------------------------------------------------------

Style consistency nit: Sometimes you use
  whatever: function()
And sometimes
  whatever: function ()

::: services/common/sqlite.js
@@ +100,5 @@
> +      for (let i = 0; i < params.length; i++) {
> +        this._statement.bindByIndex(i, params[i]);
> +      }
> +    } else if (params && typeof(params) == "object") {
> +      for (let [k, v] in Iterator(params)) {

Unfortunately, Iterator doesn't seem to work on Map objects. I'd really like to see Map supported at least for passing params - it's more appropriate for data storage, and we're going to be using it more and more (or at least, *I* am).

@@ +138,5 @@
> +          try {
> +            onRow(self, row);
> +          } catch (e if e instanceof StopIteration) {
> +            dontProcess = true;
> +            continue;

AFAIK, you don't need to keep fetching rows - you can safely just break out of this loop.

But you should also call mozIStoragePendingStatement.cancel() (executeAsync returns an instance of that). That will result in REASON_CANCELLED being passed to handleCompletion - the docs above say throwing an exception from onRow won't reject the promise, but I think throwing a StopIteration *should* reject it (albeit without any errors).

@@ +155,5 @@
> +
> +      handleCompletion: function(reason) {
> +        if (errors.length) {
> +          let error = new Error("Errors encountered during statement execution.");
> +          error.errors = errors;

I can haz more descriptive names?

@@ +162,5 @@
> +        }
> +
> +        switch (reason) {
> +          case Ci.mozIStorageStatementCallback.REASON_FINISHED:
> +            deferred.resolve(rows);

I realise rows will be undefined if this was unbuffered, but I think you should be more explicit about that here.

@@ +171,5 @@
> +            break;
> +
> +          // [gps] I don't think this will ever get called because handleError
> +          // should have registered an error and we would have rejected
> +          // above.

I *think* that's right (this should be unreachable). Marco will probably know better.

@@ +205,5 @@
> + * while preserving the promise-based API.
> + *
> + * The database to open is typically specified as a filesytem path.
> + * Absolute paths are preferred. Behavior for relative paths is
> + * undefined, but allowed.

Does OS.Path provide a mechanism for resolving possible relative paths to a base path? (Thinking of ensuring any relative paths are relative to the profile dir.)

@@ +232,5 @@
> + * @param special (optional)
> + *        (bool) Whether this is a special named database. Defaults to
> + *        false.
> + */
> +this.SQLiteConnection = function(path, shared=true, special=false) {

I wonder if the optional args here should be specified in an object, if there's any possibility of adding more in the future (eg, locking mode, fsync mode, required schema version, error recovery, etc).

@@ +321,5 @@
> +        }
> +      }
> +
> +      if (!this._connection.connectionReady) {
> +        this._log.warn("Connection is not ready!");

For this to be useful, it should include the DB filename/path.

Though, thinking about that, using this module more than once is going to make all of it's logging frustrating to comprehend. Including paths everywhere would make it very verbose, but just the basename (f.leafName) is generally enough to uniquely identify a DB file for most of our use cases.

@@ +371,5 @@
> +    if (!this._connection) {
> +      throw new Error("Only opened connections can be cloned.");
> +    }
> +
> +    let c = this._connection.clonse(readOnly);

clonse? :)

@@ +394,5 @@
> +   *        (string) The SQL that will eventually be executed.
> +   *
> +   * @return SQLiteStatement
> +   */
> +  getStatement: function (sql) {

Should be null-checking this._connection here (and other places too).

@@ +461,5 @@
> +   *        "CREATE TABLE" statement.
> +   *
> +   * @return Promise<string>
> +   */
> +  createTable: function (name, schema) {

If you really wanted to make createTable, tableExists, and indexExists all truely async without waiting for the XPCOM methods to become async, they're all doable via SQL statements (which is what mozStorageConnection.cpp does anyway).
  CREATE TABLE <name> (<schema>)
  SELECT name FROM sqlite_master WHERE type='table' AND name='<table-name>'
  SELECT name FROM sqlite_master WHERE type='index' AND name='<index-name>'
Attachment #685411 - Flags: feedback?(bmcbride) → feedback+
(In reply to Blair McBride (:Unfocused) from comment #17)
> Comment on attachment 685411 [details] [diff] [review]
> Generic SQLite JS interface, v2

Excellent feedback!

> Style consistency nit: Sometimes you use
>   whatever: function()
> And sometimes
>   whatever: function ()

Yeah, I'm so used to naming ALL OF THE FUNCTIONS that I'm at a loss for what to do in the absence of names. What is the preferred convention? Both appear in large numbers in the tree...
 
> ::: services/common/sqlite.js
> @@ +100,5 @@
> > +      for (let i = 0; i < params.length; i++) {
> > +        this._statement.bindByIndex(i, params[i]);
> > +      }
> > +    } else if (params && typeof(params) == "object") {
> > +      for (let [k, v] in Iterator(params)) {
> 
> Unfortunately, Iterator doesn't seem to work on Map objects. I'd really like
> to see Map supported at least for passing params - it's more appropriate for
> data storage, and we're going to be using it more and more (or at least, *I*
> am).

And (for let [k, v] of Object) doesn't work either. So, I ask the JS people: what's the point of all these key-value iteration techniques if none of them work everywhere you need them? But I digress.

I agree we should support Map. I too am starting to use it more!

> @@ +205,5 @@
> > + * while preserving the promise-based API.
> > + *
> > + * The database to open is typically specified as a filesytem path.
> > + * Absolute paths are preferred. Behavior for relative paths is
> > + * undefined, but allowed.
> 
> Does OS.Path provide a mechanism for resolving possible relative paths to a
> base path? (Thinking of ensuring any relative paths are relative to the
> profile dir.)

I agree that relative paths should be treated relative to profile. AFAICT, OS.Path doesn't have an isAbsolute or isRelative tester. I /think/ the default behavior of OS.Path is relative to getcwd() (which is typically profile directory). But, relying on that is risky. I'd rather explicitly normalize relative paths.

David: What facilities does/will OS.File provide for this use case?
 
> @@ +232,5 @@
> > + * @param special (optional)
> > + *        (bool) Whether this is a special named database. Defaults to
> > + *        false.
> > + */
> > +this.SQLiteConnection = function(path, shared=true, special=false) {
> 
> I wonder if the optional args here should be specified in an object, if
> there's any possibility of adding more in the future (eg, locking mode,
> fsync mode, required schema version, error recovery, etc).

JavaScript: why u no named arguments?

Or, since we require an explicit open(), we can make these optional parameters properties of the instance that only come into play during open(). That would keep the API simple and still feel natural. Mutation may not necessarily have an effect after open(), but we could certainly enforce that through setters if we wanted.

> @@ +321,5 @@
> > +        }
> > +      }
> > +
> > +      if (!this._connection.connectionReady) {
> > +        this._log.warn("Connection is not ready!");
> 
> For this to be useful, it should include the DB filename/path.
> 
> Though, thinking about that, using this module more than once is going to
> make all of it's logging frustrating to comprehend. Including paths
> everywhere would make it very verbose, but just the basename (f.leafName) is
> generally enough to uniquely identify a DB file for most of our use cases.
>

Good idea!

> @@ +461,5 @@
> > +   *        "CREATE TABLE" statement.
> > +   *
> > +   * @return Promise<string>
> > +   */
> > +  createTable: function (name, schema) {
> 
> If you really wanted to make createTable, tableExists, and indexExists all
> truely async without waiting for the XPCOM methods to become async, they're
> all doable via SQL statements (which is what mozStorageConnection.cpp does
> anyway).
>   CREATE TABLE <name> (<schema>)
>   SELECT name FROM sqlite_master WHERE type='table' AND name='<table-name>'
>   SELECT name FROM sqlite_master WHERE type='index' AND name='<index-name>'

I think I'll do that! I may even leave out CREATE TABLE - I'm not convinced it adds much value, especially since you need to pass in one giant schema string anyway (perhaps if it were an object of clauses - but that would be tricky to do right).
(In reply to Gregory Szorc [:gps] from comment #18)
> > Style consistency nit: Sometimes you use
> >   whatever: function()
> > And sometimes
> >   whatever: function ()
> 
> Yeah, I'm so used to naming ALL OF THE FUNCTIONS that I'm at a loss for what
> to do in the absence of names. What is the preferred convention? Both appear
> in large numbers in the tree...

This does not matter and is not worth even discussing :P
> > Does OS.Path provide a mechanism for resolving possible relative paths to a
> > base path? (Thinking of ensuring any relative paths are relative to the
> > profile dir.)
> 
> I agree that relative paths should be treated relative to profile. AFAICT,
> OS.Path doesn't have an isAbsolute or isRelative tester. I /think/ the
> default behavior of OS.Path is relative to getcwd() (which is typically
> profile directory). But, relying on that is risky. I'd rather explicitly
> normalize relative paths.

Good idea. Filed bug 816851.

> David: What facilities does/will OS.File provide for this use case?

Actually, |OS.Path.join(OS.Constants.Path.profileDir, myPath)| will do exactly what you want. If |myPath| is absolute, it will return |myPath|, and if it is relative, it will make it absolute wrt profileDir.

Once bug 810543 has landed, this will throw if we do not have a profile directory.

Btw, if you are looking for isAbsolute, OS.Path doesn't have one yet, but the result of |OS.Path.split| contains a boolean field |absolute|.

>  
> > @@ +232,5 @@
> > > + * @param special (optional)
> > > + *        (bool) Whether this is a special named database. Defaults to
> > > + *        false.
> > > + */
> > > +this.SQLiteConnection = function(path, shared=true, special=false) {
> > 
> > I wonder if the optional args here should be specified in an object, if
> > there's any possibility of adding more in the future (eg, locking mode,
> > fsync mode, required schema version, error recovery, etc).
> 
> JavaScript: why u no named arguments?

In OS.File, almost all functions take as argument an object |options|, which serves as named optional arguments.

> > Though, thinking about that, using this module more than once is going to
> > make all of it's logging frustrating to comprehend. Including paths
> > everywhere would make it very verbose, but just the basename (f.leafName) is
> > generally enough to uniquely identify a DB file for most of our use cases.
> >
> 
> Good idea!

Or OS.Path.basename :)
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #19)
> (In reply to Gregory Szorc [:gps] from comment #18)
> > > Style consistency nit: Sometimes you use
> > >   whatever: function()
> > > And sometimes
> > >   whatever: function ()
> > 
> > Yeah, I'm so used to naming ALL OF THE FUNCTIONS that I'm at a loss for what
> > to do in the absence of names. What is the preferred convention? Both appear
> > in large numbers in the tree...
> 
> This does not matter and is not worth even discussing :P

But if you are in doubt, anonymous functions want a space! :)
While here, sorry for being late in feedback, I'll try to do a pass asap.
Comment on attachment 685411 [details] [diff] [review]
Generic SQLite JS interface, v2

Review of attachment 685411 [details] [diff] [review]:
-----------------------------------------------------------------

It's good start, though I think it still needs a bit of brainstorming on paper

::: services/common/Makefile.in
@@ +15,5 @@
>    log4moz.js \
>    observers.js \
>    preferences.js \
>    rest.js \
> +  sqlite.js \

so, I don't think this should be in Services, and I agree it should be SQLite.jsm

::: services/common/sqlite.js
@@ +7,5 @@
> +this.EXPORTED_SYMBOLS = [
> +  "SQLiteConnection",
> +];
> +
> +const {interfaces: Ci, utils: Cu} = Components;

IIRC we decided against this practice, cause is more expensive than the classical assignments

@@ +14,5 @@
> +Cu.import("resource://gre/modules/FileUtils.jsm");
> +Cu.import("resource://gre/modules/Services.jsm");
> +Cu.import("resource://gre/modules/XPCOMUtils.jsm");
> +Cu.import("resource://services-common/log4moz.js");
> +Cu.import("resource://services-common/utils.js");

most of these could be lazy getters (excluding promise, Services and XPCOMUtils), and it should not depends on services-common/utils.js
we may probably add some protection about importing log4moz, so if the module import fails we define a no-op mock of it, to remove dependency until it's moved to toolkit.

@@ +33,5 @@
> + * preserved across execution, the caller is relying on undefined behavior.
> + *
> + * Instances are created through `SQLiteConnection`, not directly.
> + */
> +function SQLiteStatement(statement, sql) {

global-nit: function/method params should be prefixed with "a"

@@ +54,5 @@
> +   * for every row.
> +   *
> +   * Bound parameters can be defined as an Array of positional arguments or an
> +   * object mapping named parameters to their values. If there are no bound
> +   * parameters, the caller can pass nothing or null.

Honestly since this is new code, I'd enforce the params input being a Map.
We slowly converted backend to use named bindings since numeric bindings were very often subject to bugs (like adding/removing one param but not the bound value).
I wouldn't see that as a limitation, rather an increase in robustness.

@@ +57,5 @@
> +   * object mapping named parameters to their values. If there are no bound
> +   * parameters, the caller can pass nothing or null.
> +   *
> +   * When `onRow` is not specified, all returned rows are buffered before the
> +   * returned promise is resolved. For INSERT or UPDATE statements, this has no

what's the use-case of querying for something without an onRow handler? that should likely fail and be considered an error, than depleting resources... sqlite has sqlite3_stmt_readonly API that can tell if a prepared statement is only selecting, we can expose it somewhere eventually... but as a simple check you may just look if the sql begins with SELECT

@@ +77,5 @@
> +   * has completed fully.
> +   *
> +   * The promise will be rejected with an Error instance if the statement did
> +   * not finish execution fully. The Error may have an `errors` property. If
> +   * defined, it will be an Array of mozIStorageError instances.

If you want to properly wrap xpcom and hide it, would be better to avoid mozIStorageError

@@ +100,5 @@
> +      for (let i = 0; i < params.length; i++) {
> +        this._statement.bindByIndex(i, params[i]);
> +      }
> +    } else if (params && typeof(params) == "object") {
> +      for (let [k, v] in Iterator(params)) {

fwiw, we can iterate Map (bug 725909)

@@ +115,5 @@
> +    let rows;
> +
> +    if (!onRow) {
> +      rows = [];
> +    }

just init rows to an empty array and explicitly pass nothing in reason_finished if !onRow?

@@ +123,5 @@
> +    let self = this;
> +    this._statement.executeAsync({
> +      handleResult: function(resultSet) {
> +        let row = null;
> +        while (row = resultSet.getNextRow()) {

nit: I prefer for (let row = resultSet.getNextRow(); row; row = resultSet.getNextRow()) since you don't need row out of the loop

@@ +138,5 @@
> +          try {
> +            onRow(self, row);
> +          } catch (e if e instanceof StopIteration) {
> +            dontProcess = true;
> +            continue;

Agree with blair, you should cancel the pending statement

@@ +153,5 @@
> +        errors.push(error);
> +      },
> +
> +      handleCompletion: function(reason) {
> +        if (errors.length) {

you should just use REASON_ERROR switch branch for this, why do you have to handle it apart like this?

@@ +166,5 @@
> +            deferred.resolve(rows);
> +            break;
> +
> +          case Ci.mozIStorageStatementCallback.REASON_CANCELLED:
> +            deferred.reject(new Error("Statement was cancelled."));

I'm not totally sure here, cancelling may be done on purpose, for example I may find a row I was looking for, then I throw stopIteration and cancel the statement cause I'm done. At that point I should handle this in the reject handler... not a big deal? The point is whether we see cancelling (that is done on purpose by the developer) as a constructive (success) or destructive (failure) operation. I tend more to look at it as a resource to avoid fetching useless data, than a failure.

@@ +172,5 @@
> +
> +          // [gps] I don't think this will ever get called because handleError
> +          // should have registered an error and we would have rejected
> +          // above.
> +          case Ci.mozIStorageStatementCallback.REASON_ERROR:

yes, but I think you should handle error here, not apart

@@ +188,5 @@
> +    return deferred.promise;
> +  },
> +};
> +
> +Object.freeze(SQLiteStatement.prototype);

I think you can directly
SQLiteStatement.prototype = Object.freeze({...

@@ +227,5 @@
> + *        (string) The path of the SQLite database file or name of
> + *        the special database to open.
> + * @param shared (optional)
> + *        (bool) Whether to open a shared database handle. Defaults to
> + *        true.

unshared connections are usually faster, this should default to false.

@@ +230,5 @@
> + *        (bool) Whether to open a shared database handle. Defaults to
> + *        true.
> + * @param special (optional)
> + *        (bool) Whether this is a special named database. Defaults to
> + *        false.

I think we should rather make the code wise enough to check if "path" is a special name, just "memory" and "profile" are supported, and I doubt they can be confused with a path...

@@ +232,5 @@
> + * @param special (optional)
> + *        (bool) Whether this is a special named database. Defaults to
> + *        false.
> + */
> +this.SQLiteConnection = function(path, shared=true, special=false) {

I agree with blair that could be better to allow passing a "params" object with options that we can expand in future...

What we were evaluating in Storage was an "open" API that you pass all the needed initialization info and the schema and it hands off the connection to you once all the async initialization is done

some commonly used options to have here off my head: journal, synchronous, readOnly, locking

@@ +263,5 @@
> +   * return unexpected results if multiple statements are performed in
> +   * parallel. It is the caller's responsibility to schedule
> +   * appropriately.
> +   */
> +  get lastInsertRowID() {

well this is partially untrue, we can't rely on this cause we mixup sync and async, since this module is totally async it is realiable, until there's another connection to the same database that is writing.
Likely you can rely on this only if the connection is opened in exclusive mode, maybe we should throw otherwise.

@@ +274,5 @@
> +   *
> +   * The same caveats regarding asynchronous execution for
> +   * `lastInsertRowID` also apply here.
> +   */
> +  get affectedRows() {

indeed... honestly these seem to have more downsides than benefits and I'd leave them out in the first iteration...

@@ +288,5 @@
> +    return this._connection.schemaVersion;
> +  },
> +
> +  set schemaVersion(value) {
> +    this._connection.schemaVersion = value;

this should probably be handled differently, likely passing a schema object as a param when creating the connection, it's easy to misuse this otherwise.
something like
{ 0: [list of statements or functions to execute],
  1: [statements]
}
and the connection should take care of upgrades and downgrades (downgrades are hard, we discussed many times the possibility to have a "meta" table in the database with queries or code to be run on downgrade, but it's a large project to get right).

@@ +303,5 @@
> +   */
> +  open: function () {
> +    if (this._connection) {
> +      return Promise.resolve(this);
> +    }

what if the connection exists but is not ready? I suppose we should reject?

@@ +318,5 @@
> +        } else {
> +          this._log.info("Opening unshared database: " + this._path);
> +          this._connection = Services.storage.openUnsharedDatabase(f);
> +        }
> +      }

you should at least run "pragma TEMP_STORE = MEMORY" and if the user didn't define his own preferences, truncate journal and synchronous = full.

@@ +329,5 @@
> +      this._log.warn("Could not open database: " + CommonUtils.exceptionStr(ex));
> +      return Promise.reject(ex);
> +    }
> +
> +    return Promise.resolve(this);

I'd live if these would be fake-async, that means enqueue to main-thread. It makes really easier to find issues due to wrong assumption that open() is synchronous.

@@ +347,5 @@
> +    }
> +
> +    let deferred = Promise.defer();
> +
> +    this._connection.asyncClose({

we should actually cache the created statements somewhere, and finalize all of them here before closing the connection, otherwise this is going to fail miserably.

@@ +348,5 @@
> +
> +    let deferred = Promise.defer();
> +
> +    this._connection.asyncClose({
> +      complete: function () {

please verify, but I think this has the [function] annotation, so you can likely just pass a function to asyncClose.

@@ +371,5 @@
> +    if (!this._connection) {
> +      throw new Error("Only opened connections can be cloned.");
> +    }
> +
> +    let c = this._connection.clonse(readOnly);

considered the scary things that may happen having 2 connections writing at the same db, I'd vote to change this API to getConcurrentReadOnlyConnection(), it should clone only in read-only mode.
the javadoc should explain quite well that this is useful to improve read concurrency in WAL journal mode, and that in such cases reads can happen at any time, even concurrently to writes, so if one needs serialization has to use promises for it.

finally this should not be synchronous, cloning is cause of contention, we should make it async when possible.

@@ +397,5 @@
> +   */
> +  getStatement: function (sql) {
> +    let statement = this._connection.createAsyncStatement(sql);
> +
> +    return new SQLiteStatement(statement, sql);

I think the concept of "we create a statement and you take care of it" doesn't work with lazy devs. they will forget to finalize them, to re-use them. You know.
What we should do instead is caching ourselves the statements for re-use, also stick the lastUsed timestamp on them. when they are requested we update the timestamp. periodically we finalize() and remove all of those unused for at least X minutes.
On shutdown we go through them and finalize all before closing.
See http://mxr.mozilla.org/mozilla-central/source/storage/public/StatementCache.h for a first idea (it is still missing the expiration part, but apart that is the preferred way to use statements as of now)

@@ +418,5 @@
> +   *        (Array or Object) Parameters to bind to the statement.
> +   */
> +  execute: function (sql, params=null) {
> +    return this.getStatement(sql).execute(params);
> +  },

so... this looks useless duplication db.execute(sql) compared to db.getStatement(sql).execute() is not really a big win...

@@ +446,5 @@
> +  },
> +
> +  rollbackTransaction: function () {
> +    this._connection.rollbackTransaction();
> +  },

transactions are quite an issue, cause if you forget to close a transaction, you're dead (well, your app is). And it happens, just forget a finally or similar...
I wonder if we could rather have a .batchExecute([stmts]) that used connection.executeAsync... that will automagically handle transactions for us...

Again I'd probably leave them out for the first implementation.

@@ +461,5 @@
> +   *        "CREATE TABLE" statement.
> +   *
> +   * @return Promise<string>
> +   */
> +  createTable: function (name, schema) {

I agree with blair, if we need these create and exists methods, just implement them through an async statement, don't rely on the sync API.

Off-hand as I said I'd prefer that magic "schema" object passed when creating the connection... but if we want to tackle that later, make these async for now.

also, tableExists and indexExists are things used just before creating a table or an index... then you could just not expose them and let createXXX throw...

Finally, we need to be able to create indices, and TEMP things, the API doesn't allow to create temp stuff :(
Attachment #685411 - Flags: feedback?(mak77) → feedback+
(In reply to Marco Bonardo [:mak] from comment #23)
> so, I don't think this should be in Services, and I agree it should be
> SQLite.jsm

We all agree on that in principle.
 
> ::: services/common/sqlite.js
> @@ +7,5 @@
> > +this.EXPORTED_SYMBOLS = [
> > +  "SQLiteConnection",
> > +];
> > +
> > +const {interfaces: Ci, utils: Cu} = Components;
> 
> IIRC we decided against this practice, cause is more expensive than the
> classical assignments

Does it really matter for file-level code? I doubt you will be able to measure the difference.

> @@ +14,5 @@
> > +Cu.import("resource://gre/modules/FileUtils.jsm");
> > +Cu.import("resource://gre/modules/Services.jsm");
> > +Cu.import("resource://gre/modules/XPCOMUtils.jsm");
> > +Cu.import("resource://services-common/log4moz.js");
> > +Cu.import("resource://services-common/utils.js");
> 
> most of these could be lazy getters (excluding promise, Services and
> XPCOMUtils), and it should not depends on services-common/utils.js
> we may probably add some protection about importing log4moz, so if the
> module import fails we define a no-op mock of it, to remove dependency until
> it's moved to toolkit.

I'm curious why you feel this. I don't see this practice used very often in backend modules.

> 
> @@ +33,5 @@
> > + * preserved across execution, the caller is relying on undefined behavior.
> > + *
> > + * Instances are created through `SQLiteConnection`, not directly.
> > + */
> > +function SQLiteStatement(statement, sql) {
> 
> global-nit: function/method params should be prefixed with "a"

I thought that was only the C++ style / old JS style. There's a lot of JS written that doesn't a-prefix.

> 
> @@ +54,5 @@
> > +   * for every row.
> > +   *
> > +   * Bound parameters can be defined as an Array of positional arguments or an
> > +   * object mapping named parameters to their values. If there are no bound
> > +   * parameters, the caller can pass nothing or null.
> 
> Honestly since this is new code, I'd enforce the params input being a Map.
> We slowly converted backend to use named bindings since numeric bindings
> were very often subject to bugs (like adding/removing one param but not the
> bound value).
> I wouldn't see that as a limitation, rather an increase in robustness.

While I think we should encourage named parameters/Map, I believe we should still support Array. One use case is where you dynamically construct SQL statements and need to bind parameters. e.g.

  // In the real use case, the Array is not statically defined like this.
  let legalValues = ["foo", "bar", "baz"];

  let clauses = [];
  for (let value of legalValues) {
    clauses.push("value=?");
  }

  let sql = "SELECT * FROM widgets WHERE " + clauses.join(" OR ");
  // WHERE value=? AND value=? AND value=?

  connection.execute(sql, legalValues);

While possible to support with named arguments, it is more difficult.

> 
> @@ +57,5 @@
> > +   * object mapping named parameters to their values. If there are no bound
> > +   * parameters, the caller can pass nothing or null.
> > +   *
> > +   * When `onRow` is not specified, all returned rows are buffered before the
> > +   * returned promise is resolved. For INSERT or UPDATE statements, this has no
> 
> what's the use-case of querying for something without an onRow handler? that
> should likely fail and be considered an error, than depleting resources...
> sqlite has sqlite3_stmt_readonly API that can tell if a prepared statement
> is only selecting, we can expose it somewhere eventually... but as a simple
> check you may just look if the sql begins with SELECT

If you have a LIMIT 1, you may not want to waste your time registering a handler. Leaving onRow optional makes the API simpler to consume in simple cases. I argue that is something good API design should optimize for.

> @@ +77,5 @@
> > +   * has completed fully.
> > +   *
> > +   * The promise will be rejected with an Error instance if the statement did
> > +   * not finish execution fully. The Error may have an `errors` property. If
> > +   * defined, it will be an Array of mozIStorageError instances.
> 
> If you want to properly wrap xpcom and hide it, would be better to avoid
> mozIStorageError

I agree.

> 
> @@ +100,5 @@
> > +      for (let i = 0; i < params.length; i++) {
> > +        this._statement.bindByIndex(i, params[i]);
> > +      }
> > +    } else if (params && typeof(params) == "object") {
> > +      for (let [k, v] in Iterator(params)) {
> 
> fwiw, we can iterate Map (bug 725909)

Except there is no trivial iteration technique to iterate both Map and Object.

Map only works:

  for (let [k, v] of m)

Object only works:

  for (let [k, v] in Iterator(o)
  for (let k in m)

According to jorendorff, Map is not meant as a stand-in replacement for Object (sadly, IMO).

> @@ +288,5 @@
> > +    return this._connection.schemaVersion;
> > +  },
> > +
> > +  set schemaVersion(value) {
> > +    this._connection.schemaVersion = value;
> 
> this should probably be handled differently, likely passing a schema object
> as a param when creating the connection, it's easy to misuse this otherwise.
> something like
> { 0: [list of statements or functions to execute],
>   1: [statements]
> }
> and the connection should take care of upgrades and downgrades (downgrades
> are hard, we discussed many times the possibility to have a "meta" table in
> the database with queries or code to be run on downgrade, but it's a large
> project to get right).

While an interesting idea, this is feature creep and I don't think we should support it in the initial release.

> @@ +318,5 @@
> > +        } else {
> > +          this._log.info("Opening unshared database: " + this._path);
> > +          this._connection = Services.storage.openUnsharedDatabase(f);
> > +        }
> > +      }
> 
> you should at least run "pragma TEMP_STORE = MEMORY" and if the user didn't
> define his own preferences, truncate journal and synchronous = full.

This is some light magic. Now I'm curious whether we should just have connection "profiles" that define behavior for well-known 

> @@ +329,5 @@
> > +      this._log.warn("Could not open database: " + CommonUtils.exceptionStr(ex));
> > +      return Promise.reject(ex);
> > +    }
> > +
> > +    return Promise.resolve(this);
> 
> I'd live if these would be fake-async, that means enqueue to main-thread. It
> makes really easier to find issues due to wrong assumption that open() is
> synchronous.

Promise.{resolve,reject} are fake async. As long as a promise is returned, the callers treats it as if it is async. There could still be weird event loop spinning issues depending on the implementation. But, that's the caller's problem. When storage exposes real async APIs, we simply change this implementation to resolve in the async callback instead of immediately. The external API is preserved.

> @@ +347,5 @@
> > +    }
> > +
> > +    let deferred = Promise.defer();
> > +
> > +    this._connection.asyncClose({
> 
> we should actually cache the created statements somewhere, and finalize all
> of them here before closing the connection, otherwise this is going to fail
> miserably.

Oh really? That is... annoying. Can't finalizing statements be performed as part of a destructor somewhere?

Out of curiosity, what actually fails?

> @@ +371,5 @@
> > +    if (!this._connection) {
> > +      throw new Error("Only opened connections can be cloned.");
> > +    }
> > +
> > +    let c = this._connection.clonse(readOnly);
> 
> considered the scary things that may happen having 2 connections writing at
> the same db, I'd vote to change this API to
> getConcurrentReadOnlyConnection(), it should clone only in read-only mode.
> the javadoc should explain quite well that this is useful to improve read
> concurrency in WAL journal mode, and that in such cases reads can happen at
> any time, even concurrently to writes, so if one needs serialization has to
> use promises for it.
> 
> finally this should not be synchronous, cloning is cause of contention, we
> should make it async when possible.

Async it will be (I wasn't aware how atomic this operation was).

> 
> @@ +397,5 @@
> > +   */
> > +  getStatement: function (sql) {
> > +    let statement = this._connection.createAsyncStatement(sql);
> > +
> > +    return new SQLiteStatement(statement, sql);
> 
> I think the concept of "we create a statement and you take care of it"
> doesn't work with lazy devs. they will forget to finalize them, to re-use
> them. You know.
> What we should do instead is caching ourselves the statements for re-use,
> also stick the lastUsed timestamp on them. when they are requested we update
> the timestamp. periodically we finalize() and remove all of those unused for
> at least X minutes.
> On shutdown we go through them and finalize all before closing.
> See
> http://mxr.mozilla.org/mozilla-central/source/storage/public/StatementCache.
> h for a first idea (it is still missing the expiration part, but apart that
> is the preferred way to use statements as of now)

I agree this wrapper layer should handle the complex work of statement management, including finalization.
 
> Finally, we need to be able to create indices, and TEMP things, the API
> doesn't allow to create temp stuff :(

How does the current API not allow this? Isn't temp table creation not just "CREATE TEMPORARY TABLE" instead of "CREATE TABLE?" The only aspects of the storage API that aren't yet exposed are managing function, progress monitors, growth increment, and enableModule(). I'm not sure if any of these are important enough to make the first cut.
(In reply to Gregory Szorc [:gps] from comment #24)

> > > +function SQLiteStatement(statement, sql) {
> > 
> > global-nit: function/method params should be prefixed with "a"
> 
> I thought that was only the C++ style / old JS style. There's a lot of JS
> written that doesn't a-prefix.

Yeah, we don't do that (or many other "toolkit conventions") for new code -- it's really old (and IMO bad) style. Rock on with your usual style, Greg.

> Except there is no trivial iteration technique to iterate both Map and
> Object.

  for (let [k, v] in (("iterator" in x) ? x.iterator() : Iterator(x))) {
  }

? That worked for me in Web Console…
(In reply to Gregory Szorc [:gps] from comment #24)
> Does it really matter for file-level code? I doubt you will be able to
> measure the difference.

many small differences make a difference :)

> I'm curious why you feel this. I don't see this practice used very often in
> backend modules.

the fact people don't code for perf doesn't mean it's a bad practice.

> I thought that was only the C++ style / old JS style. There's a lot of JS
> written that doesn't a-prefix.

dunno, the coding style guide doesn't make distinction, not that I care much but it has great benefit to distinguish input from the rest. that said, nit.

> While I think we should encourage named parameters/Map, I believe we should
> still support Array. One use case is where you dynamically construct SQL
> statements and need to bind parameters. e.g.
> 
> While possible to support with named arguments, it is more difficult.

we should not take the easy path if it's more error-prone, you can easily build a map out of an array, even just using numbers as labels. I don't think it's more difficult.

> > what's the use-case of querying for something without an onRow handler? that
> > should likely fail and be considered an error, than depleting resources...
> > sqlite has sqlite3_stmt_readonly API that can tell if a prepared statement
> > is only selecting, we can expose it somewhere eventually... but as a simple
> > check you may just look if the sql begins with SELECT
> 
> If you have a LIMIT 1, you may not want to waste your time registering a
> handler. Leaving onRow optional makes the API simpler to consume in simple
> cases. I argue that is something good API design should optimize for.

fwiw, LIMIT 1 is more expensive than querying without a limit and stopping at the first fetch (provided the query returns just one result).  Many db APIs have something like a .getOne() for that specific case.

> Except there is no trivial iteration technique to iterate both Map and
> Object.

Well, I suggested accepting only Maps, so I didn't see that issue!

> > you should at least run "pragma TEMP_STORE = MEMORY" and if the user didn't
> > define his own preferences, truncate journal and synchronous = full.
> 
> This is some light magic. Now I'm curious whether we should just have
> connection "profiles" that define behavior for well-known 

intersting idea, though the "defaults" we take will likely work for 90% of the cases.

> Promise.{resolve,reject} are fake async. As long as a promise is returned,
> the callers treats it as if it is async. 

what I mean is that when I get back the promise, the work should not yet be done. It's really easy to sneak into subtle bugs otherwise (I get the promise and wrongly use some data that should not be used until the promise is fulfilled, but everything keeps working since it did the job synchronously. Then we switch to a real async backend and I have an hard to find bug). Is this what happens?

> > we should actually cache the created statements somewhere, and finalize all
> > of them here before closing the connection, otherwise this is going to fail
> > miserably.
> 
> Oh really? That is... annoying. Can't finalizing statements be performed as
> part of a destructor somewhere?
> 
> Out of curiosity, what actually fails?

Basically if you don't finalize the statements the database doesn't close, we assert like crazy on shutdown (Aborting clearly) and if it's not on shutdown we leak a bunch of stuff.
Really, we should handle this for the user.

> > Finally, we need to be able to create indices, and TEMP things, the API
> > doesn't allow to create temp stuff :(
> 
> How does the current API not allow this? Isn't temp table creation not just
> "CREATE TEMPORARY TABLE" instead of "CREATE TABLE?"

yes, I meant it doesn't handle it in the createTable API
http://mxr.mozilla.org/mozilla-central/source/storage/public/mozIStorageConnection.idl#269

you can do whatever you want with pure SQL.

> The only aspects of the
> storage API that aren't yet exposed are managing function, progress
> monitors, growth increment, and enableModule(). I'm not sure if any of these
> are important enough to make the first cut.

don't think they are. growth increment may be part of the input params to the connection fwiw.
(In reply to Marco Bonardo [:mak] from comment #26)
> > > what's the use-case of querying for something without an onRow handler? that
> > > should likely fail and be considered an error, than depleting resources...
> > > sqlite has sqlite3_stmt_readonly API that can tell if a prepared statement
> > > is only selecting, we can expose it somewhere eventually... but as a simple
> > > check you may just look if the sql begins with SELECT
> > 
> > If you have a LIMIT 1, you may not want to waste your time registering a
> > handler. Leaving onRow optional makes the API simpler to consume in simple
> > cases. I argue that is something good API design should optimize for.
> 
> fwiw, LIMIT 1 is more expensive than querying without a limit and stopping
> at the first fetch (provided the query returns just one result).  Many db
> APIs have something like a .getOne() for that specific case.

Maybe we should add this then.
 
> > Except there is no trivial iteration technique to iterate both Map and
> > Object.
> 
> Well, I suggested accepting only Maps, so I didn't see that issue!

Blair suggested handling them both. I like the idea in principle. I don't like writing a non-trivial for statement. Also, jorendorff told me that I shouldn't use .iterator() because it's changing.

> > Promise.{resolve,reject} are fake async. As long as a promise is returned,
> > the callers treats it as if it is async. 
> 
> what I mean is that when I get back the promise, the work should not yet be
> done. It's really easy to sneak into subtle bugs otherwise (I get the
> promise and wrongly use some data that should not be used until the promise
> is fulfilled, but everything keeps working since it did the job
> synchronously. Then we switch to a real async backend and I have an hard to
> find bug). Is this what happens?

It is true that if you use an API improperly there might be a bug. We /could/ set a nextTick event or do event loop spinning, etc. While defensive, that feels a bit hacky to me. I think any code with any test coverage would start failing with obvious "connection not ready" errors the moment we make this async. (I'm assuming there is test coverage of all callers.)

There has been talk of creating special builds that insert extra no-op event loop spins to isolate poorly-written JS. We could extend this to Promise.{resolve,reject} and insert random delays to find code that doesn't use .then() properly. I think I would prefer this to doing effectively the same in individual components all across the tree.
(In reply to Gregory Szorc [:gps] from comment #24)
>> so, I don't think this should be in Services, and I agree it should be
>> SQLite.jsm
>
>We all agree on that in principle.

In practice, I think this file shouldn't land in services/common. Hopefully we can sort out what that involves exactly without much delay, and to everyone's satisfaction - I am optimistic!

> > most of these could be lazy getters
> I'm curious why you feel this. I don't see this practice used very often in
> backend modules.

It helps avoid non-obvious perf/memory use side effects from importing modules. Just loading the JS for a large module can have a large impact in IO/memory constrained environments, and so doing it lazily where possible makes sense. You should use your judgement, obviously - some modules are used/loaded unconditionally by the module being imported, or are already guaranteed to be used/loaded (e.g. Services.jsm).

> > global-nit: function/method params should be prefixed with "a"
> 
> I thought that was only the C++ style / old JS style. There's a lot of JS
> written that doesn't a-prefix.

I generally agree that we should abandon the a-prefixing style guideline, FWIW.
I have what may be an API request, since often we inherit a connection from an xpcom interface, could be nice if this API could take that, so if I already have a connection instead of invoking open() i can invoke fromConnection() or something like that.

also, onRow receives directly the row, on it I still have to invoke .getResultByName or .getResultByIndex, that is quite annoying... may we js-ify this?
Where should we put this code? I think the candidates are /storage/ and somewhere in Toolkit, possibly /toolkit/modules.
(In reply to Marco Bonardo [:mak] from comment #26)
> (In reply to Gregory Szorc [:gps] from comment #24)
> > Does it really matter for file-level code? I doubt you will be able to
> > measure the difference.
> 
> many small differences make a difference :)

I asked in #jsapi and the JS team said that destructured assignment vs classical isn't relevant here. It only becomes an issue in tight loops. And, if I am reading the scrollback properly, it's only an issue if the tight loop is jitted. Apparently the engine doesn't currently optimize destructured assignment to the degree of classical. On top of that file-level, import time code always goes through the interpreter and the difference between classical and destructured for interpreted code is "not measurable."

I'm going to keep using destructured assignment for the Components foo because, well, 1 line is fewer than 3 or 4. I encourage others to reconsider their usage of destructured assignment.
(In reply to Gregory Szorc [:gps] from comment #24)
> (In reply to Marco Bonardo [:mak] from comment #23)
> > If you want to properly wrap xpcom and hide it, would be better to avoid
> > mozIStorageError
> 
> I agree.

I think I'm going to change my mind. This sounds like a good idea in principle. But, mozIStorageError is essentially a dumb container already. It has the attributes {result, message} and some constants to distinguish between result codes. If I were to implement this in JS, it would look nearly the same and just add a new object of overhead. Unless there is a compelling reason to sever the link, I'm inclined to expose the raw mozIStorageError. Although, I may just document the API (and not say they are mozIStorageError instances).
I think you should just put it in toolkit/modules.
Technically, storage is platform and not toolkit... but I think doesn't matter much since this is just a wrapper.
Depends on: 819033
Attached patch Generic SQLite JS interface, v3 (obsolete) — Splinter Review
I made some significant changes to the API based on feedback.

We now have separate types for closed and open sqlite connections. So, it's no longer possible to .open() and rely on that really be sync under the hood.

I also introduced a factory abstraction where you can obtain connections from well-defined "profiles." It's not implemented because I'm not sure what should go in here. We can kill it or populate.

The SqliteStatement type is gone. Instead, all statement management in performed in the OpenedSqliteConnection type. We have two classes of statements: one-shot and registered/named. One-shot are finalized as soon as they finish. Registered/named stick around until they are registered or the connection is closed.

Closing a connection will now abort any active transaction, cancel active statements, and finalize all statements.

There is no complicated LRU cache or anything like that. However, the current API allows one to exist in the future if we so wanted. Such a change would be transparent to callers (assuming we cache the original SQL strings for registered statements - I'm not concerned about memory bloat here). I'm inclined to not worry about resource consumption until we have data proving it is an issue. Even then, I think we can punt implementation to a follow-up. As long as the API enables a seamless transition...

Transactions are now performed with light magic. Callers pass a Task.spawn() generator function into executeTransaction(). If that is resolved, the transaction is committed. Else, it is rolled back. It is kinda like an automatic try..finally. I think it will result in some pretty clean code for dealing with transactions.

There are many things not yet implemented. I'm sure there are bugs. I still need to invent a type to represent rows. I still need to incorporate some feedback. The tests are horribly broken. close() doesn't invalidate the connection properly. But, it's a step in a new direction and I want to get feedback before I commit myself too far.

Drive-by's encouraged.
Attachment #685411 - Attachment is obsolete: true
Attachment #689504 - Flags: feedback?(rnewman)
Attachment #689504 - Flags: feedback?(mak77)
Comment on attachment 689504 [details] [diff] [review]
Generic SQLite JS interface, v3

I love this patch's filenames!
Attachment #689504 - Flags: feedback+
Comment on attachment 689504 [details] [diff] [review]
Generic SQLite JS interface, v3

Review of attachment 689504 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/modules/sqlite.jsm
@@ +6,5 @@
> +
> +this.EXPORTED_SYMBOLS = [
> +  "getUnopenedSqliteConnection",
> +  "UnopenedSqliteConnection",
> +];

I have contrasting feelings about this version of the patch.

So, I'll start from the neutrals. Either we make Cu.import case insensitive, or we keep respecting the de-facto standard, that means the module name must be ucfirst. The problem is that otherwise everyone has to remember if the module starts with an uc or lc letter, that sucks. I know now there is the trend to go lc, and I think it's negative, at least until we make import insensitive.

Now to the (imo) negative. I think it's really better to just expose one object from the module, named like the module, for some reasons:
- defineLazyModuleGetter imports only one object per module, that means either we fix it to import everything (making its syntax more complex), or everyone will start again writing their own lazy module loaders
- it greatly helps memory to know that once you import module Cat, you have a Cat object in scope
- it also helps naming conflicts in the scope, here I import sqlite.jsm and get a bunch of strangely names objects in scope

So, even if just as a forwarder to internal objects, I'd consider it.

@@ +28,5 @@
> +
> +
> +// Counts the number of created connections. Each connection's count is used
> +// for logging to distinguish connection instances.
> +let connectionCounter = 0;

it's a good idea to have a counter, would be even better if the counter would per per db name, so places.sqlite#1, cookies.sqlite#1, places.sqlite#2

@@ +41,5 @@
> + * @return UnopenedSqliteConnection
> + */
> +this.getUnopenedSqliteConnection = function(name) {
> +  throw new Error("Unknown Sqlite connection profile: " + name);
> +}

So, I was brainstorming on an helper that prebuilds params based on some specific profiles, in the form (random pseudocode)
sqlite.open(dbpath, sqlite.profiles.DISK_CACHE);
so we can expose a connectionProfiles object with some pre-built profiles for most common cases

some example profiles (more options may come in future)

DEFAULT = {
  journal: "truncate",
  sync: "full",
  readOnly: false,
  locking: "exclusive"
}

CONCURRENT_SAFE_STORE = {
  journal: "wal",
  sync: "normal",
  readOnly: false
  locking: "normal"
}

SHARED_SAFE_STORE = {
  journal: "truncate",
  sync: "full",
  readOnly: false
  locking: "normal"
}

EXCLUSIVE_MEMORY_CACHE = {
  journal: "memory",
  sync: "off",
  readOnly: "false"
  locking: "exclusive"
}

Unfortunately I have no idea how to make this "expandable" with additional options like the schema (as previously outlined), since Object.create() syntax is not exactly sugar :)
maybe something like

let params = sqlite.profiles.extend(sqlite.profiles.MEMORY_CACHE, {additional_options});
sqlite.open(dbpath, params);
or
sqlite.openWithProfile(sqlite.profiles.MEMORY_CACHE, {additional_options});

@@ +81,5 @@
> + */
> +this.UnopenedSqliteConnection = function (options) {
> +  this._log = Log4Moz.repository.getLogger("Sqlite.UnopenedSqliteConnection");
> +
> +  if ("path" in options && "special" in options) {

I think open(path) is easier than open({path: path}) if one wants just to use default options
and I still think special is a useless option... it was needed in cpp cause you can't make the same param an nsIFile or a string, but this is js, we can do better.

@@ +133,5 @@
> +        this._log.warn("Connection is not ready.");
> +        return Promise.reject(new Error("Connection is not ready."));
> +      }
> +
> +      return Promise.resolve(new OpenedSqliteConnection(connection, number));

I like that you splitted a new type for open... when we close the connection maybe we should get back an UnopenedConnection from the promise, so we can open it again? That would be useful to quickly clear memory databases.

@@ +223,5 @@
> +   * `lastInsertRowID` also apply here.
> +   */
> +  get affectedRows() {
> +    return this._connection.affectedRows;
> +  },

I still think lastInsertRowId and affectedRows are open traps ready to make their victims :) But let's be positive.
The comment should clearly state that they are safe to use only if the database has a SINGLE connection with write permissions, and that connection is handled through this module.
Exclusive locking would warranty more but has some other involvements in regards to concurrency (iirc you have to generate readonly clones before the first write, that is the reason we are not yet using it in Places).

@@ +255,5 @@
> +   */
> +  close: function () {
> +    if (!this._connection) {
> +      return Promise.resolve();
> +    }

due to bug 726990 and consequences (sync statements may still be created in the limbo after asyncClose since statements creation checks connectionReady!), I'd feel safer if you'd track internally a connection .state (OPEN, CLOSING, CLOSED, but maybe just a bool could do), and refuse to do any operation if the state is not OPEN.
Yes, clearly would be better to fix Storage, but in the meanwhile...

@@ +265,5 @@
> +      this._inProgressTransaction = null;
> +    }
> +
> +    // Cancel any in-progress statements.
> +    for (let [k, v] of this._inProgressStatements) {

I think internally I'd prefer keeping the pendingStatements naming, just for coherence with the Storage API, so we know which kind of objects we are talking to

@@ +268,5 @@
> +    // Cancel any in-progress statements.
> +    for (let [k, v] of this._inProgressStatements) {
> +      v.cancel();
> +    }
> +    this._inProgressStatements = new Map();

I don't see why this is a Map, it should just be a sparse Array since we only need to push stuff to it, delete by index and iterate... I'll say more, I think we don't even need to access it by index, since all the queries are serialized, so they finish in the order they begin, you can just have a fifo array here, if I'm not wrong

@@ +279,5 @@
> +
> +    for (let [k, v] of this._namedStatements) {
> +      v.finalize();
> +    }
> +    this._namedStatements = new Map();

I think we should not have "named" statements, rather let the user repeat the whole sql. May sound crazy but actually helps a lot having a readable statement near the code using it.
In Places we had named statements for many years, and it was a mess, you always have to go elsewhere to look what you are doing and if you are using the right name... Now we just use the SQL string as hash, and it greatly helped finding bugs.
SQL string can be hashed quite well and fast and be used as key.
Then, if the user wants to "name" those, he can just assign the SQL string to a named variable...

And I don't think it's really nice to request the user to unregister a statement... if a statement is cached is cause it's considered useful for the whole app life, and likely will run often. thus it's unlikely one needs to unregister it. Otherwise he would probably use a non cached statement.
The "drop old statements" behavior was exactly to simplify manual management, this should be a funny API to use, no need to micro-manage resources

@@ +288,5 @@
> +      complete: function () {
> +        this._log.info("Connection closed.");
> +        this._connection = null;
> +
> +        deferred.resolve();

As said should probably give back an unopened connection with the same params, ready to re-open.

@@ +478,5 @@
> +      throw new Error("A transaction is already active. Only one transaction " +
> +                      "can be active at a time.");
> +    }
> +
> +    this._connection.beginTransactionAs(type);

I like this automatic transaction management

@@ +516,5 @@
> +   */
> +  tableExists: function (name) {
> +    if (!this._namedStatements.has("_tableExists")) {
> +      this.registerStatement("_tableExists",
> +        "SELECT name FROM sqlite_master WHERE type='table' AND name=?");

this doesn't work for temp tables, you should union with sqlite_temp_master

I also don't think this should be a registered statement, it's used seldomly, basically once startup is done this becomes wasted resource

@@ +537,5 @@
> +   */
> +  indexExists: function (name) {
> +    if (!this._namedStatements.has("_indexExists")) {
> +      this.registerStatement("_indexExists",
> +        "SELECT name FROM sqlite_master WHERE type='index' AND name=?");

ditto for temp

@@ +545,5 @@
> +      function onResult(rows) {
> +        return Promise.resolve(rows.length > 0);
> +      }
> +    );
> +  },

I think would be useful to add a fieldEsists or whatever you want to call it, to check if a certain column exists in the table

the only remaining thing then will be triggerExists...

::: toolkit/modules/tests/xpcshell/test_sqlite.js
@@ +28,5 @@
> +
> +  let c = yield getConnection(name);
> +
> +  for (let [k, v] in Iterator(TABLES)) {
> +    yield c.execute("CREATE TABE " + k + "(" + v + ")");

TABE  is wrong but you probably know already!
Attachment #689504 - Flags: feedback?(mak77)
(In reply to Marco Bonardo [:mak] (Away 10-11 Dec) from comment #37)
> Now to the (imo) negative. I think it's really better to just expose one
> object from the module, named like the module, for some reasons:

I disagree. I think modules should be allowed to export as many symbols as they want. I do agree that exposing a "proxy" symbol is convenient.

> - defineLazyModuleGetter imports only one object per module, that means
> either we fix it to import everything (making its syntax more complex), or
> everyone will start again writing their own lazy module loaders

Bug 732385 (which you filed) is relevant here. Looking at the source, it should be a minimal patch.

> @@ +28,5 @@
> > +
> > +
> > +// Counts the number of created connections. Each connection's count is used
> > +// for logging to distinguish connection instances.
> > +let connectionCounter = 0;
> 
> it's a good idea to have a counter, would be even better if the counter
> would per per db name, so places.sqlite#1, cookies.sqlite#1, places.sqlite#2

Good idea!


> @@ +133,5 @@
> > +        this._log.warn("Connection is not ready.");
> > +        return Promise.reject(new Error("Connection is not ready."));
> > +      }
> > +
> > +      return Promise.resolve(new OpenedSqliteConnection(connection, number));
> 
> I like that you splitted a new type for open... when we close the connection
> maybe we should get back an UnopenedConnection from the promise, so we can
> open it again? That would be useful to quickly clear memory databases.

Yes, that would be a nice feature. I'll see about implementing it.

> 
> @@ +223,5 @@
> > +   * `lastInsertRowID` also apply here.
> > +   */
> > +  get affectedRows() {
> > +    return this._connection.affectedRows;
> > +  },
> 
> I still think lastInsertRowId and affectedRows are open traps ready to make
> their victims :) But let's be positive.
> The comment should clearly state that they are safe to use only if the
> database has a SINGLE connection with write permissions, and that connection
> is handled through this module.

Huh? These /should/ be per-connection. Is Storage doing something funky under the hood to break this property of SQLite's API?

https://www.sqlite.org/c3ref/last_insert_rowid.html

> 
> @@ +268,5 @@
> > +    // Cancel any in-progress statements.
> > +    for (let [k, v] of this._inProgressStatements) {
> > +      v.cancel();
> > +    }
> > +    this._inProgressStatements = new Map();
> 
> I don't see why this is a Map, it should just be a sparse Array since we
> only need to push stuff to it, delete by index and iterate... I'll say more,
> I think we don't even need to access it by index, since all the queries are
> serialized, so they finish in the order they begin, you can just have a fifo
> array here, if I'm not wrong

Isn't a sparse array just a map? :) Now, fifo queues are another story altogether...

I thought SQLite allowed you to issue multiple statements in parallel on the same connection? That's one reason I went with a Map. Does Storage not allow this or am I wrong about SQLite's abilities?
 
> @@ +279,5 @@
> > +
> > +    for (let [k, v] of this._namedStatements) {
> > +      v.finalize();
> > +    }
> > +    this._namedStatements = new Map();
> 
> I think we should not have "named" statements, rather let the user repeat
> the whole sql. May sound crazy but actually helps a lot having a readable
> statement near the code using it.
> In Places we had named statements for many years, and it was a mess, you
> always have to go elsewhere to look what you are doing and if you are using
> the right name... Now we just use the SQL string as hash, and it greatly
> helped finding bugs.
> SQL string can be hashed quite well and fast and be used as key.
> Then, if the user wants to "name" those, he can just assign the SQL string
> to a named variable...

You can make the "name" be the SQL statement. I'll change the function so when it receives 1 defined argument it will assume that is both the name and the SQL.

> And I don't think it's really nice to request the user to unregister a
> statement... if a statement is cached is cause it's considered useful for
> the whole app life, and likely will run often. thus it's unlikely one needs
> to unregister it.

That API is there mostly for completeness sake. I doubt many people will use it in practice. It does allow for more control, if needed.

> @@ +545,5 @@
> > +      function onResult(rows) {
> > +        return Promise.resolve(rows.length > 0);
> > +      }
> > +    );
> > +  },
> 
> I think would be useful to add a fieldEsists or whatever you want to call
> it, to check if a certain column exists in the table
> 
> the only remaining thing then will be triggerExists...

Scope creep. Follow-up bugs.
(In reply to Gregory Szorc [:gps] from comment #38)
> (In reply to Marco Bonardo [:mak] (Away 10-11 Dec) from comment #37)
> > Now to the (imo) negative. I think it's really better to just expose one
> > object from the module, named like the module, for some reasons:
> 
> I disagree. I think modules should be allowed to export as many symbols as
> they want. I do agree that exposing a "proxy" symbol is convenient.

"Allowed to" and "should" are different things. What are you suggesting concretely? Exporting getUnopenedSqliteConnection/UnopenedSqliteConnection in addition to a "SQLite" proxy object that has getUnopenedConnection/UnopenedConnection properties that point to them?
(In reply to Gregory Szorc [:gps] from comment #38)
> Huh? These /should/ be per-connection. Is Storage doing something funky
> under the hood to break this property of SQLite's API?

OOps, you are right, I was confused by our past issues with contemporary sync and async usage.  Util this API never exposes the raw connection or allows one to execute sync statement we should be fine. Sorry for confusion.

> I thought SQLite allowed you to issue multiple statements in parallel on the
> same connection? That's one reason I went with a Map. Does Storage not allow
> this or am I wrong about SQLite's abilities?

Yes but everything on a connection is serialized. this is how sqlite handles thread safety, so until you are working with a single connectio things will execute and return in a serialized way.

> That API is there mostly for completeness sake. I doubt many people will use
> it in practice. It does allow for more control, if needed.

or confuse people using it, they will start guessing if and when they have to unregister. Someone may also unregister wrongly and try to use unregistered stuff. That's why I prefer automatic management. just think of NS_RELEASE and why it's better to have automatic COMPtr (ok the example is EXTREME but, just to figure what I mean).
I've been using this version of the module for FHR work and am happy with the feature set and the API. Are there gaps? Yes. Might we make API changes? Possibly. But, perfect is the enemy of done and I think this patch as-is is better than anything in the tree currently.

Review notes:

* We still pass mozIStorageRow to onRow handlers. This is the main potential API breaking change I could see us making. Convince me wrapping mozIStorageRow is a good idea both in terms of advantages and in terms of performance. I'm not sure it is.
* I nuked the named profiles feature. Follow-up bug.
* I nuked unregisterStatement. We can add it easily if there is a use case.
* I retained Map instead of a queue for in progress statements. The overhead should be marginal and Map conceptually fits what SQLite is capable of doing. Using a queue would be leaking implementation details from storage into this module.
* Log messages now contain the DB name and connection number. It's pretty handy, IMO.
* Yes, we import log4moz.js from services-common. I believe we agreed "meh" and bug 451283 will eventually provide an answer.
* We're not looking in temp tables for tableExists and indexExists. Follow-up bug.
* lastInsertRowID and affectedRows still exist. They are documented as known foot guns.
* parameters to statements can be an Array of Object. Documentation says Object is preferred. I don't believe the module should prevent valid SQL usage.
* close() doesn't return an unopened connection object. Follow-up bug.
* The module doesn't export a symbol having the name of the module. Meh. 
* I'm sure I missed comments from previous reviews. A lot has been said about this patch!

It's worth repeating: perfect is the enemy of done. This needs to land in Firefox 20 to support FHR. Hopefully the API is stable. If not, then I guess early adopters incur some technical debt. Oh well. Very few things are perfect the first time through.
Attachment #689504 - Attachment is obsolete: true
Attachment #689504 - Flags: feedback?(rnewman)
Attachment #692107 - Flags: review?(mar.castelluccio)
Attachment #692107 - Flags: review?(bmcbride)
Attachment #692107 - Flags: feedback?(rnewman)
Comment on attachment 692107 [details] [diff] [review]
Promise-based SQLite interface, v4

Can't believe I did that again.
Attachment #692107 - Flags: review?(mar.castelluccio) → review?(mak77)
(In reply to Gregory Szorc [:gps] from comment #41)
> * The module doesn't export a symbol having the name of the module. Meh. 

I think I would prefer exporting a single SQLite symbol such that consumers would write:
new SQLite.unopenedConnection(...)

What are your thoughts?
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #43)
> (In reply to Gregory Szorc [:gps] from comment #41)
> > * The module doesn't export a symbol having the name of the module. Meh. 
> 
> I think I would prefer exporting a single SQLite symbol such that consumers
> would write:
> new SQLite.unopenedConnection(...)
> 
> What are your thoughts?

In the absence of a hierarchical modules system, I'm sympathetic to the idea of simulating namespaces by exporting Objects in the manner you described.
Now with ∞ more "Sqlite" in EXPORTED_SYMBOLS.
Attachment #692107 - Attachment is obsolete: true
Attachment #692107 - Flags: review?(mak77)
Attachment #692107 - Flags: review?(bmcbride)
Attachment #692107 - Flags: feedback?(rnewman)
Attachment #692117 - Flags: review?(mak77)
Attachment #692117 - Flags: review?(bmcbride)
Attachment #692117 - Flags: feedback?(rnewman)
Comment on attachment 692117 [details] [diff] [review]
Promise-based SQLite interface, v5

Review of attachment 692117 [details] [diff] [review]:
-----------------------------------------------------------------

I'm happy with this. Nice work, Greg!

::: toolkit/modules/Sqlite.jsm
@@ +33,5 @@
> +
> +/**
> + * Represents an unopened connection to a SQLite database.
> + *
> + * All connections are obtained by eventually calling this function.

I don't think "eventually" is the right word, here. "All `OpenedConnection`s start life as `UnopenedConnection`s" or somesuch?

@@ +56,5 @@
> + *   shared -- (bool) Whether the connection is shared among all threads.
> + *       Defaults to true.
> + *
> + *
> + * TODO options to control:

File bugs, plz.

@@ +387,5 @@
> +   * If an exception is thrown during execution of an `onRow` handler, it is
> +   * typically ignored and subsequent rows will continue to result in calls to
> +   * the `onRow` handler.
> +   *
> +   * If an `onRow` handler throws a `StopIteration` instance, the execution

Start this sentence with "The exception is if an…", because it clarifies the "typically" in the previous paragraph.

@@ +414,5 @@
> +    this._ensureOpen();
> +
> +    let statement = this._namedStatements.get(name);
> +
> +    if (!name) {

Either this should be 

  if (!statement) {

or the check should be before looking up the statement by name!

@@ +419,5 @@
> +      throw new Error("Unknown named statement. Did you forget to register " +
> +                      "it?: " + name);
> +    }
> +
> +    return this._executeStatement(statement[0], statement[1], params, onRow);

Definitely needs a check for statement, then.

@@ +438,5 @@
> +   * @param onRow optional
> +   *        (function) Callback to receive result of a single row.
> +   */
> +  execute: function (sql, params=null, onRow=null) {
> +    if (!sql || typeof(sql) != "string") {

Check for !sql is redundant.

@@ +455,5 @@
> +    this._executeStatement(statement, sql, params, onRow).then(
> +      function onResult(rows) {
> +        this._anonymousStatements.delete(index);
> +        deferred.resolve(rows);
> +      }.bind(this),

Wouldn't it be nice if there were syntactic sugar for a "thisfunction"?

@@ +651,5 @@
> +
> +      handleCompletion: function (reason) {
> +        let endTime = Date.now();
> +        self._log.debug("Stmt #" + index + " finished in " +
> +                        (endTime - startTime) + "ms");

Is this precise enough for timing?
Attachment #692117 - Flags: feedback?(rnewman) → feedback+
I can't make to this before Monday, though I'm happy my main concern was addressed: Sqlite module with even more Sqlite flavor.
(In reply to Richard Newman [:rnewman] from comment #46)
> @@ +455,5 @@
> > +    this._executeStatement(statement, sql, params, onRow).then(
> > +      function onResult(rows) {
> > +        this._anonymousStatements.delete(index);
> > +        deferred.resolve(rows);
> > +      }.bind(this),
> 
> Wouldn't it be nice if there were syntactic sugar for a "thisfunction"?

I think it's called Lua (or any language where "this" is passed in as an explicit argument and thus you can name it whatever you want) :)
 
> @@ +651,5 @@
> > +
> > +      handleCompletion: function (reason) {
> > +        let endTime = Date.now();
> > +        self._log.debug("Stmt #" + index + " finished in " +
> > +                        (endTime - startTime) + "ms");
> 
> Is this precise enough for timing?

I added that timing in the latest revision because "why not." I'm pretty sure storage is already doing things under the hood to report slow queries and the like. What I'm trying to say is I think I could cut this without losing much.
Addressed feedback from rnewman.
Attachment #692117 - Attachment is obsolete: true
Attachment #692117 - Flags: review?(mak77)
Attachment #692117 - Flags: review?(bmcbride)
Attachment #692364 - Flags: review?(mak77)
Attachment #692364 - Flags: review?(bmcbride)
Comment on attachment 692364 [details] [diff] [review]
Promise-based SQLite interface, v6

Review of attachment 692364 [details] [diff] [review]:
-----------------------------------------------------------------

Overall I'd not be scared about landing this.
And I'm fine with the perfect VS good discussion. Though I also think we needed a solid base to start building incrementally, so the long discussion has been worth it.

I still have a couple concerns (better expressed later in review comments):
1. I think logging should be enabled only in DEBUG or behind a pref.  At least until we figure the privacy and performances implications (that is probably when we'll have a toolkit logging module).
2. I still dislike the executeNamed API (even if it's better than before). Though, I'd be fine to retain it if we'd add also an executeCached(SQL) API, that imo is far easier to use properly. Even in a follow-up. Clearly before throwing out 2 APIs that do the same thing, would be good to listen to what others think.

That said, the patch since it's quite solid, so I'd be fine with landing it with the currently implemented features once my concerns are answered.

::: toolkit/modules/Makefile.in
@@ +14,5 @@
> +  $(NULL)
> +
> +TEST_DIRS += tests
> +
> +EXTRA_JS_MODULES = $(modules)

I don't understand the need for the modules var...

::: toolkit/modules/Sqlite.jsm
@@ +13,5 @@
> +Cu.import("resource://gre/modules/commonjs/promise/core.js");
> +Cu.import("resource://gre/modules/osfile.jsm");
> +Cu.import("resource://gre/modules/Services.jsm");
> +Cu.import("resource://gre/modules/XPCOMUtils.jsm");
> +Cu.import("resource://services-common/log4moz.js");

What's the performance impact of log4moz? Does it cause any main-thread IO? (I see some calls to .exists() and such)

Where does it store files? Are we logging queries with it? If so, this may be a privacy hit for users. If instead we just log operations without any user data, that's fine. Basically we should not log SQL queries with it, since one may (wrongly) hardcode values into them.

I think that, if we have any doubts about these 2 questions, would be better to enable logging only in debug builds, or behind a pref, otherwise it should be a no-op.
The other advantage would be that projects not using services may just drop the logging feature and still use this module (provided we also condition the services-common/utils.js import and usage).

@@ +16,5 @@
> +Cu.import("resource://gre/modules/XPCOMUtils.jsm");
> +Cu.import("resource://services-common/log4moz.js");
> +
> +XPCOMUtils.defineLazyModuleGetter(this, "CommonUtils",
> +                                  "resource://services-common/utils.js");

scope creep warning: Would be better to expose a way to get the stack trace in toolkit/content/debug.js and use it also in Sync... In future debug.js should also be converted to a proper module in toolkit/modules and fixed (like, it should not throw a dialog in front of release users when it asserts). Could be even better to have a single DebugUtils module managing both logging and assert and automatically disabling those in releases...

@@ +49,5 @@
> + *   special -- (bool) The name of a special database to open. This is
> + *       mutually exclusive with `path`.
> + *
> + *   shared -- (bool) Whether the connection is shared among all threads.
> + *       Defaults to true.

it's not the connection being shared among the threads, but the memory cache being shared among the connections: "If shared-cache mode is enabled and a thread establishes multiple connections to the same database, the connections share a single data and schema cache. This can significantly reduce the quantity of memory and IO required by the system."
The downside of shared mode is that the connections can't concurrently use the cache (they have to acquire locking on it), so it can be slower. It's the classical memory VS performance choice.

@@ +73,5 @@
> +    // Retains absolute paths and normalizes relative as relative to profile.
> +    this._path = OS.Path.join(OS.Constants.Path.profileDir, options.path);
> +  }
> +
> +  this._special = options.special;

I think you want to check that special is one of the supported values, throw otherwise

@@ +107,5 @@
> +          fn = Services.storage.openUnsharedDatabase;
> +        }
> +      }
> +
> +      let basename = OS.Path.basename(this._path);

what if this is _special? I suppose we want to use the special name as basename...

just FYI, connection.databaseFile.leafName may also be used at any time (though it's better to cache the basename like you do).

@@ +110,5 @@
> +
> +      let basename = OS.Path.basename(this._path);
> +      if (basename.endsWith(".sqlite")) {
> +        basename = basename.slice(0, -1 * ".sqlite".length);
> +      }

I'm not sure I like this, I may have a "mydb" and a "mydb.sqlite" files, they are different dbs but this makes them the same...

@@ +113,5 @@
> +        basename = basename.slice(0, -1 * ".sqlite".length);
> +      }
> +
> +      if (!(basename in connectionCounters)) {
> +        connectionCounters[basename] = 0;

I think a 1-based counter makes more sense in the logging case ("connection number N did this...", "We have N connections to db") we are not robots :)

@@ +158,5 @@
> + * `executeNamed` is used to execute a statement registered with the connection
> + * instance. Clients register multiple-use SQL statements via
> + * `registerStatement`. They later execute a saved statement by referencing
> + * its name. `executeNamed` should be used for all SQL statements that are
> + * used multiple times.

The downside of your approach with using the sql as name and value, is that instead of a simple:

fn1() {
  executeCached("SELECT bla, bla");
}
fn2() {
  executeCached("SELECT bla, bla");
}

where the first use will just cache it, I have to figure out the first use of the statement, registerStatement there:

fn1() {
  let sql = "SELECT bla, bla";
  registerStatement(sql, sql);
  executeNamed(sql);
}
fn2() {
  executeNamed("SELECT bla, bla");
}

and if I change order of the invocations I must move the registration. Otherwise I need to write a really long constructor that registers all the statements in a code point where they are not needed at all. Provided I have a constructor and I'm not instead just working in a const module object.
The only advantage is that I can use a name, but as I said I can do the same with a const MY_SQL = "SELECT bla, bla"; executeCached(MY_SQL);

If others agree with your registerStatement+executeNamed, may we evaluate the addition of an executeCached that uses sql as name and does automatic registration? it's quite easy to do and has (imo) great readability advantages vs manual registration.

@@ +184,5 @@
> +    let lc = level.toLowerCase();
> +    log[lc] = function (msg) {
> +      return Log4Moz.Logger.prototype[lc].call(log, "Conn #" + number + ": " + msg);
> +    }
> +  }

all of this logging init code doesn't seem to pertain here, maybe you want a global wrapper handling it?

@@ +197,5 @@
> +  this._namedStatements = new Map();
> +  this._anonymousStatements = new Map();
> +  this._anonymousCounter = 0;
> +  this._inProgressStatements = new Map();
> +  this._inProgressCounter = 0;

IIRC Map has gained a .size property lately, so you may not need a counter to track the index, unless you explicitly want a monotonic counter to have a sort of a unique id...

@@ +257,5 @@
> +
> +  /**
> +   * Close the database connection.
> +   *
> +   * This should be performed when you are finished with the database.

s/should/must/

@@ +306,5 @@
> +    this._connection.asyncClose({
> +      complete: function () {
> +        this._log.info("Closed");
> +        this._connection = null;
> +        this._open = false;

this._open should be set before invoking asyncClose. the critical point is between the invocation of asyncClose and that of complete(). Before asyncClose you can do whatever you want. After complete() we set _connection to null. _open should be used to cover the remaining timeframe between the 2.

@@ +407,5 @@
> +   */
> +  executeNamed: function (name, params=null, onRow=null) {
> +    this._ensureOpen();
> +
> +    let statement = this._namedStatements.get(name);

nit: you may deconstruct this as:
let [statement, sql] = this._namedStatements.get(name) || [];

mostly cause statement[0], statement[1] is a bit unclear.

@@ +599,5 @@
> +    let rows;
> +
> +    if (!onRow) {
> +      rows = [];
> +    }

this deserves a comment

@@ +627,5 @@
> +          try {
> +            onRow(self, row);
> +          } catch (e if e instanceof StopIteration) {
> +            userCancelled = true;
> +            pending.cancel();

iirc cancel is not immediate, that is you are cancelling the stmt but likely you may still get some result that was already buffered...
maybe would be safer to change the for loop condition to "row && !userCancelled", so we are sure we respect our own documentation above.

@@ +686,5 @@
> +    if (this._open) {
> +      return;
> +    }
> +
> +    this._open = false;

what is this needed for? I think here you may just:
if (!this._open)
  throw...
Attachment #692364 - Flags: review?(mak77)
(In reply to Marco Bonardo [:mak] from comment #50)
> in toolkit/content/debug.js and use it also in Sync... In future debug.js
> should also be converted to a proper module in toolkit/modules and fixed

I agree.

> (like, it should not throw a dialog in front of release users when it
> asserts).

...but it doesn't do that for real "release" users: http://hg.mozilla.org/mozilla-central/annotate/9de611848111/toolkit/content/debug.js#l52
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #51)
> > (like, it should not throw a dialog in front of release users when it
> > asserts).
> 
> ...but it doesn't do that for real "release" users:
> http://hg.mozilla.org/mozilla-central/annotate/9de611848111/toolkit/content/
> debug.js#l52

well, beta users are probably still too much... default and nightly should likely be the only channels... But we are going OT :)
* Removed "special" because it's apparently just a known mapping anyway. We can add it back if we need it later.
* Renamed "shared" to "shared_memory_cache" to be more explicit.
* Removed "registerStatement"
* Replaced "executeNamed" with "executeCached"
* Addressed other misc items from last review

I'm feeling optimistic about r+ on this one. No need to rush during Christmas. Just need to land this in 20.
Attachment #692364 - Attachment is obsolete: true
Attachment #692364 - Flags: review?(bmcbride)
Attachment #694983 - Flags: review?(mak77)
Fixed a bug in close(). Added a test for it.
Attachment #694983 - Attachment is obsolete: true
Attachment #694983 - Flags: review?(mak77)
Attachment #694999 - Flags: review?(mak77)
Comment on attachment 694999 [details] [diff] [review]
Promise-based SQLite interface, v8

Review of attachment 694999 [details] [diff] [review]:
-----------------------------------------------------------------

global NIT: there are various places using "for (let [k, v]", for code readability would be better to name those var properly, since stmt.finalize() is clearer than v.finalize(). Not a big deal since these are oneline loops, but it would be bad in large loops.

I'm not going to punt on logging, since it's an internal detail we can change at any time, just need a follow-up to make this module work in projects not using services/

Finally, I think a SR pass just to check naming of exposed methods and params would be good. I think it's a quite nice API btw.

::: toolkit/modules/Sqlite.jsm
@@ +22,5 @@
> +XPCOMUtils.defineLazyModuleGetter(this, "FileUtils",
> +                                  "resource://gre/modules/FileUtils.jsm");
> +
> +XPCOMUtils.defineLazyModuleGetter(this, "Task",
> +                                  "resource://gre/modules/Task.jsm");

you can likely remove the redundant newlines between defineLazyModuleGetters

@@ +55,5 @@
> + * FUTURE options to control:
> + *
> + *   pragma TEMP STORE = MEMORY
> + *   TRUNCATE JOURNAL
> + *   SYNCHRONOUS = full

Add "special" here since it has been removed from this version

@@ +71,5 @@
> +  // Retains absolute paths and normalizes relative as relative to profile.
> +  this._path = OS.Path.join(OS.Constants.Path.profileDir, options.path);
> +
> +  this._shared_memory_cache = "shared_memory_cache" in options
> +                              ? options.shared_memory_cache : true;

move the ? to the previous line (like we usually do with operators), just indent the options like
cond ?
  opt1 : opt2;

@@ +87,5 @@
> +   */
> +  open: function () {
> +    try {
> +      let file = FileUtils.File(this._path);
> +      let fn;

please openDatabaseFn or some better naming, since this is use much later and fn(file) is not much clear

@@ +97,5 @@
> +      }
> +
> +      let basename = OS.Path.basename(this._path);
> +
> +      if (!(basename in connectionCounters)) {

nit: since counters are no more initialized at 0, you can just check (!connectionCounters[basename])

@@ +273,5 @@
> +    // Cancel any in-progress statements.
> +    for (let [k, v] of this._inProgressStatements) {
> +      v.cancel();
> +    }
> +    this._inProgressStatements = new Map();

.clear() should be cheaper than creating and assigning a new Map()

@@ +279,5 @@
> +    // Next we finalize all active statements.
> +    for (let [k, v] of this._anonymousStatements) {
> +      v.finalize();
> +    }
> +    this._anonymousStatements = new Map();

ditto

@@ +284,5 @@
> +
> +    for (let [k, v] of this._cachedStatements) {
> +      v.finalize();
> +    }
> +    this._cachedStatements = new Map();

ditto

@@ +286,5 @@
> +      v.finalize();
> +    }
> +    this._cachedStatements = new Map();
> +
> +    this._open = false;

add comment above indicating this guards against operating between asyncClose and the actual close is executed. pointing to bug 726990 would even be nicer.

@@ +607,5 @@
> +          case Ci.mozIStorageStatementCallback.REASON_FINISHED:
> +            let result = rows;
> +            if (onRow) {
> +              result = null;
> +            }

add comment explaining if onRow handler is provided we don't return an array to the completion handler (I suppose you do this to not confuse the consumer with an empty array, so that he may think there are no results when instead he passed an onRow handler).
I'd probably rather init to null and invert the condition, but that's personal taste :)

@@ +615,5 @@
> +          case Ci.mozIStorageStatementCallback.REASON_CANCELLED:
> +            // It is not an error if the user explicitly requested cancel via
> +            // the onRow handler.
> +            if (userCancelled) {
> +              deferred.resolve(rows);

why instead do we return rows (an empty array) in this case? we should return something consistent with the FINISHED case, null or empty array in both cases.
test welcome :)
Attachment #694999 - Flags: review?(mak77) → review+
Comment on attachment 694999 [details] [diff] [review]
Promise-based SQLite interface, v8

Requesting SR per Marco's suggestion.
Attachment #694999 - Flags: superreview?(dtownsend+bugmail)
Comment on attachment 694999 [details] [diff] [review]
Promise-based SQLite interface, v8

Review of attachment 694999 [details] [diff] [review]:
-----------------------------------------------------------------

The UnopenedConnection object seems very strange to me, what is the point of it? Why not just have an Sqlite.Connection which you then call open() on, or just Sqlite.openConnection()? I'd like to understand that before signing off on this.

::: toolkit/modules/Sqlite.jsm
@@ +194,5 @@
> +  TRANSACTION_IMMEDIATE: Ci.mozIStorageConnection.TRANSACTION_IMMEDIATE,
> +  TRANSACTION_EXCLUSIVE: Ci.mozIStorageConnection.TRANSACTION_EXCLUSIVE,
> +
> +  get connectionReady() {
> +    return this._connection.connectionReady;

this._connection may be null, maybe you want to call _ensureOpen() first.

@@ +208,5 @@
> +   *
> +   * It is recommended to only use this within transactions (which are
> +   * handled as sequential statements via Tasks).
> +   */
> +  get lastInsertRowID() {

These always struck me as a big footgun in the sqlite API. Can we make this better and just include this information along with each statement result or is it expensive to get?

@@ +220,5 @@
> +   *
> +   * The same caveats regarding asynchronous execution for
> +   * `lastInsertRowID` also apply here.
> +   */
> +  get affectedRows() {

As above.

@@ +423,5 @@
> +  /**
> +   * Whether a transaction is currently in progress.
> +   */
> +  get transactionInProgress() {
> +    return this._connection.transactionInProgress;

this._connection may be null, maybe you want to call _ensureOpen() first.

@@ +454,5 @@
> +    this._ensureOpen();
> +
> +    if (this.transactionInProgress) {
> +      throw new Error("A transaction is already active. Only one transaction " +
> +                      "can be active at a time.");

This seems unfortunate, is this a limitation in sqlite or just this jsm? Can we add a way to wait for the transaction to end?
Attachment #694999 - Flags: superreview?(dtownsend+bugmail) → superreview-
(In reply to Dave Townsend (:Mossop) from comment #57)
> The UnopenedConnection object seems very strange to me, what is the point of
> it? Why not just have an Sqlite.Connection which you then call open() on, or
> just Sqlite.openConnection()? I'd like to understand that before signing off
> on this.

This comes down to personal preference. One camp (as implemented) likes objects that are ready to use as soon as you create instances via new X(). Another camp may prefer extra functions to be called after instantiation. I sit in the former camp when objects are part of a public API where it is desirable to hide complexity from the user. As implemented, we eliminate a required function call and reduce the surface area for potential misuse/bugs. This is a desirable trait of good API design.

Furthermore, the method as implemented more resembles a factory, which is the direction we plan to take the API (see discussion in comments about profiles, etc).

> @@ +208,5 @@
> > +   *
> > +   * It is recommended to only use this within transactions (which are
> > +   * handled as sequential statements via Tasks).
> > +   */
> > +  get lastInsertRowID() {
> 
> These always struck me as a big footgun in the sqlite API. Can we make this
> better and just include this information along with each statement result or
> is it expensive to get?

I'm not sure how expensive it is in practice. It is certainly non-0. I'd prefer to not call it every time out of principle (mostly due to concerns of overhead).

> @@ +454,5 @@
> > +    this._ensureOpen();
> > +
> > +    if (this.transactionInProgress) {
> > +      throw new Error("A transaction is already active. Only one transaction " +
> > +                      "can be active at a time.");
> 
> This seems unfortunate, is this a limitation in sqlite or just this jsm? Can
> we add a way to wait for the transaction to end?

SQLite does not supported nested transactions (and I'm not aware of any database that does).

We could certainly add queued operations to the API if we so wanted. I've already implemented said functionality in bug 812608, but not in Sqlite.jsm. I don't believe it is important enough for the first iteration of the API in this bug.
(In reply to Gregory Szorc [:gps] from comment #58)
> (In reply to Dave Townsend (:Mossop) from comment #57)
> > The UnopenedConnection object seems very strange to me, what is the point of
> > it? Why not just have an Sqlite.Connection which you then call open() on, or
> > just Sqlite.openConnection()? I'd like to understand that before signing off
> > on this.
> 
> This comes down to personal preference. One camp (as implemented) likes
> objects that are ready to use as soon as you create instances via new X().
> Another camp may prefer extra functions to be called after instantiation. I
> sit in the former camp when objects are part of a public API where it is
> desirable to hide complexity from the user. As implemented, we eliminate a
> required function call and reduce the surface area for potential
> misuse/bugs. This is a desirable trait of good API design.

This doesn't scan. You aren't eliminating any function calls, in fact you're adding them. UnopenedConnection isn't any more ready for use than a Connection object would be, you still have to call open() on it. And switching to a simple function instead of this intermediate object reduces the potential for misuse even further.

Every person who uses this API is going to do:

(new Sqlite.UnopenedConnection(options)).open();

Why not just make that:

Sqlite.openConnection(options);

UnopenedConnection is just a throwaway object so I think we should just throw it away.

> Furthermore, the method as implemented more resembles a factory, which is
> the direction we plan to take the API (see discussion in comments about
> profiles, etc).

I don't see why adding default options from profiles would need an object in the future, it could easily be done as Sqlite.openConnection(options, Sqlite.EXCLUSIVE_MEMORY_CACHE).

> > @@ +208,5 @@
> > > +   *
> > > +   * It is recommended to only use this within transactions (which are
> > > +   * handled as sequential statements via Tasks).
> > > +   */
> > > +  get lastInsertRowID() {
> > 
> > These always struck me as a big footgun in the sqlite API. Can we make this
> > better and just include this information along with each statement result or
> > is it expensive to get?
> 
> I'm not sure how expensive it is in practice. It is certainly non-0. I'd
> prefer to not call it every time out of principle (mostly due to concerns of
> overhead).
> 
> > @@ +454,5 @@
> > > +    this._ensureOpen();
> > > +
> > > +    if (this.transactionInProgress) {
> > > +      throw new Error("A transaction is already active. Only one transaction " +
> > > +                      "can be active at a time.");
> > 
> > This seems unfortunate, is this a limitation in sqlite or just this jsm? Can
> > we add a way to wait for the transaction to end?
> 
> SQLite does not supported nested transactions (and I'm not aware of any
> database that does).

It does support nested transactions (see SAVEPOINT). But I wasn't really referring to nested, I'm more interested in sequential transactions. One case I want to work:

User clicks on two settings in the UI quickly, both trigger a transaction to make some changes to the database. Without some magic in the sqlite library right now the second attempt has to do some waiting for the first attempt to complete before starting its transaction to make changes.

Presumably though here other code can just do connection.transactionInProgress.next(startNext). It has to remember to return a promise to support chaining but I guess it works for now.
(In reply to Dave Townsend (:Mossop) from comment #59)
> (In reply to Gregory Szorc [:gps] from comment #58)
> > (In reply to Dave Townsend (:Mossop) from comment #57)
> > > The UnopenedConnection object seems very strange to me, what is the point of
> > > it? Why not just have an Sqlite.Connection which you then call open() on, or
> > > just Sqlite.openConnection()? I'd like to understand that before signing off
> > > on this.
> > 
> > This comes down to personal preference. One camp (as implemented) likes
> > objects that are ready to use as soon as you create instances via new X().
> > Another camp may prefer extra functions to be called after instantiation. I
> > sit in the former camp when objects are part of a public API where it is
> > desirable to hide complexity from the user. As implemented, we eliminate a
> > required function call and reduce the surface area for potential
> > misuse/bugs. This is a desirable trait of good API design.
> 
> This doesn't scan. You aren't eliminating any function calls, in fact you're
> adding them. UnopenedConnection isn't any more ready for use than a
> Connection object would be, you still have to call open() on it. And
> switching to a simple function instead of this intermediate object reduces
> the potential for misuse even further.
> 
> Every person who uses this API is going to do:
> 
> (new Sqlite.UnopenedConnection(options)).open();

Oh, you are correct. I was confusing this API with the one I implemented for FHR, which only requires a single function call to get an opened database object.

> I'm more interested in sequential transactions. One
> case I want to work:
> 
> User clicks on two settings in the UI quickly, both trigger a transaction to
> make some changes to the database. Without some magic in the sqlite library
> right now the second attempt has to do some waiting for the first attempt to
> complete before starting its transaction to make changes.
> 
> Presumably though here other code can just do
> connection.transactionInProgress.next(startNext). It has to remember to
> return a promise to support chaining but I guess it works for now.

Yes, the current API is a bit lacking in this regard. I imagine the "enqueueOperation(func)" API I added in my FHR storage type will eventually migrate to this module. I purposefully left that code in FHR because I wanted to avoid scope creep in this module in order to meet the deadline and because I wanted to experiment with the functionality before adding it to a supposedly more stable API.

I envision we'll want to add the higher-level functions to this module:

* enqueueOperation(func) -- Takes a function that will be added to the execution queue. If there are currently no executing operations, it executes immediately. The function must return a promise that is resolved upon completion of the final operation in that function. Once that promise is resolved, the next queued function (if any) is executed.

* enqueueTransaction(func) -- Like enqueueOperation but the specified function is executed inside a transaction (which is rolled back if the function throws or doesn't return a promise). This is merely a convenience wrapper. We may also want to add named SAVEPOINT support in here somewhere.

We can do this in a follow-up bug, like all the other useful features we cut from the initial patch due to scope creep and time constraints.
> UnopenedConnection is just a throwaway object so I think we should just
> throw it away.

Separating the module from the open connection allows separation of code that creates the connection from code that uses it, both in time and space. I think that's worthy in and of itself, but besides that point:

The name of this class is a little bit of a misnomer: it's really an OpenedConnectionFactory (or similar name), which captures a set of parameters, and allows you to blindly obtain suitable opened connection instances.

I think with that name (which Greg avoided because of connotations of singleton-ness, which don't bother me), you can see how this might allow for easy per-consumer connection pooling and such -- the factory carries that logic, and so the consuming modules having to implement it but also individual modules don't overlap with others.

And if we want to share pooling logic between modules, we get to do so through OO, rather than if-statements: logic like pooling can be embedded in the factory, driven by options, without affecting other factories.

It's a tiny bit Java-ey, but it improves encapsulation, and I like that.

I agree that a convenience method on the module would be a nice addition.


> Yes, the current API is a bit lacking in this regard. I imagine the
> "enqueueOperation(func)" API I added in my FHR storage type will eventually
> migrate to this module. 
>
> …
>
> We can do this in a follow-up bug, like all the other useful features we cut
> from the initial patch due to scope creep and time constraints.

I think that's a great direction in which to move, and I feel that we should land this without significant changes and address this kind of growth in a near-term follow-up, as we learn more from finalizing FHR. I don't think there are any real problems with this code as-is, except for clarifying the naming of connection builders.
So we don't forget about it: a small fix we found tonight.

diff --git a/toolkit/modules/Sqlite.jsm b/toolkit/modules/Sqlite.jsm
--- a/toolkit/modules/Sqlite.jsm
+++ b/toolkit/modules/Sqlite.jsm
@@ -402,22 +402,26 @@ OpenedConnection.prototype = Object.free
     let index = this._anonymousCounter++;
 
     this._anonymousStatements.set(index, statement);
 
     let deferred = Promise.defer();
 
     this._executeStatement(sql, statement, params, onRow).then(
       function onResult(rows) {
+        // These statements won't stick around to be finalized in close(),
+        // so we finalize them now.
         this._anonymousStatements.delete(index);
+        statement.finalize();
         deferred.resolve(rows);
       }.bind(this),
 
       function onError(error) {
         this._anonymousStatements.delete(index);
+        statement.finalize();
         deferred.reject(error);
       }.bind(this)
     );
 
     return deferred.promise;
   },
 
   /**
https://tbpl.mozilla.org/?tree=Try&rev=1e43e6387dff

Just a reminder that this is blocking Firefox Health Report and needs to land in 20.
Attachment #694999 - Attachment is obsolete: true
Attachment #697602 - Flags: superreview?(dtownsend+bugmail)
Attachment #697602 - Flags: review+
(In reply to Richard Newman [:rnewman] from comment #61)
> > UnopenedConnection is just a throwaway object so I think we should just
> > throw it away.
> 
> Separating the module from the open connection allows separation of code
> that creates the connection from code that uses it, both in time and space.
> I think that's worthy in and of itself, but besides that point:
> 
> The name of this class is a little bit of a misnomer: it's really an
> OpenedConnectionFactory (or similar name), which captures a set of
> parameters, and allows you to blindly obtain suitable opened connection
> instances.
> 
> I think with that name (which Greg avoided because of connotations of
> singleton-ness, which don't bother me), you can see how this might allow for
> easy per-consumer connection pooling and such -- the factory carries that
> logic, and so the consuming modules having to implement it but also
> individual modules don't overlap with others.
> 
> And if we want to share pooling logic between modules, we get to do so
> through OO, rather than if-statements: logic like pooling can be embedded in
> the factory, driven by options, without affecting other factories.

I don't disagree that this might all be useful in the future, what I don't like is baking support for it into the API now, making it a little more complex on all callers, when that future may never come.

Most consumers simply want to open a connection, and that's all this module supports at the moment so for now let's just expose the simple openConnection(options) function to do that and if later we add factories to use then consumers that want to use them can switch.
Attachment #697602 - Flags: superreview?(dtownsend+bugmail) → superreview-
(In reply to Dave Townsend (:Mossop) from comment #64)
> I don't disagree that this might all be useful in the future, what I don't
> like is baking support for it into the API now, making it a little more
> complex on all callers, when that future may never come.
> 
> Most consumers simply want to open a connection, and that's all this module
> supports at the moment so for now let's just expose the simple
> openConnection(options) function to do that and if later we add factories to
> use then consumers that want to use them can switch.

In the course of this discussion, we've completely overlooked something. In the initial patch, I did implement things using the pattern Dave is not requesting:

  let connection = new Sqlite.Connection(...);
  connection.open().then(...);

However, Marco successfully convinced me (starting in comment #23) to switch it. The reason we've overlooked is that asyncOpen() is currently synchronous while it should be asynchronous. We wanted to bake a safeguard into the API such that when the semantics change, callers are protected from the change transparently.

I concede this is a marginal reason. I think the benefits of the functional programming approach of emitting a new value when object state changes far outweigh this reason (see the first half of http://clojure.org/state for a decent overview of the perspective I'm coming from).

My ultimate goal for the API is to involve dynamic prototype swapping as the state of the connection changes. e.g. when you call .close(), you swap out the .prototype of the formerly open connection instance and replace it with one that throws if any functions from the old prototype are called (i.e. you poison the prototype). This way, you don't have to bake state checking (this._ensureOpen()) into every function call. Fewer states. Less complexity. Fewer bugs.

That being said, I'm supportive of YAGNI, so if you still want me to nuke the factory, I'll do that without asking any more questions.
Flags: needinfo?(dtownsend+bugmail)
(In reply to Gregory Szorc [:gps] from comment #65)
> (In reply to Dave Townsend (:Mossop) from comment #64)
> > I don't disagree that this might all be useful in the future, what I don't
> > like is baking support for it into the API now, making it a little more
> > complex on all callers, when that future may never come.
> > 
> > Most consumers simply want to open a connection, and that's all this module
> > supports at the moment so for now let's just expose the simple
> > openConnection(options) function to do that and if later we add factories to
> > use then consumers that want to use them can switch.
> 
> In the course of this discussion, we've completely overlooked something. In
> the initial patch, I did implement things using the pattern Dave is not
> requesting:
> 
>   let connection = new Sqlite.Connection(...);
>   connection.open().then(...);
> 
> However, Marco successfully convinced me (starting in comment #23) to switch
> it. The reason we've overlooked is that asyncOpen() is currently synchronous
> while it should be asynchronous. We wanted to bake a safeguard into the API
> such that when the semantics change, callers are protected from the change
> transparently.
> 
> I concede this is a marginal reason. I think the benefits of the functional
> programming approach of emitting a new value when object state changes far
> outweigh this reason (see the first half of http://clojure.org/state for a
> decent overview of the perspective I'm coming from).

I think you're misunderstanding what I'm asking for. Instead of callers doing this with the current patch:

(new Sqlite.OpenedConnectionFactory(options)).open().then(function(connection) { ... });

I'm suggesting it works like this:

Sqlite.openConnection(options).then(function(connection) { ... });

This still allows opening the connection to become asynchronous in the future without callers having to change behaviour, something I strongly support.
Flags: needinfo?(dtownsend+bugmail)
OpenedConnectionFactory -> openConnection (with inlined open)
Attachment #697602 - Attachment is obsolete: true
Attachment #698067 - Flags: superreview?(dtownsend+bugmail)
Attachment #698067 - Flags: review+
Attachment #698067 - Flags: superreview?(dtownsend+bugmail) → superreview+
Comment on attachment 698067 [details] [diff] [review]
Promise-based SQLite interface, v10

Review of attachment 698067 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/modules/Sqlite.jsm
@@ +454,5 @@
> +                      "can be active at a time.");
> +    }
> +
> +    this._log.debug("Beginning transaction");
> +    this._connection.beginTransactionAs(type);

Follow-up fodder, but you should make creating and committing the transaction asynchronous.
https://hg.mozilla.org/mozilla-central/rev/939e3db16cbd
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Whiteboard: [fixed in services]
Target Milestone: --- → mozilla20
You need to log in before you can comment on or make changes to this bug.