Closed Bug 429986 Opened 16 years ago Closed 16 years ago

Provide an option for database to be asynchronous

Categories

(Toolkit :: Storage, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9.1a1

People

(Reporter: sdwilsh, Assigned: sdwilsh)

References

(Blocks 1 open bug, )

Details

(Keywords: dev-doc-complete)

Attachments

(2 files, 13 obsolete files)

Attached patch v0.1 (obsolete) — Splinter Review
Bug 408914 disabled it, but discussions led to us wanting it back but optional.  This is a wip patch that currently has heap corruption, and I'm not sure why.  Spent most of the day fighting it (got past the stack corruption!).

Interested parties can take a look and comment, but I'm not reading bugmail for a week or two still, so don't expect any replies.
Assignee: nobody → sdwilsh
Comment on attachment 316761 [details] [diff] [review]
v0.1

>+            Open(sqlite3_vfs *aVFS, const char *zName, sqlite3_file *file,
>+                 int flags, int *pOutFlags)
>+            {
>+                VFS *vfs = static_cast<VFS *>(aVFS->pAppData);
>+
>+                // We need to get the default io methods...
>+                sqlite3_file *defaults =
>+                    static_cast<sqlite3_file *>(sqlite3_malloc(vfs->szOsFile));
Heap corruption was here.  I should have checked the default vfs for the size.  It's amazing how I spent many hours looking through this code, but two weeks later I skim it and see what was wrong...
Attached patch Default VFS v1.0 (obsolete) — Splinter Review
I decided to go with a better approach (and follow the whole "plan to throw one away" guideline).  This is the first part of this bug that gives us a default VFS object that currently just wraps whatever sqlite would have used.  We have the flexibility to provide our own and just drop it in now with this code.

The idea is also to have the async VFS inherit from the default one and only re implement the necessary methods.  Hurray for virtual inheritance.  I suspect that the code for the async bits will be quite small.  Might have something tomorrow even - who knows.

I think this patch is ready for review, but I don't know if it's worth putting into folks' queues at this point.  I figured vlad for review, and shaver for sr.

One note though for my own future reference.  If you ever create an ADT, make sure to implement a virtual destructor otherwise your subclassed one will never be called.  There went four hours of my life...
Attachment #316761 - Attachment is obsolete: true
Comment on attachment 318963 [details] [diff] [review]
Default VFS v1.0

I was hitting a 100% reproducible crash in this earlier today in the system call of fsync, but I did a clobber and it went away.  I've been using it as my browser for a little while and it seems stable.  It also passes make check (with the exception of a network test that appears to be an unrelated failure), so I think this is ready for a first pass of reviews.  With that said, this isn't going to be in 1.9, so 1.9 work clearly comes first.
Attachment #318963 - Flags: superreview?(shaver)
Attachment #318963 - Flags: review?(vladimir)
A couple of questions that I'd like to get feedback on:

(1) For the VFS code, should it all be placed in a subfolder such as storage/src/vfs?

(2) How should errors be handled in the async case?  For example, if a write or sync fails, what do we do?  We already told sqlite that it was OK and that it worked, but it turns out that it didn't.  Maybe I'm being overly paranoid about system calls failing...
Returning the error next time the function is called is an option, but how do we clear that error state?  Additionally, what if the calls are very far apart?  The code that gets the error isn't necessarily even close to the code that would have gotten it in a synchronous environment.
Another option would be to have consumers register an error handler, but I'm not sure how to tie statement execution (which statement was running) into which operation failed.
I'm looking for suggestions here.
Status: NEW → ASSIGNED
I know I had another question, but forgot when I actually got to it...

(3) What's the best way to enable this option?  I'm thinking, since we want to not change api's for 3.1, that we add a method to mozIStorageService to open a database with options:
mozIStorageConnection openDatabaseWithOptions(in nsIFile aDatabaseFile, in mozIStorageConnectionOptions aOptions);
The mozIStorageConnectionOptions interface will have an async boolean member that can be toggled on and off.  To help with binary compatibility, we could borrow from places api's and have mozIStorageService have a method that returns default options:
mozIStroageConnectionOptions getNewConnectionOptions();
JS should just be able to pass an object specifying the options they care about, and the rest we just use the defaults.
Disregard (2).  I was thinking we'd have a background thread that would do the writes, and the main thread would just had off to it.  However, this is essentially what the PRAGMA synchronous command does in sqlite.  After dolske sent some documentation my way on sun's async IO, I realized there's a much saner way to do this without locking up the main UI.

We will still dispatch to a background thread, and while we wait for some action to complete, we continue to process events (using NS_ProcessPendingEvents).  Once it is done, we go ahead and return to sqlite.

This doesn't increase the performance of mozStorage, but it does solve the problem of the UI locking up during database writes, which I believe was the driving force for this (correct me if I'm wrong).  This approach also solves (3) - we won't even need to provide a new method call since we can just make any database that is on the main thread use this vfs by default.

Thoughts?
Nesting event loops can cause unexpected re-entrance and other joyful surprises if you don't filter carefully (and if you're filtering carefully, you're still application-modal).

I think that async callers should just get a callback with the completion result of their operation: error, "handed to sqlite", "committed" states might be good choices.  Implicit asynchrony is painful, as we've seen in the few places that we have it ("why doesn't sizeToContent work here?") and we should prefer explicit asynchrony where at all possible.  I believe that we want a mozStorage-level API, and a Gecko-level background thread, as we have for DNS f.e.
(In reply to comment #7)
> Nesting event loops can cause unexpected re-entrance and other joyful surprises
> if you don't filter carefully (and if you're filtering carefully, you're still
> application-modal).
That really stinks.  The unexpected re-entrance sounds like a bug, but perhaps I don't fully understand that thread manager.

> I think that async callers should just get a callback with the completion
> result of their operation: error, "handed to sqlite", "committed" states might
> be good choices.  Implicit asynchrony is painful, as we've seen in the few
> places that we have it ("why doesn't sizeToContent work here?") and we should
> prefer explicit asynchrony where at all possible.  I believe that we want a
> mozStorage-level API, and a Gecko-level background thread, as we have for DNS
> f.e.
Alright, then (2) and (3) are back on the table.  Like I said in comment 4, there's really no way to "tag" a given write/sync/read operation to a given statement, so telling the caller that something succeeded or not is useless information.  Now with that said, there's another possible solution here (take 3 anyone?):
Throw away this VFS stuff and run all operations on a background thread when they are called on the main thread.  All return values (for at least write statements - maybe read ones too) are handled by a callback.  The biggest problem I see with this is that I can't see this being friendly to developers who are familiar with other database API's.  I can't say I've ever seen a database api that was async with callbacks like this (but if someone has, pointing me to it would be a great help!).
The issue isn't that ThreadManager doesn't like re-entrance, it's that processing events by definition runs event handling code, and that code may already be executing and not be happy to re-enter just because something under the covers or in an observer call looked up something in a database.

3 database APIs that are async, I'm sure there are many more out there:

- AJDBC returns a futures object, on which you can add a listener for notification of completion/etc. (presumably the event is held until the listener is installed, in the case that the query runs very quickly?): http://code.google.com/p/adbcj/wiki/Tutorial

- the AIR database API (also atop sqlite) is async: http://www.insideria.com/2008/03/air-api-querying-a-local-datab.html

- the WHATWG client-side database API is async: http://www.whatwg.org/specs/web-apps/current-work/#executing (they've discussed adding a sync API as well)

Bonus nostalgia: http://www.ussg.iu.edu/hypermail/linux/kernel/0209.3/0729.html is the underlying primitive in the Linux journalling block device that we used in Lustre to ensure that we always knew when things were committed, without making everything synchronous and therefore unscalable.  Have we asked the sqlite folks about such a built-in callback mechanism?  It would be preferable to serializing all database operations above the sqlite API level, I think, because it would keep little queries from being queued up behind massive malware update chunks, etc.
SQL Server 2000 does async as well, atop the Windows message queue and via polling to see if the database operation is complete:

http://msdn.microsoft.com/en-us/library/aa936974(SQL.80).aspx

I do not, um, recommend that we follow that model.
Alrighty then!  I've added a link to my talk page on devmo for some sample code goodness as well as a proposed interface based on the sample code.  There are still holes to fill in, but feedback is welcome.  Please add it to the discussion section on the wiki.
Comment on attachment 318963 [details] [diff] [review]
Default VFS v1.0

This isn't needed for this bug anymore, but it may be useful at some point in the future.
Attachment #318963 - Attachment is obsolete: true
Attachment #318963 - Flags: superreview?(shaver)
Attachment #318963 - Flags: review?(vladimir)
Attached patch v0.2 (obsolete) — Splinter Review
very early implementation.  doesn't work yet (still need an issue worked out in the newsgroups regarding api for that to even become a possibility).  want the code somewhere else than just on my computer.
Depends on: 434750
Depends on: 434796
Depends on: 434830
No longer depends on: 434830
No longer depends on: 434750
Attached patch v0.3 (obsolete) — Splinter Review
Getting there, but not quite there yet...
Attachment #321655 - Attachment is obsolete: true
Attached patch tests v0.1 (obsolete) — Splinter Review
not nearly extensive enough yet, but these all pass (hurray!)
Depends on: 435994
Attached patch v0.4 (obsolete) — Splinter Review
Fixes some issues found in testing.  Still needs to support error handling better.
Attachment #322707 - Attachment is obsolete: true
Attached patch tests v0.2 (obsolete) — Splinter Review
I do believe I have extensive tests now with the exception of error handling (which is not yet implemented) and cancellation.  I have some issues with cancellation, and have posted to the newsgroup about it:
http://groups.google.com/group/mozilla.dev.planning/msg/1b50df1b0cb79c34
Attachment #322708 - Attachment is obsolete: true
Attached patch v0.5 (obsolete) — Splinter Review
This includes more fixes from tests that I'm about to upload, in addition to cancellation fixes.  I also added some asserts to catch any mishandling of assumptions.
Attachment #323077 - Attachment is obsolete: true
Attached patch tests v0.3 (obsolete) — Splinter Review
This adds tests for cancellation.  I'm not sure if they are quite that extensive, but I think they cover the bases fairly well.  With that said, I'm not to happy with the API - posting to the newsgroups about it.
Attachment #323078 - Attachment is obsolete: true
Attached patch v1.0 (obsolete) — Splinter Review
I think this is ready for at least a first pass of reviews.
Attachment #323163 - Attachment is obsolete: true
Attachment #323600 - Flags: superreview?(shaver)
Attachment #323600 - Flags: review?(vladimir)
Attached patch tests v1.0 (obsolete) — Splinter Review
tests to go with attachment 323600 [details] [diff] [review]
Attachment #323165 - Attachment is obsolete: true
Attachment #323601 - Flags: review?(vladimir)
Attachment #323601 - Flags: review?(shaver)
Flags: blocking1.9.1?
Priority: -- → P1
Blocks: 438342
Attached patch v1.1 (obsolete) — Splinter Review
Addresses naming concerns and review comments from shaver.
Attachment #323600 - Attachment is obsolete: true
Attachment #326809 - Flags: superreview?(shaver)
Attachment #326809 - Flags: review?(vladimir)
Attachment #323600 - Flags: superreview?(shaver)
Attachment #323600 - Flags: review?(vladimir)
Tests will be up in a bit - need to work on that some more to get better coverage.
Whiteboard: [has patch][needs review vlad][needs sr shaver]
Attached patch tests v1.1 (obsolete) — Splinter Review
test funness
Attachment #323601 - Attachment is obsolete: true
Attachment #326815 - Flags: review?(shaver)
Attachment #323601 - Flags: review?(vladimir)
Attachment #323601 - Flags: review?(shaver)
The code looks fine, but one comment on the API... right now there's:

+[scriptable, uuid(29383d00-d8c4-4ddd-9f8b-c2feb0f2fcfa)]
+interface mozIStorageStatementCallback : nsISupports {
+  void handleResult(in mozIStorageResultSet aResultSet);
+  void handleError(in mozIStorageError aError);
+  void handleCompletion(in unsigned short aReason);
+};
..
+  mozIStoragePendingStatement executeAsync(
+    [optional] in mozIStorageStatementCallback aCallback
+  );

It would be nice to use the new [function] IDL stuff here, especially given that the majority of consumers are going to be in JS.  I would suggest something like:

[scriptable, function]
interface mozIStorageStatementResultCallback {
  void handleResult(in mozIStorageResultSet aResultSet);
};

[scriptable, function]
interface mozIStorageStatementErrorCallback {
  void handleError(in mozIStorageError aError);
};

[scriptable, function]
interface mozIStorageStatementCompletionCallback {
  void handleCompletion(in unsigned short aReason);
};


and:

mozIStoragePendingStatement executeAsync
  ([optional] in mozIStorageStatementResultCallback resultCb,
   [optional] in mozIStorageStatementErrorCallback errorCb,
   [optional] in mozIStorageStatementCompletionCallback completionCb);

That way, JS callers can just use function (...) { } objects, and C++ callers can still just implement all three (or any combination) in one component -- they'd just have to pass it in 1-3 times to the function.
I agree, though [function] isn't new; it came in with the XPConnected DOM in 2001, I believe.
(In reply to comment #25)
> The code looks fine, but one comment on the API... right now there's:
> mozIStoragePendingStatement executeAsync
>   ([optional] in mozIStorageStatementResultCallback resultCb,
>    [optional] in mozIStorageStatementErrorCallback errorCb,
>    [optional] in mozIStorageStatementCompletionCallback completionCb);
OK, I'm game with this, but I think we should change the order.  Specifically, have result and completion switched, because I think people care more about completion than errors, and it is a bit more inline with the HTML 5 SQL spec.  Sound good?
(In reply to comment #27)
> (In reply to comment #25)
> > The code looks fine, but one comment on the API... right now there's:
> > mozIStoragePendingStatement executeAsync
> >   ([optional] in mozIStorageStatementResultCallback resultCb,
> >    [optional] in mozIStorageStatementErrorCallback errorCb,
> >    [optional] in mozIStorageStatementCompletionCallback completionCb);
> OK, I'm game with this, but I think we should change the order.  Specifically,
> have result and completion switched, because I think people care more about
> completion than errors, and it is a bit more inline with the HTML 5 SQL spec. 
> Sound good?

Yup, sounds fine.
I'm not actually sure about [optional], now that I think about it more, but I'm not totally against it.  I think we probably want to make it harder to silently swallow errors.
True, but I think that in an ideal world, if you don't specify an error callback, we would at least print an error somewhere... not sure where that somewhere would be, though.
Wait, for the arg order -- are you thinking (completion, result, error) or (completion, error, result)?  The latter looks odd.  Maybe better would be:

(completion, error)

where the completion callback would take (completionType, resultDataSet)... where the result would just be null if no result was there.
I'm thinking result, completion, error.

We give the results in chunks, and I don't think we want to give *all* the results in one chunk - especially with big queries.  If we didn't care about that, we could give all results at once (which is what the HTML 5 spec does).  However, we wanted to have a quick API, which is why we give results as we get them.  This should, in theory, use less memory too (in large cases for sure, smaller ones are less likely).
Also, robarnold pointed out that we should update the completion state if we cancel and we haven't yet ran the completion event (reasonable, and doable).

Also, looking over the HTML 5 spec again, their ResultSet will return an insertId and rowsAffected for INSERT/UPDATE statements.  We probably want to do that, but I don't know if we want it in this bug, or in a follow-up to get this landed and usable (and to shake out any other unforseen issues).  Awaiting feedback from reviewers.
That's ok, no?

e.g. the result/completion callback could be called like:

cb (RESULT, data);
cb (RESULT, data);
cb (RESULT, data);
cb (DONE, null);

for the partial pieces.
Other work in other bugs.  Let's get the core in.
Blocks: 442793
(In reply to comment #34)
> cb (RESULT, data);
> cb (RESULT, data);
> cb (RESULT, data);
> cb (DONE, null);
I don't like that from an API perspective because that handler now does more than one thing - sometimes it will process data, and sometimes it will handle completion.  I think the current API provides a cleaner separation of the two functionalities.

(In reply to comment #35)
> Other work in other bugs.  Let's get the core in.
Filed bug 442793 for that addition.  I want to fix the cancel stuff here because it seems wrong now (and I can tighten up my test as a result).
(In reply to comment #36)
> (In reply to comment #34)
> > cb (RESULT, data);
> > cb (RESULT, data);
> > cb (RESULT, data);
> > cb (DONE, null);
> I don't like that from an API perspective because that handler now does more
> than one thing - sometimes it will process data, and sometimes it will handle
> completion.  I think the current API provides a cleaner separation of the two
> functionalities.

Maybe, though I don't think that there is any difference between the two.  Completion is just a type of result; the callback just handles all non-error information coming back from the async call.  From a usability perspective, I think that's what you want; for non-data-returning calls, it'll just be a "This finished executing", and for data-returning calls, you'll want to know when there isn't any more data.  I don't really feel strongly about this, though it does seem simpler and cleaner than allowing the user to provide a result and a completion callback, especially since the result callback will be null in many cases.
ugh.  Note: mPendingEvents has a mutex I wasn't using except in one case...
I've updated everything short of the callback -> functions bit.  I've spent some time thinking about this, I'm not sure I like it because we go from:
stmt.executeAsync({
  handleResult: function(aResultSet)
  {
    // do something
  },
  handleError: function(aError)
  {
    // log it?
  },
  handleCompletion(aReason)
  {
    // do something
  }
});

to:
stmt.executeAsync({
  function handleResult(aResultSet)
  {
    // do something
  },
  function handleCompletion(aReason)
  {
    // do something
  },
  function handleError(aError)
  {
    // log it?
  }
});
Yes, they could declare these functions and reuse them, but there is no way to differentiate between which statement is running (nor am I sure of a good way to do this).  I'm not really sure of the win in this situation by changing it.
hrm, I'm confused.. you'd go to:

stmt.executeAsync(
  function (resultSet) { ... },
  function (reason) { ... },
  function (error) { .... }
);, no?

It would be cleaner still if it was:

stmt.executeAsync(
  function (callType, resultData) { ... },
  function (errorReason) { ... }
);

:)

Or are you just saying that you like the { handleResult: ..., handleError: ..., } etc labels?  I think it's in general cleaner to pre-declare the functions, esp since they might do a lot of work, so not sure that we should optimize for the inline case...
My point was that I don't see the win by providing an object or using three different functions.  Sure, you can pre-declare the functions, but you can pre-declare the object too.  It seems to me to make more sense to couple them together since they all operate on the same statement.
Attached patch v1.2Splinter Review
Doesn't do the callback changes, but gets everything else.
Attachment #326809 - Attachment is obsolete: true
Attachment #327983 - Flags: superreview?(shaver)
Attachment #327983 - Flags: review?(vladimir)
Attachment #326809 - Flags: superreview?(shaver)
Attachment #326809 - Flags: review?(vladimir)
Attached patch tests v1.2Splinter Review
Attachment #326815 - Attachment is obsolete: true
Attachment #327984 - Flags: review?(vladimir)
Attachment #326815 - Flags: review?(shaver)
Whiteboard: [has patch][needs review vlad][needs sr shaver] → [has patch][has review][needs sr shaver]
Comment on attachment 327983 [details] [diff] [review]
v1.2

sr=shaver
Attachment #327983 - Flags: superreview?(shaver) → superreview+
Whiteboard: [has patch][has review][needs sr shaver] → [has patch][has review][has sr]
s/PR_ATOMIC_SET/PR_AtomicSet/g since the macros are absolutely not portable...

Not gonna post a new patch for that change.
Pushed to mozilla-central:
http://hg.mozilla.org/mozilla-central/index.cgi/rev/1a1e94b4dca8
http://hg.mozilla.org/mozilla-central/index.cgi/rev/fd493cecad66
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Whiteboard: [has patch][has review][has sr]
Target Milestone: mozilla2.0 → mozilla1.9.1a1
Keywords: dev-doc-needed
Depends on: 454851
Flags: blocking1.9.1?
Blocks: 458811
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: