Closed Bug 874814 (OMTConnectionClose) Opened 6 years ago Closed 6 years ago

[Storage] Implement OMT AsyncClose()

Categories

(Toolkit :: Storage, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla26

People

(Reporter: Yoric, Assigned: Yoric)

References

(Blocks 1 open bug)

Details

(Keywords: main-thread-io, Whiteboard: [Async:P1])

Attachments

(1 file, 12 obsolete files)

18.93 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
The current implementation of AsyncClose performs closing on the main thread. This is to ensure that we have properly closed all connections before shutdown.

We could (and should) reimplement this as follows:
- AsyncClose on the async execution thread;
- terminate the execution thread once close is complete;
- during xpcom-shutdown-threads, spin until all execution threads are done.
afaik, after xpcom-shutdown-threads the threads are merged, when a thread is merged it is automatically spinned. There should be no need to do that spinning manually, unless it's for a very good reason, like exit(0) behavior.
If the close runnable is fired soon enough (before the thread merges) it should be executed.
Alias: OMTConnectionClose
Quick status report.
I have an early prototype of AsyncClose that does everything on the async execution thread. This patch fails with some tests due to unfinalized statements, and I believe that I have finally found why. There is a race condition between:
- AsyncExecute() (which dispatches an AsyncStatementFinalizer once execution is complete);
- AsyncCloseConnection (which should not be Run() before the AsyncStatementFinalizer is complete);
- the AsyncStatementFinalizer itself.

This race condition can be witnessed reproducibly in the scenarios in which the code launches AsyncExecute() and immediately AsyncClose(), as the AsyncStatementFinalizer ends up processed by the async thread after the AsyncCloseConnection.

I may have to rework slightly asyncFinalize() and AsyncCloseConnection to ensure that all async finalizers are processed properly by AsyncCloseConnection::Run() if they have not been processed before.
are you sure it's the execution that dispatches the finalizer, to me looks like it's the call to finalize() that does that.
AsyncStatementFinalizer is dispatched by asyncFinalize, that is called by finalize()...

Could be the code you are having problems with is doing something like:

stmt.executeAsync({
  ...
  onCompletion() {
    stmt.finalize();
  }
})

instead of the correct:

stmt.executeAsync({
   ...
})
stmt.finalize();

Cause I cannot see how
stmt.executeAsync();
stmt.finalize();
asyncClose();
may end up in the wrong order
I misdisagnosed again, I'm afraid.

The situation seems to be the following:
- mozStorageStatementData::finalize() is called and completes successfully;
- despite this, on the same thread, internalClose() raises an assertion failure because the same statement is not finalized.

Proceeding with the investigation.
note that, even if wrong, we probably still want to support both cases, calling finalize in the completion callback is a very common pitfall from what I saw in the past, mostly cause it looks unnatural to fire an async job and "kill" it immediately.
Status update.

Both comment 2 and comment 3 were right – we had a race condition between |ExecuteAsync()| and |AsyncClose()| (in C++) and between |finalize()|-in-a-callback and |asyncClose()| (in JS). Comment 4 was wront – I was just led in error by a function called |finalize| but that is actually unrelated to finalization.

After several attempts, the strategy we discussed over IRC is the following:
- maintain at all times the number of async statements that need to be finalized (i.e. asynchronous statements that have been initialized – possibly lazily – but that haven't been finalized yet);
- in the off-main thread AsyncCloseConnection event, loop until either that number is 0 or some timeout is exceeded – to ensure that we can process asynchronous events including ExecuteAsync and asynchronous finalization, we loop by having the event reschedule itself.

Prototype running on Try: https://tbpl.mozilla.org/?tree=Try&rev=2a0b01e415b5
Yoric asked me to comment on IRC.

My idealized opinion of how to handle this is that on the main thread we treat asyncClose as killing the connection and automatically finalizing all outstanding statements.  All calls on the connection or statements will return an error (create a new statement, run a statement that was not explicitly finalized), or silently do nothing because the operation was already implied by closing the connection (finalizing a statement as in comment 3.)  The async closer can do the sqlite3_next_stmt thing to close all outstanding statements and optionally generate warnings or dev-tool hints or what have you.  (And no async finalizers can be run after the async closer because the call to finalize on the main thread would check the main-thread-set flag that indicates the connection is shutting down and not allow finalization.)

The danger in such a strategy is that since we don't currently maintain a list of all outstanding statements, our mozStorage async-ish statements can have moot sqlite3_stmt* pointers hanging around.  This is mitigated by the fact that they should only ever be touched on the async execution thread, and the asyncClose process should make it impossible for anything to be scheduled on that thread after dispatching the async closer.  As such, there should be no one alive who would ever touch those pointers.  (People using gdb might get very confused by the state of the data structure, however.)


I think that's a preferable alternative to the looping strategy.  The looping strategy seems like it has the potential to add a lot of cognitive overhead to analyzing problems involving mozStorage.  It certainly seems like a bug in code using mozStorage if they are trying to sneak in write operations after closing the connection, and there's no benefit to stalling the close operation until their misplaced finalized calls are issues (if they get issued!)
Having said that, I don't have any fundamental opposition to the looping strategy; it just seems more complex and the things the complexity buys us are things we don't actually need/want.
forced finalization sounds good to me, we can still notify issues and be sure the closing process can proceed.
Sounds good to me.
Attached patch Loop-based version. Do not land. (obsolete) — Splinter Review
Assignee: nobody → dteller
sitrep: I have a patch that seems to work in most cases. However, when we have virtual tables, it auto-finalizes some statements that have been defined by sqlite, and this later causes a segfault when closing the database.

This can be reproduced with test_storage_fulltextindex.js. Note that, without my patch, closing the database prints warnings but succeeds.
Actually, that's not doe to virtual tables but to fts.

So, basically, I have two possibilities:
- either the async loop implemented already (which will still print the spurious warnings in fts3);
- or maintain manually a set of statements that need to be finalized, and iterate only through these statements during auto-finalization (more complex, possibly slower, but resolves the spurious warnings).
Any preference?
Flags: needinfo?(mak77)
So, uh, this is a little bit horrible, but if you look at the source to sqlite3Close, the pseudo-code is basically:

- disconnect all virtual tables, causing them to clean up their cached prepared statements
- (if invoked via sqlite3_close rather than sqlite3_close_v2): check if the connection is busy and return with SQLITE_BUSY if so.

So, we could try and close the connection to cause the virtual tables to get torn down.  *Then* we walk the list of live statements afterwards.  *Then* we close them and complain a lot.  The horrible bit is that this relies on undocumented behaviour of sqlite3Close.  (However, because of the existing APIs for virtual tables and the semantics for close, it's very unlikely that this would be changed.  The FTS3 prepared statements would need to be marked with meta-data so they could be removed from consideration for the SQLITE_BUSY-returning check.  In that case, we could probably get at the meta-data too and not try and close them.)


Alternatively, if we still liked that strategy but found the above too horrible, we could try and get the call to 'disconnectAllVtab' exposed via a different mechanism.

The 'most correct' thing to do is to maintain the set of statements to be finalized, but that seems like double-book-keeping for hygiene reasons.  (Of course, if the implementation of sqlite3Close changed substantially, we might take a trip to crash-town again, but as demonstrated, we already have unit tests that cover that.)

It might not be a bad idea ping drh/the SQLite team to see what their thoughts are on this.  I would not be surprised if they would find our theoretical invariant "it's okay to finalize the statements on the async thread without nulling out the owned statements" a little cavalier and suggest keeping a list we can definitely hold onto.
(In reply to Andrew Sutherland (:asuth) from comment #16)
> - (if invoked via sqlite3_close rather than sqlite3_close_v2): check if the
> connection is busy and return with SQLITE_BUSY if so.

Andrew, do you remember the specific reason we use sqlite3_close instead of sqlite3_close_v2, in the async case? I think it's just cause we never thought of using it so far.
I wonder if we may switch to v2 just for asyncClose, it would basically solve our problem of having to call sqlite_close at the "right" time. But surely we'd lose some control.
That wouldn't help our need to notify about unfinalized statements (for which we may still want some tracking debug code), but maybe then we don't care that much, cause we know the connection will be closed once they are destroyed, instead of never being closed.

> It might not be a bad idea ping drh/the SQLite team to see what their
> thoughts are on this.

This sounds like a good idea, before using internal undocumented behavior of the API.
Flags: needinfo?(mak77) → needinfo?(bugmail)
(In reply to Marco Bonardo [:mak] from comment #17)
> Andrew, do you remember the specific reason we use sqlite3_close instead of
> sqlite3_close_v2, in the async case? I think it's just cause we never
> thought of using it so far.

I would say historical/legacy, but...

> I wonder if we may switch to v2 just for asyncClose, it would basically
> solve our problem of having to call sqlite_close at the "right" time. But
> surely we'd lose some control.

...I think having the clear delineation of "we're closed now!" is important.  For example, it would suck if you're trying to do something clever with database migration and it turns out that when asyncCloses's callback fires that the database is not in fact closed and then when you try and migrate it things explode.


> > It might not be a bad idea ping drh/the SQLite team to see what their
> > thoughts are on this.
> 
> This sounds like a good idea, before using internal undocumented behavior of
> the API.

I'll leave this to you since I think you are the PoC and I don't know the preferred mechanism to contact them...
Flags: needinfo?(bugmail)
Flags: needinfo?(mak77)
Note that we probably lose warnings for non-finalized statements regardless, unless we wish to warn when a C++ user does

 myStatement = nullptr;
 myStatement->ExecuteAsync();
 myConnection->AsyncClose();
(In reply to David Rajchenbach Teller [:Yoric] from comment #19)
>  myStatement = nullptr;
>  myStatement->ExecuteAsync();

I think the cpp user will be warned enough by a crash in this case...
Sorry, I meant
 myConnection->ExecuteAsync(myStatement);
 myStatement = nullptr;
 myConnection->AsyncClose();
This looks a proper finalization to me, the destructor (provided it's the only owner) invokes finalize, that enqueues the finalizer before the asyncClose runnable. Am I missing something?
Flags: needinfo?(mak77)
Well, currently, the finalizer is generally enqueued after the asyncClose runnable. That's one of the reasons for which we need this async loop and/or auto-finalization mechanism.
that's not what happens there, I can't see how it would be enqueued after, you must be thinking of a different case like finalize in the onCompletion callback.
Unless I'm mistaken, we have the following:

Main thread
* ExecuteAsync() misc stuff
* Dispatches ExecuteAsync() Runnable.
* AsyncClose() misc stuff
* Enqueues AsyncClose() Runnable.
(event loop - next runnable is statement release)
* destructorAsyncFinalize()
* Dispatches LastDitchSqliteStatementFinalizer Runnable

Async thread
* ExecuteAsync() Runnable
* Performs Statement
* Release Statement on the main thread (I can't remember how, exactly)
(event loop - next runnable is AsyncClose() runnable)
* AsyncClose() Runnable
* In the new version, perform auto-finalization.
* sqlite3_close.
(event loop - next runnable is LastDitchSqliteStatementFinalizer)
* now attempt to finalize from LastDitchSqliteStatementFinalizer, too late.

Unless I'm mistaken (and my experiments seem to confirm my understanding), this means that the AsyncClose() Runnable is not the right place to warn of non-finalized statements.
So, we got a quick reply from SQLite team, mosre specifically Richard Hipp:

"The behavior that Andrew describes is stable and will not change.

I'm not going to document the behavior at this time.  However, I have added a new test module to the TH3 test suite (http://www.sqlite.org/th3/doc/trunk/www/th3.wiki) that verifies that sqlite3_close() works as described above.  The new test module has a conspicuous comment at the top that says that Firefox depends on this behavior and so it cannot change.  So there is no chance of the behavior changing in a future release.  It is safe for you to use as-is."

Based on this I think we can do the "little bit horrible thing" from comment 16.
Whiteboard: [Async:P2] → [Async:P1]
Attached patch Off-main thread closing (obsolete) — Splinter Review
Here we go, with the strategy suggested by asuth.
Attachment #778443 - Flags: review?(mak77)
Comment on attachment 778442 [details] [diff] [review]
Fixing statement management in nsCookieService

Can you please add some information about what this patch tries to do in the commit message?  It's not obvious from looking at the patch itself.

Thanks!
Attachment #778442 - Flags: review?(ehsan)
At the moment, nsCookieService get its database connection closing sequence wrong: it nulls out the statements after having called AsyncClose(). For the moment, this works by accident, but once we move to off main thread AsyncClose(), this causes a race condition between statement finalization and database close, which causes assertion failures.
Sorry, I clicked "Save Changes" before I was done.

This patch simply reorders the closing sequence in nsCookieService to null out the statements before calling AsyncClose().
Why don't you create a helper which nulls out the statements and calls AsyncClose() right away?
Well, every call to database close and/or resource cleanup (both of which need to be interleaved) in nsCookieService seems to follow a distinct pattern depending on why they close the db and/or cleanup the resources. There may be a nice helper that can fit every use, but when I wrote the patch, I could not find a nice way to do so.
OK.  In this patch, sometimes you call DestroyCachedStatements immediately before CloseDefaultDBConnection and sometimes you do things in between the two calls.  Is there a good reason why that is the case?
Flags: needinfo?(dteller)
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #34)
> OK.  In this patch, sometimes you call DestroyCachedStatements immediately
> before CloseDefaultDBConnection and sometimes you do things in between the
> two calls.  Is there a good reason why that is the case?

Note: I have just renamed both methods to more meaningful names: DestroyCachedStatements becomes CleanupCachedStatements and CloseDefaultDBConnection becomes CleanupDefaultDBConnection.

To comply with the API, we have the following constraints:
- CleanupCachedStatements must take place before any AsyncClose (or anywhere if the database cannot be opened)
- CleanupDefaultDBConnection must take place after any AsyncClose (or anywhere if the database cannot be opened).

CloseDefaultDBConnection used to do both CleanupCachedStatements and CleanupDefaultDBConnection, regardless of any AsyncClose, which was wrong (and could fail assertions). In my patch, I just cut the method in two and ensured that whatever needs to go before AsyncClose does and whatever needs to go after AsyncClose does, while keeping both method calls as close as possible to each other, given that they are meant to be used together.
Attachment #778442 - Attachment is obsolete: true
Attachment #779225 - Flags: review?(ehsan)
Flags: needinfo?(dteller)
Comment on attachment 779225 [details] [diff] [review]
Fixing statement management in nsCookieService, v2

Review of attachment 779225 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/cookie/nsCookieService.cpp
@@ +1278,5 @@
> +// asynchronously close the connection -- these must be done
> +// beforehand if necessary.
> +void
> +nsCookieService::CleanupDefaultDBConnection()
> +{

Please add a debug-only boolean member to be able to MOZ_ASSERT that CleanupCachedStatements() has been called before this.  r=me with that.
Attachment #779225 - Flags: review?(ehsan) → review+
Here we go, thanks.
Attachment #779225 - Attachment is obsolete: true
Attachment #779262 - Flags: review+
you may land the cookies part already, with a [leave open] in the whiteboard, I will go through the other review as soon as possible, but those changes are unrelated, by landing them separately we may get better testing and regressions coverage.
or split patch to a dependency bug, if you want proper separation. It could make sense exactly cause that's a correctness fix, that is needed for this change.
Comment on attachment 779262 [details] [diff] [review]
Fixing statement management in nsCookieService, v3

Moved patch to bug 897033.
Attachment #779262 - Attachment is obsolete: true
Comment on attachment 778443 [details] [diff] [review]
Off-main thread closing

Review of attachment 778443 [details] [diff] [review]:
-----------------------------------------------------------------

some first comments

::: storage/src/StorageBaseStatementInternal.cpp
@@ +47,5 @@
>      if (mStatement->mAsyncStatement) {
> +      // If mConnection->getAsyncExecutionTarget() returns null,
> +      // we have just lost a race condition with asyncClose(),
> +      // let asyncClose() win and auto-finalize the statement.
> +      if (mConnection->getAsyncExecutionTarget()) {

may this really happen?
I mean, this is a runnable that runs on the async thread, that thread still exists while it's running.
What is this check protecting?

@@ +135,5 @@
> +    // Dispatch. Note that dispatching can fail, typically if
> +    // we have a race condition with asyncClose(). It's ok,
> +    // let asyncClose() win.
> +    (void)target->Dispatch(event, NS_DISPATCH_NORMAL);
> +    mAsyncStatement = nullptr;

I think in both cases we should MOZ_ASSERT, we "accept" broken outside consumers, but we should do our best to avoid wrong internal consumers.

@@ +154,2 @@
>    }
> +  // We cannot dispatch to the background thread, presumably because it

well, we don't know if we can dispatch or not since you removed the early return in the if and the error check.. I'd like stricter checks and asserts.

::: storage/src/mozStorageConnection.cpp
@@ +347,5 @@
>    {
> +    // This code is executed on the background thread
> +    bool onAsyncThread = false;
> +    (void)mAsyncExecutionThread->IsOnCurrentThread(&onAsyncThread);
> +    MOZ_ASSERT(onAsyncThread);

the first part is unneeded in release mode, so please #ifdef DEBUG

@@ +355,5 @@
> +    (void)mConnection->ProxyDispose();
> +
> +    // Callback
> +    if (mCallbackEvent) {
> +      (void)mCallingThread->Dispatch(mCallbackEvent, NS_DISPATCH_NORMAL);

we made asyncClose main-thread only, so you can probably remove mCallingThread and friends.

@@ +367,5 @@
> +    Connection *rawConnection = nullptr;
> +    mConnection.swap(rawConnection);
> +    (void)NS_ProxyRelease(thread, NS_ISUPPORTS_CAST(mozIStorageConnection *,
> +                                                    rawConnection));
> +    NS_ProxyRelease(thread, mCallbackEvent);

what about a proper destructor?

@@ +796,5 @@
>    return mAsyncExecutionThreadShuttingDown && ConnectionReady();
>  }
>  
>  nsresult
> +Connection::ProxyDispose()

I don't like the naming and the behavior of this... could this be done in the destructor?

@@ +807,5 @@
> +  return NS_OK;
> +}
> +
> +nsresult
> +Connection::internalClose(bool aOffMainThread)

This can very easily figure by itself which thread it's running off, without the need of an argument

@@ +832,5 @@
>    if (mDatabaseFile)
>        (void)mDatabaseFile->GetNativeLeafName(leafName);
>    PR_LOG(gStorageLog, PR_LOG_NOTICE, ("Closing connection to '%s'",
>                                        leafName.get()));
>  #endif

I'm honestly not sure that PR_LOGGING is thread-safe... Could you please verify that?

::: storage/src/mozStorageConnection.h
@@ +23,5 @@
>  #include "mozIStorageCompletionCallback.h"
>  
>  #include "nsIMutableArray.h"
>  #include "mozilla/Attributes.h"
> +#include "mozilla/Atomics.h"

what is this include for? leftover or should it be moved elsewhere?
Attachment #778443 - Flags: review?(mak77)
(In reply to Marco Bonardo [:mak] from comment #41)
> Comment on attachment 778443 [details] [diff] [review]
> Off-main thread closing
> 
> Review of attachment 778443 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> some first comments
> 
> ::: storage/src/StorageBaseStatementInternal.cpp
> @@ +47,5 @@
> >      if (mStatement->mAsyncStatement) {
> > +      // If mConnection->getAsyncExecutionTarget() returns null,
> > +      // we have just lost a race condition with asyncClose(),
> > +      // let asyncClose() win and auto-finalize the statement.
> > +      if (mConnection->getAsyncExecutionTarget()) {
> 
> may this really happen?
> I mean, this is a runnable that runs on the async thread, that thread still
> exists while it's running.
> What is this check protecting?

Ah, that may have been a leftover from my trying to merge |AsyncStatementFinalizer| and |LastDitchSqliteStatementFinalizer|.

> 
> @@ +135,5 @@
> > +    // Dispatch. Note that dispatching can fail, typically if
> > +    // we have a race condition with asyncClose(). It's ok,
> > +    // let asyncClose() win.
> > +    (void)target->Dispatch(event, NS_DISPATCH_NORMAL);
> > +    mAsyncStatement = nullptr;
> 
> I think in both cases we should MOZ_ASSERT, we "accept" broken outside
> consumers, but we should do our best to avoid wrong internal consumers.

The situation I am concerned with
- Main thread calls ExecuteAsync()
- Main thread nulls out references to statement
- Main thread calls AsyncClose() without waiting for the callback to ExecuteAsync(), and immediately dispatches an AsyncCloseConnection
- At some point, ExecuteAsync releases last reference to statement on main thread, hence calling Finalize (*)
- Finalize calls asyncFinalize, which dispatches an AsyncStatementFinalizer (*)
- execution of the AsyncStatementFinalizer starts.

The execution of the AsyncCloseConnection can take place at any point before, during or after the states marked with (*). This situation arises as soon as we accept calls to AsyncClose() before the callback from ExecuteAsync(), so I'm not sure what you call a "broken outside consumer" or where I could MOZ_ASSERT.

> 
> @@ +154,2 @@
> >    }
> > +  // We cannot dispatch to the background thread, presumably because it
> 
> well, we don't know if we can dispatch or not since you removed the early
> return in the if and the error check.. I'd like stricter checks and asserts.

Sure, if you tell me which scenario is wrong :)
For the moment, I believe I just need to change a few words in the comment.

> ::: storage/src/mozStorageConnection.cpp
> @@ +347,5 @@
> >    {
> > +    // This code is executed on the background thread
> > +    bool onAsyncThread = false;
> > +    (void)mAsyncExecutionThread->IsOnCurrentThread(&onAsyncThread);
> > +    MOZ_ASSERT(onAsyncThread);
> 
> the first part is unneeded in release mode, so please #ifdef DEBUG

Sure.

> @@ +355,5 @@
> > +    (void)mConnection->ProxyDispose();
> > +
> > +    // Callback
> > +    if (mCallbackEvent) {
> > +      (void)mCallingThread->Dispatch(mCallbackEvent, NS_DISPATCH_NORMAL);
> 
> we made asyncClose main-thread only, so you can probably remove
> mCallingThread and friends.

I'm not sure I understand. This is in the code of AsyncCloseConnection::Run, which is executed on the async thread, isn't it?

> 
> @@ +367,5 @@
> > +    Connection *rawConnection = nullptr;
> > +    mConnection.swap(rawConnection);
> > +    (void)NS_ProxyRelease(thread, NS_ISUPPORTS_CAST(mozIStorageConnection *,
> > +                                                    rawConnection));
> > +    NS_ProxyRelease(thread, mCallbackEvent);
> 
> what about a proper destructor?

Certainly.

> @@ +796,5 @@
> >    return mAsyncExecutionThreadShuttingDown && ConnectionReady();
> >  }
> >  
> >  nsresult
> > +Connection::ProxyDispose()
> 
> I don't like the naming and the behavior of this... could this be done in
> the destructor?

Good idea. Are we sure that the destructor is called on the opener thread?

> @@ +807,5 @@
> > +  return NS_OK;
> > +}
> > +
> > +nsresult
> > +Connection::internalClose(bool aOffMainThread)
> 
> This can very easily figure by itself which thread it's running off, without
> the need of an argument

While this is ultimately equivalent, I find that passing the intended behavior as argument rather than extracting it from the environment is much easier to read and check.
If you want me to get rid of the argument, I will.

> @@ +832,5 @@
> >    if (mDatabaseFile)
> >        (void)mDatabaseFile->GetNativeLeafName(leafName);
> >    PR_LOG(gStorageLog, PR_LOG_NOTICE, ("Closing connection to '%s'",
> >                                        leafName.get()));
> >  #endif
> 
> I'm honestly not sure that PR_LOGGING is thread-safe... Could you please
> verify that?

I have seen it used on various threads, but I'll try and check.

> ::: storage/src/mozStorageConnection.h
> @@ +23,5 @@
> >  #include "mozIStorageCompletionCallback.h"
> >  
> >  #include "nsIMutableArray.h"
> >  #include "mozilla/Attributes.h"
> > +#include "mozilla/Atomics.h"
> 
> what is this include for? leftover or should it be moved elsewhere?

Leftover.
Attached patch Off-main thread closing, v2 (obsolete) — Splinter Review
Applied your feedback.

Ah, also, I now understand your remark « we made asyncClose main-thread only, so you can probably remove mCallingThread and friends. » I suppose I can do a NS_GetMainThread(). Do you want me to?

Try: https://tbpl.mozilla.org/?tree=Try&rev=055d8b4e1e4d
Attachment #778443 - Attachment is obsolete: true
Attachment #781170 - Flags: review?(mak77)
(In reply to David Rajchenbach Teller [:Yoric] from comment #43)
> Ah, also, I now understand your remark « we made asyncClose main-thread
> only, so you can probably remove mCallingThread and friends. » I suppose I
> can do a NS_GetMainThread(). Do you want me to?

well, it doesn't make sense to keep calling thread when we can get it at any time.

(In reply to David Rajchenbach Teller [:Yoric] from comment #42)
> The situation I am concerned with
> - Main thread calls ExecuteAsync()
> - Main thread nulls out references to statement
> - Main thread calls AsyncClose() without waiting for the callback to
> ExecuteAsync(), and immediately dispatches an AsyncCloseConnection
> - At some point, ExecuteAsync releases last reference to statement on main
> thread, hence calling Finalize (*)
> - Finalize calls asyncFinalize, which dispatches an AsyncStatementFinalizer
> (*)
> - execution of the AsyncStatementFinalizer starts.
> 
> The execution of the AsyncCloseConnection can take place at any point
> before, during or after the states marked with (*).

Shouldn't the consumer invoke finalize() before invoking asyncClose?
stmt.executeAsync()
stmt.finalize()
db.asyncClose()
any other usage than this should MOZ_ASSERT very loudly.

> I'm not sure what you call a "broken outside consumer" or
> where I could MOZ_ASSERT.

Anyone that is not smart enough to test his add-on in debug mode, on the other side we have tests, we use debug, we can and should find API misues.
nvm the finalize() comment, I misread the thing.
I suspect we may want to avoid the roundtrip to the mainthread when the async execution is the last thing keeping the statement alive.  The fact is that we should be able to distinguish misuses and properly notify consumers about those. I'll see if I can figure out a solution for the latest version of the patch.
(In reply to Marco Bonardo [:mak] from comment #45)
> nvm the finalize() comment, I misread the thing.

Just for outside readers lacking context: you were talking about JavaScript clients, I was talking about C++ clients.

> I suspect we may want to avoid the roundtrip to the mainthread when the
> async execution is the last thing keeping the statement alive.  The fact is
> that we should be able to distinguish misuses and properly notify consumers
> about those. I'll see if I can figure out a solution for the latest version
> of the patch.

I'll see if I can find a way.
A simple alternative, even if sucky, would be to check if finalize has been called OR check the async statement mRefCnt. So, what we'd request from cpp consumers is that any ownership is dropped before calling asyncClose. This is not very nice but in the end it's just to be able to notify consumers in debug mode, so it's not critical to reach perfection.
(In reply to Marco Bonardo [:mak] from comment #47)
> A simple alternative, even if sucky, would be to check if finalize has been
> called OR check the async statement mRefCnt. So, what we'd request from cpp
> consumers is that any ownership is dropped before calling asyncClose. This
> is not very nice but in the end it's just to be able to notify consumers in
> debug mode, so it's not critical to reach perfection.

To do this, we would need to add statement bookkeeping to the connection, wouldn't we?

Tracing the execution, I determined that we have the following timeline:
- async execution completes, triggering AsyncExecuteStatements::notifyComplete();
- this method dispatches a CompletionNotifier to the main thread;
- at the end of CompletionNotifier::Run(), we have a call to mStatementData.Clear() that triggers off main thread finalization.

Now, we could do the following:
- patch AsyncExecuteStatements::notifyComplete() to finalize directly the |sqlite3_statement|s of |mStatementData| whenever the mozStorageStatementData holding that statement has a refcount of 1; 
- remove the call to stmt_finalize in AsyncCloseConnection/internalClose and reintroduce the warning we had initially.

This looks a little complicated, but we can try it.
(In reply to David Rajchenbach Teller [:Yoric] from comment #48)
> To do this, we would need to add statement bookkeeping to the connection,
> wouldn't we?

So, I didn't verify this on actual code yet, can't we (ab)use ref counting to tell if the consumer dropped his ownership? Ideally, before invoking asyncClose() the consumer would drop his ownership. At that point the only owners should be internal, we know the expected refcnt and we can tell if something else didn't drop ownership. Then we can assert that, and proceed with simple automated finalization.
I'm not sure we have a good place to store that information, unless you want me to add a table of owned statements.
Comment on attachment 781170 [details] [diff] [review]
Off-main thread closing, v2

Review of attachment 781170 [details] [diff] [review]:
-----------------------------------------------------------------

This is reviewing current patch.

Now to the remaining issue.
If you executeAsync, null out your ownership, and call asyncClose, it looks like we could make AsyncExecuteStatements::Run release mStatements just before returning, that should cause the statements to be destroyed on the async thread before asyncClose runnable. We seem to try to do that (http://mxr.mozilla.org/mozilla-central/source/storage/src/mozStorageAsyncStatement.cpp#228). Looks like that would call destructorAsyncFinalize() on the async thread, but it tries to Dispatch to the same thread in such a case (http://mxr.mozilla.org/mozilla-central/source/storage/src/StorageBaseStatementInternal.cpp#143), it may probably check if it's already on that thread and just invoke Run instead of dispatching the event.
Do you think this is something you tried already? May this solve the issue you pointed out? I didn't try it, but I could. Though first I'd like to understand if you already tried such solution.

::: storage/src/StorageBaseStatementInternal.cpp
@@ +149,4 @@
>    }
> +  // We cannot dispatch to the background thread, presumably because it
> +  // is being shutdown. Since said shutdown will finalize the statement,
> +  // we just need to clean-up around here.

are the changes in StorageBaseStatementInternal really needed? Looks like we already ignore any error condition, so the patch could be simpler, but maybe I'm missing something.

::: storage/src/mozStorageConnection.cpp
@@ +336,5 @@
>                         nsIEventTarget *aCallingThread,
>                         nsIRunnable *aCallbackEvent,
>                         already_AddRefed<nsIThread> aAsyncExecutionThread)
>    : mConnection(aConnection)
>    , mCallingThread(aCallingThread)

Since we enforced asyncClose to be used only from main-thread, this can only be main-thread, so there's no need to bring it on in a property.

@@ +842,5 @@
> +    // We still have non-finalized statements. Since we have called
> +    // sqlite3_close(), all internal sqlite3 statements have been
> +    // closed so the only statements left belong to users of
> +    // mozStorage. We may now either auto-finalize them or warn
> +    // users that they should have done so in the first place.

I'd prefer a shorter comment before the if, explaining the strategy, than the effects.

// Attempt to close the database connection.  This will first disconnect
// any virtual tables and cleanly finalize all of their internal statements.
// Then it will try to actually close the connection, but it may fail, due to
// unfinalized statements. That's fine, cause we can automatically finalize
// all of the remaining statements before retrying to close.

@@ +847,5 @@
> +
> +    sqlite3_stmt *stmt = NULL;
> +    if (aOffMainThread) {
> +      // Auto-finalize statements
> +      while ((stmt = ::sqlite3_next_stmt(mDBConn, NULL))) {

I think that we could unify these code paths a little bit more. The only difference is that on main-thread we can issue a warning, while we can't off main-thread, cause we can't easily distinguish valid from invalid cases. But all of the rest, included the automatic finalization with PR_LOG reporting and the second sqlite3_close() can be done in both cases.
I'd condition only the warning, and since aOffMainThread will only be needed in DEBUG mode for the warning, I'd avoid it as an argument and just codition on onOpenedThread.

@@ +868,5 @@
> +#endif
> +      }
> +#ifdef PR_LOGGING
> +      PR_LOG(gStorageLog, PR_LOG_NOTICE, ("All statements for '%s' have been finalized",
> +                                          leafName.get()));

I think this is pointless, it's pretty clear than if there's no auto-finalizing log and no error from the second close call, all statements have been finalized...

@@ +871,5 @@
> +      PR_LOG(gStorageLog, PR_LOG_NOTICE, ("All statements for '%s' have been finalized",
> +                                          leafName.get()));
> +#endif
> +
> +      // Now attempt to close once again

nit: end comments with period

Add brief explain of the reasoning like
// Now that we auto-finalized all of the remaining statements, attempt to close once again.

@@ +873,5 @@
> +#endif
> +
> +      // Now attempt to close once again
> +
> +      srv = ::sqlite3_close(mDBConn);

nit: and no newline between comment and the command

@@ +890,5 @@
> +  }
> +
> +  if (srv != SQLITE_OK) {
> +    NS_ASSERTION(srv == SQLITE_OK,
> +                 "sqlite3_close failed. There are probably outstanding statements that are listed above!");

at this point, if we unify the paths and do automatic finalization in both cases, I think we should make this stronger, like a MOZ_ASSERT

@@ +896,1 @@
>    mDBConn = NULL;

OT: Honestly, I think we should set it to null earlier and pass a pointer to internalClose instead of using mDBConn. The reason is that ConnectionReady checks if it's null, and after a call to asyncClose it should probably return false.  But this is for a separate bug, like bug 726990. SetClosedState has a note saying it nulls mDBConn when it doesn't, maybe it should. Btw, all of this is an off-topic digression.
Attachment #781170 - Flags: review?(mak77)
(In reply to Marco Bonardo [:mak] from comment #51)
> Comment on attachment 781170 [details] [diff] [review]
> Off-main thread closing, v2
> 
> Review of attachment 781170 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This is reviewing current patch.
> 
> Now to the remaining issue.
> If you executeAsync, null out your ownership, and call asyncClose, it looks
> like we could make AsyncExecuteStatements::Run release mStatements just
> before returning, that should cause the statements to be destroyed on the
> async thread before asyncClose runnable.

It's not very far from what I had in mind, except I intended to run finalization by myself if the refcount was 1 instead of releasing. Let's try your idea.


> @@ +896,1 @@
> >    mDBConn = NULL;
> 
> OT: Honestly, I think we should set it to null earlier and pass a pointer to
> internalClose instead of using mDBConn. The reason is that ConnectionReady
> checks if it's null, and after a call to asyncClose it should probably
> return false.  But this is for a separate bug, like bug 726990.
> SetClosedState has a note saying it nulls mDBConn when it doesn't, maybe it
> should. Btw, all of this is an off-topic digression.

In that case, I'm not fixing this just yet :)
(In reply to David Rajchenbach Teller [:Yoric] from comment #52)
> It's not very far from what I had in mind, except I intended to run
> finalization by myself if the refcount was 1 instead of releasing.

yes, but there's also the fact destructorAsyncFinalize is re-dispatching even when not needed.
Attached patch Off-main thread closing, v3 (obsolete) — Splinter Review
That's just the same patch, with your feedback.
Attachment #781170 - Attachment is obsolete: true
Attachment #785753 - Flags: review?(mak77)
Attached patch Finalizing statements earlier (obsolete) — Splinter Review
Is this the kind of thing you had in mind?
Attachment #785755 - Flags: feedback?(mak77)
Attached patch Merged patch (obsolete) — Splinter Review
As discussed over IRC, here's a merged patch, minus the cleanup to StorageStatementInternal.cpp.
Try: https://tbpl.mozilla.org/?tree=Try&rev=f412c564979a
Attachment #785753 - Attachment is obsolete: true
Attachment #785755 - Attachment is obsolete: true
Attachment #785753 - Flags: review?(mak77)
Attachment #785755 - Flags: feedback?(mak77)
Attachment #785959 - Flags: review?(mak77)
Comment on attachment 785959 [details] [diff] [review]
Merged patch

Review of attachment 785959 [details] [diff] [review]:
-----------------------------------------------------------------

So, overall this looks good! I think the approach is fine and there is just some cleanup needed, as stated below.

Unfortunately I don't have enough time left for a deep check of all of the thread-safety issues that may arise from the change, so, I suggest fixing my comments, then asking for a review pass to Andrew. I will drop him a mail now to notify about this. I will check status next Monday when I'm back.

::: storage/src/mozStorageAsyncStatementExecution.cpp
@@ +149,5 @@
>      return NS_OK;
>    }
>  
>  private:
>    StatementDataArray mStatementData;

I guess you should also remove this

@@ +427,5 @@
>    for (uint32_t i = 0; i < mStatements.Length(); i++)
> +    mStatements[i].reset();
> +
> +  // Release references to the statement data as soon as possible. This
> +  // will also cause finalization of statements to be executed immediately,

s/will/should/

@@ +431,5 @@
> +  // will also cause finalization of statements to be executed immediately,
> +  // on the async thread.
> +  mStatements.Clear();
> +
> +

too many newlines here

::: storage/src/mozStorageConnection.cpp
@@ +376,4 @@
>    }
>  private:
>    nsRefPtr<Connection> mConnection;
>    nsCOMPtr<nsIEventTarget> mCallingThread;

should be removed

@@ +811,5 @@
>                   "Did not call setClosedState!");
>    }
>  
> +  bool onOpeningThread = false;
> +  (void)threadOpenedOn->IsOnCurrentThread(&onOpeningThread);

looks like you wrongly removed the assertion from here...

@@ +835,5 @@
> +  if (srv == SQLITE_BUSY) {
> +    // We still have non-finalized statements. Finalize them.
> +
> +    sqlite3_stmt *stmt = NULL;
> +    while ((stmt = ::sqlite3_next_stmt(mDBConn, NULL))) {

if we pass NULL as the second argument, and auto-finalization of stmt fails for whatever reason, won't that trap us into an infinite loop where we keep reading and trying to finalize the same statement? I guess we should pass stmt as second argument here.

@@ +845,2 @@
>  #ifdef DEBUG
> +      if (onOpeningThread) {

With the latest changes we should have ensured we arrive here with finalized statements, if the consumer did the right thing (calling finalize() from js, or dropping ownership from cpp). So we should be able to remove this condition and warn in both cases

::: storage/src/mozStorageConnection.h
@@ +23,5 @@
>  #include "mozIStorageCompletionCallback.h"
>  
>  #include "nsIMutableArray.h"
>  #include "mozilla/Attributes.h"
> +#include "mozilla/Atomics.h"

leftover

@@ +139,5 @@
>  
>    /**
> +   * Dispose resources on the right thread
> +   */
> +  nsresult ProxyDispose();

leftover
Attachment #785959 - Flags: review?(mak77) → feedback+
(In reply to Marco Bonardo [:mak] (Away Aug 07-11) from comment #58)
> @@ +811,5 @@
> >                   "Did not call setClosedState!");
> >    }
> >  
> > +  bool onOpeningThread = false;
> > +  (void)threadOpenedOn->IsOnCurrentThread(&onOpeningThread);
> 
> looks like you wrongly removed the assertion from here...

I don't think so.
The assertion ensured that this function was called from the right thread. Since it can now be called from any thread, I don't really think I should leave it here.

> @@ +835,5 @@
> > +  if (srv == SQLITE_BUSY) {
> > +    // We still have non-finalized statements. Finalize them.
> > +
> > +    sqlite3_stmt *stmt = NULL;
> > +    while ((stmt = ::sqlite3_next_stmt(mDBConn, NULL))) {
> 
> if we pass NULL as the second argument, and auto-finalization of stmt fails
> for whatever reason, won't that trap us into an infinite loop where we keep
> reading and trying to finalize the same statement? I guess we should pass
> stmt as second argument here.

Already tested: if we pass |stmt|, sqlite3 segfaults. So I need to pass |stmt| if closing failed or |NULL| if it succeeded. That can be arranged.
Attached patch Merged patch, v2 (obsolete) — Splinter Review
Applied feedback.

Try and personal testing seem to confirm that my cleanup to StorageBaseStatementInternal.cpp is actually required. Otherwise, I have a few tests that fail mysteriously (no error message, nothing meaningful in the logs, even on mozStoage:5), so I put the cleanups back into the patch.
Attachment #785959 - Attachment is obsolete: true
Attachment #786286 - Flags: review?(bugmail)
Comment on attachment 786286 [details] [diff] [review]
Merged patch, v2

Review of attachment 786286 [details] [diff] [review]:
-----------------------------------------------------------------

This really cleans a lot of mozStorage stuff up quite nicely! r=asuth conditional on not clearing mAsyncStatement as noted in the cases below (above?).

::: storage/src/StorageBaseStatementInternal.cpp
@@ +120,5 @@
>    if (!target) {
>      // If we cannot get the background thread, we have to assume it has been
> +    // shutdown (or is in the process of doing so).  Since the statement
> +    // will be auto-finalized, we just need to clean-up around here.
> +    mAsyncStatement = nullptr;

It is not safe to be nulling out mAsyncStatement here or in the case directly below this.  In both cases, it is entirely possible that there are still AsyncExecuteStatements instances active or enqueued on the asynchronous thread.  While the race window for AsyncStatement::getAsyncStatement to return a null statement and thereby violating its contract is ridiculously small, it is there.  In the event the bad race doesn't happen, the AsyncExecuteStatements will just end up re-creating the statement.

The intended contract for mAsyncStatement was that races are avoided since it only gets touched by the async thread.  I think what the logic in this case was trying to do was only null out mAsyncStatement if we were sure that the async thread was entirely dead.

I think the right course of action is to not zero out mAsyncStatement in either case.  Based on the invariant that only the async thread touches it (or it is unable to access it because of other strong invariants), there is no risk in leaving this as a potentially dangling pointer even though it might not feel like the best thing.  (Debugging tools might pick up on the dangling pointer, which is fine, since the code that did not cause finalization before calling AsyncClose was indeed in error and deserves to be yelled at.)

@@ +130,5 @@
> +    // Dispatch. Note that dispatching can fail, typically if
> +    // we have a race condition with asyncClose(). It's ok,
> +    // let asyncClose() win.
> +    (void)target->Dispatch(event, NS_DISPATCH_NORMAL);
> +    mAsyncStatement = nullptr;

As mentioned above, don't zero this, especially all the time (it used to be only on failure to dispatch.)

@@ +161,1 @@
>    mAsyncStatement = nullptr;

I would suggest moving the comment in question up to the if (target) case.  I might strengthen it a little to be something like so:
// If we can get the async execution target, then our owner forgot to finalize
// us and we are going out of scope.  It's helpful for us to automatically
// finalize the statement, otherwise the statement memory would leak until the
// connection gets closed.
//
// If we can't get the async execution target, then AsyncClose has already been
// called and our statement either has been or will be cleaned up by
// internalClose.

And before the mAsyncStatement nulling here I would say:
// This object is being destroyed; it doesn't matter if we clear this or not.
// We definitely won't step on the async execution thread; it would hold a
// reference to us so we wouldn't be getting destroyed.

Honestly I'm not sure if we should clear it or not.

::: storage/src/mozStorageConnection.cpp
@@ +371,5 @@
> +    mConnection.swap(rawConnection);
> +    (void)NS_ProxyRelease(thread,
> +                          NS_ISUPPORTS_CAST(mozIStorageConnection *,
> +                                            rawConnection));
> +    NS_ProxyRelease(thread, mCallbackEvent);

The second NS_ProxyRelease should be (void)-cast too per coding guidelines (which I forgot about until I saw the first one :)

::: storage/src/mozStorageStatementData.h
@@ +49,5 @@
>    }
> +  ~StatementData()
> +  {
> +    nsCOMPtr<nsIThread> mainThread = do_GetMainThread();
> +    NS_ProxyRelease(mainThread, mParamsArray);

This probably wants a comment along the following lines:

// It is essential that the release of the the parameters occurs on the main
// thread because the BindingParams instances hold nsIVariant references that
// may be (main thread) XPCVariant instances.  XPCVariants are cycle-collected
// and therefore their reference counts are not safe for us to manipulate on
// this thread.
//
// (The XPCVariants can get in there because mozIStorageBindingParams'
// bindByName and bindByIndex methods take nsIVariant instances for which
// XPConnect will furnish XPCVariants.  We could/should probably transform
// such variants into mozStorage's thread-safe Variant implementation.)

@@ +81,5 @@
>    /**
>     * NULLs out our sqlite3_stmt (it is held by the owner) after reseting it and
>     * clear all bindings to it.  This is expected to occur on the async thread.
>     */
> +  inline void reset()

Good call on the name change!
Attachment #786286 - Flags: review?(bugmail) → review+
(In reply to Andrew Sutherland (:asuth) from comment #61)
> 
> I would suggest moving the comment in question up to the if (target) case. 
> I might strengthen it a little to be something like so:
> // If we can get the async execution target, then our owner forgot to
> finalize
> // us and we are going out of scope.  It's helpful for us to automatically
> // finalize the statement, otherwise the statement memory would leak until
> the
> // connection gets closed.

Are you sure? This code can be triggered by a call to AsyncStatement::Finalize, so I wouldn't say that the owner forgot to finalize us.

> And before the mAsyncStatement nulling here I would say:
> // This object is being destroyed; it doesn't matter if we clear this or not.
> // We definitely won't step on the async execution thread; it would hold a
> // reference to us so we wouldn't be getting destroyed.
> 
> Honestly I'm not sure if we should clear it or not.

Ok, then let's not.

> ::: storage/src/mozStorageStatementData.h
> @@ +49,5 @@
> >    }
> > +  ~StatementData()
> > +  {
> > +    nsCOMPtr<nsIThread> mainThread = do_GetMainThread();
> > +    NS_ProxyRelease(mainThread, mParamsArray);
> 
> This probably wants a comment along the following lines:

I went for something a little less descriptive.
Attached patch Additive to merged patch (obsolete) — Splinter Review
Does this do the trick?
Attachment #789172 - Flags: review?(bugmail)
Comment on attachment 789172 [details] [diff] [review]
Additive to merged patch

r=asuth, although feel free to make any additional comment changes given the next comment, etc.

(In reply to David Rajchenbach Teller [:Yoric] from comment #62)
> (In reply to Andrew Sutherland (:asuth) from comment #61)
> > 
> > I would suggest moving the comment in question up to the if (target) case. 
> > I might strengthen it a little to be something like so:
> > // If we can get the async execution target, then our owner forgot to
> > finalize
> > // us and we are going out of scope.  It's helpful for us to automatically
> > // finalize the statement, otherwise the statement memory would leak until
> > the
> > // connection gets closed.
> 
> Are you sure? This code can be triggered by a call to
> AsyncStatement::Finalize, so I wouldn't say that the owner forgot to
> finalize us.

This comment was made on StorageBaseStatementInternal::destructorAsyncFinalize.  AsyncStatement::Finalize only calls StorageBaseStatementInternal::asyncFinalize as far as I can tell.



> > ::: storage/src/mozStorageStatementData.h
> > @@ +49,5 @@
> > >    }
> > > +  ~StatementData()
> > > +  {
> > > +    nsCOMPtr<nsIThread> mainThread = do_GetMainThread();
> > > +    NS_ProxyRelease(mainThread, mParamsArray);
> > 
> > This probably wants a comment along the following lines:
> 
> I went for something a little less descriptive.

I can get a little verbose; editing assistance is always appreciated :)
Attachment #789172 - Flags: review?(bugmail) → review+
Attached patch Merged patch, v3 (obsolete) — Splinter Review
Applying feedback on comments, merging patch.
Attachment #775300 - Attachment is obsolete: true
Attachment #786286 - Attachment is obsolete: true
Attachment #789172 - Attachment is obsolete: true
Attachment #794576 - Flags: review+
Attached patch Merged patch, v4Splinter Review
Fixing a trivial issue in a test.
Attachment #794576 - Attachment is obsolete: true
Attachment #796206 - Flags: review+
https://hg.mozilla.org/integration/fx-team/rev/c7d907403794
Flags: in-testsuite+
Keywords: checkin-needed
Whiteboard: [Async:P1] → [Async:P1][fixed-in-fx-team]
https://hg.mozilla.org/mozilla-central/rev/c7d907403794
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Whiteboard: [Async:P1][fixed-in-fx-team] → [Async:P1]
Target Milestone: --- → mozilla26
Wooooooooooo!  Thanks so much for making this happen, :Yoric!  (And to :mak for the reviews!)
Depends on: 910226
Depends on: 911256
You need to log in before you can comment on or make changes to this bug.