Closed Bug 985655 Opened 10 years ago Closed 10 years ago

[AsyncShutdown] Ensure that Sqlite.jsm doesn't shutdown before its clients

Categories

(Toolkit :: Async Tooling, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla32
Tracking Status
firefox30 - wontfix
firefox31 - wontfix
firefox32 + fixed

People

(Reporter: mak, Assigned: Yoric)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [async])

Attachments

(3 files, 13 obsolete files)

34.98 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
11.69 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
14.69 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
I think it's worth to evaluate moving Sqlite.jsm shutdown later, mostly cause it's very common for consumers to use profile-before-change for shutdown work, but if they do:
- Sqlite.jsm will try to open a connection and fail doing so cause AsyncShutdown.profileBeforeChange.addBlocker will throw
- they may not notice it didn't work as expected
- basically they are only left with profile-change-teardown, that's not enough for complex shutdown processes

There should be no downside moving to profile-before-change2, and that would allow consumers to use profileChangeTeardown and profileBeforeChange for enqueued shutdown work.
I agree with the general idea of moving Sqlite.jsm shutdown later, but I'm not convinced about profile-before-change2. I believe that a storage-shutdown with a clear policy of what should go in it and what shouldn't would be cleaner.
storage shutdown doesn't exist, it shutdowns at xpcom-shutdown and xpcom-shutdown-threads, that is definitely too late. Anyting after profile-before-change2 is too late for consumers, as well as anything before it.
if you want to fire a storage-shutdown at profile-before-change2, that would do, but wouldn't be very useful.
I meant adding a new shutdown step, with a clear policy. Our shutdown steps have a bad tendency towards scope creep and unclear ordering.

Alternatively, I dream of a way to extend AsyncShutdown with more explicit dependencies. Right now, my best idea (which is not good enough) is

e.g.:
 let storageShutdown = AsyncShutdown.profileBeforeChange.lastSlice
 storageShutdown.addBlocker(...)

 // storageShutdown is a lightweight phase, still part of profileBeforeChange,
 // but blockers enqueued without lastSlice are executed before those enqueued with lastSlice.
Ok, what about introducing a method Sqlite.addBlocker, with the same signature as the addBlocker of AsyncShutdown, meant to be called by all services that require the use of Sqlite?

With the following semantics:
- Sqlite.addBlocker() can only be called before profileBeforeChange;
- Sqlite's shutdown can only proceed once all the blockers have been resolved;
- Sqlite's shutdown remains part of profileBeforeChange.

This could be a nice second step towards the de-uglification of our shutdown dependencies.
Flags: needinfo?(mak77)
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?") from comment #4)
> Ok, what about introducing a method Sqlite.addBlocker

I guess you meant addShutdownBlocker :)

, with the same
> signature as the addBlocker of AsyncShutdown, meant to be called by all
> services that require the use of Sqlite?
> 
> With the following semantics:
> - Sqlite.addBlocker() can only be called before profileBeforeChange;

How would we enforce this, if the consumer creates the Sqlite object at profile-before-change (this was the bookmarks.html case, for example). AsyncShutdown by listening to various events can do that, but not Sqlite.

> - Sqlite's shutdown can only proceed once all the blockers have been
> resolved;
> - Sqlite's shutdown remains part of profileBeforeChange.

So basically, this enforces users to create the Sqlite object before profileBeforeChange, though we don't have a way to enforce that.

Would Sqlite use AsyncShutdown internally? I don't think we should spread ents loop spinning in multiple components, one is more than enough...
Flags: needinfo?(mak77)
(In reply to Marco Bonardo [:mak] from comment #5)
> (In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?") from
> comment #4)
> > Ok, what about introducing a method Sqlite.addBlocker
> 
> I guess you meant addShutdownBlocker :)
> 
> , with the same
> > signature as the addBlocker of AsyncShutdown, meant to be called by all
> > services that require the use of Sqlite?
> > 
> > With the following semantics:
> > - Sqlite.addBlocker() can only be called before profileBeforeChange;
> 
> How would we enforce this, if the consumer creates the Sqlite object at
> profile-before-change (this was the bookmarks.html case, for example).
> AsyncShutdown by listening to various events can do that, but not Sqlite.

Well, Sqlite is a client of AsyncShutdown, so Sqlite is informed by AsyncShutdown of profileBeforeChange.

> > - Sqlite's shutdown can only proceed once all the blockers have been
> > resolved;
> > - Sqlite's shutdown remains part of profileBeforeChange.
> 
> So basically, this enforces users to create the Sqlite object before
> profileBeforeChange, though we don't have a way to enforce that.

What do you mean by "the Sqlite object"? This addShutdownBlocker API would be connected to the module itself, not to a specific instance of a database, so I'm not sure I understand your comment.

> Would Sqlite use AsyncShutdown internally? I don't think we should spread
> ents loop spinning in multiple components, one is more than enough...

Indeed, I'm pretty sure that I can implement addShutdownBlocker without spinning the event loop a second time... if Sqlite is already a client of AsyncShutdown. More generally, I imagine that addShutdownBlocker could become an API of AsyncShutdown, used to provide dependencies inside a phase.
Ah ok, basically Sqlite.addShutdownBlocker is a sort of an helper that collects blockers and forwards them to AsyncShutdown... I think that may work.
Though, it may be tricky for consumers to figure the right time to call it, I suspect people may just try to addShutdownBlocker exactly at profile-before-change... and you may need to do shutdown work only once you have created a connection. This makes me think it should rather be per connection (maybe one of the connection options may be a shutdown blocker).
Summary: move Sqlite.jsm shutdown to profile-before-change2 → [AsyncShutdown] Ensure that Sqlite.jsm doesn't shutdown before its clients
Until now, AsyncShutdown only exposed a heavyweight Phase mechanism designed to inject asynchronous cleanup code into a synchronous shutdown notification. This is good for refactoring existing stuff and get it to fit, more or less easily, with our synchronous shutdown mechanism.

Now, the problem with which we are dealing here is slightly simpler: Sqlite.jsm is already asynchronous, so clients need to inject async dependencies into something that is already asynchronous. Instead of attempting to artificially project this back into the synchronous world and introducing implicit (and hidden, and fragile) ordering between shutdowns, let's take this opportunity to introduce an async-on-async API that does essentially the same thing as Phase, but without spinning a nested event loop. This should be both cleaner, easier to debug and faster than AsyncShutdown phases.
Assignee: nobody → dteller
Attachment #8408189 - Flags: review?(nfroyd)
Attachment #8408189 - Flags: feedback?(mak77)
Making use of this mechanism to let clients of Sqlite register blockers.

The API is essentially the same as the addBlocker method exposed by phases of AsyncShutdown:

Sqlite.shutdown.addBlocker(
  "Foobar performing cleanup during Sqlite.jsm shutdown",
  function cleanup() {
    // Shutdown of Sqlite.jsm has started, let's clean up asynchronously
    return promise;
  },
  function status()  {
    // Return data for debugging purposes
  });
Attachment #8408191 - Flags: review?(mak77)
Attachment #8408191 - Flags: feedback?(nfroyd)
Note: I realize that this needs more tests.
Comment on attachment 8408191 [details] [diff] [review]
2. Refactoring Sqlite.jsm to let clients register lightweight blockers

Review of attachment 8408191 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/modules/Sqlite.jsm
@@ +1039,4 @@
>  this.Sqlite = {
>    openConnection: openConnection,
> +  cloneStorageConnection: cloneStorageConnection,
> +  shutdown: barrier.client,

I see now why you want a separate object...sort of.
Attachment #8408191 - Flags: feedback?(nfroyd) → feedback+
Comment on attachment 8408189 [details] [diff] [review]
1. Lightweight AsyncShutdown barrier mechanism.

Review of attachment 8408189 [details] [diff] [review]:
-----------------------------------------------------------------

Needs tests, but everything in here looks OK.

::: toolkit/modules/AsyncShutdown.jsm
@@ +340,5 @@
> +            phase: topic,
> +            conditions: state
> +	  };
> +          gCrashReporter.annotateCrashReport("AsyncShutdownTimeout",
> +          JSON.stringify(data));

Fix the indentation here, please.

@@ +355,5 @@
> +      });
> +
> +    promise = promise.then(function() {
> +      timeToCrash.reject();
> +    }.bind(this)/* No error is possible here*/);

Arrow function?  Though I don't see why the bind is necessary.

@@ +358,5 @@
> +      timeToCrash.reject();
> +    }.bind(this)/* No error is possible here*/);
> +
> +    // Now, spin the event loop
> +    promise.then(() => satisfied = true); // This promise cannot reject 

Nit: trailing whitespace.

@@ +402,5 @@
> +  this._monitors = null;
> +
> +  /**
> +   * The capability of adding blockers. This object may safely be returned
> +   * or passed to clients.

Freeze this?  And does this really need to be an object, rather than just exposing an addBlocker() function?

@@ +548,5 @@
>              " Phase: " + topic;
>        warn(msg, error);
>      });
>  
> +    promise = promise.then(() => this._monitors = null); // Memory cleanup

Does that happen on error, too?
Attachment #8408189 - Flags: review?(nfroyd) → feedback+
(In reply to Nathan Froyd (:froydnj) from comment #12)
> @@ +402,5 @@
> > +  this._monitors = null;
> > +
> > +  /**
> > +   * The capability of adding blockers. This object may safely be returned
> > +   * or passed to clients.
> 
> Freeze this?  And does this really need to be an object, rather than just
> exposing an addBlocker() function?

It doesn't strictly need to be an object, but I believe that it's clearer that you can export it to clients if it's an object than if it's an apparently normal method. Also, since Barrier is somewhat symmetric to Deferred (in Promise.jsm), that property mirrors the |promise| field of Deferred.

> 
> @@ +548,5 @@
> >              " Phase: " + topic;
> >        warn(msg, error);
> >      });
> >  
> > +    promise = promise.then(() => this._monitors = null); // Memory cleanup
> 
> Does that happen on error, too?

The previous then() guarantees that we can't have errors at this stage.
Same one, more tests and a few trivial fixes.
Attachment #8408189 - Attachment is obsolete: true
Attachment #8408189 - Flags: feedback?(mak77)
Attachment #8415262 - Flags: review?(nfroyd)
The same changes, but now with tests.
Attachment #8408191 - Attachment is obsolete: true
Attachment #8408191 - Flags: review?(mak77)
Attachment #8415264 - Flags: review?(mak77)
Attachment #8415264 - Flags: feedback?(nfroyd)
Comment on attachment 8415262 [details] [diff] [review]
1. Lightweight AsyncShutdown barrier mechanism., v2

Review of attachment 8415262 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/modules/tests/xpcshell/test_AsyncShutdown.js
@@ +39,5 @@
> + * a AsyncShutdown barrier.
> + *
> + * @return An object with the following methods:
> + *   - addBlocker() - the same method as AsyncShutdown phases and barrier clients
> + *   - wait() - trigger the resolution of the lock, 

Nit: trailing whitespace.
Attachment #8415262 - Flags: review?(nfroyd) → review+
Attachment #8415264 - Flags: feedback?(nfroyd) → feedback+
Comment on attachment 8415262 [details] [diff] [review]
1. Lightweight AsyncShutdown barrier mechanism., v2

Side-note: the name of the constructor will probably change to `AsyncShutdown.Barrier` instead of just `Barrier`.

Other side-note: it looks like we will need this mechanism also to remove the nested event loop spinning from the shutdown procedure of FHR (bug 917883).

Benjamin, what do you think of this mechanism?
Attachment #8415262 - Flags: feedback?(benjamin)
Comment on attachment 8415262 [details] [diff] [review]
1. Lightweight AsyncShutdown barrier mechanism., v2

If you want API feedback, could you write this a patch to in-tree API docs? I'd have to read the patch in detail to figure out what the API contracts are here.
Attachment #8415262 - Flags: feedback?(benjamin)
Attached file Outline of the new API (obsolete) —
Here's just the documentation.
Attachment #8416072 - Flags: feedback?(benjamin)
Attachment #8416072 - Attachment mime type: application/x-javascript → text/plain
Comment on attachment 8415262 [details] [diff] [review]
1. Lightweight AsyncShutdown barrier mechanism., v2

Review of attachment 8415262 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/modules/AsyncShutdown.jsm
@@ +162,5 @@
>    deferred.promise.then(() => timer.cancel(), () => timer.cancel());
>    return deferred;
>  }
>  
> +this.EXPORTED_SYMBOLS = ["AsyncShutdown", "Barrier"];

I'd honestly prefer if Barrier would be eported through AsyncShutdown, like new AsyncShutdown.Barrier()... It would allow to keep using defineLazyModuleGetter (or future lazyImport) more easily. Also Barrier is a so generic name that sounds possibly too much conflicting.
I believe that everybody is in agreement, I'll make this AsyncShutdown.Barrier.
Comment on attachment 8415264 [details] [diff] [review]
2. Refactoring Sqlite.jsm to let clients register lightweight blockers, v2

Review of attachment 8415264 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/modules/Sqlite.jsm
@@ +15,5 @@
>  Cu.import("resource://gre/modules/Services.jsm");
>  Cu.import("resource://gre/modules/XPCOMUtils.jsm");
>  Cu.import("resource://gre/modules/Log.jsm");
> +let {AsyncShutdown, Barrier} = Cu.import("resource://gre/modules/AsyncShutdown.jsm");
> +Cu.import("resource://gre/modules/Task.jsm");

Would be possible to do all the init work the first time a connection is opened?

When importing sqlite.jsm you are now immediatly importing AsyncShutdown, Task, creating a ConnectionRegister, a Barrier and adding an AsyncShutdown blocker...

I feel like this is a lot of stuff to do when just importing the module, add-ons may import Sqlite.jsm and never use it, and also some of our code may use Sqlite.jsm conditionally (we should usually defineLazyModuleGetter it, but it's wrong to make that assumption here).

Most of the stuff you are doing is only needed when the first connection is created, a lazyGetter would satisfy such requirement much better imo.

@@ +33,5 @@
> + *
> + * Used to ensure that we wait until all connections are closed before
> + * we shutdown.
> + */
> +let ConnectionRegister = {

for example this may be lazyGetter-ed

@@ +38,5 @@
> +  /**
> +   * A map from connection id to a promise resolved once the database
> +   * is closed.
> +   */
> +  _map: new Map(),

so, as soon as someone imports Sqlite.jsm, we create this object and a Map. I wonder if wouldn't be better to "new ConnectionRegister()" when needed.

@@ +50,5 @@
> +
> +  /**
> +   * |true| if it is still time to open connections, |false| otherwise.
> +   */
> +  get mayOpen() {

I find very confusing the ConnectionRegister.mayOpen checks in this patch, they are also almost always negated... why not going for a much simpler and more readable ConnectionRegister.isClosed property?

@@ +52,5 @@
> +   * |true| if it is still time to open connections, |false| otherwise.
> +   */
> +  get mayOpen() {
> +    return this._mayOpen;
> +  },

_isClosed: false,
get isClosed() this._isClosed,

@@ +57,5 @@
> +  _mayOpen: true,
> +
> +  delete: function(key) {
> +    return this._map.delete(key);
> +  },

maybe delete: (key) => this.map.delete(key)

@@ +72,5 @@
> +  },
> +
> +  keys: function() {
> +    return this._map.keys();
> +  }

arrow functions?

@@ +338,5 @@
> +
> +  ConnectionRegister.set(this._connectionIdentifier, this._deferredClose.promise);
> +  this._deferredClose.promise.then(
> +    () => ConnectionRegister.delete(this._connectionIdentifier)
> +  );

couldn't ConnectionRegister register a then handler and cleanup itself?

@@ +998,5 @@
> + * A light-weight synchronization barrier provided for clients who
> + * need to trigger a treatment during shutdown but before Sqlite.jsm
> + * is shutdown.
> + */
> +let barrier = new Barrier("Sqlite.jsm shutdown barrier");

what about better naming like shutdownBarrier or closeBarrier

@@ +1017,5 @@
> +    // Prevent any new opening.
> +    ConnectionRegister.close();
> +
> +    // Now, wait until all databases are closed
> +    return Promise.all(ConnectionRegister.values());

I feel like this detail should be hidden by a ConnectionRegister API, it's not really clear at all that ConnectionRegister.values() is closed-promises...

@@ +1039,4 @@
>  this.Sqlite = {
>    openConnection: openConnection,
> +  cloneStorageConnection: cloneStorageConnection,
> +  shutdown: barrier.client,

I have no idea what this is supposed to be and how it it supposed to work, shutdown = client? Either naming is relly bad or a lot of documentation is missing here :)

Is there a reason we must expose the bare shutdown blocker object rather than a nicely named method like Sqlite.addShutdownBlocker? Is this useful only to tests? I have some fear consumers may think they _must_ use it if we don't add a comment here or name it as addTestShutdownBlocker

::: toolkit/modules/tests/xpcshell/test_sqlite.js
@@ +939,5 @@
> +      dbOpened = true;
> +      Task.spawn(function*() {
> +        yield db.close();
> +        dbClosed = true;
> +      });

db.close().then may be simpler, do we really need to spawn a task for a single yield?

@@ +947,5 @@
> +  assertions.push({name: "dbOpened", value: () => dbOpened});
> +  assertions.push({name: "dbClosed", value: () => dbClosed});
> +
> +  do_print("Now shutdown Sqlite.jsm synchronously");
> +  Services.obs.notifyObservers(null, "profile-before-change", null);

would it be possible to directly call observe() on AsyncShutdown? We should try to avoid firing system wide notifications in tests, you can't predict how other subsystems may react.

@@ +961,5 @@
> +    yield getDummyDatabase("opened after shutdown");
> +  } catch (ex) {
> +    exn = ex;
> +  }
> +  do_check_true(!!exn);

please use Assert.throws(callback, expectedException) and check  you are catching the right error (by passing the msg string)
Attachment #8415264 - Flags: review?(mak77) → feedback+
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?" - I'll be away on May 8th-12th) from comment #22)
> I believe that everybody is in agreement, I'll make this
> AsyncShutdown.Barrier.

uh sorry I had missed the previous comment about that :)
note that to extend Map you may also use __nosuchmethod__ and forward any unknown methods to the underlying Map
Comment on attachment 8416072 [details]
Outline of the new API

I believe that since we're making this generic, we should have a .removeBlocker method as well.

AsyncShutdown.profileBeforeChange will become a barrier-client?

This looks good. I really do want this documentation as a real patch which adds an in-tree .rst file.
Attachment #8416072 - Flags: feedback?(benjamin) → feedback+
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #26)
> Comment on attachment 8416072 [details]
> Outline of the new API
> 
> I believe that since we're making this generic, we should have a
> .removeBlocker method as well.

Ok, that shouldn't be too hard. There is a chance that this might also make the Sqlite.jsm code simpler, too.

> AsyncShutdown.profileBeforeChange will become a barrier-client?

Exactly.

> 
> This looks good. I really do want this documentation as a real patch which
> adds an in-tree .rst file.

Ok, will do. Thanks for the feedback.
(In reply to Marco Bonardo [:mak] from comment #23)
> ::: toolkit/modules/Sqlite.jsm
> @@ +15,5 @@
> >  Cu.import("resource://gre/modules/Services.jsm");
> >  Cu.import("resource://gre/modules/XPCOMUtils.jsm");
> >  Cu.import("resource://gre/modules/Log.jsm");
> > +let {AsyncShutdown, Barrier} = Cu.import("resource://gre/modules/AsyncShutdown.jsm");
> > +Cu.import("resource://gre/modules/Task.jsm");
> 
> Would be possible to do all the init work the first time a connection is
> opened?

[...]

I'll see what I can do. I'm not sure going fully lazy is really useful, but lazifying calls to Cu.import() might help (bug 859912). Also, bsmedberg's suggestion of adding removeBlocker made it possible to rewrite the code in a much more elegant manner.

> @@ +1039,4 @@
> >  this.Sqlite = {
> >    openConnection: openConnection,
> > +  cloneStorageConnection: cloneStorageConnection,
> > +  shutdown: barrier.client,
> 
> I have no idea what this is supposed to be and how it it supposed to work,
> shutdown = client? Either naming is relly bad or a lot of documentation is
> missing here :)
> 
> Is there a reason we must expose the bare shutdown blocker object rather
> than a nicely named method like Sqlite.addShutdownBlocker? Is this useful
> only to tests? I have some fear consumers may think they _must_ use it if we
> don't add a comment here or name it as addTestShutdownBlocker

Well, `client` has exactly two methods (used to have only one):
- addBlocker;
- removeBlocker.

The idea is to write Sqlite.shutdown.addBlocker or now Sqlite.shutdown.removeBlocker. Not very different from addShutdownBlocker, just (imho) more readable.

> @@ +947,5 @@
> > +  assertions.push({name: "dbOpened", value: () => dbOpened});
> > +  assertions.push({name: "dbClosed", value: () => dbClosed});
> > +
> > +  do_print("Now shutdown Sqlite.jsm synchronously");
> > +  Services.obs.notifyObservers(null, "profile-before-change", null);
> 
> would it be possible to directly call observe() on AsyncShutdown? We should
> try to avoid firing system wide notifications in tests, you can't predict
> how other subsystems may react.
> 
> @@ +961,5 @@
> > +    yield getDummyDatabase("opened after shutdown");
> > +  } catch (ex) {
> > +    exn = ex;
> > +  }
> > +  do_check_true(!!exn);
> 
> please use Assert.throws(callback, expectedException) and check  you are
> catching the right error (by passing the msg string)

Unfortunately, Assert.throws didn't work with async errors. Might be worth filing a followup bug.
Applied your feedback. Also, I have attempted to minimize startup-time use of Cu.import() and used removeBlocker to simplify the mechanism.
Attachment #8415264 - Attachment is obsolete: true
Attachment #8418366 - Flags: review?(mak77)
- Added removeBlocker mechanism.
- Renamed Barrier to AsyncShutdown.Barrier.

I'll add the documentation as a separate patch once I have had the time to write and format it.
Attachment #8415262 - Attachment is obsolete: true
Attachment #8418369 - Flags: review?(nfroyd)
Comment on attachment 8418369 [details] [diff] [review]
1. Lightweight AsyncShutdown barrier mechanism., v3

Review of attachment 8418369 [details] [diff] [review]:
-----------------------------------------------------------------

Looks fine; I assume the documentation is done later.  f+ only because I don't understand what _trigger is used for, so the patch looks incomplete.

::: toolkit/modules/AsyncShutdown.jsm
@@ +245,5 @@
> +      spinner.addBlocker(name, condition, state);
> +    },
> +    removeBlocker: function(condition) {
> +      spinner.removeBlocker(condition);
> +    },

Documentation needed here for later.

@@ +250,5 @@
> +    /**
> +     * Trigger the phase without having to broadcast a
> +     * notification. For testing purposes only.
> +     */
> +    get _trigger() {

This does not seem to be used in the test code?

@@ +293,5 @@
>     * before we proceed to the next runstate. If |action| returns a promise,
>     * we wait until the promise is resolved/rejected before proceeding
>     * to the next runstate.
>     */
> +  addBlocker: function(name, condition, state) {

I think you want to modify the doc comment for this function now.

@@ +296,5 @@
>     */
> +  addBlocker: function(name, condition, state) {
> +    this._barrier.client.addBlocker(name, condition, state);
> +  },
> +  removeBlocker: function(condition) {

I see that you are going to write documentation separately, but this needs documentation. :)

@@ +397,5 @@
> + */
> +function Barrier(name) {
> +  /**
> +   * The set of conditions registered by clients, as a
> +   * Map name => Set({name, condition, state}).

This documentation is not correct, the keys are functions, not the string names.  And the members of the set are {name, state}.
Attachment #8418369 - Flags: review?(nfroyd) → feedback+
(In reply to Nathan Froyd (:froydnj) from comment #31)
> @@ +250,5 @@
> > +    /**
> > +     * Trigger the phase without having to broadcast a
> > +     * notification. For testing purposes only.
> > +     */
> > +    get _trigger() {
> 
> This does not seem to be used in the test code?

This is used in the test of patch 2, at mak's request.
Going to track this since we know we're experiencing slow shutdowns now in FF29 and if this is a low enough risk to uplift, we could try getting it into a Beta next week to test it out with more users.
I'd be a bit nervous about uplifting this to Beta.
Comment on attachment 8418369 [details] [diff] [review]
1. Lightweight AsyncShutdown barrier mechanism., v3

Review of attachment 8418369 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/modules/AsyncShutdown.jsm
@@ +255,5 @@
> +      let accepted = false;
> +      try {
> +        accepted = Services.prefs.getBoolPref("toolkit.asyncshutdown.testing");
> +      } catch (ex) {
> +        // Ignore errors

imo, it should throw. it's usually better to throw in the face of the dev when he's doing something wrong, than to silently return undefined. I'd just put the plain getBoolPref here and also throw if it's false.

nit: btw, why making it a getter rather than a simple method invoking observe? fwiw if you want to protect it you may consider to seal the object, provided you are not setting private properties into it dinamically, and still, the getter protection is very weak.
(In reply to Marco Bonardo [:mak] from comment #35)
> ::: toolkit/modules/AsyncShutdown.jsm
> @@ +255,5 @@
> > +      let accepted = false;
> > +      try {
> > +        accepted = Services.prefs.getBoolPref("toolkit.asyncshutdown.testing");
> > +      } catch (ex) {
> > +        // Ignore errors
> 
> imo, it should throw. it's usually better to throw in the face of the dev
> when he's doing something wrong, than to silently return undefined. I'd just
> put the plain getBoolPref here and also throw if it's false.
> 
> nit: btw, why making it a getter rather than a simple method invoking
> observe? fwiw if you want to protect it you may consider to seal the object,
> provided you are not setting private properties into it dinamically, and
> still, the getter protection is very weak.

Well, the idea was to pretend that the method doesn't exist at all in non-testing mode, more or less as feature detection. I believe it's slightly nicer than throwing an error, although probably not by much. The only non-subjective reason to do this is that we already use the mechanism for `getPhase` somewhere else in the module. If you insist, I can change the mechanism, but in this case, I'll need to change both for consistency. Your call.
Comment on attachment 8418366 [details] [diff] [review]
2. Refactoring Sqlite.jsm to let clients register lightweight blockers, v3

Review of attachment 8418366 [details] [diff] [review]:
-----------------------------------------------------------------

Nothing blocking, though I think the code may be simplified by making Barriers itself lazy, rather than its properties.

::: toolkit/modules/Sqlite.jsm
@@ +16,5 @@
>                                    "resource://gre/modules/AsyncShutdown.jsm");
> +XPCOMUtils.defineLazyModuleGetter(this, "Promise",
> +                                  "resource://gre/modules/Promise.jsm");
> +XPCOMUtils.defineLazyModuleGetter(this, "Services",
> +                                  "resource://gre/modules/Services.jsm");

nit: I guess lazy-fying Services doesn't matter much, in general. it's small and used everywhere, like XPCOMUtils.

@@ +42,5 @@
> +/**
> + * Barriers used to ensure that Sqlite.jsm is shutdown after all
> + * its clients.
> + */
> +let Barriers = {

wouldn't have been simpler to make Barriers itself a lazyGetter (and do all of its init before returning the object) instead of making Barriers.shutdown and Barriers.connections lazy?
that would have removed the need to track _initialized
Something like

XPCOMUtils.defineLazyGetter(this, "Barriers", () => {
  // do all of the init here...
  return Object.seal({
    get connections: ,
    get shutdown:
  });
});
(or maybe in your case you should define the object locally first, since you refer to it in the Task, and then still return it at the end)
this would also protect all of the init code from abuse.

@@ +55,5 @@
> +      return;
> +    }
> +    this._initialized = true;
> +    AsyncShutdown.profileBeforeChange.addBlocker("Sqlite.jsm shutdown blocker",
> +      () => Task.spawn(function* () {

Is not this a Task.async?

@@ +67,5 @@
> +
> +        // Now, wait until all databases are closed
> +        yield Barriers.connections.wait();
> +      }),
> +  

some trailing spaces here

@@ +71,5 @@
> +  
> +      function status() {
> +        if (isClosed) {
> +          // We are waiting for the connections to close. The interesting
> +          // statis is therefore the list of connections still pending.

typo: statis

@@ +73,5 @@
> +        if (isClosed) {
> +          // We are waiting for the connections to close. The interesting
> +          // statis is therefore the list of connections still pending.
> +          return {description: "Waiting for connections to close",
> +                  status: Barriers.connections.status};

nit: for readability I love having spaces around braces when inlining objects like this
return { this: 1,
         that: 2 }

@@ +80,5 @@
> +        // We are still in the first stage: waiting for the barrier
> +        // to be lifted. The interesting status is therefore that of
> +        // the barrier.
> +        return {description: "Waiting for the barrier to be lifted",
> +            status: Barriers.shutdown.status};

indentation

@@ +528,5 @@
> +          // Now that the connection is closed, no need to keep
> +          // a blocker for Barriers.connections.
> +          Barriers.connections.client.removeBlocker(deferred.promise);
> +        } catch (ex if ex instanceof Error
> +                 && isClosed) {

nit: we usually keep && and || at the end of line in if conditions, I don't think this differs

@@ +529,5 @@
> +          // a blocker for Barriers.connections.
> +          Barriers.connections.client.removeBlocker(deferred.promise);
> +        } catch (ex if ex instanceof Error
> +                 && isClosed) {
> +          // removeBlocker throws an Error if we call it too late

is this expected or is there a bug to investigate a way to fix this? if the latter, would be nice to put that bug # here for reference.
I guess this is going over xpcom-shutdown?

@@ +1044,5 @@
>    openConnection: openConnection,
> +  cloneStorageConnection: cloneStorageConnection,
> +  get shutdown() {
> +    return Barriers.shutdown.client;
> +  }

get shutdown() Barriers.shutdown.client

::: toolkit/modules/tests/xpcshell/test_sqlite.js
@@ +910,5 @@
>  });
> +
> +
> +//
> +// -----------  This test must be kept last as it shuts down Sqlite.jsm

nit: or move it to a test_sqlite_shutdown.js test...this test starts growind a bit too much.
Attachment #8418366 - Flags: review?(mak77) → review+
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?" - I'll be away on May 8th-12th) from comment #36)
> The only
> non-subjective reason to do this is that we already use the mechanism for
> `getPhase` somewhere else in the module. If you insist, I can change the
> mechanism, but in this case, I'll need to change both for consistency. Your
> call.

I don't care _that_ much, it's just me that prefer to have the code explode the louder as possible when something is wrong. If this code went the undefined direction, there's no reason to spend time changing all instances here. just move on.
(In reply to Marco Bonardo [:mak] from comment #37)
> nit: I guess lazy-fying Services doesn't matter much, in general. it's small
> and used everywhere, like XPCOMUtils.

counter-nit: In that case, Task.jsm, osfile.jsm and AsyncShutdown.jsm probably don't matter, as they are all imported before Sqlite.jsm :)

> @@ +42,5 @@
> > +/**
> > + * Barriers used to ensure that Sqlite.jsm is shutdown after all
> > + * its clients.
> > + */
> > +let Barriers = {
> 
> wouldn't have been simpler to make Barriers itself a lazyGetter (and do all
> of its init before returning the object) instead of making Barriers.shutdown
> and Barriers.connections lazy?

Sounds feasible,

> @@ +529,5 @@
> > +          // a blocker for Barriers.connections.
> > +          Barriers.connections.client.removeBlocker(deferred.promise);
> > +        } catch (ex if ex instanceof Error
> > +                 && isClosed) {
> > +          // removeBlocker throws an Error if we call it too late
> 
> is this expected or is there a bug to investigate a way to fix this? if the
> latter, would be nice to put that bug # here for reference.
> I guess this is going over xpcom-shutdown?

This is expected. `removeBlocker` throws an `Error` if we call it after `wait`, i.e. when removing a blocker doesn't make sense anymore.

> @@ +1044,5 @@
> >    openConnection: openConnection,
> > +  cloneStorageConnection: cloneStorageConnection,
> > +  get shutdown() {
> > +    return Barriers.shutdown.client;
> > +  }
> 
> get shutdown() Barriers.shutdown.client

Do you find this more readable?

> ::: toolkit/modules/tests/xpcshell/test_sqlite.js
> @@ +910,5 @@
> >  });
> > +
> > +
> > +//
> > +// -----------  This test must be kept last as it shuts down Sqlite.jsm
> 
> nit: or move it to a test_sqlite_shutdown.js test...this test starts growind
> a bit too much.

Fine with me.
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?" - I'll be away on May 8th-12th) from comment #39)
> (In reply to Marco Bonardo [:mak] from comment #37)
> > nit: I guess lazy-fying Services doesn't matter much, in general. it's small
> > and used everywhere, like XPCOMUtils.
> 
> counter-nit: In that case, Task.jsm, osfile.jsm and AsyncShutdown.jsm
> probably don't matter, as they are all imported before Sqlite.jsm :)

let's say probability Services.jsm has already be imported is 99%, probability the others have, is much lower. there's not just Firefox using this code, so you can't guess an order from our code. nit nit nit.

> > get shutdown() Barriers.shutdown.client
> 
> Do you find this more readable?

it's more compact and the return doesn't add anything to it... personal taste I guess :)
Main differences with previous:
- added documentation (additional documentation coming in a following patch);
- experience using removeBlocker and writing doc shows that we always resolve the blocker when we call removeBlocker, so removeBlocker will now always resolve the blocker;
- moving the timeout check from Spinner to Barrier;
- more tests.

I have no specific plans of further changes, so I hope this is a final patch (depending on review).
Attachment #8418369 - Attachment is obsolete: true
Attachment #8420917 - Flags: review?(nfroyd)
Attached patch 3. Documentation (obsolete) — Splinter Review
Documentation patch. r? Nathan for the contents, Ted for the build.
Attachment #8416072 - Attachment is obsolete: true
Attachment #8420926 - Flags: review?(ted)
Attachment #8420926 - Flags: review?(nfroyd)
Attached patch 3. Documentation (obsolete) — Splinter Review
Same one, with the missing qref.
Attachment #8420926 - Attachment is obsolete: true
Attachment #8420926 - Flags: review?(ted)
Attachment #8420926 - Flags: review?(nfroyd)
Attachment #8420927 - Flags: review?(ted)
Attachment #8420927 - Flags: review?(nfroyd)
Comment on attachment 8420927 [details] [diff] [review]
3. Documentation

Review of attachment 8420927 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/modules/docs/AsyncShutdown.rst
@@ +14,5 @@
> +
> +Consider a service FooService. At some point during the shutdown of the process, this service needs to:
> +- inform its clients that it is about to shut down;
> +- wait until the clients have completed their final operations based on FooService (often asynchronously);
> +- only then shut itself down.

I think the first requirement here isn't really a requirement.  It's not addressed by the capabilities of Barrier that you describe below, so having three conditions here makes the two capabilities look incomplete.  I'd take out the first requirement.

@@ +24,5 @@
> +Shutdown timeouts
> +------------------
> +
> +By design, an instance of ``AsyncShutdown.Barrier`` will cause a crash
> +if it takes more than 60 seconds `awake` for its clients to lift or remove

|awake| needs to be more descriptive.

@@ +33,5 @@
> +If the CrashReporter is enabled, this crash will report:
> +- the name of the barrier that failed;
> +- for each blocker that has not been released yet:
> +  - the name of the blocker;
> +  - the state of the blocker, if a state function has been provided (see :ref:`AsyncShutdown.Barrier.state`). 

Nit: trailing whitespace.

@@ +58,5 @@
> +
> +Example 2: Simple Barrier owner
> +----------------------------
> +
> +The following snippet presents an example of a service FooService that wishes to ensure that all clients have had a chance to complete any treatment before FooService shuts down.

"treatment" sounds unusual here.  I think "outstanding operations" would be the more standard phrase.  Here and throughout the docs.
Attachment #8420927 - Flags: review?(nfroyd) → review+
Comment on attachment 8420917 [details] [diff] [review]
1. Lightweight AsyncShutdown barrier mechanism., v4

Review of attachment 8420917 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/modules/AsyncShutdown.jsm
@@ +337,5 @@
> +      crashAfterMS: DELAY_CRASH_MS
> +    });
> +
> +    // Now, spin the event loop
> +    promise.then(() => satisfied = true); // This promise cannot reject 

Nit: trailing whitespace.

::: toolkit/modules/tests/xpcshell/test_AsyncShutdown.js
@@ +39,5 @@
> + * a AsyncShutdown barrier.
> + *
> + * @return An object with the following methods:
> + *   - addBlocker() - the same method as AsyncShutdown phases and barrier clients
> + *   - wait() - trigger the resolution of the lock, 

Nit: trailing whitespace.
Attachment #8420917 - Flags: review?(nfroyd) → review+
Attachment #8420927 - Flags: review?(ted) → review+
Applied feedback.
Attachment #8420917 - Attachment is obsolete: true
Attachment #8421205 - Flags: review+
(In reply to Nathan Froyd (:froydnj) from comment #45)
> Comment on attachment 8420927 [details] [diff] [review]
> 3. Documentation
> 
> Review of attachment 8420927 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/modules/docs/AsyncShutdown.rst
> @@ +14,5 @@
> > +
> > +Consider a service FooService. At some point during the shutdown of the process, this service needs to:
> > +- inform its clients that it is about to shut down;
> > +- wait until the clients have completed their final operations based on FooService (often asynchronously);
> > +- only then shut itself down.
> 
> I think the first requirement here isn't really a requirement.  It's not
> addressed by the capabilities of Barrier that you describe below, so having
> three conditions here makes the two capabilities look incomplete.  I'd take
> out the first requirement.

I'm not sure what you mean here.

 FooService.shutdown.addBlocker("foo", doSomething);

This will trigger doSomething when FooService is about to shut down, which is #1 above.
Attached patch 3. Documentation, v2 (obsolete) — Splinter Review
Applied feedback.
Attachment #8420927 - Attachment is obsolete: true
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?" - I'll be away on May 8th-12th) from comment #48)
> (In reply to Nathan Froyd (:froydnj) from comment #45)
> > Comment on attachment 8420927 [details] [diff] [review]
> > 3. Documentation
> > 
> > Review of attachment 8420927 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > ::: toolkit/modules/docs/AsyncShutdown.rst
> > @@ +14,5 @@
> > > +
> > > +Consider a service FooService. At some point during the shutdown of the process, this service needs to:
> > > +- inform its clients that it is about to shut down;
> > > +- wait until the clients have completed their final operations based on FooService (often asynchronously);
> > > +- only then shut itself down.
> > 
> > I think the first requirement here isn't really a requirement.  It's not
> > addressed by the capabilities of Barrier that you describe below, so having
> > three conditions here makes the two capabilities look incomplete.  I'd take
> > out the first requirement.
> 
> I'm not sure what you mean here.
> 
>  FooService.shutdown.addBlocker("foo", doSomething);
> 
> This will trigger doSomething when FooService is about to shut down, which
> is #1 above.

Ah, right.  OK, then I think stylistically, it'd be better to say something like:

"Before FooService shuts down, FooService needs to:

- inform its clients that it is about to shut down;
- wait until the clients have completed their interactions with FooService.  This waiting often needs to happen asynchronously.

``AsyncShutdown.Barrier`` addresses these requirements in the following ways:..."

Or even dispense with the requirements and just state the capabilities of AsyncShutdown.Barrier:

"``AsyncShutdown.Barrier`` provides the following capabilities:

- a capability ``client`` that may be published to users of FooService, enabling them to be notified prior to the shutdown of FooService
- methods for the owner of the barrier to wait until all clients have completed their outstanding operations..."

making the requirements implicit.
I'll land the docfixes as a followup patch.
https://hg.mozilla.org/mozilla-central/rev/01bc471d7a7f
https://hg.mozilla.org/mozilla-central/rev/a2405cd005f6
https://hg.mozilla.org/mozilla-central/rev/decf35a875bf
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [async][fixed-in-fx-team] → [async]
Target Milestone: --- → mozilla32
Let's get this nominated for uplift and into Thursday's beta so we have the most time to get this in front of that larger user population to shake out potential issues from this work and verify the fix.
Flags: needinfo?(dteller)
I'm not completely sure this would be useful. Mak?
Flags: needinfo?(dteller) → needinfo?(mak77)
I don't think uplifting this to beta is feasible. What's the driver for pushing on this? I thought we'd concluded that the 29 shutdown hangs were mostly related to Seer or bug 965309, and those are being handled separately.
If this isn't related to the shutdown hangs, then there is no need to rush.  We can let it ride the trains.
>> Slow shut downs lead to 'Firefox is already running' warning (see 966469 and 985655)

Will I have to really wait till FF version 32 for this fix to come via the release channel?  This is impacting me a few times on a daily basis on one of my machines, making me not updating Firefox on the other two.

Will I have to keep dismissing "A security and stability update for Firefox is available, Firefox 29.0.1" that keeps on showing up on these other two machines, and in more than one prompt window if Firefox is left open unattended, again till FF version 32?
(In reply to Alok from comment #61)
> >> Slow shut downs lead to 'Firefox is already running' warning (see 966469 and 985655)
> 
> Will I have to really wait till FF version 32 for this fix to come via the
> release channel?  This is impacting me a few times on a daily basis on one
> of my machines, making me not updating Firefox on the other two.

This fix is unrelated to your issue.
Comment on attachment 8421205 [details] [diff] [review]
1. Lightweight AsyncShutdown barrier mechanism., v5

Approval Request Comment
[Feature/regressing bug #]: bug 888373
[User impact if declined]: Cannot uplift bug 1008944, which fixes random shutdown freeze/crashes.
[Describe test coverage new/current, TBPL]: This has been on m-c (now Aurora) for one month and is used in many places.
[Risks and why]: None.
[String/UUID change made/needed]: None.
Attachment #8421205 - Flags: approval-mozilla-beta?
Comment on attachment 8421205 [details] [diff] [review]
1. Lightweight AsyncShutdown barrier mechanism., v5

I discussed with David IRL about this bug.
We agreed that bug 1008944 is not a topcrash (the topcrash is in experiment.jsm) and uplifting a refactoring in beta 9 is too risky.
Attachment #8421205 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: