Closed Bug 913899 Opened 11 years ago Closed 11 years ago

[OS.File] Ensure that OS.File doesn't lose data during shutdown

Categories

(Toolkit Graveyard :: OS.File, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla26

People

(Reporter: Yoric, Assigned: Yoric)

References

(Blocks 1 open bug)

Details

(Keywords: dataloss, Whiteboard: [Async:P1][for intermittent orange, see comment 85])

Attachments

(3 files, 25 obsolete files)

8.70 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
15.87 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
8.81 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
Bug 911820 shows that we need to spin the event loop on shutdown to wait for all pending operations to complete.
DXR ( http://dxr.mozilla.org/mozilla-central/search?tree=mozilla-central&q=processNextEvent&redirect=true ) seems to indicate that we have too many JS components that spin the event loop already (on shutdown, I guess, but I have only checked a few). Rather than adding yet another one, I wonder whether it might be interesting to group all these spins in a single loop.
Attached file SpinUtils.jsm (obsolete) —
Here's a first draft. In a first time, this utility will be shared between the shared OS.File thread and the thread spawned by SessionRestore, that uses OS.File.
Assignee: nobody → dteller
Attachment #801269 - Flags: feedback?(nfroyd)
Comment on attachment 801269 [details]
SpinUtils.jsm

Actually, that needs more thinking.
Attachment #801269 - Attachment is obsolete: true
Attachment #801269 - Flags: feedback?(nfroyd)
Comment on attachment 801525 [details] [diff] [review]
Spinning the event loop on OS.File shutdown

Review of attachment 801525 [details] [diff] [review]:
-----------------------------------------------------------------

r=me somewhat reluctantly.  I like the idea of SpinUtils.jsm and unifying all the places we currently spin the event loop in JS; I don't think it's particularly easy to do, though...  (Even better would be if you could provide some sort of callback interface where the callback is called after the event loop is spun appropriately.  Then maybe the spinning could even be moved into C++ and the C++ spinners could use it too.  And then eventually, we can make spinning the event loop go away entirely.)

::: toolkit/components/osfile/modules/osfile_async_front.jsm
@@ +143,5 @@
> +   * |true| once we have sent at least one message to the worker.
> +   */
> +  launched: false,
> +  /**
> +   * |true| once shutdown has startedand we need to reject any

Nit: "started and".

@@ +283,5 @@
> +  // Start shutting down the worker. Don't return until the worker has properly
> +  // finished, otherwise the worker may be killed in the middle of an operation,
> +  // which could lose data.
> +
> +  let shutdown = false; // |true| once we know that shutdown has taken place.

Can we call this something else?  Maybe |workerHasShutdown| or something, to avoid mental conflict with Scheduler.shutdown.

@@ +316,5 @@
> +    shutdown = true;
> +  });
> +
> +  while(!shutdown) {
> +    Services.tm.currentThread.processNextEvent(true);

We have to be on the main thread here, correct?  Can we just use Services.tm.mainThread?

@@ +317,5 @@
> +  });
> +
> +  while(!shutdown) {
> +    Services.tm.currentThread.processNextEvent(true);
> +  }

Blech.  We need to stop doing this, but I don't know of a better way to avoid it.
Attachment #801525 - Flags: review?(nfroyd) → review+
Attached patch Runstate dependencies (obsolete) — Splinter Review
Let's go back to the idea of providing a general-purpose mechanism. Normally, this mechanism has been designed to be sufficient for the Add-on manager, for off main thread Telemetry and for Session Restore.

Irving, can you confirm that this mechanism be sufficient for you?
Attachment #801525 - Attachment is obsolete: true
Attachment #801598 - Flags: feedback?(nfroyd)
Attachment #801598 - Flags: feedback?(irving)
Summary: [OS.File] Spin the event loop on shutdown → [OS.File] Ensure that OS.File doesn't lose data during shutdown
Comment on attachment 801598 [details] [diff] [review]
Runstate dependencies

Review of attachment 801598 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/modules/RunStateDependency.jsm
@@ +56,5 @@
> +
> +/**
> + * Register a new client for a given runstate.
> + *
> + * Registering a client twice is idempotent.

This needs clearer explanation:

 - can clients register for more than one run state? I don't see any reason why we should prevent this

So, something like: Later calls with the same runstate and name replace previous registrations for that runstate/name pair

@@ +60,5 @@
> + * Registering a client twice is idempotent.
> + *
> + * @param {object} client A client dependent of this shutdown stage.
> + * Must contain fields:
> + * - {string} runstate A runstate (e.g. "profile-before-change").

Because we can only specify run states here, I can't quite get what I need, which is inter-module dependencies (e.g. AddonManager must complete its p-b-c handing before OS.File begins)

We could hack that in a couple of ways, either by having Addon Manager loop on p-b-c and OS.File on p-b-c2, or by having OS.File notify "osfile-shutting-down" and having its clients register RunStateDependency on that instead of p-b-c

@@ +63,5 @@
> + * Must contain fields:
> + * - {string} runstate A runstate (e.g. "profile-before-change").
> + * - {string}  name The human-readable name of the client. Used
> + * for debugging/error reporting.
> + * - {function} action An action that needs to be completed

Can we pass the arguments as separate parameters rather than an object? I can see an argument for extensibility here, but I'm not sure we need it.
Attachment #801598 - Flags: feedback?(irving) → feedback-
Comment on attachment 801598 [details] [diff] [review]
Runstate dependencies

Review of attachment 801598 [details] [diff] [review]:
-----------------------------------------------------------------

Irving has good comments.  I think before this goes in we should attempt to convert several event-loop-spinning bits in JS to use this so we can assure ourselves we've caught most of the use-cases.  I recall looking at the spinners and their forms varied somewhat.

::: toolkit/modules/RunStateDependency.jsm
@@ +3,5 @@
> + * You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +/**
> + * A module used to let client register to ensure that their work is complete
> + * before we move to the next runstate (i.e. before we move to the next step

The connection between runstates and XPCOM notifications that we send during shutdown should be made more explicit, either by not inventing a separate term, or...something else.
Attachment #801598 - Flags: feedback?(nfroyd)
(In reply to :Irving Reid from comment #7)
> Comment on attachment 801598 [details] [diff] [review]
> Runstate dependencies
> 
> Review of attachment 801598 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/modules/RunStateDependency.jsm
> @@ +56,5 @@
> > +
> > +/**
> > + * Register a new client for a given runstate.
> > + *
> > + * Registering a client twice is idempotent.
> 
> This needs clearer explanation:
> 
>  - can clients register for more than one run state? I don't see any reason
> why we should prevent this
> 
> So, something like: Later calls with the same runstate and name replace
> previous registrations for that runstate/name pair

Sure.

> @@ +60,5 @@
> > + * Registering a client twice is idempotent.
> > + *
> > + * @param {object} client A client dependent of this shutdown stage.
> > + * Must contain fields:
> > + * - {string} runstate A runstate (e.g. "profile-before-change").
> 
> Because we can only specify run states here, I can't quite get what I need,
> which is inter-module dependencies (e.g. AddonManager must complete its
> p-b-c handing before OS.File begins)

I don't see the problem here. You want the AddonManager to have written all its data before profile-before-change completes, don't you? I don't see why this would require anything specific from OS.File beginning anything.

> We could hack that in a couple of ways, either by having Addon Manager loop
> on p-b-c and OS.File on p-b-c2, or by having OS.File notify
> "osfile-shutting-down" and having its clients register RunStateDependency on
> that instead of p-b-c
> 
> @@ +63,5 @@
> > + * Must contain fields:
> > + * - {string} runstate A runstate (e.g. "profile-before-change").
> > + * - {string}  name The human-readable name of the client. Used
> > + * for debugging/error reporting.
> > + * - {function} action An action that needs to be completed
> 
> Can we pass the arguments as separate parameters rather than an object? I
> can see an argument for extensibility here, but I'm not sure we need it.

The argument is for not getting the order of arguments wrong.
(In reply to David Rajchenbach Teller [:Yoric] from comment #9)
> (In reply to :Irving Reid from comment #7)
> 
> I don't see the problem here. You want the AddonManager to have written all
> its data before profile-before-change completes, don't you? I don't see why
> this would require anything specific from OS.File beginning anything.

Actually I don't really care (much) about profile-before-change, what I really want is that AddonManager has written everything before OS.File becomes unavailable. Then I don't have to worry about lack of a guarantee about what order p-b-c (or any other event) observers are notified. The lack of a framework to declare that sort of direct dependency is what leaves us needing to hack in things like profile-before-change2, and the FHR multi step shutdown and event loop to make sure it finishes before MozStorage.

> > Can we pass the arguments as separate parameters rather than an object? I
> > can see an argument for extensibility here, but I'm not sure we need it.
> 
> The argument is for not getting the order of arguments wrong.

Well, I think we'd end up pretty deep in the bicycle shed if we started arguing about named vs. positional parameters in programming languages; JS is a positional parameter language, and the vast majority of our APIs expect programmers to get the arguments in the right order.
(In reply to :Irving Reid from comment #10)
> (In reply to David Rajchenbach Teller [:Yoric] from comment #9)
> > (In reply to :Irving Reid from comment #7)
> > 
> > I don't see the problem here. You want the AddonManager to have written all
> > its data before profile-before-change completes, don't you? I don't see why
> > this would require anything specific from OS.File beginning anything.
> 
> Actually I don't really care (much) about profile-before-change, what I
> really want is that AddonManager has written everything before OS.File
> becomes unavailable. Then I don't have to worry about lack of a guarantee
> about what order p-b-c (or any other event) observers are notified. The lack
> of a framework to declare that sort of direct dependency is what leaves us
> needing to hack in things like profile-before-change2, and the FHR multi
> step shutdown and event loop to make sure it finishes before MozStorage.

I agree about profile-before-change2, but perhaps that's something we can handle in a different bug. For the add-on manager, you only need to return a promise to RunStateDependency that is resolved only once you are satisfied that all your data is written. No need to complicate the code further.

> Well, I think we'd end up pretty deep in the bicycle shed if we started
> arguing about named vs. positional parameters in programming languages; JS
> is a positional parameter language, and the vast majority of our APIs expect
> programmers to get the arguments in the right order.

Once we have several arguments with the same type, I tend to move to labelled parameters. I don't care all that much, so let's keep bikeshedding for the actual review.
(In reply to David Rajchenbach Teller [:Yoric] from comment #11)
> (In reply to :Irving Reid from comment #10)
> > (In reply to David Rajchenbach Teller [:Yoric] from comment #9)
> > > (In reply to :Irving Reid from comment #7)
> > > 
> > > I don't see the problem here. You want the AddonManager to have written all
> > > its data before profile-before-change completes, don't you? I don't see why
> > > this would require anything specific from OS.File beginning anything.
> > 
> > Actually I don't really care (much) about profile-before-change, what I
> > really want is that AddonManager has written everything before OS.File
> > becomes unavailable. Then I don't have to worry about lack of a guarantee
> > about what order p-b-c (or any other event) observers are notified. The lack
> > of a framework to declare that sort of direct dependency is what leaves us
> > needing to hack in things like profile-before-change2, and the FHR multi
> > step shutdown and event loop to make sure it finishes before MozStorage.
> 
> I agree about profile-before-change2, but perhaps that's something we can
> handle in a different bug. For the add-on manager, you only need to return a
> promise to RunStateDependency that is resolved only once you are satisfied
> that all your data is written. No need to complicate the code further.

So if OS.File registers a p-b-c shutdown event loop spinner and the addon-manager does so also, and the spinners are run in just that order, OS.File will still be available when the addon-manager's spinner is called?  That does not appear to be the case with the current code.

> > Well, I think we'd end up pretty deep in the bicycle shed if we started
> > arguing about named vs. positional parameters in programming languages; JS
> > is a positional parameter language, and the vast majority of our APIs expect
> > programmers to get the arguments in the right order.
> 
> Once we have several arguments with the same type, I tend to move to
> labelled parameters. I don't care all that much, so let's keep bikeshedding
> for the actual review.

I think JavaScript's wacko appropriation of keyword-style argument lists is fine.  We use it in OS.File and it's worked out well.  If other JS reviewer folks don't have any objections...
(In reply to Nathan Froyd (:froydnj) from comment #12)
> > I agree about profile-before-change2, but perhaps that's something we can
> > handle in a different bug. For the add-on manager, you only need to return a
> > promise to RunStateDependency that is resolved only once you are satisfied
> > that all your data is written. No need to complicate the code further.
> 
> So if OS.File registers a p-b-c shutdown event loop spinner and the
> addon-manager does so also, and the spinners are run in just that order,
> OS.File will still be available when the addon-manager's spinner is called? 
> That does not appear to be the case with the current code.

You are correct that if both register an event loop spinner, we have a problem.
I didn't intend to register a p-b-c event loop spinner in OS.File, though. If you consider it necessary, I am more than willing to introduce a runstate profile-before-change-osfile-flushed (or something such) that you could use instead of profile-before-change.
Fixed up a few minor errors in David's code to get things working with the WIP implementation for bug 911621.
Attachment #801598 - Attachment is obsolete: true
Attachment #801989 - Flags: feedback?(nfroyd)
Attachment #801989 - Flags: feedback?(dteller)
Attached patch Runstate dependencies, v2 (obsolete) — Splinter Review
Looks like we both updated the module concurrently. Here's my new version, I'll take a look at yours asap.
Attached patch Adapting nsBrowserGlue (obsolete) — Splinter Review
Reworking nsBrowserGlue to RunStateDependency instead of spinning the event loop.
Attached patch Adapting OS.File (obsolete) — Splinter Review
Here's a version of OS.File adapted to use RunStateDependency.
It uses RunStateDependency to:
1. ensure that the shutdown warnings are displayed upon shutdown;
2. flush during profile-before-change;
3. provide a runstate osfile-flush-during-profile-before-change that clients may use to synchronize with this flush.

I am rather satisfied with 1. and 2., but not with 3.
Attachment #802212 - Flags: feedback?(nfroyd)
Attachment #802212 - Flags: feedback?(irving)
Time to discuss strategy, gentlecoders.
While I like RunStateDependency, I believe that it is not mature enough for the next uplift.
Should we try and land the OS.File-spins-the-event-loop patch, with the objective of reverting it in a followup bug?
Comment on attachment 802189 [details] [diff] [review]
Adapting nsBrowserGlue

Review of attachment 802189 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/nsBrowserGlue.js
@@ +389,5 @@
>      this._isPlacesInitObserver = true;
>      os.addObserver(this, "places-database-locked", false);
>      this._isPlacesLockedObserver = true;
>      os.addObserver(this, "distribution-customization-complete", false);
>      os.addObserver(this, "places-shutdown", false);

Remove this line too, yes?

@@ +1189,2 @@
>          function onFailure() {
>            Cu.reportError("Unable to backup bookmarks.");

I wish Cu.reportError took an optional second parameter (like AddonLogger) so that we could say Cu.reportError("message", exception) with more useful structure than exception.toString() gives...
Attached patch Adapting OS.File, v2 (obsolete) — Splinter Review
Actually, flushing OS.File during profile-before-change looks much simpler than I thought. I don't think that we are going to have any conflict with AddonManager. Irving, do you concur?
Attachment #802212 - Attachment is obsolete: true
Attachment #802212 - Flags: feedback?(nfroyd)
Attachment #802212 - Flags: feedback?(irving)
Attachment #802304 - Flags: feedback?(nfroyd)
Attachment #802304 - Flags: feedback?(irving)
Attached patch Adapting SessionStore (obsolete) — Splinter Review
This change should let us ensure that the write of sessionstore.js is never killed while in progress.
Attachment #802323 - Flags: feedback?(ttaubert)
Attachment #802323 - Flags: feedback?(nfroyd)
Comment on attachment 802323 [details] [diff] [review]
Adapting SessionStore

Review of attachment 802323 [details] [diff] [review]:
-----------------------------------------------------------------

Looks reasonable to me.  Tim's the expert here, though.
Attachment #802323 - Flags: feedback?(nfroyd) → feedback+
Comment on attachment 801989 [details] [diff] [review]
updated RunStateDependency WIP patch, various bug fixes

Review of attachment 801989 [details] [diff] [review]:
-----------------------------------------------------------------

Looks sane.

::: toolkit/modules/RunStateDependency.jsm
@@ +41,5 @@
> +
> +this.EXPORTED_SYMBOLS = ["RunStateDependency"];
> +
> +// The list of runstates/topics.
> +// Used to help avoid typoes. Do not hesitate to add topics.

Nit: This is a duplicate comment.
Attachment #801989 - Flags: feedback?(nfroyd) → feedback+
Comment on attachment 802304 [details] [diff] [review]
Adapting OS.File, v2

Review of attachment 802304 [details] [diff] [review]:
-----------------------------------------------------------------

The main_test_osfile_async.js changes don't look relevant to this patch...and they're changing things (e.g. timeouts).  I'd like that split off into a separate patch or a more detailed rationale.

::: toolkit/components/osfile/modules/osfile_async_front.jsm
@@ +995,5 @@
>  OS.File.Error = OSError;
>  OS.File.DirectoryIterator = DirectoryIterator;
> +
> +
> +// Auto-flush OS.File during profile-before-change. This ensures that any I/O

How sure are you that clients using OS.File aren't current/will-be-in-the-future doing writes at profile-before-change?

@@ +1004,5 @@
> +  runstate: "profile-before-change",
> +  name: "flush I/O queued before profile-before-change",
> +  action: function() {
> +    // Wait until the latest currently enqueued promise is satisfied/rejected
> +    return Scheduler.latestPromise.then(null,

Is it technically possible to arrive here with a null |latestPromise|, isn't it?
Attachment #802304 - Flags: feedback?(nfroyd)
(In reply to Nathan Froyd (:froydnj) from comment #24)
> Comment on attachment 802304 [details] [diff] [review]
> Adapting OS.File, v2
> 
> Review of attachment 802304 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> The main_test_osfile_async.js changes don't look relevant to this
> patch...and they're changing things (e.g. timeouts).  I'd like that split
> off into a separate patch or a more detailed rationale.

These tests are testing warnings that I have reimplemented more robustly with RunStateDependency as a test bench for that module (and also because the previous implementation was based on assumptions that turned out wrong). The previous hooks were no longer applicable, so I had to rewrite the tests. I admit that I took the opportunity to rewrite them into something that I managed to debug. That can definitely move to another patch, but it has to remain in the same bug.

> ::: toolkit/components/osfile/modules/osfile_async_front.jsm
> @@ +995,5 @@
> >  OS.File.Error = OSError;
> >  OS.File.DirectoryIterator = DirectoryIterator;
> > +
> > +
> > +// Auto-flush OS.File during profile-before-change. This ensures that any I/O
> 
> How sure are you that clients using OS.File aren't
> current/will-be-in-the-future doing writes at profile-before-change?

I am currently combing m-c for clients that may be doing this already. So far, I haven't found any that does so. There is no guarantee that this won't happen, although I guess I could cook something in a followup bug to warn of such errors.

> @@ +1004,5 @@
> > +  runstate: "profile-before-change",
> > +  name: "flush I/O queued before profile-before-change",
> > +  action: function() {
> > +    // Wait until the latest currently enqueued promise is satisfied/rejected
> > +    return Scheduler.latestPromise.then(null,
> 
> Is it technically possible to arrive here with a null |latestPromise|, isn't
> it?

Yes, it is. If my reviewer's memory serves correctly, Promise.all will handle this nicely.
For testdriving purposes, the untested attached patch intends to demonstrate that RunStateDependency also works with the password manager.

I'll stop adding examples now, before this bug goes too far off-topic.
Attachment #801989 - Attachment is obsolete: true
Attachment #801989 - Flags: feedback?(dteller)
Attachment #802349 - Flags: feedback?(nfroyd)
Attachment #802349 - Flags: feedback?(mak77)
Attached patch Runstate dependencies, v3 (obsolete) — Splinter Review
Small fix: any value that is intended to be used as a promise should be promoted to a promise.
Attachment #802185 - Attachment is obsolete: true
Comment on attachment 802323 [details] [diff] [review]
Adapting SessionStore

Review of attachment 802323 [details] [diff] [review]:
-----------------------------------------------------------------

I'd rather have this in _SessionFile.jsm, though. This feels quite close to the worker implementation of async writes and the SessionSaver doesn't actually care how async writing is implemented.

::: browser/components/sessionstore/src/SessionSaver.jsm
@@ +285,5 @@
>        this.updateLastSaveTime();
>        notify(null, "sessionstore-state-write-complete");
> +    }, Cu.reportError).then(() =>
> +      this._writeInProgress = null;
> +    );

In case of the last write before shutdown failing, we never reset _writeInProgress. Does that mean we would never quit? Should the rejection handler null _writeInProgress was well?

@@ +305,5 @@
> +RunStateDependency.registerClient({
> +  runstate: "profile-before-change",
> +  name: "Ensure that sessionstore.js write is not interrupted",
> +  action: function() {
> +    return SessionStoreInternal._writeInProgress;

Can you please add a proper getter to SessionStoreInternal? I think that would look better than having it access the flag directly.

Also, this needs to be SessionSaverInternal.
Attachment #802323 - Flags: feedback?(ttaubert) → feedback+
Applied feedback. Marking "do not land" as we'll land this as part of bug 914581.
Attachment #802323 - Attachment is obsolete: true
Comment on attachment 802368 [details] [diff] [review]
Adapting SessionStore, v2 (do not land)

Review of attachment 802368 [details] [diff] [review]:
-----------------------------------------------------------------

This patch still contains the SessionSaver.jsm changes... Also, should writeInProgres maybe rather be 'writesInProgress' and be a counter? In theory, there's nothing that prevents me from calling .write() three times in a row, right?
Comment on attachment 802356 [details] [diff] [review]
Runstate dependencies, v3

Review of attachment 802356 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/modules/RunStateDependency.jsm
@@ +56,5 @@
> + * An association from {string} name -> {string} description
> + */
> +let Runstates = new Map();
> +
> +let RunStateDependency = {};

Needs to be "this.RunStateDependency" to avoid breakage with B2G's hacked module loader

@@ +61,5 @@
> +
> +/**
> + * Register a new client for a given runstate.
> + *
> + * Registering a client twice (with the same runstate) is idempotent.

"Subsequent registrations for the same runstate / client name pair discard the old action and replace it with the new one."

@@ +94,5 @@
> + * Must contain fields:
> + * - {string} name
> + * - {Promise} promise
> + */
> +RunStateDependency.registerRunState = function(runstate) {

This could cause problems with lazy module loading. If Module1 does XPCOMUtils.defineLazyModuleGetter("Module2"), and then tries to register a dependency on Module2's run state, it's possible that Module2 hasn't loaded and registered yet.

Aside from that, I see how this lets us register "run states" that don't depend on Observer notifications, with the run state triggered by resolving the supplied promise. To me, using a Promise for this feels a bit clumsy; why not use traditional observe() where necessary and an API call for non-Observer triggers, like:

Spinner.observe = function(subject, topic) {
  Services.observer.removeObserver(this, topic);
  this.runDependents();
}

Spinner.runDependents = function() {
  ... the rest of your current observe() method
}

RunStateDependency.runDependents = function(runstate) {
  // handle unknown runstate and known runstate with no spinner
  let spinner = spinners.get(runstate);
  spinner.runDependents();
}

If we want to get fancy about delaying registering the Observer until a client is registered for the Spinner, we could make this a bit more complicated. I'm not sure that's really necessary.

@@ +123,5 @@
> + * @param {string} topic The runstate.
> + */
> +function Spinner(runstate) {
> +  this._runstate = runstate;
> +  this._clients = new Set(); // set to |null| once it is too late to register

I prefer my version with this as a Map to automatically apply the "one action per runstate / clientName" constraint
Comment on attachment 802304 [details] [diff] [review]
Adapting OS.File, v2

Review of attachment 802304 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/osfile/modules/osfile_async_front.jsm
@@ +268,4 @@
>   */
> +let warnAboutUnclosedFiles = {
> +  runstate: "web-workers-shutdown",
> +  name: "warn about unclosed files",

While DXR makes it relatively easy to find these strings, I'd like to establish a tradition of using the client module name here, something like

name: "OS.File: warn about unclosed files"

(or just name: "OS.File" if it's the only run state dependency registered by this module)

@@ +280,5 @@
> +    // Send a "System_shutdown" message to the worker.
> +    return Scheduler.post("System_shutdown").then(function onSuccess(opened) {
> +      let msg = "";
> +      if (opened.openedFiles.length > 0) {
> +        msg += "The following files are still opened:\n" +

s/still opened/still open

@@ +287,5 @@
> +      if (msg) {
> +        msg += "\n";
> +      }
> +      if (opened.openedDirectoryIterators.length > 0) {
> +        msg += "The following directory iterators are still opened:\n" +

s/still opened/still open

@@ +292,5 @@
> +          opened.openedDirectoryIterators.join("\n");
> +      }
> +      // Only log if file descriptors leaks detected.
> +      if (msg) {
> +        LOG("WARNING: File descriptors leaks detected.\n" + msg);

I don't like inserting newlines in log messages; it can make the logs difficult to parse for automatic scanners. I'd put them all on one line unless you think there will be a lot; in that case log one message for each.

@@ +303,5 @@
>  
>  // Attaching an observer for PREF_OSFILE_TEST_SHUTDOWN_OBSERVER to enable or
>  // disable the test shutdown event observer.
>  // Note: By default the PREF_OSFILE_TEST_SHUTDOWN_OBSERVER is unset.
>  // Note: This is meant to be used for testing purposes only.

Can we get this test harness out of the production code and just call RunStateDependency.observe("whatever") directly from your test case to drive the shutdown tests through the module's normal listener? Even if you can't drive all the tests using the module's existing shutdown dependency registration, you can probably move all the code below into the test .js if you just export |warnAboutUnclosedFiles| from the module; then you can get rid of the pref listeners.

Even better, you could decouple the OS.File shutdown warning tests from RunStateDependency completely. For the AddonManager tests, I just have the test harness load the module, call the shutdown() function that *would* have given the promise to the RunStateDependency spinner, and then spin the event loop in the test harness.

@@ +998,5 @@
> +
> +// Auto-flush OS.File during profile-before-change. This ensures that any I/O
> +// that has been queued *before* profile-before-change is properly completed.
> +// To ensure that I/O queued *during* profile-before-change is completed,
> +// clients should register using RunStateDependency.

I would prefer not to have two separate stages in OS.File shutdown, one where we guarantee I/O and another where we forbid it, with clients needing to worry about whether they fall into the gap.

I have a philosophical rant forming in my mind about this, which I will post as a separate comment.
Attachment #802304 - Flags: feedback?(irving) → feedback-
(In reply to Tim Taubert [:ttaubert] from comment #30)
> Comment on attachment 802368 [details] [diff] [review]
> Adapting SessionStore, v2 (do not land)
> 
> Review of attachment 802368 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This patch still contains the SessionSaver.jsm changes... Also, should
> writeInProgres maybe rather be 'writesInProgress' and be a counter? In
> theory, there's nothing that prevents me from calling .write() three times
> in a row, right?

We hold the latest promise, so calling .write() three times strictly in a row should be ok. But you are right, there is an issue here, due to the fact that some unlanded version of SessionStore.jsm (e.g. bug 894595) could theoretically decide to start at an inconvenient moment, while the RunStateDependency mechanism is spinning the event loop and somehow enqueue the promise after we our RunStateDependency client has been satisfied, theoretically ending up with a write that finishes too late. This involves several narrow windows, but it's possible. To fix this, we should introduce a SessionFile.close operation that prevents further writes from being enqueued. Let's move that conversation to bug 914581.
As our I/O gets more and more Promise-based, it will get harder and harder for developers to figure out exactly when things will happen, and I don't think we should impose the burden of understanding the exact event loop behaviour of our Promise mechanisms, and depending on that (possibly changing) behaviour to make sure that their code runs at the right time.

In the current environment, with shutdown events sequenced by Observer notifications, I don't think there's any other way than to force all async I/O clients to have a RunStateDependency on profile-before-change, unless the developer goes to great lengths to make sure they never accidentally fall back into the event loop (because of a promise or a callback) before starting another operation.

Even if we have OS.File do its auto-flush in profile-before-change2 (or even in web-workers-shutdown) we're still at the mercy of event sequencing if any client doesn't make sure they've at least started all their I/O during profile-before-change. Without that, clients could have arbitrary I/O failures because either OS.File is denying requests due to shutdown, or request completion is no longer guaranteed because the flush has already happened.

My head is full of ideas about how to directly sequence the module dependencies, with a directed graph of "this has to be shut down before that" driving an async shutdown process; by letting each service explicitly declare what it depends on we can make the mystery of "what should I observe in order to shut down safely" a thing of the past.

I suspect we won't reach that golden place any time soon, so I think our medium term goal should be:

* any module that does async I/O (either OS.File or async Storage) must have RunStateDependency on an Observer event that triggers before the underlying service becomes unavailable (profile-before-change would be fine, but we could also use service-specific events)

* Other module shutdown dependencies should be investigated (e.g. I think the addon blocklist service asynchronously calls AddonManager)

* OS.File and Storage should forbid new calls before flushing in-progress work, so that it's not possible for a client to submit requests that may never complete.
(In reply to :Irving Reid from comment #31)
> This could cause problems with lazy module loading. If Module1 does
> XPCOMUtils.defineLazyModuleGetter("Module2"), and then tries to register a
> dependency on Module2's run state, it's possible that Module2 hasn't loaded
> and registered yet.

This should be easy to find out/debug.

> Aside from that, I see how this lets us register "run states" that don't
> depend on Observer notifications, with the run state triggered by resolving
> the supplied promise. To me, using a Promise for this feels a bit clumsy;
> why not use traditional observe() where necessary and an API call for
> non-Observer triggers, like:

I concur that using promises to register run states is a bad idea. For the moment, let's keep only observer notifications, there will be time to add a more powerful extension mechanism later.

> @@ +123,5 @@
> > + * @param {string} topic The runstate.
> > + */
> > +function Spinner(runstate) {
> > +  this._runstate = runstate;
> > +  this._clients = new Set(); // set to |null| once it is too late to register
> 
> I prefer my version with this as a Map to automatically apply the "one
> action per runstate / clientName" constraint

Whereas mine is one action per runstate / client object. I have no strong preference here.
(In reply to :Irving Reid from comment #32)
> Comment on attachment 802304 [details] [diff] [review]
> Adapting OS.File, v2
> 
> Review of attachment 802304 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/components/osfile/modules/osfile_async_front.jsm
> @@ +268,4 @@
> >   */
> > +let warnAboutUnclosedFiles = {
> > +  runstate: "web-workers-shutdown",
> > +  name: "warn about unclosed files",
> 
> While DXR makes it relatively easy to find these strings, I'd like to
> establish a tradition of using the client module name here, something like
> 
> name: "OS.File: warn about unclosed files"

Good idea.

> (or just name: "OS.File" if it's the only run state dependency registered by
> this module)

(it's not)

> I don't like inserting newlines in log messages; it can make the logs
> difficult to parse for automatic scanners. I'd put them all on one line
> unless you think there will be a lot; in that case log one message for each.

Let's keep that part of the review for another bug. The warning mechanism has been present in OS.File for some time, I just rewrote it without changing its features or API (well, truth be told, I slightly simplified the API, but I don't want to spend too much time on this right now).

[...]

> @@ +998,5 @@
> > +
> > +// Auto-flush OS.File during profile-before-change. This ensures that any I/O
> > +// that has been queued *before* profile-before-change is properly completed.
> > +// To ensure that I/O queued *during* profile-before-change is completed,
> > +// clients should register using RunStateDependency.
> 
> I would prefer not to have two separate stages in OS.File shutdown, one
> where we guarantee I/O and another where we forbid it, with clients needing
> to worry about whether they fall into the gap.

At that stage, we have not started shutting down OS.File. OS.File actually shuts down at some point of xpcom-shutdown. Here, we flush to give a fighting chance to OS.File clients that had queued I/O before profile-before-change, without realizing that the I/O could possibly be executed after profile-before-change.
(In reply to :Irving Reid from comment #34)
> I suspect we won't reach that golden place any time soon, so I think our
> medium term goal should be:
> 
> * any module that does async I/O (either OS.File or async Storage) must have
> RunStateDependency on an Observer event that triggers before the underlying
> service becomes unavailable (profile-before-change would be fine, but we
> could also use service-specific events)

This is also what I had in mind, at least for critical data.
Example of service-specific events: places-shutdown for everything Places-related.

> * Other module shutdown dependencies should be investigated (e.g. I think
> the addon blocklist service asynchronously calls AddonManager)

Agreed. This is a necessity.

> * OS.File and Storage should forbid new calls before flushing in-progress
> work, so that it's not possible for a client to submit requests that may
> never complete.

This is mostly what I had in mind (either completely forbidding submitting requests or at least printing out warnings), although the details need to be worked out, given that I intend to use OS.File during profile-before-change2. So, I'll be glad to file a followup bug.

I would add the following point:

* Progressively work on replacing event loop spinning with asynchronous observers (https://wiki.mozilla.org/Snappy/AsyncShutdown, bug 722648, bug 819060), once said mechanism is implemented.
Attached patch Runstate dependencies, v4 (obsolete) — Splinter Review
New version of RunStateDependency.jsm:
- registerRunState has grown back into something weaker that takes notification topics, rather than promises - there will be time to add more extension mechanisms later if need arises;
- a few minor fixes.

I have come to the conclusion that we will want to land this before the uplift (or shortly afterwards and uplift it), in particular for bug 911820, and also to avoid potential (minor) data loss in Session Restore. This means that we'll need to freeze features right about now, and proceed quickly to review and super-review.

We need to decide whether we can wait for v2 to start thinking about how to best detect and report misbehaving clients (typically, clients that somehow lock by spinning the event loop themselves, failing to resolve their promise or attempting to nest runstates). I believe that we can afford to wait until we have a little more experience using the module before we start designing and implementing such control, provided that we do not advertise widely this module before we are ready.

If we agree, I believe that we're basically ready for review.
Attachment #802356 - Attachment is obsolete: true
Attachment #802612 - Flags: feedback?
Attachment #802612 - Flags: feedback? → feedback?(nfroyd)
Attached patch Adapting OS.File, v3 (obsolete) — Splinter Review
Trivial changes. Ready for review.
Attachment #802304 - Attachment is obsolete: true
Attachment #802892 - Flags: review?(nfroyd)
Comment on attachment 802892 [details] [diff] [review]
Adapting OS.File, v3

Review of attachment 802892 [details] [diff] [review]:
-----------------------------------------------------------------

Giving this an f+ because I want to go through all this stuff today to make sure everything makes sense.  I think I was just handing out +s like candy yesterday.

::: toolkit/components/osfile/modules/osfile_async_front.jsm
@@ +148,5 @@
> +
> +  /**
> +   * The latest promise returned.
> +   */
> +  latestPromise: null,

This null can still cause problems for the profile-before-change runstate dependency; is it possible that this can just be a null promise (Promise.resolve())?  If that's not feasible, then you need to null-check Scheduler.latestPromise.

@@ +321,5 @@
> +      };
> +      Services.obs.addObserver(observer, TOPIC, false);
> +      RunStateDependency.registerRunState({
> +          topic: TOPIC,
> +          promise: deferred.promise

This should be |action| and a function returning |deferred.promise|, yes?

This should probably also have |name:|.

@@ +1004,5 @@
> +  runstate: "profile-before-change",
> +  name: "OS.File: flush I/O queued before profile-before-change",
> +  action: function() {
> +    // Wait until the latest currently enqueued promise is satisfied/rejected
> +    return Scheduler.latestPromise.then(null,

Or you could check Scheduler.launched here.
Attachment #802892 - Flags: review?(nfroyd) → feedback+
(In reply to Nathan Froyd (:froydnj) from comment #40)
> Comment on attachment 802892 [details] [diff] [review]
> Adapting OS.File, v3
> 
> Review of attachment 802892 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Giving this an f+ because I want to go through all this stuff today to make
> sure everything makes sense.  I think I was just handing out +s like candy
> yesterday.
> 
> ::: toolkit/components/osfile/modules/osfile_async_front.jsm
> @@ +148,5 @@
> > +
> > +  /**
> > +   * The latest promise returned.
> > +   */
> > +  latestPromise: null,
> 
> This null can still cause problems for the profile-before-change runstate
> dependency; is it possible that this can just be a null promise
> (Promise.resolve())?  If that's not feasible, then you need to null-check
> Scheduler.latestPromise.

I don't think it's an issue now that runstate promotes non-promise values to promises, but this this can definitely be replaced with Promise.resolve() to ease our minds.

> 
> @@ +321,5 @@
> > +      };
> > +      Services.obs.addObserver(observer, TOPIC, false);
> > +      RunStateDependency.registerRunState({
> > +          topic: TOPIC,
> > +          promise: deferred.promise
> 
> This should be |action| and a function returning |deferred.promise|, yes?
> 
> This should probably also have |name:|.

Oh, forgot to remove the promise. In the latest version of RunStateDependency, we only care about field |topic| (which replaces field |name|).

> 
> @@ +1004,5 @@
> > +  runstate: "profile-before-change",
> > +  name: "OS.File: flush I/O queued before profile-before-change",
> > +  action: function() {
> > +    // Wait until the latest currently enqueued promise is satisfied/rejected
> > +    return Scheduler.latestPromise.then(null,
> 
> Or you could check Scheduler.launched here.
Attached patch Adapting OS.File, v4 (obsolete) — Splinter Review
Applied feedback. Not putting r? yet because we'll want the RunStateDependency.jsm to be reviewed first anyway.
Attachment #802892 - Attachment is obsolete: true
The design here definitely needs to be superreviewed by someone familiar with shutdown before we land this.
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #43)
> The design here definitely needs to be superreviewed by someone familiar
> with shutdown before we land this.

Do you have someone to suggest?
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #45)
> bsmedberg, ideally, I think.

I agree with this and was going to suggest it myself. :)
Attached patch Runstate dependencies, v5 (obsolete) — Splinter Review
Trivial cleanup.
Also, putting bsmedberg into the loop.
Attachment #802612 - Attachment is obsolete: true
Attachment #802612 - Flags: feedback?(nfroyd)
Attachment #803076 - Flags: feedback?(benjamin)
Comment on attachment 803076 [details] [diff] [review]
Runstate dependencies, v5

I think the general approach is good. The API is oddly named and probably could be clearer.

Naming:
"runstate" is a neologism in the platform. If we're specifically scoping this to shutdown notifications, I think we should call it StateTracker or ShutdownBlocker. If we're not scoping this to observer notifications, then the API should be different anyway...

API:
Currently there is a function (client.action) which then optionally returns a promise. The idea behind this (I think?) is that clients would no longer register directly with the observer service, but would instead call .registerClient? This seems to conflate two things:

* being notified when a particular phase happens
* delaying the end of the phase (via a promise) until some async activity is complete

Perhaps we should separate those:

StateTracker.profileBeforeChange.observe(function() {
  StateTracker.profileBeforeChange.wait(promise);
});

This would allow you to extend this to other scenarios, such as delaying idle observers:

StateTracker.idleObserver.wait(promise); // don't fire idle observers until this promise is resolved

It would also allow clients to block profileBeforeChange or other notifications without worrying about whether we are currently running normally or shutting down.
Attachment #803076 - Flags: feedback?(benjamin)
Notes from today's meeting. If I have missed something, please don't hesitate to comment.

Our long-term objective is to move towards an Async Shutdown mechanism (see https://wiki.mozilla.org/Snappy/AsyncShutdown) that does not rely upon spinning the event loop (see bug 722648, bug 819060). So,  any API should be designed so that eventually moving to such mechanisms should not require a complete reengineering of the clients.

** V1

For v1, we will go for the most minimal primitives, i.e. separate |observe| and |wait|, as suggested by bsmedberg.

Also, for v1 (and the coming uplift), we will only use the service for AddonManager, Session Restore and OS.File itself, and we will neither port other clients to the service nor suggest that other clients should be ported for the service, this will give us some time to gather experience on the service before we release it in the wild. Side-note: in v1, module name will certainly be prefixed with "_".

For the moment, we have our eyes on the following list of notifications/states: places-shutdown, storage-shutdown, profile-before-change, profile-before-change2, xpcom-shutdown. For v1, we will only cover profile-before-change.

We need to be aware that misbehaving clients will appear sooner or later, and will find a way to use this mechanism to cause deadlocks. Tracking them will not be easy. For v1, we will content ourselves with logging something after a timeout.

Oh, finally, name "run-state" does not seem to be a winner. We still need to bikeshed a replacement.

** Longer-term developments

* Eventually, we might decide to rely upon some declarative mechanism (side-note: perhaps an extension of https://developer.mozilla.org/en-US/docs/XPCOM/Receiving_startup_notifications) to track these reliably, but this is long-term.

* Eventually, if we detect that clients are locking Firefox, we will want to crash Firefox (preferably with meaningful logs) rather than letting Firefox spin and drain battery forever. This requires being smart about timeouts, as we do not want to crash Firefox if the computer is sleeping. A possible simple hack to detect sleeping computer is measuring the duration of intervals created with setInterval. Not for v1, but we should start thinking about this before we make this API public.

* Eventually, we may want to observe the stack to detect misbehaving clients.

* Do we want the implementation to remain a jsm or do we want to move this to XPCOM? XPCOM would let us access the same mechanism from C++, which would be a big gain. Such a refactoring can be made without changing the API.

** Additional discussions

* Are (global) notifications the right support for this mechanism, or do we want something more localized (e.g. each service could provide runstate objects for its various initialization and deinitialization phases)? To be discussed further.

* Concern has been expressed about the use of promises and the risk that they might hold onto state for the duration of the whole session, hence hoarding valuable memory. To be discussed further.
For the moment, I am going with "Phase" instead of "RunState". I hope this satisfies everyone.

> For v1, we will go for the most minimal primitives, i.e. separate |observe| and |wait|, as suggested by bsmedberg.

I gave some more thought on the "one primitive" vs. "two primitives" debate. Given that there was no strong opinion on which direction we wanted to head and that both choices ultimately offer the same expressiveness, I went through a few scenarios which rather convinced me that "one primitive" is a better choice, so my next iteration will offer a single primitive.

* Offering a single higher-level primitive gives us more refactoring flexibility. Since we intend to eventually refactor the mechanism to get rid of event loop spinning wherever we can, I believe that this is important. 

* Offering a single higher-level primitive also gives us more information that we will be able to use for tooling/debugging purposes, which is also important.

* We do not yet have a usage scenario for the "two primitives" that does not require us to use these two primitives together and reconstitute the single primitive.

* As suggested by Irving, we can hope that the single primitive encourages clients to do things lazily/asynchronously, which should generally be better for responsiveness and memory hoarding.

* The "two primitives" version makes it easier to get things wrong.

Example 1:

 Phase.observe("profile-before-change2", function() {
   // ...
   Profile.wait("profile-before-change", promise);
      // Oops, wrong phase, deadlock or runtime error
 });

by opposition to

 Phase.addCondition("profile-before-change2", function() {
    // ...
    return promise; // Cannot get the wrong phase
 });


Example 2:

 Phase.observe("profile-before-change", function() {
   let promise = //...
   promise.then(function() {                       // Execute on next tick
     Phase.wait("profile-before-change", promise);
       // Oops, this *might* be too late, non-deterministically.
   });
 });


by opposition to

 Phase.addCondition("profile-before-change2", function() {
    // ...
    return promise; // Same tick, no chance to have left the event loop spinner
 });
Attached patch Phase.jsm (obsolete) — Splinter Review
Here we go with the all-new "Phase" module. See comment 50 for the rationale behind the latest changes. I took the opportunity to add a few tests, too.
Attachment #802189 - Attachment is obsolete: true
Attachment #802349 - Attachment is obsolete: true
Attachment #802368 - Attachment is obsolete: true
Attachment #802959 - Attachment is obsolete: true
Attachment #803076 - Attachment is obsolete: true
Attachment #802349 - Flags: feedback?(nfroyd)
Attachment #802349 - Flags: feedback?(mak77)
Attachment #803685 - Flags: review?(nfroyd)
Attached patch Adapting OS.File, v5 (obsolete) — Splinter Review
Adaptation of OS.File. No major change here.
Attachment #803687 - Flags: review?(nfroyd)
Attached patch Adapting OS.File tests (obsolete) — Splinter Review
Adapting the relevant test of OS.File. Now that the shutdown warning is handled with phases, we had to be a little less heavy-handed with the test. Which is probably good since that test was intermittently failing anyway (bug 866293).
Attachment #803693 - Flags: review?(nfroyd)
Comment on attachment 803685 [details] [diff] [review]
Phase.jsm

Review of attachment 803685 [details] [diff] [review]:
-----------------------------------------------------------------

test_phase.js needs some tests for error cases.  (Fields missing, wrongly typed fields, multiple registration of clients, etc.  If you can do the "phase has already completed" error path, that'd be good too.)

::: toolkit/modules/Phase.jsm
@@ +5,5 @@
> +/**
> + * Representation of a Phase during the shutdown of the application.
> + *
> + * Firefox shutdown is composed of phases that take place
> + * sequentially. In particular, each shutdown phase typically remove

Nit: "removes".

@@ +10,5 @@
> + * some capabilities from the application. For instance, at the end of
> + * phase "profile-before-change", no service is permitted to write to
> + * the profile directory (with the exception of Telemetry). In
> + * particular, this means that any pending profile-bound I/O must be
> + * completed during phase "profile-before-change", or, in other words,

I'd say "completed before the end of phase "profile-before-change" and drop the "in other words" bit.

@@ +13,5 @@
> + * particular, this means that any pending profile-bound I/O must be
> + * completed during phase "profile-before-change", or, in other words,
> + * that the completion of such I/O is a Condition for the completion
> + * of phase "profile-before-change". Indeed, since I/O generally takes
> + * place off the main thread, failing to add this Condition can (and

Nit: no capitalization on Condition.  And this sentence is going to need tweaking.

@@ +21,5 @@
> + * ensure that all add-ons have safely written their data to disk, before
> + * writing its own data. Since the data is saved to the profile, this must
> + * be completed during phase "profile-before-change".
> + *
> + * Phase.addCondition({

I like painting bikesheds.  I don't think addCondition is the right terminology here; addBlocker sounds better IMHO.

@@ +24,5 @@
> + *
> + * Phase.addCondition({
> + *   phase: "profile-before-change",
> + *   name: "Add-on manager: shutting down",
> + *   action: function() {

Onto the bikeshed door; I don't think |action| is the right term here, but I'm not sure what a better one is (waiter?).

@@ +40,5 @@
> +"use strict";
> +
> +let Cu = Components.utils;
> +let Cc = Components.classes;
> +let Ci = Components.interfaces;

const'ify these?

@@ +49,5 @@
> +const DELAY_WARNING_MS = 10 * 1000;
> +
> +function debug(...args) {
> +  dump("*** Phase: " + args.join(" ") + "\n");
> +}

Leftover code?

@@ +126,5 @@
> +   * following fields:
> +   * - {string} topic The notification topic for this Phase.
> +   * @see {https://developer.mozilla.org/en-US/docs/Observer_Notifications}
> +   */
> +  registerPhase: function(phase) {

It's not obvious to me that we want this to be an exposed API at the moment.  Anybody who wants to add a phase should probably be requesting review to make sure the phase makes sense; we don't want the phases proliferating like observer topics.  This should be moved to an internal function.  We can expose it later.

I like Benjamin's Phase.profileBeforeChange.<APICall> notation.  Not sure whether that's too cute and hides the observer topic-ness of what we're doing, though.

@@ +168,5 @@
> +   * to the next runstate.
> +   */
> +  addCondition: function(condition) {
> +    if (!("name" in condition)) {
> +      throw new TypeError("Expected field |name|");

All this checking for the correctness of the form of |condition| should probably be moved to registerCondition, since that's the external interface.

@@ +177,5 @@
> +    if (typeof condition.action != "function") {
> +      throw new TypeError("Field |action| must be a method");
> +    }
> +    if (!this._conditions) {
> +      throw new Error("Phase " + this._runstate.name +

Where does this._runstate get set?  Should it be this._phase.topic?  this._phase.description?  (.description is otherwise unused, AFAICS)

@@ +191,5 @@
> +
> +    // Avoid keeping long-term memory leases.
> +    registeredSpinners.delete(topic);
> +
> +    let satisfied = false; // |true| once we have satisfied all conditions

Let's move this closer to the point of use.
Attachment #803685 - Flags: review?(nfroyd) → feedback+
Comment on attachment 803687 [details] [diff] [review]
Adapting OS.File, v5

Review of attachment 803687 [details] [diff] [review]:
-----------------------------------------------------------------

Looks right, but we still have to figure out the Phase interface.

::: toolkit/components/osfile/modules/osfile_async_front.jsm
@@ +148,5 @@
> +
> +  /**
> +   * The latest promise returned.
> +   */
> +  latestPromise: Promise.resolve("OS.File scheduler hasn't been launched yet"),

Yay!
Attachment #803687 - Flags: review?(nfroyd) → review+
(In reply to Nathan Froyd (:froydnj) from comment #55)
> Looks right, but we still have to figure out the Phase interface.

When writing that, it strikes me that the module name should be ShutdownPhase or similar; Phase is too generic.
Comment on attachment 803693 [details] [diff] [review]
Adapting OS.File tests

Review of attachment 803693 [details] [diff] [review]:
-----------------------------------------------------------------

rs=me; I'm going to assume you know what you're doing, since this is test-only and a very muddled set of changes.
Attachment #803693 - Flags: review?(nfroyd) → review+
Comment on attachment 803685 [details] [diff] [review]
Phase.jsm

Review of attachment 803685 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/modules/Phase.jsm
@@ +96,5 @@
> +   * - {string} phase The topic of the phase. The phase must have previously
> +   * been registered with method |registerPhase|.
> +   * - {string} name The human-readable name of the condition. Used
> +   * for debugging/error reporting. Please make sure that the name
> +   * respects the following model: "Some Service: some action in progress";

add an example, such as

* e.g. "OS.File: flushing pending I/O operations"

@@ +124,5 @@
> +   *
> +   * @param {object} phase The phase to register, as an object with the
> +   * following fields:
> +   * - {string} topic The notification topic for this Phase.
> +   * @see {https://developer.mozilla.org/en-US/docs/Observer_Notifications}

Wrapping a single string parameter in an object is going a little too far, I think.

@@ +126,5 @@
> +   * following fields:
> +   * - {string} topic The notification topic for this Phase.
> +   * @see {https://developer.mozilla.org/en-US/docs/Observer_Notifications}
> +   */
> +  registerPhase: function(phase) {

Observer notification topic names don't need to be parseable as JS identifiers, so we need to keep them as strings in the API.

If we do keep registerPhase, switching to _registerPhase is fine, don't add the complexity of a preference.
Applied feedback, except when Nathan and Irving's feedback went in opposite directions :)
Attachment #803685 - Attachment is obsolete: true
Attachment #803817 - Flags: review?(nfroyd)
Attachment #803817 - Flags: feedback?(irving)
Attached patch Adapting OS.File tests, v2 (obsolete) — Splinter Review
Propagated latest changes.
Attachment #803693 - Attachment is obsolete: true
Attachment #803821 - Flags: review+
Attached patch Adapting OS.File, v6 (obsolete) — Splinter Review
Propagating latest changes
Attachment #803687 - Attachment is obsolete: true
Attachment #803822 - Flags: review+
> * Offering a single higher-level primitive also gives us more information
> that we will be able to use for tooling/debugging purposes, which is also
> important.
> 
> * We do not yet have a usage scenario for the "two primitives" that does not
> require us to use these two primitives together and reconstitute the single
> primitive.

I absolutely do. Take the I/O which happens writing session restore. This happens periodically. When we start writing this data, ideally we'd just be able to do

ShutdownPhase.profileClosing.wait(promiseThatIFinishedWritingThisSessionData);

If session restore also wanted to do a last-ditch session write on shutdown, it could also register to be notified about shutdown. But that's separate!

> * As suggested by Irving, we can hope that the single primitive encourages
> clients to do things lazily/asynchronously, which should generally be better
> for responsiveness and memory hoarding

I'm not sure what this means. I don't think in general that we should be encouraging code to only write state at shutdown. Crashes or power failures are common enough that dirty data should be written regularly.

> * The "two primitives" version makes it easier to get things wrong.
> 
> Example 1:
> 
>  Phase.observe("profile-before-change2", function() {
>    // ...
>    Profile.wait("profile-before-change", promise);
>       // Oops, wrong phase, deadlock or runtime error
>  });

This should assert/throw an exception. You can't wait for something that's already passed.

> Example 2:
> 
>  Phase.observe("profile-before-change", function() {
>    let promise = //...
>    promise.then(function() {                       // Execute on next tick
>      Phase.wait("profile-before-change", promise);
>        // Oops, this *might* be too late, non-deterministically.
>    });
>  });

This is just a programming error, but a similar pattern should work:

Phase.profileShutdown.observe(function() {
  let promise = //...
  Phase.profileShutdown.wait(promise);
});

I think for the immediate term, if you want to just have the .observe API and accept promises from the callback, that's ok. But long-term I do think we'll want the .wait API also.

I do feel strongly that the API should not expose topic names at all. I really do want it to be something like

ShutdownPhase.profileShutdown.observe(function);

For debugging, we should probably just add another parameter, rather than passing an object:

ShutdownPhase.profileShutdown.observe(function, "debugstring");
Comment on attachment 803687 [details] [diff] [review]
Adapting OS.File, v5

Review of attachment 803687 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/osfile/modules/osfile_async_front.jsm
@@ +994,5 @@
> +// that has been queued *before* profile-before-change is properly completed.
> +// To ensure that I/O queued *during* profile-before-change is completed,
> +// clients should register using Phase.addCondition.
> +Phase.addCondition({
> +  phase: "profile-before-change",

I'm going to repeat my earlier comments about having an interval between when OS.File I/O can be safely queued and left to complete, and when attempts to start new I/O operations will throw synchronous errors to let the caller know that async I/O is unavailable.

If you're going to put in an explicit wait for pending I/O to complete, do it at the same time as you close OS.File for future I/O. At first glance it looks like immediately before you post the System Shutdown message would be the right place.
Attachment #803687 - Attachment is obsolete: false
Attachment #803687 - Flags: review+ → review?(nfroyd)
Comment on attachment 803687 [details] [diff] [review]
Adapting OS.File, v5

Review of attachment 803687 [details] [diff] [review]:
-----------------------------------------------------------------

Mmm, I thought about that, and then convinced myself that it wasn't going to be a problem.  I think I was wrong.  For avoidance of doubt, IIUC, Irving's postulating a situation like:

- OS.File waits for pending operations;
- Other clients at profile-before-change invoke OS.File I/O;
- Nothing forces those new requests to complete before the end of profile-before-change;
- Data loss.

Is that correct, Irving?

Which seems reasonable enough.  David, do you have a rebuttal to why this can't happen?
Attachment #803687 - Flags: review?(nfroyd)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #62)
> > * We do not yet have a usage scenario for the "two primitives" that does not
> > require us to use these two primitives together and reconstitute the single
> > primitive.
> 
> I absolutely do. Take the I/O which happens writing session restore. This
> happens periodically. When we start writing this data, ideally we'd just be
> able to do
>
> ShutdownPhase.profileClosing.
> wait(promiseThatIFinishedWritingThisSessionData);
> 
> If session restore also wanted to do a last-ditch session write on shutdown,
> it could also register to be notified about shutdown. But that's separate!

I have written a patch to make Session Restore shutdown-safe (using the single primitive), and my experience seems to indicate otherwise. We want two guarantees:
1. our final write to sessionstore.js must be completed;
2. it must be our final write to sessionstore.js.

With two primitives, to obtain 1, I would use |wait| and to obtain 2, I would use |observe|. Furthermore, I want to avoid a race condition between 1 and 2, so I basically would use both atomically.

> > * As suggested by Irving, we can hope that the single primitive encourages
> > clients to do things lazily/asynchronously, which should generally be better
> > for responsiveness and memory hoarding
> 
> I'm not sure what this means. I don't think in general that we should be
> encouraging code to only write state at shutdown. Crashes or power failures
> are common enough that dirty data should be written regularly.

Actually, if I understand correctly, Irving was actually talking about the promises themselves and memory hoarding.

> > * The "two primitives" version makes it easier to get things wrong.
> > 
> > Example 1:
> > 
> >  Phase.observe("profile-before-change2", function() {
> >    // ...
> >    Profile.wait("profile-before-change", promise);
> >       // Oops, wrong phase, deadlock or runtime error
> >  });
> 
> This should assert/throw an exception. You can't wait for something that's
> already passed.

Certainly. I just suggest that an API that makes the correct usage simpler than the incorrect one is to be preferred.

> 
> > Example 2:
> > 
> >  Phase.observe("profile-before-change", function() {
> >    let promise = //...
> >    promise.then(function() {                       // Execute on next tick
> >      Phase.wait("profile-before-change", promise);
> >        // Oops, this *might* be too late, non-deterministically.
> >    });
> >  });
> 
> This is just a programming error, but a similar pattern should work:

Sure, it's a programming error, just one that's going to be more difficult to both explain and catch. So, as above, I prefer the API that encourages the correct use.

> I think for the immediate term, if you want to just have the .observe API
> and accept promises from the callback, that's ok. But long-term I do think
> we'll want the .wait API also.

Fine with me.

> I do feel strongly that the API should not expose topic names at all. I
> really do want it to be something like
> 
> ShutdownPhase.profileShutdown.observe(function);

Ok, I'll get started on taht.

> For debugging, we should probably just add another parameter, rather than
> passing an object:
> 
> ShutdownPhase.profileShutdown.observe(function, "debugstring");

If I understand correctly, you suggest that the debugging information should not be added until we know that we need it. I believe that this is a bad idea and we will find ourselves lacking any meaningful debug info when we need it most.

Or I misunderstand and you just don't like objects-as-labelled-arguments style, in which case I'll just get rid of it if we only have a single string.

On this, I don't quite agree. If we wait until we need debugging to require the details for debugging
(In reply to :Irving Reid from comment #63)
> Comment on attachment 803687 [details] [diff] [review]
> Adapting OS.File, v5
> 
> Review of attachment 803687 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/components/osfile/modules/osfile_async_front.jsm
> @@ +994,5 @@
> > +// that has been queued *before* profile-before-change is properly completed.
> > +// To ensure that I/O queued *during* profile-before-change is completed,
> > +// clients should register using Phase.addCondition.
> > +Phase.addCondition({
> > +  phase: "profile-before-change",
> 
> I'm going to repeat my earlier comments about having an interval between
> when OS.File I/O can be safely queued and left to complete, and when
> attempts to start new I/O operations will throw synchronous errors to let
> the caller know that async I/O is unavailable.
> 
> If you're going to put in an explicit wait for pending I/O to complete, do
> it at the same time as you close OS.File for future I/O. At first glance it
> looks like immediately before you post the System Shutdown message would be
> the right place.

Ah, sorry about that, I had implemented rejection at some point, then it seems that I lost the patch. I'll add this in v7.
Comment on attachment 803817 [details] [diff] [review]
AsyncShutdown.jsm (was Phase.jsm, was ...)

Review of attachment 803817 [details] [diff] [review]:
-----------------------------------------------------------------

I'll update the Addon Manager patch in bug 911621 to use this module and see how it works.

::: toolkit/modules/AsyncShutdown.jsm
@@ +222,5 @@
> +          let msg = "A phase completion condition is" +
> +            " taking too long to complete." +
> +            " Condition: " + cond.name +
> +            " Phase: " + topic;
> +          warn(msg);

I wish we could get telemetry (or some other kind of feedback) on this, but it's likely to happen too late.
Attachment #803817 - Flags: feedback?(irving) → feedback+
Comment on attachment 803817 [details] [diff] [review]
AsyncShutdown.jsm (was Phase.jsm, was ...)

Review of attachment 803817 [details] [diff] [review]:
-----------------------------------------------------------------

Please take Benjamin's feedback into account or convince him why he is wrong.

::: toolkit/modules/AsyncShutdown.jsm
@@ +118,5 @@
> +    return spinner.addBlocker(blocker);
> +  },
> +
> +  /**
> +   * Access function registerPhase. For testing purposes only.

Actually, you don't need to expose this.  You can have this module register well-known testing phases iff a pref is appropriately set before import.  That will make it easier to support Benjamin's strong preference for topics being a property of AsyncShutdown.

@@ +138,5 @@
> +/**
> + * Register a new phase.
> + *
> + * This method is used for testing purposes only. If you wish to register a
> + * new phase in production code, you should request

...a cookie? :)

@@ +222,5 @@
> +          let msg = "A phase completion condition is" +
> +            " taking too long to complete." +
> +            " Condition: " + cond.name +
> +            " Phase: " + topic;
> +          warn(msg);

We could record telemetry for most cases here, like Irving said.  For some topics, we won't get anything, but I think that's OK; we'll catch the biggest offenders at profile-before-change, which is before telemetry writes.  I think that should be a followup bug; if it's done soon enough, we can uplift it to Aurora in the next week or so.

@@ +278,5 @@
> +
> +// List of well-known runstates
> +// Ideally, runstates should be registered from the component that decides
> +// when they start/stop. For compatibility with existing startup/shutdown
> +// mechanisms, we register a few runstates here.

All this should define appropriate properties on AsyncShutdown; the properties could be named with a regexp like:

s/-([a-z])/\u\1/g

(I think that's the right perl to turn things uppercase).  Or you can give them friendlier names.

::: toolkit/modules/tests/xpcshell/test_phase.js
@@ +36,5 @@
> + * Generate a unique notification topic.
> + */
> +function getUniqueTopic() {
> +  let PREFIX = "testing-phases-";
> +  return PREFIX + Math.random();

This seems likely (but yet unlikely) to cause problems by registering identically-named phases at inconvenient times.  Let's just replace this with a counter that increments on every getUniqueTopic call.

@@ +143,5 @@
> +  } catch (ex) {
> +    do_print("Found exception: " + ex);
> +    exn = ex;
> +  }
> +  do_check_true(!!exn);

There is a do_check_throws in xpcshell; please use that instead of all these try/catch.
Attachment #803817 - Flags: review?(nfroyd) → feedback+
(In reply to Nathan Froyd (:froydnj) from comment #68)
> Comment on attachment 803817 [details] [diff] [review]
> AsyncShutdown.jsm (was Phase.jsm, was ...)
> 
> Review of attachment 803817 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Please take Benjamin's feedback into account or convince him why he is wrong.
> 
> ::: toolkit/modules/AsyncShutdown.jsm
> @@ +118,5 @@
> > +    return spinner.addBlocker(blocker);
> > +  },
> > +
> > +  /**
> > +   * Access function registerPhase. For testing purposes only.
> 
> Actually, you don't need to expose this.  You can have this module register
> well-known testing phases iff a pref is appropriately set before import. 
> That will make it easier to support Benjamin's strong preference for topics
> being a property of AsyncShutdown.

I'm strongly opposed to pref-driven test harness code included in the shipping module. I'd much prefer an approach like I took for the DeferredSave.jsm tests - Cu.import("...AsyncShutdown.jsm") and inject mocks or test settings directly from your test code.

> @@ +278,5 @@
> > +
> > +// List of well-known runstates
> > +// Ideally, runstates should be registered from the component that decides
> > +// when they start/stop. For compatibility with existing startup/shutdown
> > +// mechanisms, we register a few runstates here.
> 
> All this should define appropriate properties on AsyncShutdown; the
> properties could be named with a regexp like:
> 
> s/-([a-z])/\u\1/g
> 
> (I think that's the right perl to turn things uppercase).  Or you can give
> them friendlier names.

I think this is unnecessary over-design. We've lived with string names for Observer topics up to now, let's just stick with those names until we actually change the underlying dependency mechanism. Otherwise everyone will have to learn the mapping from our names to "profile-before-change" or whatever anyway, since that's how all the other code and documentation talks about event sequences.
(In reply to :Irving Reid from comment #67)
> Comment on attachment 803817 [details] [diff] [review]
> AsyncShutdown.jsm (was Phase.jsm, was ...)
> 
> Review of attachment 803817 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'll update the Addon Manager patch in bug 911621 to use this module and see
> how it works.
> 
> ::: toolkit/modules/AsyncShutdown.jsm
> @@ +222,5 @@
> > +          let msg = "A phase completion condition is" +
> > +            " taking too long to complete." +
> > +            " Condition: " + cond.name +
> > +            " Phase: " + topic;
> > +          warn(msg);
> 
> I wish we could get telemetry (or some other kind of feedback) on this, but
> it's likely to happen too late.

Filed as bug 915861.
(In reply to :Irving Reid from comment #69)
> I'm strongly opposed to pref-driven test harness code included in the
> shipping module. I'd much prefer an approach like I took for the
> DeferredSave.jsm tests - Cu.import("...AsyncShutdown.jsm") and inject mocks
> or test settings directly from your test code.

For the moment, I have used a preference, as we do in Session Restore. I have no strong preference (ha!) on this, but I generally like to Object.freeze(...) anything I export to the public. I'll take a look at DeferredSave.jsm and see if I can reuse your design.

[...]
> I think this is unnecessary over-design. We've lived with string names for
> Observer topics up to now, let's just stick with those names until we
> actually change the underlying dependency mechanism. Otherwise everyone will
> have to learn the mapping from our names to "profile-before-change" or
> whatever anyway, since that's how all the other code and documentation talks
> about event sequences.

I believe that the regexp is a little overdesigned. For camelCased property names vs. notification strings, I have no strong preference, so I'll just bow down to my super-reviewer and call this a night.
Attached patch AsyncShutdown.jsm, v2 (obsolete) — Splinter Review
Applied feedback, going to bed.
Attachment #803817 - Attachment is obsolete: true
Attachment #803987 - Flags: review?(nfroyd)
Attached patch Adapting OS.File, v7 (obsolete) — Splinter Review
Propagated changes, plus OS.File now rejects I/O operations once we have passed xpcom-shutdown-threads.

I consider also logging these rejections, but that may have to wait for a followup bug.
Attachment #803687 - Attachment is obsolete: true
Attachment #803821 - Attachment is obsolete: true
Attachment #803822 - Attachment is obsolete: true
Attachment #803988 - Flags: review?(nfroyd)
Propagated changes, I'm keeping the r+.
Attachment #803990 - Flags: review+
Attachment #803987 - Flags: feedback?(irving)
Attachment #803988 - Flags: feedback?(irving)
Comment on attachment 803988 [details] [diff] [review]
Adapting OS.File, v7

Review of attachment 803988 [details] [diff] [review]:
-----------------------------------------------------------------

I see no xpcom-shutdown-ish bits here.  Did you forget to qref?
Attachment #803988 - Flags: review?(nfroyd)
Actually, this is in phase webWorkersShutdown, lines 263+. I realize that I should certainly document argument |shutdown| of the function, so give me a second.
Attached patch Adapting OS.File, v8 (obsolete) — Splinter Review
Same one, with a slight code cleanup, and more documentation.
Attachment #803988 - Attachment is obsolete: true
Attachment #803988 - Flags: feedback?(irving)
Attachment #804425 - Flags: review?(nfroyd)
Attachment #804425 - Flags: feedback?(irving)
Attachment #804425 - Flags: review?(nfroyd) → review+
Comment on attachment 803987 [details] [diff] [review]
AsyncShutdown.jsm, v2

Review of attachment 803987 [details] [diff] [review]:
-----------------------------------------------------------------

I think this is good.  Benjamin will need to look at this; I think you responded to his points in comment 65.

::: toolkit/modules/AsyncShutdown.jsm
@@ +13,5 @@
> + * Firefox shutdown is composed of phases that take place
> + * sequentially. Typically, each shutdown phase removes some
> + * capabilities from the application. For instance, at the end of
> + * phase profileBeforeChange, no service is permitted to write to
> + * the profile directory (with the exception of

Nit: this looks like it needs a paragraph reflow.

::: toolkit/modules/tests/xpcshell/test_phase.js
@@ +95,5 @@
> +  Services.obs.notifyObservers(null, topic, null);
> +  do_check_true(outParams.every((x) => x.isFinished));
> +});
> +
> +function get_exn(f) {

Sigh, apparently everybody has their own do_check_throws, but there's no common thing in xpcshell...
Attachment #803987 - Flags: superreview?(benjamin)
Attachment #803987 - Flags: review?(nfroyd)
Attachment #803987 - Flags: review+
Comment on attachment 804425 [details] [diff] [review]
Adapting OS.File, v8

Review of attachment 804425 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/osfile/modules/osfile_async_front.jsm
@@ +160,5 @@
>    post: function post(...args) {
> +    this.launched = true;
> +
> +    if (this.shutdown) {
> +      return Promise.reject(new Error("OS.File has been shutdown."));

nit: "shut down"
Attachment #804425 - Flags: feedback?(irving) → feedback+
Comment on attachment 803987 [details] [diff] [review]
AsyncShutdown.jsm, v2

Review of attachment 803987 [details] [diff] [review]:
-----------------------------------------------------------------

Works fine in tests with my updated patch for bug 911621.
Attachment #803987 - Flags: feedback?(irving) → feedback+
Comment on attachment 803987 [details] [diff] [review]
AsyncShutdown.jsm, v2

My only concern is this hunk:

+        if (typeof condition == "function") {
+          condition = condition(topic);
+        }
+        condition = Promise.resolve(condition);

Since we document above that blockers don't *have* to return a promise, does Promise.resolve do something useful with undefined?

Per IRC, Promise.resolve(undefined) works. So just add a comment so that the code is readable.
Attachment #803987 - Flags: superreview?(benjamin) → superreview+
Applied feedback.
Attachment #803987 - Attachment is obsolete: true
Attachment #804670 - Flags: superreview+
Attachment #804670 - Flags: review+
Applying feedback.
Attachment #804425 - Attachment is obsolete: true
Attachment #804673 - Flags: review+
We have the following in xpcshell
TEST-UNEXPECTED-FAIL | /builds/slave/talos-slave/test/build/tests/xpcshell/tests/toolkit/components/osfile/tests/xpcshell/test_logging.js | Test timed out

Investigation indicates that this is just a new form of bug 845190, so I'll mark this checkin-needed regardless – and I'll fix bug 845190 asap.
Whiteboard: [Async:P1] → [Async:P1][for intermittent orange, see comment 85]
https://hg.mozilla.org/mozilla-central/rev/f9c7040dddba
https://hg.mozilla.org/mozilla-central/rev/7b1b70e7efe0
https://hg.mozilla.org/mozilla-central/rev/765efd2602e8
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
Actually, we'll defer dev-doc-needed until we are satisfied that the module won't change too much.
Keywords: dev-doc-needed
Depends on: 981556
Product: Toolkit → Toolkit Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: