User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0
Build ID: 20130409194949
Steps to reproduce:
I have an extension that uses IndexedDB for local storage. If the extension is in the process of saving to the IndexedDB (asynchronously) when the browser closes, the save does not complete and the database is left in an incomplete or broken state.
The extension kicks off a number of stores to IndexedDB with objectStore.add(). Before these complete, the user closes the browser with the close button or File > Exit. The pending writes to the IndexedDB never complete.
The pending writes should have completed.
We abort any transactions that are in process, yes, and roll the database back to the last complete transaction. That shouldn't leave the database "in an incomplete or broken state" though. Can you describe what you mean a little more clearly?
Well, broken is in the eye of the beholder, I suppose. In my case, I'm writing out a number of transactions, and the database isn't usable unless they all get written out. So the database isn't "broken" in the database sense, but is unusable to my extension.
Regardless, it's a problem that the pending transactions get aborted. This makes it impossible to guarantee that a transaction will get written out to the database, since the user can always close the browser before the transaction has completed. And since IndexedDB is specifically intended for large amounts of data (and hence it takes a significant time to write out the data) it's not unlikely that this will happen. Given that you cannot guarantee that any particular transaction will actually get written to the database, what use is IndexedDB?
(got to this via a link from the comment on https://hacks.mozilla.org/2013/05/building-a-notes-app-with-indexeddb-redis-and-node-js/)
It sounds like you are creating multiple independent transactions when your program logic requires you to create just one. There are (bad) things you can do to make sure application shutdown is delayed until your transactions will run, but then an application crash could still break your application. I assume you are calling .transaction(..., "readwrite") more than once? It would also explain how you/the user could experience this problem; each write transaction is going to involve at least one fsync(). On non-SSD storage, this can easily be hundreds of milliseconds or even more on systems with a lot of pending I/O. And each transaction pushes out all the subsequent transactions from running, so with a large number of transactions or a lot of system I/O, you could be looking at many seconds.
There is currently an effort to improve shutdown time by just having firefox terminate without cleaning up, so the best thing your extension can do is provide some UI that helps the user know that it's active and they shouldn't shutdown yet. Like cause your icon to spin, use the notifications API to display start/stop, have a XUL 'panel' hover somewhere, etc.
Andrew, thank you for your response. I have in fact done some bundling of transactions to reduce the impact of a shutdown in the middle of saving. The use of a UI indicator that the extension is still "busy" is certainly possible, but that sort of design is fragile and not effective for naive users who don't understand the implications. I certainly wouldn't want to rely on it for successful operation of my extension.
Nonetheless, I still think this is a serious problem that needs to be addressed. The Firefox developers seem to be pushing towards more asynchronous operations (which is fine). However, the default design principle for asynchronous operations is that they will yield some kind of event back to the originating thread, indicating success or failure. What's been created here violates that principle, and is not -- so far as I can tell -- documented anywhere. This creates a problem for every extension that uses any asynchronous operation (which seems like a growing pool of affected users).
Also, I think the use case that bit me -- "The browser is shutting down, so let me save the latest state" -- is not uncommon. It seems to me you've eliminated any way to reliably save information on browser shutdown, and that's a significant shortcoming.
I don't know if the same mechanism applies to user-generated operations, but I've several times had a problem where a bookmark I've created just before closing the browser (e.g., "Oops, gotta go, let me bookmark and quit") has failed to be saved.
The theory of the perf team is that slow shutdowns are bad for the user experience, especially if we're trying to quickly restart to upgrade the user's Firefox and suddenly they're waiting several seconds or more. data-loss is not acceptable either which is why there is a conversation on ordering shutdown observers that happened recently/is still happening at https://groups.google.com/forum/?fromgroups#!topic/mozilla.dev.platform/NVDVrKauzLs.
I think a key development principle for all webbish code is that stuff can crash at any time or the browser tab can get closed. You haven't said what the extension you are develoing does, but usually there's clear delineation of when something interesting has happened that should not be lost. And usually saving that state can happen much faster than the user can generate new states.
But not always, so I do agree that it's worth slowing shutdown to save out explicit user actions, but there is also a burden to try and make things as fast as possible. In the "Storage in Gecko" thread (still recent-ish) at https://groups.google.com/d/msg/mozilla.dev.platform/vYbQqkqGzlo/YYIvd-rB9ToJ it's been decided that IndexedDB is not appropriate for all situations, and that in many cases just dumping a JSON object to disk in an atomic file write is the way to go. That might be the right thing for your extension since writing out one continuous file should end up as a very small number of seeks whereas if you have to do a lot of IndexedDB stuff, there could be a huge number of seeks.
Is there public code I could look at or you could describe what your extension is doing/trying to save and how the schema for that is structured/why you have multiple transactions?
It isn't reviewed yet, but my extension is an RSS reader called "Rsstler" so you can grab the xpi from amo if you're interested. In my case, I want to save off the state of all the feeds (primarily the unread feed items) when the user finishes. That state changes continuously as the user reads items and as the background scanning finds new items in the feeds. I'm also modifying bookmarks, which is another asynchronous operation. (And of course I could be using many other extensions which are also performing asynchronous operations.)
You're right that I can do various engineering to limit the data loss / problems if an asynchronous call is silently terminated, and now that I understand the behavior I'm working on this. (Although I have concerns about the possible browser performance & system impacts if I'm constantly updating an IndexedDB with significant amounts of data.) However, I still think this behavior is wrong, especially for IndexedDB.
Look at the documentation for IndexedDB:
"IndexedDB is a way for you to persistently store data inside a user's browser... IndexedDB is useful for applications that store a large amount of data..."
I think a lot of people are going to read that and think, "Ah, so IndexedDB is a good way to store data between user sessions. When I start up my extensions I can read in the data from the last session, and when my extensions shuts down, I'll save the data for the next session." (I know I did.)
If the documentation said:
"IndexedDB is a way for you to persistently store data inside a user's browser. However, you should be aware that anything you try to save -- particularly if it's a large amount of data -- might not actually get saved if the user closes his browser too soon. In fact, no matter how you try to use IndexedDB, you cannot guarantee that you're actually going to save any data."
I'm not sure how many people would be keen to use it! Worse, you're going to drive people to use synchronous methods, which is probably not what you want.