Last Comment Bug 751802 - Intermittent IndexedDB write failures
: Intermittent IndexedDB write failures
Status: UNCONFIRMED
:
Product: Core
Classification: Components
Component: DOM: IndexedDB (show other bugs)
: 11 Branch
: x86 Mac OS X
: -- normal with 2 votes (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-03 20:47 PDT by Duncan Bayne
Modified: 2014-12-04 08:24 PST (History)
8 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
intermittently failing test case (719.77 KB, application/octet-stream)
2012-05-09 22:30 PDT, daniel.r.heath
no flags Details

Description Duncan Bayne 2012-05-03 20:47:57 PDT
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.168 Safari/535.19

Steps to reproduce:

We have built an application that stores several megabytes of information in IndexedDB storage.  We run automated tests against this application using:

selenium-webdriver (2.21.2)
cucumber (0.6.1)



Actual results:

Normally, the tests drive the application successfully.  Rarely - perhaps once every few hundred writes? - an attempt to write to the IndexedDB fails with an exception like the following:

Exception... "The operation failed because the requested database object could 
not be found. For example, an object store did not exist but was being opened."  
code: "3" nsresult: "0x80660003 (NS_ERROR_DOM_INDEXEDDB_NOT_FOUND_ERR)"

We have also observed this issue in production, albeit very rarely.  When this happens on a user machine, all successive attempts to use the IndexedDB fails.  

We can work around this by deleting the IndexedDB files from the machine, but this causes the user to lose all data that have not been sent to our server.

(We're quite keen to help fix this so if there's anything we can do - e.g. running debug builds etc., please let me know).


Expected results:

The write should have succeeded without raising an error - like it does the other 99% of the time :-)
Comment 1 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-05-04 00:27:42 PDT
What does the actual code look like, if you can share it?
Comment 2 Jonas Sicking (:sicking) PTO Until July 5th 2012-05-04 00:31:03 PDT
Yeah, I think we'll need to see some code here. There isn't enough to go on otherwise.

If you can't share it in public (as in, if you can't share it in this bug), you can send me an email directly.
Comment 3 Duncan Bayne 2012-05-07 17:24:36 PDT
Sure thing.  The source is CoffeeScript:

    _writePage: (storeName, startIndex, endIndex, binder) ->
      writeTransaction = @connection.transaction([storeName], @idbTransaction.READ_WRITE)
      store = writeTransaction.objectStore(storeName)
      for index in [startIndex...endIndex] when (item = binder.list[index])?
        writeRequest = store.put(item)
        writeRequest.onerror = binder.failCallback()
        writeRequest.onsuccess = binder.successCallback()
      if endIndex >= binder.list.length
        binder.finishedRegisteringCallbacks()
        return
      setTimeout((=> @_writePage(storeName, endIndex, endIndex + @WRITE_EACH_PAGE_SIZE, binder)), @WRITE_EACH_PAGE_DELAY)
      null

... which compiles to the following JS:

    ({
      _writePage: function(storeName, startIndex, endIndex, binder) {
        var index, item, store, writeRequest, writeTransaction,
          _this = this;
        writeTransaction = this.connection.transaction([storeName], this.idbTransaction.READ_WRITE);
        store = writeTransaction.objectStore(storeName);
        for (index = startIndex; startIndex <= endIndex ? index < endIndex : index > endIndex; startIndex <= endIndex ? index++ : index--) {
          if (!((item = binder.list[index]) != null)) continue;
          writeRequest = store.put(item);
          writeRequest.onerror = binder.failCallback();
          writeRequest.onsuccess = binder.successCallback();
        }
        if (endIndex >= binder.list.length) {
          binder.finishedRegisteringCallbacks();
          return;
        }
        setTimeout((function() {
          return _this._writePage(storeName, endIndex, endIndex + _this.WRITE_EACH_PAGE_SIZE, binder);
        }), this.WRITE_EACH_PAGE_DELAY);
        return null;
      }
    });
    
The rough idea is that we write a 'page' of data to the IndexedDB, then use setTimeout() to write the next page.  This is so that we can write a lot of data to the IndexedDB without blocking, but the failure occurs even during small writes.
Comment 4 Jonas Sicking (:sicking) PTO Until July 5th 2012-05-07 22:05:25 PDT
So it's the store.put call that throws an exception on rare occasions?

Or is it the writeTransaction.objectStore call that fails?

Either way, would you be able to provide a working testcase? I can't see anything wrong with the code, so we likely need to try to reproduce the bug here.
Comment 5 daniel.r.heath 2012-05-09 22:24:14 PDT
Hi Jonas,

I work with Duncan - he's just had a kid and won't be back online for a little while.

We haven't been able to get something that fails 100% of the time, but I'm putting together a test case which fails intermittently on FF 12, OSX Snow Leopard.
Comment 6 daniel.r.heath 2012-05-09 22:30:04 PDT
Created attachment 622626 [details]
intermittently failing test case

I hit 'save' from a live test-case which fails intermittently on FF12.

It seems firefox doesn't allow indexeddb access from html on the local filesys (also a bug?) so you'll need to use some sort of basic server to run it.
Comment 7 Jonas Sicking (:sicking) PTO Until July 5th 2012-05-10 00:20:21 PDT
A smaller testcase would be hugely helpful. This is a lot of code to read and understand. But we'll give it a try and see how far we get.
Comment 8 Matthew Riddle 2012-05-15 15:28:26 PDT
I was able to replicate the error constantly at one point and it was failing because the database would be opened with the correct version yet not have any of our previously migrated data or object stores.  I am still unable to replicate the "trigger" that caused the database to go into this state though.

The error was occurring due to us trying to find objects within object stores which did not exist. 

The temporary work around was to add a check after the database was setup (with expected version) to ensure it had the object stores mentioned in our migrations. 

If not close, drop and recreate the database. Running the migrations again. This seems to have fixed our intermittent failures but we are losing all previous data stored in the IndexedDB as a result.
Comment 9 Jonas Sicking (:sicking) PTO Until July 5th 2012-05-15 17:05:27 PDT
Matthew: Are you sure you were seeing the same problem as the one this bug was originally filed as? I.e. are you working on the same product or something using the same library?

If you're just happening to see intermittent failures with opening/migrating databases, which may or may not be related to the problem filed in comment 3 and comment 6, you should probably file a new bug.
Comment 10 Matthew Riddle 2012-05-15 17:30:23 PDT
Hi Jonas, thank you for the quick reply. I am sorry, I should have said who I was. I am working on the same product as Daniel and Duncan and trying to resolve the same problem.
Comment 11 Jonas Sicking (:sicking) PTO Until July 5th 2012-05-16 11:39:37 PDT
Ah, sorry, thanks for jumping in!

What do you mean by "previously *migrated* data" (emphasis mine). I.e. how are you "migrating" the data?

The way IndexedDB works when you call |req = indexedDB.open("mydb", 5)| is:

If a database called "mydb" exists, but has a version number *higher* than 5, the open fails and an error event is fired on |req|.

If a database called "mydb" exists and has version 5, the open succeeds and a success event is fired on |req|. req.result is set to the opened database.

If a database called "mydb" exists and has version lower than 5, a upgradeneeded event is fired on |req|. event.oldVersion is set to the old version number of the database, event.newVersion is set to 5, req.result is set to the database and database.version is set to 5. The page is responsible for upgrading the database contents from oldVersion to 5. I.e. you are responsible for creating whatever objectStores/indexes are needed, and modify any data in existing objectStores.

If a database called "mydb" doesn't exist, a upgradeneeded event is fired on |req|. event.oldVersion is set to 0, event.newVersion is set to 5, req.result is set to the database and database.version is set to 5. The page is responsible for upgrading the database contents to 5. I.e. you are responsible for creating whatever objectStores/indexes are needed.
Comment 12 Jonas Sicking (:sicking) PTO Until July 5th 2012-05-16 11:45:37 PDT
I forgot a couple of things in the previous comment. At the end of the last two cases we additionally do the following:

If the upgrade transaction is successfully committed, a success event is fired on |req|, otherwise a error event is fired on |req|.
Comment 13 Matthew Riddle 2012-05-16 15:43:56 PDT
Thanks for the detailed reply Jonas! What I meant by 'migrated data' is when the user opens the database we want to ensure they have the correct structure before we continue, if they don't we run migrations. Here is a code snippet of how we open the database and migrate the database.

The code format is CoffeeScript. I provided it in this format to save on clutter, if javascript is easier to read paste it on this site to compile it. http://js2coffee.org/

 _open: ->
    dbOpenRequest = @indexedDB.open(@databaseName, DSC.repositories.migrations.last().version)
    dbOpenRequest.onerror = -> console.log "database error!!"
    dbOpenRequest.onsuccess = (e) =>
      console.log("successfully opened database")
      @connection = e.target.result
      @connection.onerror = -> console.log "database error!!"
      @_ready()
    dbOpenRequest.onupgradeneeded = (e) =>
      @connection = e.target.result
      @_runMigrations(e.oldVersion, e.newVersion, e.target.result)

  _runMigrations: (oldVersion, newVersion, transaction) ->
    for migration in DSC.repositories.migrations
      if migration.version > oldVersion
        migration.migrate.apply(this, [transaction])

DSC.repositories.migrations = [
  {
    version: 1
    migrate: (transaction) ->
      @_deleteStoreIfExists('pois')
      @connection.createObjectStore("pois", keyPath: "id")
  },
  {
    version: 2
    migrate: (transaction) ->
      @_deleteStoreIfExists('publishedPois')
      @connection.createObjectStore("publishedPois", keyPath: "id")
  },
  {
    version: 3
    migrate: (transaction) ->
      @_deleteStoreIfExists('metadata')
      @connection.createObjectStore("metadata", keyPath: "id")
  },
  # DB: deliberately deleted migration 4 due to incompatibility of an index that we don't use any more.
  {
    version: 5
    migrate: (transaction) ->
      @_deleteStoreIfExists('places')
      @connection.createObjectStore("places", keyPath: "id")
  },
  {
    version: 6
    migrate: (transaction) ->
      @_deleteStoreIfExists('contentPacks')
      @connection.createObjectStore("contentPacks", keyPath: "id")
  },
  {
    version: 7
    migrate: (transaction) ->
      @_deleteStoreIfExists('maptiles')
      @connection.createObjectStore("maptiles", keyPath: "path")
  },
  {
    version: 8
    migrate: (transaction) ->
      @_deleteStoreIfExists('maptiles')
      store = @connection.createObjectStore("maptiles", autoIncrement: true) #don't create an explicit key index
      index = store.createIndex('path', 'path', multiEntry: true)
      console.log(index,'maptiles path index was created')
  }

  ]

What we get at the end of this code is an open database connection at version 8 with our object stores. 
However, there is an error that occurs on occasion which removes all objectStores and data but keeps the version number, causing the error mentioned at the beginning of this thread. 

I hope that adds some clarity. Thank you again for your help in this matter Jonas, we really appreciate it.
Comment 14 Jonas Sicking (:sicking) PTO Until July 5th 2012-05-16 17:40:57 PDT
I still can't see any bugs in the code, so it's definitely looking like a problem on our end. I still have some questions though.

A) Do you know which of the lines in comment 3 that is throwing the exception? Is it
   the connection.transaction call, the writeTransaction.objectStore call or the
   store.put call?

B) Do you know if this happens to a database that was opened and the upgradeneeded
   migration code ran, or does it happen to a already-existing-and-in-version-8
   database? I.e. are the steps
    1) you call indexedDB.open(..., 8);
    2) firefox fires onupgradeneeded
    3) you run the migration code
    4) firefox fires onsuccess
    5) code in comment 3 runs and throws an exception
   or are the steps
    1) you call indexedDB.open(..., 8);
    2) firefox fires onupgradeneeded
    3) you run the migration code
    4) code in comment 3 runs fine
    5) page is closed and so is the database
    6) page is reopened
    7) you again call indexedDB.open(..., 8);
    8) firefox fires onsuccess (since the database is already migrated)
    9) code in comment 3 runs and throws an exception

C) Why do you need all the deleteStoreIfExists calls in the migration code? The
   objectStore with the given name should *never* exist except in the last step
   where you are re-creating an already existing database.

D) When it fails, are *all* objectStores missing? I.e. it's an empty database with
   version 8? Or are just some objectStores missing?
Comment 15 Matthew Riddle 2012-05-16 19:10:11 PDT
To answer your questions:
- A: it is when we are trying to get a transaction on the object store
   h=this.connection.transaction(['obj_store_name'],IDBTransaction.READ_ONLY)

- B: In our tests (reproducible problem), when our tests run, the problem seems to occur when we first open the database in a new profile. This implies that we have attempted to open a non-existent database which should execute all the migrations. So, your first set of steps shows the problem, every other time that we open the database, the version has been correctly set to '8', so onupgradeneeded is not run and as a result there are no object stores.

- C: Well, I guess we don't. They are really only there for developer convenience when developing the migrations. The code of that method is below. I don't think that it should be problematic, but you are right that it does add one more variable.

  _deleteStoreIfExists: (objectStore) ->
    if @connection.objectStoreNames.contains(objectStore)
      @connection.deleteObjectStore(objectStore)

- D: Yes, all objectStores are missing and it is version '8'.

We also think that we have seen this problem in the wild with a couple of our users, and it seems to be very rare. On these occasions it may have occurred at the time of a firefox version upgrade (not sure). In these cases it seems to have actually thrown away their currently existing object stores.

We also have these automated tests running against Chrome and have not had these problems occur. With the previous version of the Firefox API (set version events), we also don't think that we had these problems.
Comment 16 Jonas Sicking (:sicking) PTO Until July 5th 2012-05-16 22:49:38 PDT
(In reply to Matthew Riddle from comment #15)
> - B: In our tests (reproducible problem), when our tests run, the problem
> seems to occur when we first open the database in a new profile. This
> implies that we have attempted to open a non-existent database which should
> execute all the migrations. So, your first set of steps shows the problem,
> every other time that we open the database, the version has been correctly
> set to '8', so onupgradeneeded is not run and as a result there are no
> object stores.

I don't understand the above.

On one hand you are saying "the problem seems to occur when we first open the database in a new profile", which I interpret as meaning as it's failing in the pattern described by the first set of steps.

But then you say "every other time that we open the database, the version has been correctly set to '8', so onupgradeneeded is not run and as a result there are no object stores".

When the database is in version 8, the objects stores should already be existing, so onupgradeeded shouldn't need to run. But the object stores should already be there. So here you seem to be saying that it's failing using the pattern described by the second set of steps.

To be clear, I definitely think this is a problem on our end. I'm just trying to figure out where exactly things go wrong.

I also have a third possible pattern that it could be failing using:

    1) you call indexedDB.open(..., 8);
    2) firefox fires onupgradeneeded
    3) you run the migration code
    4) firefox fires *onerror* (i.e. we fail to properly execute the create/migrate
       database operations).
    5) You *still* attempt to call db.transaction(...) on the database handed to
       onupgraneeded, resulting in the seen exception.

Another way to put the question is this:

i) Do the object stores disappear between you successfully create them in
   onupgradeneeded and when the onsuccess event is been fired?
ii) Do the object stores disappear after you successfully create the database and
    then close it, such that when you reopen the database the object stores aren't
    there.
iii) Does the database create/migrate code fail to successfully create the
     database?

Based on the error you are seeing, I don't think it's iii), but I wanted to ask anyway.

I hope that makes the question more clear?
Comment 17 Jonas Sicking (:sicking) PTO Until July 5th 2012-05-16 22:54:05 PDT
Oh, and if the answer is that it's failing according to the second pattern (ii), then does reopening the database multiple time result in the same behavior? I.e. every time you reopen the database it has version 8 but the object stores aren't there?

If so, would you be able to attach the database here or send it to me over email? You can find the database in 
<profile-directory>/indexedDB/http+++<domain>/<hashed string>.sqlite

See here for finding your profile directory: http://support.mozilla.org/en-US/kb/Profiles
Comment 18 Matthew Riddle 2012-05-17 05:01:46 PDT
Hi Jonas, sorry for the confusion.

In reply to Comment 16
-------
What I was trying to say was your first scenario was correct, I tried to add more detail and made a mess of my explanation. 

To answer your questions

i) Yes the object stores disappear somewhere between us successfully creating them in the onupgradneeded and the onsuccess event being fired.

ii) No, the error occurs before we close the database.

iii) No, The onerror handlers for creation were never called and I added some code to throw an exception if one of my objectstores did not exist after creation immediately after each migration (didn't include it in snippet in comment 13).

In reply to Comment 17
------------
The database didn't fail according to the second pattern however we are seeing the behaviour you mentioned. Yes, "every time you(we) reopen the database it has version 8 but the object stores aren't there"

I haven't got a database that is in the 'corrupted' state currently but if the error occurs again resulting in it I will attach a copy.

Thanks
Comment 19 Jonas Sicking (:sicking) PTO Until July 5th 2012-05-17 11:58:04 PDT
I have one theory, but it seems like a bit of a stretch given the behavior you are seeing.

We currently can on occasion fail random database requests (known issue that should be fixed very soon). We additionally have another bug which causes failures to create objectStores and indexes to not raise error events (also being fixed).

That could explain the behavior if only some object stores disappeared. And only if they disappeared after the database was closed and reopened. It shouldn't cause the object stores to go away while the database is open between the create/migrate code and the success event.

Using the attached test-case, what are the steps to reproduce? Do we just run the testcase until failures show up? Again, providing a smaller testcase would help a lot!
Comment 20 Matthew Riddle 2012-05-17 14:56:26 PDT
I am in the process of creating a smaller test case which will be straight forward to run and reproduce the problem (hopefully). I doubt I will be able to get it failing consistently though.

I am writing a test in Jasmine, would it be useful for me to provide that as well or just the code the steps that triggers the error.

 If it's convenient for you I could create a small Ruby on Rails app with the test case and put it on GitHub so that you can run the test case until it fails.
Comment 21 Jonas Sicking (:sicking) PTO Until July 5th 2012-05-17 22:31:41 PDT
I'm not really familiar with Jasmine, so I don't have an opinion there.

I agree it seems unlikely that we can get it failing consistently, so don't worry about that.

I'm not sure that it's worth putting anything up on github. I suspect we'll want to add diagnosing code so a zipfile is probably the best solution. That way we can modify as needed and upload to servers we have access to.
Comment 22 Matthew Riddle 2012-05-23 14:26:22 PDT
Hi Jonas,

We setup a few small test cases and ran them 2,000 times over the weekend but have yet to see any fail. We are still working towards a smaller test case and will get back to you when we have something that can replicate the error above.

Regards,

Note You need to log in before you can comment on or make changes to this bug.