Closed Bug 986176 Opened 10 years ago Closed 10 years ago

DataCloneError when getting values from IndexedDB. possibly related to Blob rollbacks

Categories

(Core :: Storage: IndexedDB, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: asuth, Unassigned)

References

Details

In the e-mail app I've gotten 3 separate reports of DataCloneErrors that occur when attempting to load mail body blocks from disk.  One is bug 984148 circa March 14th, one is https://bugzilla.mozilla.org/show_bug.cgi?id=925480#c4 circa March 15th, one is a log shown to me when trying to understand if something was already known, also circa March 15th.

Email's fatal error handler reports this as:
indexedDB error: DataCloneError from request as part of readonly transaction on object store "bodyBlocks"

Callstack confirms it was a readonly transaction performing a simple get().

I did some preliminary investigation and attempted to reproduce.  I was unable to reproduce, but I have some theories.  I wanted to log them here and also properly check with the IndexedDB team if there are known issues that would cause this that aren't tracked in bugzilla.  I've started watching the component now to better keep track of what's going on.


== Relevant facts:

- We sometimes store Uint8Array instances in the bodies.  Specifically, when we fetch a snippet for the body and the fetch did not end up retrieving the entire body.  We do modify the prototype of the Uint8Array so it looks like a node Buffer object.

- We sometimes store Blobs in the bodies.  Specifically, if a user downloads embedded images, we put these in IndexedDB.  The user must manually push a button for each e-mail to cause this to happen.  Downloaded attachments are not stored in IndexedDB; they go in DeviceStorage and we just store the device storage name and relative path as two strings in a list.

- When doing periodic sync, the email app will window.close() itself if it started itself up using a mozAlarm and the user has not made us visible.  We appear to somewhat reliably close ourself while we have an outstanding IndexedDB write transaction despite explicitly having logic in place that tries to avoid this.  The log message is:

03-17 15:53:26.699 I/GeckoDump(  873): LOG: mail sync complete, closing mail app 13420.936635999999
03-17 15:53:26.759 E/GeckoConsole(  137): [JavaScript Error: "IndexedDB UnknownErr: IDBTransaction.cpp:863"]

This message appears to be fired as a result of "database->IsInvalidated()" returning true.

http://dxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBTransaction.cpp#863

I believe this results in a rollback of the transaction.  We will obviously fix the e-mail app to stop doing this since it's very dumb.


== Partially investigated theories / complications:

- I looked for obvious structured clone byte-stream changes and regressions, but did not find any.  This included me using a March 1st build and then upgrading and verifying there were no regressions.

- It seemed like a lot of the potential DataClone failures relate to Blobs.  So it seems possible that the e-mail app ridiculously closing itself while a transaction is in flight might be exercising edge cases in the Blob handling code's rollback logic.  The odd thing is that our periodic sync mechanism should not be causing new Blobs to be added to the database.  What is possible is that a block overflow might cause us to split a block and end up writing it to a new key/value.  To state explicitly, previously we had put({ BlobA }, 'foo').  Then we decide to put({}, 'foo'), put({ BlobA }, 'bar').
A database that this reproduces on would be *incredibly* helpful.
:gerard-majax saved off the database at my request.  Unfortunately, the e-mail database is a horrible, horrible concentration of potentially private information which is why I did not request it from him initially and tried to reproduce.  It's got credentials, headers, mail bodies.  :gerard-majax, any chance you were only using an unimportant test account in the mail app?
Flags: needinfo?(lissyx+mozillians)
Yeah, I figured as much.  If there are circumstances under which he would be willing to let us poke at it (perhaps on his machine with him standing over our shoulders) we could probably arrange them at some point.
(In reply to Andrew Sutherland (:asuth) from comment #2)
> :gerard-majax saved off the database at my request.  Unfortunately, the
> e-mail database is a horrible, horrible concentration of potentially private
> information which is why I did not request it from him initially and tried
> to reproduce.  It's got credentials, headers, mail bodies.  :gerard-majax,
> any chance you were only using an unimportant test account in the mail app?

No, it contains all my sensitive email accounts. I can share the db privately, provided you generate a gpg key I can use to encrypt it and send it by email.
Flags: needinfo?(lissyx+mozillians)
Flags: needinfo?(khuey)
Flags: needinfo?(bugmail)
All the data is encoded internally too so we wouldn't be able to read anything accidentally while debugging (we would have to explicitly run everything through the decoder, pinky-swear we won't!). Encrypting sounds like a fine idea though.
I've leave it to Kyle/Ben/Jan to investigate the database, but y'all, please let me know if you would like any assistance to make it easier to locate the corrupt key/value from the e-mail app.  I expect the safest (privacy-wise) solution is to walk the "bodyBlocks" table value by value until an exception is generated.  In practice a headless getAll() may be just as safe as long as you don't inspect into the other values.

https://github.com/benturner/idbbrowser might work but the UI seems to want to display all result rows, so is bad for privacy visually.  Also, the xul tooltips crash gnome-shell for me on a multi-monitor ubuntu system, so I don't like that.
Flags: needinfo?(bugmail)
Jan, do you have cycles to investigate this?
Flags: needinfo?(khuey) → needinfo?(Jan.Varga)
I have the db in question, will try to take a look soon.
I did a little poking. The sqlite file is healthy afaict. Plenty of data. We seem to have orphaned blobs though. The files exist in the filesystem but the 'file' table in the database is empty and no 'object_data' entries have a 'file_ids' value. Jan and I theorize that one of the 'object_data' entries must embed a file id in its structured clone data but of course we can't find it without 'file_ids' info.
Do the orphaned Blobs explain the DataCloneError?  Is the DataCloneError not reproducible on your (:bent's) end?
Should we resolve this WFM now that the SQLite truncate rollback-on-open bug has been addressed, or do we still think it's possible for such an inconsistent state to be plausibly reached?
Let's hope so.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.