Closed
Bug 1078438
Opened 10 years ago
Closed 10 years ago
Contacts/SMS apps are sometimes empty at launch
Categories
(Core :: Storage: IndexedDB, defect)
Core
Storage: IndexedDB
Tracking
()
RESOLVED
FIXED
mozilla36
People
(Reporter: rik, Assigned: bent.mozilla)
References
Details
Attachments
(2 files, 4 obsolete files)
61.58 KB,
text/x-log
|
Details | |
7.56 KB,
patch
|
janv
:
review+
|
Details | Diff | Splinter Review |
On my dogfooding flame, after today's update, I'm getting an empty Contacts app. This is the error I'm seeing at launch: W/GeckoConsole( 297): [JavaScript Error: "IndexedDB UnknownErr: ActorsParent.cpp:5510"] W/GeckoConsole( 297): [JavaScript Error: "TypeError: aFailureCb is not a function" {file: "resource://gre/modules/ContactDB.jsm" line: 1060}] Before that update, since the middle of last week, only ~40% of my contacts would show up (stops at a contact starting with H) with this error: W/GeckoConsole( 287): [JavaScript Error: “IndexedDB UnknownErr: ActorsParent.cpp:5510”] E/Communications( 3866): Content JS ERROR: Error getting first contacts
Assignee | ||
Comment 1•10 years ago
|
||
Hm, any chance you can email me your contacts database+files?
That error just means that the other process crashed though, right?
Comment 3•10 years ago
|
||
I'm not getting an empty Contacts app, but I also noticed similar errors in logcat today (only line numbers are different): E/QCALOG ( 306): [MessageQ] ProcessNewMessage: [XT-CS] unknown deliver target [OS-Agent] E/QCALOG ( 306): [MessageQ] ProcessNewMessage: [XTWWAN-PE] unknown deliver target [OS-Agent] I/Gecko ( 208): I/Gecko ( 208): ###!!! [Parent][DispatchAsyncMessage] Error: (msgtype=0x120005,name=PBackground::Msg_PBlobConstructor) Value error: message was deserialized, but contained an illegal value I/Gecko ( 208): I/Gecko ( 208): IPDL protocol error: could not look up PBlob I/Gecko ( 208): IPDL protocol error: Error deserializing 'PBlobParent' I/Gecko ( 208): I/Gecko ( 208): ###!!! [Parent][DispatchAsyncMessage] Error: (msgtype=0x60004,name=PBackgroundIDBDatabase::Msg_PBackgroundIDBDatabaseFileConstructor) Value error: message was deserialized, but contained an illegal value I/Gecko ( 208): I/Gecko ( 208): I/Gecko ( 208): ###!!! [Parent][MessageChannel] Error: Channel error: cannot send/recv I/Gecko ( 208): W/GeckoConsole( 208): [JavaScript Error: "IndexedDB UnknownErr: ActorsParent.cpp:7692"] W/GeckoConsole( 208): [JavaScript Error: "IndexedDB UnknownErr: ActorsParent.cpp:7692"] W/GeckoConsole( 208): [JavaScript Error: "IndexedDB UnknownErr: ActorsParent.cpp:7692"] W/GeckoConsole( 208): [JavaScript Error: "IndexedDB UnknownErr: ActorsParent.cpp:6065"] W/AudioFlinger( 204): Thread AudioOut_2 cannot connect to the power manager service E/AudioFlinger( 204): no wake lock to update! I/Gecko ( 1672): ###################################### forms.js loaded I/Gecko ( 1672): ############################### browserElementPanning.js loaded I/Gecko ( 1672): ######################## BrowserElementChildPreload.js loaded D/wpa_supplicant( 992): RX ctrl_iface - hexdump(len=11): 53 49 47 4e 41 4c 5f 50 4f 4c 4c D/wpa_supplicant( 992): wlan0: Control interface command 'SIGNAL_POLL' D/wpa_supplicant( 992): nl80211: survey data missing! E/QCALOG ( 306): [MessageQ] ProcessNewMessage: [XTWiFi-PE] unknown deliver target [OS-Agent] This log output was either a manifestation of bug 1079313 or one of bug 1079322, can't remember.
Reporter | ||
Comment 4•10 years ago
|
||
Just sent the .sqlite and .sqlite-journal files to Ben.
Comment 5•10 years ago
|
||
I get the same issue (W/GeckoConsole( 297): [JavaScript Error: "IndexedDB UnknownErr: ActorsParent.cpp:5510"]) when launching the SMS App from my 2.0 backup, but the consequences is that the SMS app does not load :) so basically same thing. I can send my db too if necessary, ben.
Comment 6•10 years ago
|
||
I think we have several bugs filled with this problem. Gabriele did you opened a bug related to this problem too? (Cannot find it)
Flags: needinfo?(gsvelto)
Comment 7•10 years ago
|
||
Yes, there's plenty of fallout from recent IndexedDB changes. The most similar is bug 1082505, :khuey suggested trying the fix from bug 1082543 but I haven't had time to do it yet.
Flags: needinfo?(gsvelto)
Comment hidden (obsolete) |
Assignee | ||
Comment 9•10 years ago
|
||
Hrm. So I just looked at the database, it looks like your 'getallcache' obejctstore is empty. This may be a dupe of 1072773.
Comment hidden (obsolete) |
Comment 11•10 years ago
|
||
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #9) > Hrm. So I just looked at the database, it looks like your 'getallcache' > obejctstore is empty. This may be a dupe of 1072773. But I have a similar issue with the SMS app too. From time to time, I have the same error, and the SMS app stays blank. Same IndexedDB error in the log. But simply restarting the app then works. I'll try tomorrow with a newer build and I can send you the SMS DB if necessary.
Updated•10 years ago
|
Flags: needinfo?(felash)
Comment 12•10 years ago
|
||
Flags: needinfo?(felash)
Comment 13•10 years ago
|
||
I think we still have the same error in the latest build. It prevented from booting after a full flash. (see boot.log attachment) Then I rebooted using "adb reboot" and it correctly booted. I then flashed my own SMS and Contacts db to the phone, then rebooted (I can provide all of this if you want). Then I launched the SMS app, the DB migration likely started (my db comes from FxOS v1.3). But it worked fine. So I don't have the issue with the same STR right now (I tried several times to push my db). This is weird. I'll report if I see this again.
Assignee | ||
Comment 14•10 years ago
|
||
Does SMS use a separate objectStore for a cache like contacts? One that it might not always include in the same transaction?
Comment 15•10 years ago
|
||
SMS actually uses 3 databases: * the one in mozContacts * the one in mozMobileMessaging * one managed by asyncStorage to store drafts Each database are independant, and for sure they all use different transactions when we write to them. But we use no database to cache stuff yet, they all have different responsibilities. But as I said, I didn't reproduce the issue in a recent build in SMS :/
Comment 16•10 years ago
|
||
Sorry, just reproduced on latest nightly.
Updated•10 years ago
|
Summary: Contact app is empty at launch → Contacts/SMS apps are sometimes empty at launch
Assignee | ||
Comment 17•10 years ago
|
||
Bug 1087464 may fix this too.
Comment 19•10 years ago
|
||
I need to check with a newer gecko. Let me update my phones and I'll report.
Assignee | ||
Comment 20•10 years ago
|
||
Thanks!
Comment 21•10 years ago
|
||
yeah, just got it on latest build for Buri :( It seems to happen after a reboot only, but I get it quite consistently on this build.
Flags: needinfo?(felash)
Comment 22•10 years ago
|
||
"quite" consistently, not 100%. Of 3 reboots, I got the issue twice.
Comment 24•10 years ago
|
||
0-29 10:05:32.940 1222 1257 W GeckoConsole: [JavaScript Error: "IndexedDB UnknownErr: ActorsParent.cpp:415"] 10-29 10:05:33.060 1481 1481 E Messages: Content JS ERROR: Reading the database. Error: InternalError This is a filtered logcat, do you want an unfiltered log?
Flags: needinfo?(felash)
Comment 25•10 years ago
|
||
I checked the unfiltered log, I have nothing better.
Attaching gdb, breaking at http://mxr.mozilla.org/mozilla-central/source/dom/indexedDB/ActorsParent.cpp#415, and printing aResultCode and the stack (bt) would help.
Assignee | ||
Comment 27•10 years ago
|
||
Julien, also, if you don't could you email me the contacts db and files folder? I promise not to post it on dropbox! :) Though if you're uncomfortable with that I can try to give you a better logging patch to make this more debuggable.
Flags: needinfo?(felash)
Comment 28•10 years ago
|
||
Unfortunately I don't do my own build these days so gdb is not useful. I'm sending to you the data I use.
Flags: needinfo?(felash)
Assignee | ||
Comment 29•10 years ago
|
||
Ok, I know what is going on here now. This isn't actually caused by bug 994190, we're just losing a race here.
No longer blocks: IndexedDB-on-PBackground
Assignee | ||
Comment 30•10 years ago
|
||
Ok, this patch fixes it locally for me. I'll try to chat with the SQLite team to see if there are better ways to do this.
Assignee: nobody → bent.mozilla
Status: NEW → ASSIGNED
Assignee | ||
Comment 31•10 years ago
|
||
Comment on attachment 8517846 [details] [diff] [review] Patch, v1 Review of attachment 8517846 [details] [diff] [review]: ----------------------------------------------------------------- Jan, how does this look?
Attachment #8517846 -
Flags: review?(Jan.Varga)
Comment 32•10 years ago
|
||
Comment on attachment 8517846 [details] [diff] [review] Patch, v1 Review of attachment 8517846 [details] [diff] [review]: ----------------------------------------------------------------- looks good - if there's no other way to fix it
Attachment #8517846 -
Flags: review?(Jan.Varga) → review+
Assignee | ||
Comment 33•10 years ago
|
||
The SQLite folks think that this will be a bit safer. Jan, how does this look?
Attachment #8517846 -
Attachment is obsolete: true
Attachment #8518447 -
Flags: review?(Jan.Varga)
Assignee | ||
Comment 34•10 years ago
|
||
Comment on attachment 8518447 [details] [diff] [review] Patch, v2 This one doesn't seem to work as well... Let me tweak it a bit more.
Attachment #8518447 -
Flags: review?(Jan.Varga)
Assignee | ||
Updated•10 years ago
|
Attachment #8518447 -
Attachment is obsolete: true
Assignee | ||
Comment 35•10 years ago
|
||
Comment on attachment 8517846 [details] [diff] [review] Patch, v1 This one works. v2 doesn't.
Attachment #8517846 -
Attachment is obsolete: false
Assignee | ||
Comment 36•10 years ago
|
||
This is basically the same as v1 but simplified a bit, and with beefier comments.
Attachment #8517846 -
Attachment is obsolete: true
Attachment #8519026 -
Flags: review?(Jan.Varga)
Comment 37•10 years ago
|
||
Comment on attachment 8519026 [details] [diff] [review] Patch, v1.1 Review of attachment 8519026 [details] [diff] [review]: ----------------------------------------------------------------- Looks good, I wonder if we have other bugs filed which are caused by the same issue.
Attachment #8519026 -
Flags: review?(Jan.Varga) → review+
Assignee | ||
Comment 38•10 years ago
|
||
This is basically the same patch as v1 but we have an additional failsafe in mozStorageConnection.cpp. Mak, I just need you to review the change to mozStorageConnection. We were being too conservative there before and not letting connections rollback to drop their locks.
Attachment #8519026 -
Attachment is obsolete: true
Attachment #8519082 -
Flags: review?(mak77)
Attachment #8519082 -
Flags: review?(Jan.Varga)
Updated•10 years ago
|
Attachment #8519082 -
Flags: review?(Jan.Varga) → review+
Comment 39•10 years ago
|
||
Comment on attachment 8519082 [details] [diff] [review] Patch, v1.2 Review of attachment 8519082 [details] [diff] [review]: ----------------------------------------------------------------- ::: storage/src/mozStorageConnection.cpp @@ +977,5 @@ > > srv = WaitForUnlockNotify(aNativeConnection); > + if (MOZ_UNLIKELY(srv != SQLITE_OK)) { > + NS_WARNING("WaitForUnlockNotify detected deadlock, yielding to let " > + "another connection drop its locks!"); is it possible a few yields won't be enough to exit the deadlock and we will end up looping here indefinitely or for a very long time? I was thinking maybe we should still check a time diff here, and after a meaningful amount of time (a few seconds?) MOZ_ASSERT and still break out of the loop.
Assignee | ||
Comment 40•10 years ago
|
||
I'm not sure how I feel about a timeout like that... It just depends on how fast a machine's cpu and disk are. Sometimes you'll just get unlucky and get an error, but it might be rare so writing code that handles that error will be challenging. Also, the sqlite3_unlock_notify() that we call today waits forever with no timeout, right?
Comment 41•10 years ago
|
||
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #40) > be challenging. Also, the sqlite3_unlock_notify() that we call today waits > forever with no timeout, right? if sqlite3_unlock_notify detects a deadlock it returns SQLITE_LOCKED but we wait the unlock notification only if it returns SQLITE_OK. Afaict your code is handling the SQLITE_LOCKED (or other error) case in which we were unable to install a notification callback at all, most likely because Sqlite detected a deadlock across 2 connections. That deadlock wouldn't be solved just by waiting the shared lock an so Sqlite doesn't even try to install the callback. This is how I interpreted the Deadlock Detection section at https://www.sqlite.org/c3ref/unlock_notify.html and our implementation of WaitForUnlockNotify. So the problem I see there, is that at one of the next loops, the deadlock could be solved by one of the 2 connections, but that's not mandatory, it could be a real deadlock that will never solve. The previous break was then just giving up and returning an error, to avoid entering the deadlock, while with your change we could be trapped indefinitely in the loop. Did I misinterpreted something about your proposal?
Assignee | ||
Comment 42•10 years ago
|
||
(In reply to Marco Bonardo [::mak] (needinfo? me) from comment #41) > This is how I interpreted the Deadlock Detection Yep, that is correct. > Did I misinterpreted something about your proposal? Nope, I think you nailed the mozStorage part. The only thing you're missing is that we explicitly ROLLBACK the *other* connection in indexedDB (see the ActorsParent.cpp changes in the patch) when this happens so that we're guaranteed to never deadlock forever. Of course, that is only true for indexedDB connections, but as far as I know the only time we ever use the unlock_notify mechanism is with indexedDB anyway. However, in the meantime, I have figured out another way around this problem that won't require any mozStorage changes, so I don't think we have to worry about this at all anymore. Sorry!
Assignee | ||
Comment 43•10 years ago
|
||
Nuke the table locks from orbit.
Attachment #8519082 -
Attachment is obsolete: true
Attachment #8519082 -
Flags: review?(mak77)
Attachment #8520703 -
Flags: review?(Jan.Varga)
Comment 44•10 years ago
|
||
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #42) > Nope, I think you nailed the mozStorage part. The only thing you're missing > is that we explicitly ROLLBACK the *other* connection in indexedDB (see the > ActorsParent.cpp changes in the patch) when this happens so that we're > guaranteed to never deadlock forever. Of course, that is only true for > indexedDB connections, but as far as I know the only time we ever use the > unlock_notify mechanism is with indexedDB anyway. Yes my point was not specifically about indexedDB, rather about other consumers that could cause trouble to Firefox. It's true the possibilities to hit this kind of deadlock are low, but the add-ons ecosystem is very unpredictable, so I would have preferred to not risk the possibility of an infinite loop here.
Comment 45•10 years ago
|
||
Comment on attachment 8520703 [details] [diff] [review] Patch, v2 Review of attachment 8520703 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/indexedDB/ActorsParent.cpp @@ +2204,5 @@ > + // ourselves and we know we will never allow a write transaction to modify > + // an object store that a read transaction is in the process of using. > + "PRAGMA read_uncommitted = TRUE;" > + // No more PRAGMAs. > + ; This looks so scary and awesome at the same time :)
Attachment #8520703 -
Flags: review?(Jan.Varga) → review+
Assignee | ||
Comment 46•10 years ago
|
||
For those following along, this fix works fine, but exposes us to another SQLite bug that the SQLite team is testing a fix for right now. Once that fix is available we can land both.
Assignee | ||
Updated•10 years ago
|
Depends on: SQLite3.8.7.2
Assignee | ||
Comment 47•10 years ago
|
||
Ok, we have the new SQLite version ready in bug 1101217.
Assignee | ||
Comment 48•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/577bc6a54c1d
https://hg.mozilla.org/mozilla-central/rev/577bc6a54c1d
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
Comment 50•10 years ago
|
||
\o/ thanks
You need to log in
before you can comment on or make changes to this bug.
Description
•