Closed Bug 1541776 (idb-shutdown-hangs) Opened 7 months ago Closed 8 days ago

Crash in [@ mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownTimedOut]

Categories

(Core :: DOM: IndexedDB, defect, P1, critical)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- unaffected
firefox67 --- unaffected
firefox67.0.1 --- unaffected
firefox68 --- wontfix
firefox69 --- wontfix
firefox70 + fixed
firefox71 --- fixed

People

(Reporter: janv, Assigned: janv)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression, topcrash)

Crash Data

This bug is for crash report bp-82654c09-3450-4241-90d7-4a1b80190404.

Top 10 frames of crashing thread:

0 xul.dll void mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownTimedOut dom/indexedDB/ActorsParent.cpp:16304
1 xul.dll static void mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<unnamed-tag>::operator dom/indexedDB/ActorsParent.cpp:16231
2 xul.dll static void mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<unnamed-tag>::__invoke dom/indexedDB/ActorsParent.cpp:16231
3 xul.dll nsresult nsTimerEvent::Run xpcom/threads/TimerThread.cpp:260
4 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1180
5 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:482
6 xul.dll static bool mozilla::SpinEventLoopUntil<mozilla::ProcessFailureBehavior::ReportToCaller, `lambda at z:/task_1554323631/build/src/dom/indexedDB/ActorsParent.cpp:16241:3'> xpcom/threads/nsThreadUtils.h:348
7 xul.dll void mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownWorkThreads dom/indexedDB/ActorsParent.cpp:16241
8 xul.dll void mozilla::dom::quota::QuotaManager::Shutdown dom/quota/ActorsParent.cpp:3344
9 xul.dll static void mozilla::dom::quota::QuotaManager::ShutdownInstance dom/quota/ActorsParent.cpp:2958

Blocks: 1541370

I wonder why I don't see the new "IndexedDBShutdownTimeout" annotation in the crash report.

https://crash-stats.mozilla.com/report/index/82654c09-3450-4241-90d7-4a1b80190404#tab-metadata
IndexedDBShutdownTimeout: gFactoryOps: 26

https://crash-stats.mozilla.com/report/index/0676ba42-e2d5-4b17-93c4-2dbd60190404#tab-metadata
IndexedDBShutdownTimeout: gFactoryOps: 26

https://crash-stats.mozilla.com/report/index/5569a872-c513-41ff-ad3c-30ce50190404#tab-metadata
IndexedDBShutdownTimeout: gFactoryOps: 26

Note that I didn't find gLiveDatabaseHashtable & mCurrentMaintenance in the reports so they should be empty. Besides, all of three reports show that there are 26 Ops which seems to be dead already but without being cleaning up from the global hashtable.

(In reply to Tom Tung [:tt, :ttung] from comment #2)

Note that I didn't find gLiveDatabaseHashtable & mCurrentMaintenance in the reports so they should be empty. Besides, all of three reports show that there are 26 Ops which seems to be dead already but without being cleaning up from the global hashtable.

Yes, from the global array.

Do you know what can be causing this ?

Crash Signature: [@ mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownTimedOut] → [@ mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownTimedOut] [@ mozilla::dom::indexedDB::(anonymous namespace)::QuotaClient::ShutdownTimedOut]

The priority flag is not set for this bug.
:overholt, could you have a look please?

Flags: needinfo?(overholt)
Flags: needinfo?(overholt)
Priority: -- → P2
Depends on: 1544870

It seems there is something strange.

https://crash-stats.mozilla.com/report/index/fe5a864c-3a7e-4d94-a7c7-369700190408#allthreads

The metadata doesn't show anything and on the QuotaManager IO thread is initializing so that I am assuming it shouldn't wait on SpinEventLoopUntil and thus the shutdown timer on IDB shouldn't expire before canceled. However, we have these things on the stack for the report.

Also, the shutdown timer on IDB is 50s, that means the initialization time for the report is more than that if there is no anything unexpected.

This is spiking up in 68 since it went to beta, is there anything we can do here?

Flags: needinfo?(shes050117)

Nevermind, I see Jan's working on tracking that down in another bug.

Flags: needinfo?(shes050117)
Crash Signature: [@ mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownTimedOut] [@ mozilla::dom::indexedDB::(anonymous namespace)::QuotaClient::ShutdownTimedOut] → [@ mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownTimedOut] [@ mozilla::dom::indexedDB::(anonymous namespace)::QuotaClient::ShutdownTimedOut] [@ mozilla::dom::indexedDB::(anonymous namespace)::QuotaClient::ShutdownWorkThreads::$_3:…
Duplicate of this bug: 1554535
Crash Signature: namespace)::QuotaClient::ShutdownWorkThreads::$_3::__invoke] [@ mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<T>::__invoke] → namespace)::QuotaClient::ShutdownWorkThreads::$_3::__invoke] [@ mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<T>::__invoke] [@ mozilla::dom::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<T>::__invoke]
Crash Signature: namespace)::QuotaClient::ShutdownWorkThreads::$_3::__invoke] [@ mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<T>::__invoke] [@ mozilla::dom::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<T>::__invoke] → namespace)::QuotaClient::ShutdownWorkThreads::$_3::__invoke] [@ mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<T>::__invoke]

[@ mozilla::dom::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<T>::__invoke] is bug 1541928

Crash Signature: namespace)::QuotaClient::ShutdownWorkThreads::$_3::__invoke] [@ mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<T>::__invoke] → namespace)::QuotaClient::ShutdownWorkThreads::$_3::__invoke] [@ mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<T>::__invoke] [@ mozilla::dom::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<T>::__invoke]
Crash Signature: namespace)::QuotaClient::ShutdownWorkThreads::$_3::__invoke] [@ mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<T>::__invoke] [@ mozilla::dom::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<T>::__invoke] → namespace)::QuotaClient::ShutdownWorkThreads::$_3::__invoke] [@ mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownWorkThreads::<T>::__invoke]

I read some of the reports today while I was building gecko. Although they timeouted on the background thread (because some requests probably haven't been finished yet), their QuotaManager IO thread is in storage initialization. I reckon these IDB requests were waiting for jumping to the QuotaManager IO thread and QM IO thread is doing initialization request from some component (e.g. LS).

IIUC, based on these reports, their initialization times are at least greater than 50s which is way more than what we collected from the telemetry for the successful cases.

If that is true, then we should probably make initialization cancelable or somehow prevent it from initializing that long.

Yeah, that's on my TODO list too, but I want to get some data from bug 1542478 first.

Looks pretty bad in beta.

Keywords: topcrash

Julien, this is not a new issue, we just had to create separate crash signatures for IndexedDB and LocalStorage and also filter out shutdown hangs/crashes that are not related to QuotaManager/IndexedDB/LocalStorage.
The original issue (shutdown hangs) was tracked in bug 1542485. If you look at the graph for beta in that bug, it went down rapidly.
I'm sorry it may look confusing at first glance, but we had to create these new crash signatures to be able to focus on specific areas in the code and eventually fix these shutdown hangs. So here's a simple outline for checking the issue (shutdown hangs) in different versions of FF:
Nightly and Beta - bug 1541776
Release - bug 1542485
ESR - bug 1487194

Ok, so far there are 3 crash reports with the new crash annotation and all 3 report this:
gFactoryOps: 8 (persistent*aaaaaa*DatabaseOpenPending, persistent*aaaaaa*DatabaseWorkOpen)

"aaaaaa" is probably "chrome"

"DatabaseOpenPending" means that a previous factory request hasn't finished yet so the factory op must wait
"DatabaseWorkOpen" means that the factory op is doing work on QuotaManager I/O thread

I'm going to add some "IsShuttingDown" checks to temporary storage initialization and we will see.

However, I'm not sure how "persistent" can trigger temporary storage initialization.

Ok, so it seems that something else triggered temporary storage initialization and these factory ops are just waiting to be processed on the QuotaManager I/O thread.

Depends on: 1556730

It seems bug 1556730 has helped to reduce these crashes (shutdown hangs).

Alias: idb-shutdown-hangs
Depends on: 1557820
Assignee: nobody → jvarga
Status: NEW → ASSIGNED
Depends on: 1542541
Depends on: 1557997
Depends on: 1542478
Priority: P2 → P1

Volume looks much lower on 68 starting with 68.0b8

7000 crashes per week on 68 release is still quite high.
Jan, is still something you or someone else could look at again? Or is that being done in the dependencies?

Flags: needinfo?(jvarga)
Keywords: topcrash
Depends on: 1581127
Flags: needinfo?(jvarga)

It seems bug 1581127 helped with reducing IDB shutdown hangs.

One of the remaining problems that block IDB shutdown is database integrity check which runs as part of database maintenance, see this crash report (especially the IndexedDB Mnt #1 thread):
https://crash-stats.mozilla.org/report/index/5141c5be-9827-4626-a798-1cdc50190923#allthreads

I guess that the integrity check doesn't use the progress handler, so we can't interrupt it when the shutdown is detected.

Another problem is that looping over stored files in FileManager::GetUsage can take long time, see this crash report (especially the QuotaManager IO thread):
https://crash-stats.mozilla.org/report/index/9aa4231a-d474-4758-ad71-5ab120190923#allthreads

Another problem is that storage upgrade doesn't check if shutdown is in progress, see this crash report (especially the QuotaManager IO thread):
https://crash-stats.mozilla.org/report/index/70b98a32-59d5-4e03-a66c-720b50190925#allthreads

Depends on: 1584323

Another problem is that quota manager clears origins and that operation takes long time, see this crash report (especially the QuotaManager IO thread):
https://crash-stats.mozilla.org/report/index/1f05c3f4-d992-4227-b685-c2f790190925#allthreads

We probably can't just abort that operation because it can be a privacy related clear. It needs to be investigated more if we see many shutdown hangs like this.

The clearing is also slowed down by clearing the shadow database of LocalStorage. See this crash report (especially the QuotaManager IO thread):
https://crash-stats.mozilla.org/report/index/a0ef1059-d9a8-49b5-bb29-2fd490190926#allthreads

The shadow database will be removed at some point.

CreateLocalStorageArchiveConnectionFromWebAppsStore can also block shutdown. See this crash report (especially the QuotaManager IO thread):
https://crash-stats.mozilla.org/report/index/4b69e49b-906c-43ca-bf47-56d310190926#allthreads

Another problem is that looping over IDB databases in QuotaClient::GetDatabaseFilenames can take long time, see this crash report (especially the QuotaManager IO thread):
https://crash-stats.mozilla.org/report/index/26b4d0f9-64fd-4856-a954-e813e0190930#allthreads

Another problem is that removing files in a loop in QuotaManager::CheckTemporaryStorageLimits can take long time, see this crash report (especially the QuotaManager IO thread):
https://crash-stats.mozilla.org/report/index/43b5f476-773e-496b-b707-5fc060191002#allthreads

(In reply to Jan Varga [:janv] from comment #21)

It seems bug 1581127 helped with reducing IDB shutdown hangs.

Looking at the recent crash data and according to Jan's comment above, I think we should set status-70 to fixed.
(I am stealing Jan's honor to comment here :P )

Flags: needinfo?(lhenry)
Flags: needinfo?(lhenry)

Marking as fixed given low frequency. Remaining issues are tracked in bug 1588498.

Status: ASSIGNED → RESOLVED
Closed: 8 days ago
Resolution: --- → FIXED
See Also: → 1588498
Target Milestone: --- → mozilla71
You need to log in before you can comment on or make changes to this bug.