Closed
Bug 1160459
Opened 10 years ago
Closed 9 years ago
shutdown hang in mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownWorkThreads()
Categories
(Core :: Storage: IndexedDB, defect, P3)
Core
Storage: IndexedDB
Tracking
()
People
(Reporter: jimm, Unassigned)
References
Details
20:36:32 INFO - 4 nss3.dll!PR_Wait [prmon.c:55826466dd7b : 294 + 0xd]
20:36:32 INFO - rip = 0x000007f9d3e4366b rsp = 0x000000094041ecf0
20:36:32 INFO - rbp = 0x0000000940812438
20:36:32 INFO - Found by: call frame info
20:36:32 INFO - 5 xul.dll!nsEventQueue::GetEvent(bool,nsIRunnable * *) [nsEventQueue.cpp:55826466dd7b : 67 + 0x10]
20:36:32 INFO - rip = 0x000007f9cda733a0 rsp = 0x000000094041ed30
20:36:32 INFO - rbp = 0x0000000940812438
20:36:32 INFO - Found by: call frame info
20:36:32 INFO - 6 xul.dll!nsThread::ProcessNextEvent(bool,bool *) [nsThread.cpp:55826466dd7b : 857 + 0x15]
20:36:32 INFO - rip = 0x000007f9cda758f5 rsp = 0x000000094041ed60
20:36:32 INFO - rbp = 0x0000000940812438
20:36:32 INFO - Found by: call frame info
20:36:32 INFO - 7 xul.dll!NS_ProcessNextEvent(nsIThread *,bool) [nsThreadUtils.cpp:55826466dd7b : 265 + 0xc]
20:36:32 INFO - rip = 0x000007f9cda91aff rsp = 0x000000094041ef40
20:36:32 INFO - rbp = 0x0000000940812438
20:36:32 INFO - Found by: call frame info
20:36:32 INFO - 8 xul.dll!mozilla::dom::indexedDB::`anonymous namespace'::QuotaClient::ShutdownWorkThreads() [ActorsParent.cpp:55826466dd7b : 15103 + 0x9]
20:36:32 INFO - rip = 0x000007f9cea6bf9a rsp = 0x000000094041ef70
20:36:32 INFO - rbp = 0x0000000940812438
20:36:32 INFO - Found by: call frame info
20:36:32 INFO - 9 xul.dll!mozilla::dom::quota::QuotaManager::Observe(nsISupports *,char const *,wchar_t const *) [QuotaManager.cpp:55826466dd7b : 2859 + 0x11]
20:36:32 INFO - rip = 0x000007f9ce998b49 rsp = 0x000000094041efa0
20:36:32 INFO - rbp = 0x0000000940812438
20:36:32 INFO - Found by: call frame info
20:36:32 INFO - 10 xul.dll!nsObserverList::NotifyObservers(nsISupports *,char const *,wchar_t const *) [nsObserverList.cpp:55826466dd7b : 113 + 0x13]
20:36:32 INFO - rip = 0x000007f9cda4d4bb rsp = 0x000000094041f1d0
20:36:32 INFO - rbp = 0x0000000940812438
20:36:32 INFO - Found by: call frame info
20:36:32 INFO - 11 xul.dll!nsObserverService::NotifyObservers(nsISupports *,char const *,wchar_t const *) [nsObserverService.cpp:55826466dd7b : 334 + 0x10]
20:36:32 INFO - rip = 0x000007f9cda4d5ab rsp = 0x000000094041f210
20:36:32 INFO - rbp = 0x0000000940812438
20:36:32 INFO - Found by: call frame info
20:36:32 INFO - 12 xul.dll!nsXREDirProvider::DoShutdown() [nsXREDirProvider.cpp:55826466dd7b : 902 + 0x18]
20:36:32 INFO - rip = 0x000007f9cf1d0e78 rsp = 0x000000094041f240
20:36:32 INFO - rbp = 0x0000000940812438
20:36:32 INFO - Found by: call frame info
20:36:32 INFO - 13 xul.dll!ScopedXPCOMStartup::~ScopedXPCOMStartup() [nsAppRunner.cpp:55826466dd7b : 1318 + 0xb]
20:36:32 INFO - rip = 0x000007f9cf1c7a93 rsp = 0x000000094041f270
20:36:32 INFO - rbp = 0x0000000940812438
20:36:32 INFO - Found by: call frame info
20:36:32 INFO - 14 xul.dll!XREMain::XRE_main(int,char * * const,nsXREAppData const *) [nsAppRunner.cpp:55826466dd7b : 4177 + 0x14]
20:36:32 INFO - rip = 0x000007f9cf1cc7ea rsp = 0x000000094041f2a0
20:36:32 INFO - rbp = 0x0000000940812438
20:36:32 INFO - Found by: call frame info
20:36:32 INFO - 15 xul.dll!XRE_main [nsAppRunner.cpp:55826466dd7b : 4240 + 0x11]
20:36:32 INFO - rip = 0x000007f9cf1ce960 rsp = 0x000000094041f320
20:36:32 INFO - rbp = 0x0000000940812438
20:36:32 INFO - Found by: call frame info
20:36:32 INFO - 16 firefox.exe!do_main [nsBrowserApp.cpp:55826466dd7b : 214 + 0x17]
20:36:32 INFO - rip = 0x000007f71e131a0a rsp = 0x000000094041f4d0
20:36:32 INFO - rbp = 0x0000000940812438
Reporter | ||
Comment 1•10 years ago
|
||
We're currently getting about 30 or 40 of these a day while running tests.
Reporter | ||
Updated•10 years ago
|
Component: General → DOM: IndexedDB
Comment 2•10 years ago
|
||
This looks like nested event loops. Could we perhaps use AsyncShutdown instead?
Updated•10 years ago
|
tracking-e10s:
--- → ?
Comment 3•10 years ago
|
||
The recent surge in the intermittent looks like it could be a regression from bug 1131766. Can you look at this, Kyle, as Ben is on PTO?
Flags: needinfo?(khuey)
I was out that week too :P This is janv territory anyways.
Flags: needinfo?(khuey) → needinfo?(Jan.Varga)
Hm, where is the data here? Is it only happening on windows? (/me assumes windows because jimm filed)
Flags: needinfo?(bent.mozilla) → needinfo?(jmathies)
Reporter | ||
Comment 7•9 years ago
|
||
(In reply to Ben Turner [:bent] (use the needinfo flag!) from comment #6)
> Hm, where is the data here? Is it only happening on windows? (/me assumes
> windows because jimm filed)
This is a test only failure afaict, see the bug this bug blocks - bug 1121145. Looks like it is all Windows.
Flags: needinfo?(jmathies)
Comment 8•9 years ago
|
||
This is probably caused by bug 1180978.
Depends on: 1180978
Flags: needinfo?(Jan.Varga)
Reporter | ||
Updated•9 years ago
|
Flags: needinfo?(mrbkap)
Comment 9•9 years ago
|
||
FYI, I'm strongly considering hiding various Windows mochitest-bc suites affected by bug 1121145. What can we do to prioritize getting this (and/or deps it has) fixed?
Flags: needinfo?(jmathies)
Flags: needinfo?(Jan.Varga)
We can land bug 1180978 without the assertion and see what happens.
Reporter | ||
Comment 11•9 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #9)
> FYI, I'm strongly considering hiding various Windows mochitest-bc suites
> affected by bug 1121145. What can we do to prioritize getting this (and/or
> deps it has) fixed?
Sounds like bug 1180978 need to be made a priority assuming it's the cause.
Flags: needinfo?(jmathies)
Reporter | ||
Comment 12•9 years ago
|
||
(In reply to Jim Mathies [:jimm] from comment #11)
> (In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #9)
> > FYI, I'm strongly considering hiding various Windows mochitest-bc suites
> > affected by bug 1121145. What can we do to prioritize getting this (and/or
> > deps it has) fixed?
>
> Sounds like bug 1180978 need to be made a priority assuming it's the cause.
Hmm, not looking very promising but lets see how it goes today - bug 1180978 landed on inbound prior to a test failure report on inbound in bug 1121145.
That doesn't seem to have helped, unfortunately.
Looking at http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-win32-pgo/1437424212/mozilla-inbound_win7-ix_test_pgo-mochitest-e10s-browser-chrome-2-bm112-tests1-windows-build222.txt.gz, the main thread is blocked on the IDB background thread, which is blocked on the waiting for the ConnectionPool to shut down. There's clearly a connection thread outstanding (thread 55), but its not doing anything.
I'm inclined to stick a fatal assertion in http://hg.mozilla.org/mozilla-central/annotate/2ddec2dedced/dom/indexedDB/ActorsParent.cpp#l11008 that checks to see if we are shutting down. What do you think janv?
Comment 14•9 years ago
|
||
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) (UTC+8 July 17-25, expect delays) from comment #13)
> I'm inclined to stick a fatal assertion in
> http://hg.mozilla.org/mozilla-central/annotate/2ddec2dedced/dom/indexedDB/
> ActorsParent.cpp#l11008 that checks to see if we are shutting down. What do
> you think janv?
Ok, sounds good.
Flags: needinfo?(Jan.Varga)
Now that I think about it more I don't think that's correct. I think we'll end up in that path in the testcase from bug 1180978. Perhaps I should try just stacking a ton of blocked transactions on a readwrite and seeing what happens if we shut down.
Trying to reproduce this locally ... is it expected that ./mach mochitest -f browser --e10s starts up a new browser for each directory? Does that match the tinderbox behavior?
Flags: needinfo?(jmathies)
Comment 17•9 years ago
|
||
For mochitest-bc and mochitest-dt, I'd expect that, yes. We use run-by-dir on them.
Ok, there's no need to hide the whole test suite then because bug 1121145 appears to always happen in the customizableui directory.
Unfortunately customizableui doesn't actually use IndexedDB itself (at least not directly) so figuring out what's going on is non-trivial ...
Reporter | ||
Comment 19•9 years ago
|
||
Looks like Ryan answered the question and that's good since I have no info on our "run by directory" practices in automation.
Flags: needinfo?(jmathies)
Reporter | ||
Updated•9 years ago
|
Flags: needinfo?(mrbkap)
Comment 20•9 years ago
|
||
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #18)
> Ok, there's no need to hide the whole test suite then because bug 1121145
> appears to always happen in the customizableui directory.
Confirmed on Try that skipping the customizableui directory on Windows e10s makes the failures go away. Will try to bisect it down next.
Comment 21•9 years ago
|
||
Disabling half the tests in the directory led to all green regardless of which half were disabled :\
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b6a4ed2ff82c
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e2042048e895
Not sure where we go from here.
Reporter | ||
Comment 22•9 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #21)
> Disabling half the tests in the directory led to all green regardless of
> which half were disabled :\
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=b6a4ed2ff82c
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=e2042048e895
>
> Not sure where we go from here.
I would take one of these pushes and add back small blocks of disabled tests for pushing to try. Hopefully we'd find a block with a test in it that triggers the failure.
Comment 23•9 years ago
|
||
I took the green https://treeherder.mozilla.org/#/jobs?repo=try&revision=e2042048e895 push and split the disabled block in half. Both were orange.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a85e685ba2ae
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b0c5f7b76b74
I think I've done what I can do here, sorry.
status-firefox41:
--- → affected
status-firefox42:
--- → affected
Updated•9 years ago
|
Priority: -- → P3
Comment 24•9 years ago
|
||
This appears to have gone away on its own somewhere along the way.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•