Closed
Bug 1156063
Opened 10 years ago
Closed 10 years ago
Intermittent application crashed [@ mozilla::dom::indexedDB::::ConnectionPool::Start] in various tests
Categories
(Core :: Storage: IndexedDB, defect)
Tracking
()
RESOLVED
FIXED
mozilla40
People
(Reporter: ehsan.akhgari, Assigned: bent.mozilla)
References
Details
(Keywords: intermittent-failure)
Attachments
(2 files)
9.88 KB,
patch
|
janv
:
review+
|
Details | Diff | Splinter Review |
8.96 KB,
patch
|
lizzard
:
approval-mozilla-aurora+
lizzard
:
approval-mozilla-esr38+
jocheng
:
approval-mozilla-b2g37+
|
Details | Diff | Splinter Review |
See for example <http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux/1429438111/mozilla-inbound_ubuntu32_hw_test-other_nol64-bm104-tests1-linux-build488.txt.gz>:
03:53:11 INFO - PROCESS-CRASH | sessionrestore_no_auto_restore | application crashed [@ mozilla::dom::indexedDB::::ConnectionPool::Start]
03:53:11 INFO - Crash dump filename: /tmp/tmprWC7Kh/profile/minidumps/12ef4634-0c48-15c8-5d41fe14-2dd3bd19.dmp
03:53:11 INFO - Operating system: Linux
03:53:11 INFO - 0.0.0 Linux 3.2.0-76-generic-pae #111-Ubuntu SMP Tue Jan 13 22:34:29 UTC 2015 i686
03:53:11 INFO - CPU: x86
03:53:11 INFO - GenuineIntel family 6 model 30 stepping 5
03:53:11 INFO - 8 CPUs
03:53:11 INFO - Crash reason: SIGSEGV
03:53:11 INFO - Crash address: 0x58
03:53:11 INFO - Thread 29 (crashed)
03:53:11 INFO - 0 libxul.so!mozilla::dom::indexedDB::::ConnectionPool::Start [ActorsParent.cpp:7953d3dd62ff : 9487 + 0x3]
03:53:11 INFO - eip = 0xb432aac5 esp = 0xa2bfefb0 ebp = 0xa2bff038 ebx = 0xb6e60538
03:53:11 INFO - esi = 0x9a454b00 edi = 0x99137ee0 eax = 0xa2ca5000 ecx = 0x00000000
03:53:11 INFO - edx = 0xa2ca5000 efl = 0x00210202
03:53:11 INFO - Found by: given as instruction pointer in context
03:53:11 INFO - 1 libxul.so!mozilla::dom::indexedDB::::OpenDatabaseOp::DispatchToWorkThread [ActorsParent.cpp:7953d3dd62ff : 17793 + 0x26]
03:53:11 INFO - eip = 0xb432b1c5 esp = 0xa2bff040 ebp = 0xa2bff098 ebx = 0xb6e60538
03:53:11 INFO - esi = 0x9a454b00 edi = 0x99137ee0
03:53:11 INFO - Found by: call frame info
03:53:11 INFO - 2 libxul.so!mozilla::dom::indexedDB::::FactoryOp::Run [ActorsParent.cpp:7953d3dd62ff : 17048 + 0x8]
03:53:11 INFO - eip = 0xb432e0cf esp = 0xa2bff0a0 ebp = 0xa2bff0f8 ebx = 0xb6e60538
03:53:11 INFO - esi = 0x9a454b00 edi = 0x96ff3420
03:53:11 INFO - Found by: call frame info
03:53:11 INFO - 3 libxul.so!mozilla::dom::indexedDB::::WaitForTransactionsHelper::Run [ActorsParent.cpp:7953d3dd62ff : 11404 + 0x5]
03:53:11 INFO - eip = 0xb431f598 esp = 0xa2bff100 ebp = 0xa2bff148 ebx = 0xb6e60538
03:53:11 INFO - esi = 0xa2c70510 edi = 0x96ff3420
03:53:11 INFO - Found by: call frame info
03:53:11 INFO - 4 libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:7953d3dd62ff : 866 + 0x5]
03:53:11 INFO - eip = 0xb3422b7a esp = 0xa2bff150 ebp = 0xa2bff1b8 ebx = 0xb6e60538
03:53:11 INFO - esi = 0xa2c70510 edi = 0x00000000
03:53:11 INFO - Found by: call frame info
03:53:11 INFO - 5 libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:7953d3dd62ff : 265 + 0xf]
03:53:11 INFO - eip = 0xb34380df esp = 0xa2bff1c0 ebp = 0xa2bff1f8 ebx = 0xb6e60538
03:53:11 INFO - esi = 0xa2cbe610 edi = 0xa2e6be80
03:53:11 INFO - Found by: call frame info
03:53:11 INFO - 6 libxul.so!mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) [MessagePump.cpp:7953d3dd62ff : 368 + 0xb]
03:53:11 INFO - eip = 0xb360df45 esp = 0xa2bff200 ebp = 0xa2bff248 ebx = 0xb6e60538
03:53:11 INFO - esi = 0xa2cbe610 edi = 0xa2e6be80
03:53:11 INFO - Found by: call frame info
03:53:11 INFO - 7 libxul.so!MessageLoop::RunInternal() [message_loop.cc:7953d3dd62ff : 233 + 0x5]
03:53:11 INFO - eip = 0xb35f9a74 esp = 0xa2bff250 ebp = 0xa2bff268 ebx = 0xb6e60538
03:53:11 INFO - esi = 0xa2e6be80 edi = 0xa2e6be80
03:53:11 INFO - Found by: call frame info
03:53:11 INFO - 8 libxul.so!MessageLoop::Run() [message_loop.cc:7953d3dd62ff : 226 + 0x7]
03:53:11 INFO - eip = 0xb35f9ba2 esp = 0xa2bff270 ebp = 0xa2bff298 ebx = 0xb6e60538
03:53:11 INFO - esi = 0xa2e6be80 edi = 0xa2e6be80
03:53:11 INFO - Found by: call frame info
03:53:11 INFO - 9 libxul.so!nsThread::ThreadFunc(void*) [nsThread.cpp:7953d3dd62ff : 364 + 0x7]
03:53:11 INFO - eip = 0xb3423ea4 esp = 0xa2bff2a0 ebp = 0xa2bff2e8 ebx = 0xb6e60538
03:53:11 INFO - esi = 0xa2c70510 edi = 0xa2e6be80
03:53:11 INFO - Found by: call frame info
03:53:11 INFO - 10 libnspr4.so!_pt_root [ptthread.c:7953d3dd62ff : 212 + 0x8]
03:53:11 INFO - eip = 0xb73c44d9 esp = 0xa2bff2f0 ebp = 0xa2bff328 ebx = 0xb73d85ac
03:53:11 INFO - esi = 0xa2ef6ec0 edi = 0x00006c4e
03:53:11 INFO - Found by: call frame info
03:53:11 INFO - 11 libpthread-2.15.so + 0x6d4b
03:53:11 INFO - eip = 0xb76cdd4c esp = 0xa2bff330 ebp = 0xa2bff428 ebx = 0xb76deff4
03:53:11 INFO - esi = 0x00000000 edi = 0x003d0f00
03:53:11 INFO - Found by: call frame info
03:53:11 INFO - 12 libc-2.15.so + 0xef8bd
03:53:11 INFO - eip = 0xb74cd8be esp = 0xa2bff430 ebp = 0x00000000
03:53:11 INFO - Found by: previous frame's frame pointer
It seems that gConnectionPool is null here. In some places we null check it before we dereference and in some places we don't. It's not obvious to me which should happen here but the crash suggests a null check. Ben, what do you think?
Flags: needinfo?(bent.mozilla)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Reporter | ||
Comment 12•10 years ago
|
||
Note that this started to happen when bug 1150683 landed on inbound), which is interesting because it added code that activates this JSM <https://hg.mozilla.org/integration/mozilla-inbound/file/c6805afff48c/dom/push/PushService.jsm> which accesses IDB.
Comment 13•10 years ago
|
||
Other jsm's do this without problem. Maybe the shutdown isn't working right?
Assignee | ||
Comment 14•10 years ago
|
||
Jan, I think this should consolidate our shutdown code a little better.
This may also fix bug 1140913.
Assignee: nobody → bent.mozilla
Status: NEW → ASSIGNED
Flags: needinfo?(bent.mozilla)
Attachment #8595054 -
Flags: review?(Jan.Varga)
Assignee | ||
Comment 15•10 years ago
|
||
Comment 16•10 years ago
|
||
Comment on attachment 8595054 [details] [diff] [review]
Patch, v1
Review of attachment 8595054 [details] [diff] [review]:
-----------------------------------------------------------------
r=me but please double check the case below
::: dom/indexedDB/ActorsParent.cpp
@@ +17902,5 @@
> unused <<
> PBackgroundIDBFactoryRequestParent::Send__delete__(this, response);
> }
>
> + CleanupBackgroundThreadObjects(NS_FAILED(mResultCode));
Ben, are you sure this won't call mDatabase->Invalidate() if Factory::ActorDestory() was already called ?
I'm asking because I ran into a case some time ago when a factory actor was already destroyed and it was the last actor so the global hash tables were destroyed too.
Database::Invalidate() need to access those hash tables.
Attachment #8595054 -
Flags: review?(Jan.Varga) → review+
Comment 17•10 years ago
|
||
Assignee | ||
Comment 18•10 years ago
|
||
(In reply to Jan Varga [:janv] from comment #16)
> Ben, are you sure this won't call mDatabase->Invalidate() if
> Factory::ActorDestory() was already called ?
Hm. Well, if Factory::ActorDestroy was already called then OpenDatabaseOp::FactoryDestroy should have been called too, right?
And if that's the case then we already would have invalidated the database, and a second call to Invalidate() is just a no-op.
Does that match your understanding too?
Assignee | ||
Updated•10 years ago
|
Flags: needinfo?(Jan.Varga)
Comment 19•10 years ago
|
||
Comment 20•10 years ago
|
||
(In reply to Ben Turner [:bent] (use the needinfo flag!) from comment #18)
> (In reply to Jan Varga [:janv] from comment #16)
> > Ben, are you sure this won't call mDatabase->Invalidate() if
> > Factory::ActorDestory() was already called ?
>
> Hm. Well, if Factory::ActorDestroy was already called then
> OpenDatabaseOp::FactoryDestroy should have been called too, right?
>
> And if that's the case then we already would have invalidated the database,
> and a second call to Invalidate() is just a no-op.
>
> Does that match your understanding too?
It seems you are right. Thanks for checking.
Flags: needinfo?(Jan.Varga)
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
status-firefox40:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
Comment 22•10 years ago
|
||
Ben, should we backport this to the b2g 2.2 (gecko 37) branch?
Flags: needinfo?(bent.mozilla)
Assignee | ||
Comment 23•10 years ago
|
||
Comment on attachment 8595054 [details] [diff] [review]
Patch, v1
[Approval Request Comment]
Bug caused by (feature/regressing bug #): 994190
User impact if declined: Shutdown leak of parent process, which is really not a big deal and probably not even noticeable. I think this is only useful for backporting so that automated tests are intermittently orange a bit less.
Testing completed: Caught by automated tests
Risk to taking this patch (and alternatives if risky): Low risk, this just delays creating a thread pool until we actually need it.
String or UUID changes made by this patch: None
Flags: needinfo?(bent.mozilla)
Attachment #8595054 -
Flags: approval-mozilla-beta?
Attachment #8595054 -
Flags: approval-mozilla-b2g37?
Attachment #8595054 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 24•10 years ago
|
||
(In reply to Andrew Overholt [:overholt] from comment #22)
> Ben, should we backport this to the b2g 2.2 (gecko 37) branch?
The types are different there but the pattern is the same. But this only matters for sheriff sanity, this leak is not worth worrying about.
Assignee | ||
Comment 25•10 years ago
|
||
Comment on attachment 8595054 [details] [diff] [review]
Patch, v1
Oops, that approval comment was for a different bug :(
Attachment #8595054 -
Flags: approval-mozilla-beta?
Attachment #8595054 -
Flags: approval-mozilla-b2g37?
Attachment #8595054 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 26•10 years ago
|
||
Comment on attachment 8595054 [details] [diff] [review]
Patch, v1
[Approval Request Comment]
Bug caused by (feature/regressing bug #): 994190
User impact if declined: Rare crash on shutdown.
Testing completed: Caught by automated tests
Risk to taking this patch (and alternatives if risky): There's moderate risk of regressions here, but so far I don't think we've seen anything worrying on trunk. This consolodates two separate code paths so that it's easier to reason about and be correct. I'd like to get it on our long-lived branches at least so that we don't have a known shutdown crash there.
String or UUID changes made by this patch: None
Attachment #8595054 -
Flags: approval-mozilla-beta?
Attachment #8595054 -
Flags: approval-mozilla-b2g37?
Attachment #8595054 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 27•10 years ago
|
||
Hrm, maybe bug 1157029 is fallout from this.
Assignee | ||
Comment 28•10 years ago
|
||
Comment on attachment 8595054 [details] [diff] [review]
Patch, v1
Yeah, this had problems. Need an additional fix here.
Attachment #8595054 -
Flags: approval-mozilla-beta?
Attachment #8595054 -
Flags: approval-mozilla-b2g37?
Attachment #8595054 -
Flags: approval-mozilla-aurora?
Updated•10 years ago
|
status-b2g-v2.2:
--- → ?
status-b2g-master:
--- → fixed
status-firefox38:
--- → ?
status-firefox39:
--- → ?
status-firefox-esr31:
--- → unaffected
Comment 30•10 years ago
|
||
This missed Fx38 due to the lack of a rebased patch. Hopefully we can still get this on esr38...
status-firefox38.0.5:
--- → wontfix
status-firefox-esr38:
--- → affected
Comment 32•10 years ago
|
||
Comment on attachment 8601110 [details] [diff] [review]
Branch patch
[Approval Request Comment]
Bug caused by (feature/regressing bug #): 994190
User impact if declined: Rare crash on shutdown.
Testing completed: Caught by automated tests
Risk to taking this patch (and alternatives if risky): There's moderate risk of regressions here, but so far I don't think we've seen anything worrying on trunk. This consolodates two separate code paths so that it's easier to reason about and be correct. I'd like to get it on our long-lived branches at least so that we don't have a known shutdown crash there.
String or UUID changes made by this patch: None
Attachment #8601110 -
Flags: approval-mozilla-esr38?
Attachment #8601110 -
Flags: approval-mozilla-b2g37?
Attachment #8601110 -
Flags: approval-mozilla-aurora?
Comment 33•10 years ago
|
||
Comment on attachment 8601110 [details] [diff] [review]
Branch patch
Approved for uplift to aurora. Maybe this will eliminate some shutdown crashes, and we still have time to back this out before the merge to beta if there are problems.
Attachment #8601110 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 34•10 years ago
|
||
Comment 35•10 years ago
|
||
Backed out for ASAN (at least) crashes.
https://hg.mozilla.org/releases/mozilla-aurora/rev/5822851d2911
https://treeherder.mozilla.org/logviewer.html#?job_id=807159&repo=mozilla-aurora
Flags: needinfo?(bent.mozilla)
Assignee | ||
Comment 36•10 years ago
|
||
This patch is fine and will reland with the others soon.
Flags: needinfo?(bent.mozilla)
Assignee | ||
Comment 37•10 years ago
|
||
Comment 38•10 years ago
|
||
Comment on attachment 8601110 [details] [diff] [review]
Branch patch
Keeping the risk in mind and backing out if causing any side effect.
Attachment #8601110 -
Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Comment 39•10 years ago
|
||
Comment 40•10 years ago
|
||
Comment on attachment 8601110 [details] [diff] [review]
Branch patch
Approved for uplift to esr38 as this fixes a shutdown hang/crash.
Attachment #8601110 -
Flags: approval-mozilla-esr38? → approval-mozilla-esr38+
Comment 41•10 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•