If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Intermittent 914521.html | Exited with code -11 during test run (after Assertion failure: !mAsyncExecutionThread (AsyncClose has not been invoked on this connection!), at mozStorageConnection.cpp:491

RESOLVED FIXED in Firefox 38

Status

()

Toolkit
Storage
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: KWierso, Assigned: Ben Turner (not reading bugmail, use the needinfo flag!))

Tracking

({intermittent-failure})

unspecified
mozilla40
intermittent-failure
Points:
---
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox38 fixed, firefox39 fixed, firefox40 fixed, firefox-esr31 unaffected)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

3 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=48602887&tree=Mozilla-Inbound
Rev5 MacOSX Mountain Lion 10.8 mozilla-inbound debug test crashtest on 2014-09-22 11:48:17 PDT for push caadbd3acebf

slave: talos-mtnlion-r5-006



11:59:41     INFO -  ++DOMWINDOW == 132 (0x12063cc00) [pid = 1582] [serial = 1007] [outer = 0x0]
11:59:41     INFO -  ++DOMWINDOW == 133 (0x12076ec00) [pid = 1582] [serial = 1008] [outer = 0x12063cc00]
11:59:41     INFO -  JavaScript strict warning: , line 0: TypeError: setting a property that has only a getter
11:59:41     INFO -  JavaScript strict warning: , line 0: TypeError: setting a property that has only a getter
11:59:41     INFO -  JavaScript strict warning: , line 0: TypeError: setting a property that has only a getter
11:59:41     INFO -  ++DOCSHELL 0x11e078000 == 9 [pid = 1582] [id = 63]
11:59:41     INFO -  ++DOMWINDOW == 134 (0x120e37c00) [pid = 1582] [serial = 1009] [outer = 0x0]
11:59:41     INFO -  ++DOCSHELL 0x11e079000 == 10 [pid = 1582] [id = 64]
11:59:41     INFO -  ++DOMWINDOW == 135 (0x120e38400) [pid = 1582] [serial = 1010] [outer = 0x0]
11:59:41     INFO -  JavaScript strict warning: , line 0: TypeError: setting a property that has only a getter
11:59:41     INFO -  JavaScript strict warning: , line 0: TypeError: setting a property that has only a getter
11:59:41     INFO -  JavaScript strict warning: , line 0: TypeError: setting a property that has only a getter
11:59:41     INFO -  JavaScript strict warning: , line 0: TypeError: setting a property that has only a getter
11:59:41     INFO -  JavaScript strict warning: , line 0: TypeError: setting a property that has only a getter
11:59:41     INFO -  JavaScript strict warning: , line 0: TypeError: setting a property that has only a getter
11:59:41     INFO -  [1582] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111: file /builds/slave/m-in-osx64-d-00000000000000000/build/content/base/src/nsFrameLoader.cpp, line 287
11:59:41     INFO -  ++DOCSHELL 0x11ed17800 == 11 [pid = 1582] [id = 65]
11:59:41     INFO -  ++DOMWINDOW == 136 (0x120fb3800) [pid = 1582] [serial = 1011] [outer = 0x0]
11:59:41     INFO -  [1582] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111: file /builds/slave/m-in-osx64-d-00000000000000000/build/content/base/src/nsFrameLoader.cpp, line 287
11:59:41     INFO -  ++DOMWINDOW == 137 (0x121e28400) [pid = 1582] [serial = 1012] [outer = 0x120fb3800]
11:59:41     INFO -  [1582] WARNING: NS_ENSURE_TRUE(nsContentUtils::GetCurrentJSContext()) failed: file /builds/slave/m-in-osx64-d-00000000000000000/build/docshell/base/nsDocShell.cpp, line 9009
11:59:41     INFO -  [1582] WARNING: NS_ENSURE_TRUE(globalObject && globalObject->GetGlobalJSObject()) failed: file /builds/slave/m-in-osx64-d-00000000000000000/build/content/html/document/src/nsHTMLContentSink.cpp, line 741
11:59:41     INFO -  [1582] WARNING: Subdocument container has no frame: file /builds/slave/m-in-osx64-d-00000000000000000/build/layout/base/nsDocumentViewer.cpp, line 2515
11:59:41     INFO -  ++DOMWINDOW == 138 (0x11d914000) [pid = 1582] [serial = 1013] [outer = 0x120e37c00]
11:59:41     INFO -  [1582] WARNING: NS_ENSURE_TRUE(globalObject && globalObject->GetGlobalJSObject()) failed: file /builds/slave/m-in-osx64-d-00000000000000000/build/content/html/document/src/nsHTMLContentSink.cpp, line 741
11:59:41     INFO -  [1582] WARNING: Subdocument container has no frame: file /builds/slave/m-in-osx64-d-00000000000000000/build/layout/base/nsDocumentViewer.cpp, line 2515
11:59:41     INFO -  ++DOMWINDOW == 139 (0x121fc1800) [pid = 1582] [serial = 1014] [outer = 0x120e38400]
11:59:41     INFO -  ++DOMWINDOW == 140 (0x121fc3400) [pid = 1582] [serial = 1015] [outer = 0x120fb3800]
11:59:41     INFO -  ++DOMWINDOW == 141 (0x11adea400) [pid = 1582] [serial = 1016] [outer = 0x120fb3800]
11:59:41     INFO -  [1582] WARNING: NS_ENSURE_TRUE(mMutable) failed: file /builds/slave/m-in-osx64-d-00000000000000000/build/netwerk/base/src/nsSimpleURI.cpp, line 265
11:59:41     INFO -  OpenGL version detected: 210
11:59:41     INFO -  OpenGL vendor: Intel Inc.
11:59:41     INFO -  OpenGL renderer: Intel HD Graphics 3000 OpenGL Engine
11:59:41     INFO -  [1582] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111: file /builds/slave/m-in-osx64-d-00000000000000000/build/dom/events/ContentEventHandler.cpp, line 110
11:59:41     INFO -  [1582] WARNING: no buffer: 'mDTBuffer', file /builds/slave/m-in-osx64-d-00000000000000000/build/gfx/layers/RotatedBuffer.cpp, line 351
11:59:41     INFO -  --DOCSHELL 0x1143a6800 == 10 [pid = 1582] [id = 60]
11:59:41     INFO -  --DOCSHELL 0x1143b2000 == 9 [pid = 1582] [id = 61]
11:59:41     INFO -  JavaScript error: resource://gre/components/nsHandlerService.js, line 391: NS_ERROR_FILE_NOT_FOUND: Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIRDFRemoteDataSource.Flush]
11:59:41     INFO -  JavaScript error: resource://gre/components/nsHandlerService.js, line 391: NS_ERROR_FILE_NOT_FOUND: Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIRDFRemoteDataSource.Flush]
11:59:41     INFO -  Assertion failure: !mAsyncExecutionThread (AsyncClose has not been invoked on this connection!), at /builds/slave/m-in-osx64-d-00000000000000000/build/storage/src/mozStorageConnection.cpp:491
11:59:41     INFO -  nsTArray_Impl<nsRefPtr<mozilla::storage::Connection>, nsTArrayInfallibleAllocator>::RemoveElementsAt(unsigned long, unsigned long) [xpcom/glue/nsTArray.h:1672]
11:59:41     INFO -  mozilla::storage::Service::unregisterConnection(mozilla::storage::Connection*) [xpcom/glue/Mutex.h:169]
11:59:41     INFO -  mozilla::storage::Connection::Release() [storage/src/mozStorageConnection.cpp:524]
11:59:41     INFO -  nsProxyReleaseEvent::Run() [xpcom/glue/nsProxyRelease.cpp:18]
11:59:41     INFO -  nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:823]
11:59:41     INFO -  NS_ProcessPendingEvents(nsIThread*, unsigned int) [xpcom/glue/nsThreadUtils.cpp:207]
11:59:41     INFO -  nsBaseAppShell::NativeEventCallback() [widget/xpwidgets/nsBaseAppShell.cpp:99]
11:59:41     INFO -  nsAppShell::ProcessGeckoEvents(void*) [widget/cocoa/nsAppShell.mm:375]
11:59:41     INFO -  CoreFoundation + 0x12841
11:59:41     INFO -  CoreFoundation + 0x1222d
11:59:41     INFO -  CoreFoundation + 0x354e5
11:59:41     INFO -  CoreFoundation + 0x34dd2
11:59:41     INFO -  HIToolbox + 0x5f774
11:59:41     INFO -  HIToolbox + 0x5f512
11:59:41     INFO -  HIToolbox + 0x5f3a3
11:59:41     INFO -  AppKit + 0x156fa3
11:59:41     INFO -  -AppKit + 0x156862
11:59:41     INFO -  -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] [widget/cocoa/nsAppShell.mm:129]
11:59:41     INFO -  -AppKit + 0x14dc03
11:59:41     INFO -  nsAppShell::Run() [xpcom/glue/nsCOMPtr.h:643]
11:59:41     INFO -  nsAppStartup::Run() [toolkit/components/startup/nsAppStartup.cpp:280]
11:59:41     INFO -  XREMain::XRE_mainRun() [toolkit/xre/nsAppRunner.cpp:4159]
11:59:41     INFO -  XREMain::XRE_main(int, char**, nsXREAppData const*) [toolkit/xre/nsAppRunner.cpp:4230]
11:59:41     INFO -  XRE_main [toolkit/xre/nsAppRunner.cpp:4444]
11:59:41     INFO -  main [browser/app/nsBrowserApp.cpp:282]
11:59:41     INFO -  ** Unknown exception behavior: 0
11:59:45     INFO -  TEST-INFO | Main app process: killed by out-of-range signal, number 117
11:59:45  WARNING -  TEST-UNEXPECTED-FAIL | file:///builds/slave/talos-slave/test/build/tests/reftest/tests/docshell/base/crashtests/914521.html | Exited with code -11 during test run
11:59:45     INFO -  INFO | automation.py | Application ran for: 0:01:50.935583
11:59:45     INFO -  Automation Error: PID log not found: /var/folders/sh/x3_cwy_x189_cj3v8yxlp9m000000w/T/tmpeXdqo4pidlog
11:59:45     INFO -  WARNING | leakcheck | refcount logging is off, so leaks can't be detected!
11:59:45     INFO -  REFTEST INFO | runreftest.py | Running tests: end.
11:59:46    ERROR - Return code: 1
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
This is a storage bug.
Assignee: nobody → bent.mozilla
Component: Document Navigation → Storage
OS: Mac OS X → All
Product: Core → Toolkit
Hardware: x86_64 → All
Created attachment 8585314 [details] [diff] [review]
Patch, v1

I need this to land bug 866846 and its siblings.
Attachment #8585314 - Flags: review?(bugmail)
The problem is that AsyncOpen doesn't handle failure correctly. It notifies the caller but doesn't properly shut down the half-opened connection that it created.
Status: NEW → ASSIGNED
Comment on attachment 8585314 [details] [diff] [review]
Patch, v1

Hm, yes, it does seem like it would be hard for the caller to clean-up the object properly without a reference to it.

Good eye finding this.
Attachment #8585314 - Flags: review?(bugmail) → review+
Huh, but your try build at https://treeherder.mozilla.org/#/jobs?repo=try&revision=0e2144a7057a with this patch still resulted in the same failure on dt4.  This patch still seems correct, but it's probably best if I don't try and helpfully land this for you since I'm about to turn into a pumpkin.
Is it possible your closeRunnable ends up running after we reach the assertion point?
Maybe you could try using NS_DISPATCH_SYNC. Or, since you are already on the async thread, directly create an AsyncCloseConnection object, .forget() the execution thread to it, an invoke its Run(). The only difference will be we don't call SetClosedState, but that's not critical since this connection won't work.
this is provided you verified the problem is with Sqlite.jsm (currently the only consumer of openAsyncDatabase), cause, as Andrew said, any connection using an async statement must invoke asyncClose, also synchronous ones.
Is this failure reproducible locally?
Created attachment 8585543 [details] [diff] [review]
Patch, v1.1

Slightly improved. Depending on where in initializeInternal we fail sometimes mDBConn is null and sometimes it's not. (Aside: This seems kinda dumb...)
Attachment #8585314 - Attachment is obsolete: true
Attachment #8585543 - Flags: review?(bugmail)
(In reply to Marco Bonardo [::mak] from comment #11)
> Is it possible your closeRunnable ends up running after we reach the
> assertion point?

No, the assertion is in the destructor, and we're holding references throughout this whole async close sequence. It's just that sometimes mDBConn is nulled out in initializeInternal, and that caused a short-circuit return in AsyncClose.
(In reply to Marco Bonardo [::mak] from comment #12)
> this is provided you verified the problem is with Sqlite.jsm (currently the
> only consumer of openAsyncDatabase)

Yeah, this test uses SQLite.jsm via StorageActors (http://mxr.mozilla.org/mozilla-central/source/toolkit/devtools/server/actors/storage.js)
Comment on attachment 8585543 [details] [diff] [review]
Patch, v1.1

(In reply to Ben Turner <unresponsive until 3/30> (use the needinfo flag!) from comment #13)
> (Aside: This seems kinda dumb...)

Yeah, the mozStorage state machine is too complex leading to a variety of edge-cases like this.  Which are eventually caught this way.  It would be great if there were engineering resources to engage in a more protracted cleanup, but most changes to mozStorage are incidental/fires being put out.  Maybe someone has an intern soon? :)
Attachment #8585543 - Flags: review?(bugmail) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/d4fdf509f242
(Reporter)

Comment 18

3 years ago
Backed out for frequent bc1 oranges
https://treeherder.mozilla.org/logviewer.html#?job_id=8256324&repo=mozilla-inbound

https://hg.mozilla.org/integration/mozilla-inbound/rev/93bab937c624
Flags: needinfo?(bent.mozilla)
Flags: needinfo?(bent.mozilla)
https://hg.mozilla.org/integration/mozilla-inbound/rev/ce2692d64bcf
https://hg.mozilla.org/mozilla-central/rev/ce2692d64bcf
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
status-firefox40: --- → fixed
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
Can we nominate this for Aurora/Beta as well?
status-firefox38: --- → affected
status-firefox39: --- → affected
status-firefox-esr31: --- → unaffected
Flags: needinfo?(bent.mozilla)
Comment on attachment 8585543 [details] [diff] [review]
Patch, v1.1

Approval Request Comment
[Feature/regressing bug #]: Unclear, probably a very old bug
[User impact if declined]: In rare circumstances connections will keep their threads alive until app shutdown, and DEBUG builds will assert.
[Describe test coverage new/current, TreeHerder]: The assertion failure is hit on tinderbox
[Risks and why]: This seems like a pretty safe patch, especially since it handles a condition that is only rarely hit.
[String/UUID change made/needed]: None
Flags: needinfo?(bent.mozilla)
Attachment #8585543 - Flags: approval-mozilla-beta?
Attachment #8585543 - Flags: approval-mozilla-aurora?
Comment on attachment 8585543 [details] [diff] [review]
Patch, v1.1

Actually I don't know that it's worth the (small) risk for beta... Ryan, maybe you could clarify whether or not this is a big pain point for sheriffs?
Attachment #8585543 - Flags: approval-mozilla-beta?
I'm fine deferring to your judgment on that. Just trying to get 38 as solid as possible since we've got it for the next year.
Andrew, what do you think about the safety of this for beta?
Flags: needinfo?(bugmail)
I think this is safe for uplift.

Rationale:
- Because the connection was never reported to the caller, the permutation-space for things that can happen and interact and reorder is extremely constrained.  The one edge-case expected was the one in comment 13 and this patch explicitly caught that.  (We've had other changes that can result in things bouncing around threads different for release.  Though they have generally been safe/an improvement, this patch has none of that.)
- It wouldn't be surprising for use of Sqlite.jsm to pick up in extensions, so it wouldn't be surprising for the number of impacted users to increase over time.
Flags: needinfo?(bugmail)
Attachment #8585543 - Flags: approval-mozilla-beta?
Ok, thanks!
Comment on attachment 8585543 [details] [diff] [review]
Patch, v1.1

should be in 38 beta 2
Attachment #8585543 - Flags: approval-mozilla-beta?
Attachment #8585543 - Flags: approval-mozilla-beta+
Attachment #8585543 - Flags: approval-mozilla-aurora?
Attachment #8585543 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/6de4a352d5f2
status-firefox39: affected → fixed
https://hg.mozilla.org/releases/mozilla-beta/rev/0e9a4f42d12a
status-firefox38: affected → fixed
You need to log in before you can comment on or make changes to this bug.