Closed Bug 878411 Opened 7 years ago Closed 7 years ago

crash in mozilla::storage::Connection::internalClose @ sqlite3WalClose

Categories

(Toolkit :: Storage, defect, critical)

24 Branch
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla24
Tracking Status
firefox23 --- unaffected
firefox24 + verified

People

(Reporter: scoobidiver, Unassigned)

References

Details

(Keywords: crash, regression, topcrash)

Crash Data

Attachments

(1 file, 4 obsolete files)

It first showed up in 24.0a1/20130531 and is currently #2 top crasher in this build. The regression range is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f66d956d188e&tochange=3c6f2394995d
It's likely a regression from bug 874171.

Signature 	@0x0 | sqlite3WalClose More Reports Search
UUID	6776dee1-726f-4648-880e-79cac2130601
Date Processed	2013-06-01 09:56:05
Uptime	15
Last Crash	29 seconds before submission
Install Age	1.5 minutes since version was first installed.
Install Time	2013-06-01 09:54:31
Product	Firefox
Version	24.0a1
Build ID	20130531031002
Release Channel	nightly
OS	Windows NT
OS Version	6.1.7601 Service Pack 1
Build Architecture	x86
Build Architecture Info	GenuineIntel family 6 model 37 stepping 2
Crash Reason	EXCEPTION_ACCESS_VIOLATION_EXEC
Crash Address	0x0
App Notes 	
AdapterVendorID: 0x1002, AdapterDeviceID: 0x9552, AdapterSubsysID: 04341028, AdapterDriverVersion: 8.692.1.0
D3D10 Layers? D3D10 Layers- D3D9 Layers? D3D9 Layers+ 
Processor Notes 	sp-processor09_phx1_mozilla_com_4386:2012
EMCheckCompatibility	True
Adapter Vendor ID	0x1002
Adapter Device ID	0x9552
Total Virtual Memory	4294836224
Available Virtual Memory	3827376128
System Memory Use Percentage	38
Available Page File	6518530048
Available Physical Memory	2535256064

Frame 	Module 	Signature 	Source
0 		@0x0 	
1 	nss3.dll 	sqlite3WalClose 	db/sqlite3/src/sqlite3.c:47343
2 	nss3.dll 	sqlite3PagerClose 	db/sqlite3/src/sqlite3.c:42290
3 	nss3.dll 	sqlite3BtreeClose 	db/sqlite3/src/sqlite3.c:51648
4 	nss3.dll 	sqlite3LeaveMutexAndCloseZombie 	db/sqlite3/src/sqlite3.c:115699
5 	nss3.dll 	sqlite3Close 	db/sqlite3/src/sqlite3.c:115642
6 	nss3.dll 	sqlite3_close 	db/sqlite3/src/sqlite3.c:115655
7 	xul.dll 	mozilla::storage::Connection::internalClose 	storage/src/mozStorageConnection.cpp:764
8 	xul.dll 	mozilla::storage::Connection::Close 	storage/src/mozStorageConnection.cpp:949
9 	xul.dll 	mozilla::storage::Connection::~Connection 	storage/src/mozStorageConnection.cpp:413
10 	xul.dll 	mozilla::storage::Connection::Release 	storage/src/mozStorageConnection.cpp:441
11 	xul.dll 	nsCOMPtr<nsIMediaDevice>::`scalar deleting destructor' 	
12 	xul.dll 	nsTArray_Impl<nsCOMPtr<nsITimer>,nsTArrayInfallibleAllocator>::DestructRange 	obj-firefox/dist/include/nsTArray.h:1393
13 	xul.dll 	nsTArray_Impl<nsRefPtr<mozilla::dom::devicestorage::DeviceStorageRequestParent:: 	obj-firefox/dist/include/nsTArray.h:1110
14 	xul.dll 	nsTArray_Impl<nsRefPtr<mozilla::storage::Connection>,nsTArrayInfallibleAllocator 	obj-firefox/dist/include/nsTArray.h:1135
15 	xul.dll 	mozilla::storage::Service::unregisterConnection 	storage/src/mozStorageService.cpp:353
16 	xul.dll 	mozilla::storage::Connection::Release 	storage/src/mozStorageConnection.cpp:436
17 	xul.dll 	mozilla::storage::Statement::~Statement 	storage/src/mozStorageStatement.cpp:248
18 	xul.dll 	mozilla::storage::Statement::Release 	storage/src/mozStorageStatement.cpp:254
19 	xul.dll 	DoDeferredRelease<nsISupports*> 	js/xpconnect/src/XPCJSRuntime.cpp:628
20 	xul.dll 	XPCJSRuntime::GCCallback 	js/xpconnect/src/XPCJSRuntime.cpp:828
21 	mozjs.dll 	Collect 	js/src/jsgc.cpp:4585
22 	mozjs.dll 	js::GC 	js/src/jsgc.cpp:4602
23 	mozjs.dll 	js::DestroyContext 	js/src/jscntxt.cpp:388
24 	mozjs.dll 	JS_DestroyContext 	js/src/jsapi.cpp:1284
25 	xul.dll 	mozJSComponentLoader::UnloadModules 	js/xpconnect/loader/mozJSComponentLoader.cpp:1156
26 	xul.dll 	mozJSComponentLoader::Observe 	js/xpconnect/loader/mozJSComponentLoader.cpp:1470
27 	xul.dll 	mozilla::ShutdownXPCOM 	xpcom/build/nsXPComInit.cpp:672
28 	xul.dll 	ScopedXPCOMStartup::~ScopedXPCOMStartup 	toolkit/xre/nsAppRunner.cpp:1129
29 	xul.dll 	XREMain::XRE_main 	toolkit/xre/nsAppRunner.cpp:3949
30 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:4137
31 	firefox.exe 	do_main 	browser/app/nsBrowserApp.cpp:272
32 	firefox.exe 	NS_internal_main 	browser/app/nsBrowserApp.cpp:632
33 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:105
34 	firefox.exe 	__tmainCRTStartup 	crtexe.c:552
35 	kernel32.dll 	BaseThreadInitThunk 	
36 	ntdll.dll 	__RtlUserThreadStart 	
37 	ntdll.dll 	_RtlUserThreadStart 	

More reports at:
https://crash-stats.mozilla.com/report/list?signature=%400x0+|+sqlite3WalClose
Crash Signature: [@ @0x0 | sqlite3WalClose] → [@ @0x0 | sqlite3WalClose] [@ @0x0 | sqlite3WalCheckpoint ]
This following is a guess, but it does appear to cover the facts:

SQLite version 3.7.17 added two new virtual methods to the sqlite3_io_methods object, xFetch and xUnfetch.  These methods are only called if the version number on the sqlite3_io_methods object is 3 or greater.  Legacy VFSes that do not define these methods should have version numbers of 2 or less.

However, TelemetryVFS found in the storage/src/TelemetryVFS.cpp source file of Firefox appears to be copying the version number from a sub-VFS, but then fails to set xFetch and xUnfetch.  (See the code beginning at approximately line 337).  The TelemetryVFS implementation appears to set those two methods to NULL, which explains the 0x0 frames on the stack traces as sqlite3WalClose() calls xUnfetch().

Suggested fixes:

(1) Modify TelemetryVFS so that pNew->iVersion is never more than 2.

(2) Modify TelemetryVFS to initialize pNew->xFetch and pNew->xUnfetch.  Suggested no-op implementations can be seen at http://www.sqlite.org/src/artifact/b4ad71336fd?ln=154-161

(3) Recompile with -DSQLITE_MAX_MMAP_SIZE=0.  That compile-time option will disable the new memory-mapping code in SQLite 3.7.17 which will prevent xFetch and xUnfetch from ever being called.  This might be a "quick fix" to verify this really is the source of the problem.

Moving forward, we will modify SQLite so that it never invokes xFetch() or xUnfetch() if they are NULL pointers.  That will also solve the problem.  But it will be a while yet before that change makes it into an official SQLite release.  On the other hand, if Firefox needs an emergency 3.7.17.1 release of SQLite with this fix, please let us know and we will make it happen.
I believe this implements proposal 2 of comment 1.
Attachment #757005 - Flags: review?(mak77)
Comment on attachment 757005 [details] [diff] [review]
always define xFetch and xUnfetch methods in TelemetryVFS files

I think the conditionals inside the xFetch and xUnfetch functions should read:

    if (p->pReal->pMethods->iVersion >= 3) {

Also, perhaps reduce both functions to file scope by adding "static"?  But all that said, as long as FF is statically linked with SQLite (versus using a system library SQLite as some Linux distributions prefer) it should work either way.
(In reply to D. Richard Hipp from comment #3)
> Comment on attachment 757005 [details] [diff] [review]
> always define xFetch and xUnfetch methods in TelemetryVFS files
> 
> I think the conditionals inside the xFetch and xUnfetch functions should
> read:
> 
>     if (p->pReal->pMethods->iVersion >= 3) {

Mmm, that's a good point.  I guess it's unlikely that the "real" filesystem will have non-NULL xFetch and xUnfetch methods?  Setting the xFetch and xUnfetch fields should also be dependent on iVersion >= 3, too.

> Also, perhaps reduce both functions to file scope by adding "static"?

They are enclosed in an anonymous namespace, which has roughly the same effect.
Here's a different version that only sets xFetch and xUnfetch if the underlying file has
version 3 or greater.  I left in the NULL checks as a safety measure.
Attachment #757005 - Attachment is obsolete: true
Attachment #757005 - Flags: review?(mak77)
Attachment #757007 - Flags: review?(mak77)
Comment on attachment 757007 [details] [diff] [review]
define xFetch and xUnfetch methods in TelemetryVFS file objects if necessary

Review of attachment 757007 [details] [diff] [review]:
-----------------------------------------------------------------

I think we should stop unconditionally setting our version number based on the underlying iomethods version. Otherwise we're going to have this same problem next time sqlite changes.

Let's instead hardcode the latest version we support (3), always define xFetch/xUnfetch, add a warning there if the underlying version is newer, and then properly check the underlying iomethod version in our xFetch/xUnfetch implementations before calling them.

mak can of course override me...
Attachment #757007 - Flags: review?(mak77)
Attachment #757007 - Flags: review-
Attachment #757007 - Flags: feedback?(mak77)
(In reply to ben turner [:bent] from comment #6)
> I think we should stop unconditionally setting our version number based on
> the underlying iomethods version. Otherwise we're going to have this same
> problem next time sqlite changes.
> 
> Let's instead hardcode the latest version we support (3), always define
> xFetch/xUnfetch, add a warning there if the underlying version is newer

I agree, please move the version to a define and s/warning/MOZ_ASSERT/.
We have a similar MOZ_ASSERT check for the vfs version at http://hg.mozilla.org/mozilla-central/diff/46c47be9d44e/storage/src/TelemetryVFS.cpp
Looks like when I introduced the check for the vfs version I missed to add one for the io methods version :(
Attachment #757007 - Flags: feedback?(mak77)
OK, here's a version that always defines our methods as version 3.
Attachment #757007 - Attachment is obsolete: true
Attachment #757924 - Flags: review?(mak77)
Comment on attachment 757924 [details] [diff] [review]
define xFetch and xUnfetch methods in TelemetryVFS file objects if necessary

Review of attachment 757924 [details] [diff] [review]:
-----------------------------------------------------------------

::: storage/src/TelemetryVFS.cpp
@@ +294,5 @@
> +int
> +xFetch(sqlite3_file *pFile, sqlite3_int64 iOff, int iAmt, void **pp)
> +{
> +  telemetry_file *p = (telemetry_file *)pFile;
> +  if (p->pReal->pMethods->xFetch) {

I still think you should not null check here but instead compare the version to 3. In xUnfetch too.

@@ +358,5 @@
>      sqlite3_io_methods *pNew = new sqlite3_io_methods;
>      const sqlite3_io_methods *pSub = p->pReal->pMethods;
>      memset(pNew, 0, sizeof(*pNew));
> +    pNew->iVersion = 3;
> +    MOZ_ASSERT(pNew->iVersion == pSub->iVersion);

You should do <= here, right?
(In reply to ben turner [:bent] from comment #10)
> @@ +358,5 @@
> >      sqlite3_io_methods *pNew = new sqlite3_io_methods;
> >      const sqlite3_io_methods *pSub = p->pReal->pMethods;
> >      memset(pNew, 0, sizeof(*pNew));
> > +    pNew->iVersion = 3;
> > +    MOZ_ASSERT(pNew->iVersion == pSub->iVersion);
> 
> You should do <= here, right?

I think the scope is that if we push a new version of SQLite without properly bumping the version (by adding or modifying the needed methods) this will abort. In spite of that == is a stricter check and sounds better than <= to me.
If in future we find ourselves not being in need to update the VFS (with valid reason), then we will be able to relax the request, but at least there's no risk we forget to bump.
(In reply to ben turner [:bent] from comment #10)
> ::: storage/src/TelemetryVFS.cpp
> @@ +294,5 @@
> > +int
> > +xFetch(sqlite3_file *pFile, sqlite3_int64 iOff, int iAmt, void **pp)
> > +{
> > +  telemetry_file *p = (telemetry_file *)pFile;
> > +  if (p->pReal->pMethods->xFetch) {
> 
> I still think you should not null check here but instead compare the version
> to 3. In xUnfetch too.

Doing the version check is strictly for defensive purposes, right?  Because if we're ever executing this code, we'll have already gotten past the MOZ_ASSERT that guarantees the sub-methods are at version 3 and therefore has an xFetch field.

I suppose we could be dealing with a buggy library in the field and the MOZ_ASSERT wouldn't protect us, therefore we should do the check.  Is that your reasoning here?

But I don't understand why the same reasoning doesn't apply to the null check: we've written code with null xFetch fields.  There's no reason that a hypothetical buggy library couldn't have a null xFetch field too...  And if the modifications suggested in comment 1 are made (to null check xFetch &co in SQLite itself), it seems like we should be doing the same, right?

> @@ +358,5 @@
> >      sqlite3_io_methods *pNew = new sqlite3_io_methods;
> >      const sqlite3_io_methods *pSub = p->pReal->pMethods;
> >      memset(pNew, 0, sizeof(*pNew));
> > +    pNew->iVersion = 3;
> > +    MOZ_ASSERT(pNew->iVersion == pSub->iVersion);
> 
> You should do <= here, right?

I think the stricter check is better.  Perhaps we should have a comment here that if you're upgrading the version, you should also add passthrough methods for any additional io methods added in that version bump.
(In reply to Nathan Froyd (:froydnj) from comment #12)
> Perhaps we should have a comment here
> that if you're upgrading the version, you should also add passthrough
> methods for any additional io methods added in that version bump.

This is a good idea, and while there please also correct the old existing comment on the other MOZ_ASSERT, since as it is now, looks like you can just bump the version and forget, instead you must verify what changed and apply those changes to the vfs. Both comments may read similarly.

Personally I don't have a preference for the check-version VS null-check, both solutions look valid.
My concern is that we're assuming things about the underlying VFS which we can't control.

First, we're assuming that all underlying VFS will be version 3. Can we assume that? Why do we care? All we should care about I think is that we have implemented a VFS that supports version 3.

Second, the SQLite API has one relevant requiremnet at the moment: If your VFS is at version 3 or higher then you have xFetch/xUnfetch methods that will be called. That's it. IMO we should just check that prescribed condition instead of inventing a new one (non-null xFetch/xUnfetch means we will call it regardless of version).
(In reply to ben turner [:bent] from comment #14)
> First, we're assuming that all underlying VFS will be version 3. Can we
> assume that? Why do we care? All we should care about I think is that we
> have implemented a VFS that supports version 3.

Historically we have often assumed requirements about out SQLite library, to be able to use latest improvements to the library itself. And we base our design decisions over that, for example we base some design patterns on being able to use WAL, a third library may have disabled it. I don't think we should spend time designing for unknown exotic systems.
(In reply to ben turner [:bent] from comment #14)
> My concern is that we're assuming things about the underlying VFS which we
> can't control.
> 
> First, we're assuming that all underlying VFS will be version 3. Can we
> assume that? Why do we care? All we should care about I think is that we
> have implemented a VFS that supports version 3.

We do check for the underlying VFS:

http://mxr.mozilla.org/mozilla-central/source/storage/src/TelemetryVFS.cpp#478

Since those VFS names are defined by SQLite itself, I think we are justified in assuming they are at the correct version.  I suppose it's possible that somebody could replace the basic VFSes...maybe we should defend against that, (?) but I don't think this bug is the place to start the discussion.

And like mak says, we have assumed requirements about the underlying SQLite generally elsewhere.

> Second, the SQLite API has one relevant requiremnet at the moment: If your
> VFS is at version 3 or higher then you have xFetch/xUnfetch methods that
> will be called. That's it. IMO we should just check that prescribed
> condition instead of inventing a new one (non-null xFetch/xUnfetch means we
> will call it regardless of version).

Well, if we can assume that the underlying VFSes are from SQLite itself, then we can just call xFetch/xUnfetch directly, no null checks or version checks required.

The most consistent course of action with the current code is:

1) Add the comments from comment 13.
2) Add an assertion for the io methods similar to the VFS check, with == (underlying VFS must be version 3).
3) Our custom xFetch/xUnfetch methods unconditionally call the appropriate underlying method.  Even though Ben argues this makes potentially unwarranted assumptions, it is consistent with all the version 2 io methods and version 2/3 VFS methods we already have in this file.

I can see the goodness of version checks in the individual methods, but I'm skeptical of their actual effectiveness.

Does that sound reasonable?
Yeah, I don't care too strongly here, I just don't see why we'd add hurdles when we don't need to.
Here's something that implements comment 16.  I took the liberty of tweaking the comments
for the VFS bits.
Attachment #757924 - Attachment is obsolete: true
Attachment #757924 - Flags: review?(mak77)
Attachment #758540 - Flags: review?(mak77)
Comment on attachment 758540 [details] [diff] [review]
define xFetch and xUnfetch methods in TelemetryVFS file objects

Review of attachment 758540 [details] [diff] [review]:
-----------------------------------------------------------------

::: storage/src/TelemetryVFS.cpp
@@ +373,5 @@
> +    pNew->xShmBarrier = pSub->xShmBarrier ? xShmBarrier : 0;
> +    pNew->xShmUnmap = pSub->xShmUnmap ? xShmUnmap : 0;
> +    // Methods added in version 3.
> +    // SQLite 3.7.17 calls these methods without checking for NULL first,
> +    // so we always define them.

xFetch and xUnfetch are defined through a no-op (apart tracing) method when not supported, while other methods like xShmMap may be defined as a null pointer. That's likely the reason a null check was not added for xFetch and xUnfetch, while it was used for other methods.
I was wondering if it would be worth to protectively check for null, in case this may change in future, but since OSTRACE is done regardless, that's less likely to change. Just in case we may add a MOZ_ASSERT(pSub->xFetch) to be notified if that will ever change.
Attachment #758540 - Flags: review?(mak77) → review+
Landed in: https://hg.mozilla.org/integration/mozilla-inbound/rev/ad781d8dcdc6
Backout in: https://hg.mozilla.org/integration/mozilla-inbound/rev/197943dd42b7

Looks like the underlying file objects are still something < 3?  Guess bent's version check isn't looking so bad after all. :)
ok, so looking at IOMETHODS() macro in sqlite.c, some are indeed on older versions... so yes, let's convert the assert to <= 3
(warning, slow to load) http://mxr.mozilla.org/mozilla-central/source/db/sqlite3/src/sqlite3.c#28004
While SQLite verifies the version (through bUseFetch) before using xFetch and xUnfetch, let's still add MOZ_ASSERT(p->pReal->pMethods->iVersion == 3) inside those.
This should basically cover all of the previous suggestions.
(In reply to Marco Bonardo [:mak] from comment #21)
> ok, so looking at IOMETHODS() macro in sqlite.c, some are indeed on older
> versions... so yes, let's convert the assert to <= 3
> (warning, slow to load)
> http://mxr.mozilla.org/mozilla-central/source/db/sqlite3/src/sqlite3.c#28004
> While SQLite verifies the version (through bUseFetch) before using xFetch
> and xUnfetch, let's still add MOZ_ASSERT(p->pReal->pMethods->iVersion == 3)
> inside those.
> This should basically cover all of the previous suggestions.

I don't think those should be MOZ_ASSERT, they should be checks for iVersion >= 3.  Otherwise, we have to stub the functions.

In a similar fashion, I think the xShm* methods that we were defining for iVersion >= 2 now need to do version checks themselves.  Before, we would only have xShm* methods if the underlying io methods were >= 2.  Now, since our io methods are always version 3, we need to unconditionally define those methods and do the check in our methods before calling down into the lower-level io methods.
(In reply to Nathan Froyd (:froydnj) from comment #22)
> I don't think those should be MOZ_ASSERT, they should be checks for iVersion
> >= 3.  Otherwise, we have to stub the functions.

SQLite defines those functions in the macro regardless of the io methods version, it just doesn't call them, based on the version.

> In a similar fashion, I think the xShm* methods that we were defining for
> iVersion >= 2 now need to do version checks themselves.

Those methods are again always defined in the macro, though they can be defined to a null pointer, that is why SQLite null-checks them, and we do the same before adding pointers. For these SQLite checks both the version and null.

I won't be again even stricter checks, I suggested MOZ_ASSERT cause, as the things are right now, hitting those would be a coding error not a runtime error. Whether this may change in future is out of my knowledge.
Don't have strong feelings about the approach, even if you want to duplicate SQLite checks on our side, it's not required though.
The assertion added here is incorrect, and I'm going to take it out.
Nevermind, this whole patch is busted, and it has already been backed out. :(
OK, new patch with slightly more lenient asserts.  Try run:

https://tbpl.mozilla.org/?tree=Try&rev=1a3d27ec65f6
Attachment #758540 - Attachment is obsolete: true
Attachment #759147 - Flags: review?(mak77)
Comment on attachment 759147 [details] [diff] [review]
define xFetch and xUnfetch methods in TelemetryVFS file objects

Review of attachment 759147 [details] [diff] [review]:
-----------------------------------------------------------------

::: storage/src/TelemetryVFS.cpp
@@ +303,5 @@
> +xUnfetch(sqlite3_file *pFile, sqlite3_int64 iOff, void *pResOut)
> +{
> +  telemetry_file *p = (telemetry_file *)pFile;
> +  MOZ_ASSERT(p->pReal->pMethods->iVersion >= 3);
> +  return p->pReal->pMethods->xUnfetch(p->pReal, iOff, pResOut); 

trailing space
Attachment #759147 - Flags: review?(mak77) → review+
https://hg.mozilla.org/mozilla-central/rev/dc8e78ed8c44
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
No crashes in the last 4 weeks. I guess we can call this verified fixed.
Status: RESOLVED → VERIFIED
Depends on: 926181
You need to log in before you can comment on or make changes to this bug.