Closed Bug 1070447 Opened 10 years ago Closed 10 years ago

Sqlite causing crash on Windows XP, possibly other OSes

Categories

(Firefox :: Untriaged, defect)

32 Branch
x86
Windows XP
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 1005991

People

(Reporter: ray_engelking, Unassigned)

Details

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0
Build ID: 20140917194002

Steps to reproduce:

I have built a very successful extension that uses a sqlite database from xul; it is called pin4ever (download at pin4ever.com) and it backs up Pinterest pins among other things.  It opens a connection to a local file, and then runs sql statements to that file:

get dbService() {	return Components.classes["@mozilla.org/storageservice;1"].getService(Components.interfaces.mozIStorageService);},
dbConnection=dbService.openDatabase(dbFile);

let statement=Pin4EverSql.dbConnection.createStatement(sql)
statement.params.blah_blah_blah=blah_blah_blah
statement.executeAsync

It keeps the connection open and runs thousands of statements, depending on the size of the Pinterest account (although the crash does not seem to be related to the size), and on some systems it causes Firefox to crash on the executeAsync; it does this at random times, on entirely different sql statements, both INSERTs and regular single SELECTs, is not very reproducable at all, and I have tried many things to eliminate this problem and it definitely occurs with sqlite on XP machines.  On my Windows 7 and Mac OSX Mavericks, I can't even get it to crash, even with the exact same data and parameters, with versions all the same, and clean profiles.  I did manage to cut the crashes way down by combining statements into a batch where ever possible, but it still is occurring, even on single statements.  I have even imaged an XP machine that it was crashing on and run the image on VMWare on my windows 7 computer, and still had the same exact crashing occurring in the virtual machine as on the actual machine.  Once it starts occurring it seems to get progressively worse.  Creating a new sqlite database along with a clean Firefox profile, seems to help restore the machine to pre-error conditions, so not sure where the problem is.  I have also tried taking the sqlite database from a crashed XP machine and putting it on my Windows 7 machine, and working with the same extension, and same version of Firefox and can not get it to crash.  The machines don't appear to be running out of RAM either. 


Actual results:

Here are crash logs from one XP machine, which on it was rather difficult to get to crash:
https://crash-stats.mozilla.com/report/index/1d5cf252-cb96-4901-8601-1a8972140723

Here are crash logs from another older XP machine, that crashes more frequently (the one I also imaged and did tests in a VM):
https://crash-stats.mozilla.com/report/index/bp-9b8f92c4-8c9d-4a20-9515-e8e582140919
https://crash-stats.mozilla.com/report/index/bp-efc183dc-0e35-4f17-995e-53fde2140916
https://crash-stats.mozilla.com/report/index/bp-11b840a8-3f8a-4bad-ac40-e93172140916
https://crash-stats.mozilla.com/report/index/bp-03093625-a0e5-4bc1-a0f7-be15d2140916
https://crash-stats.mozilla.com/report/index/bp-c95e0984-ba94-49e2-b763-7f3922140915

They all appear to have EXCEPTION_ACCESS_VIOLATION_READ as the Crash Reason.  We have many customers reporting the same problem.  This problem seemed to have started occurring in February of this year.  We have kept waiting hoping that newer version would correct this problem, but they have not.
Severity: normal → critical
OS: Windows 7 → Windows XP
Hardware: x86_64 → x86
Ray, can you provide a copy of the problematic DB and more detailed steps regarding how to reproduce after installing the add-on?

Marco, do you have ideas about what's going on here and/or how to debug this further?
Flags: needinfo?(ray_engelking)
Flags: needinfo?(mak77)
off-hand it looks like a thread-safety issue. And basically this is a duplicate of bug 1005991, we indeed have some code doing threaded stuff in a not very safe way. Unfortunately the change there is not trivial and I never finished that patch due to lack of time. Hope to have some time to work on it soon.

You might try to make your code a little bit more correct and see if that helps, for example you should use mozIStorageAsyncConnection and not mozIStorageConnection.
For example something you are doing wrong is using createStatement and then executeasyncStatement. you should have used createAsyncStatement. mozIStorageAsyncConnection will ensure you cannot do these mistakes. you can open one through openAsyncDatabase but I'd rather suggest using Sqlite.jsm that will do most of the work for you.

I'm duping this bug there. Since you can likely not access that bug, feel free to keep posting here, whether the suggested solution helps any way.
Status: UNCONFIRMED → RESOLVED
Closed: 10 years ago
Flags: needinfo?(mak77)
Resolution: --- → DUPLICATE
The only way I was able to get a crash was to download the pin4ever extension onto an older windows XP computer, and then run multiple backups on it of a Pinterest account that had over 10,000 "pins", including backing up thousands of "Followers" and "Followings" (you have to check those off to backup, as they are unchecked by default).  Somewhere along the way, due to some unknown combination of Pinterest returning data, and the database being updated with new data, a crash will occur.  The crashes never occur at the same point in the data, even when starting from scratch, with a new profile, and the exact same starting point (I know this because I was running it from an image in VMWare of the older XP machine and would start from the same snapshot).  The only thing that would be different would be the pages being returned from Pinterest (Pinterest regularly returns all kinds of partial pages and errors, even to users just using Pinterest without any addons).

I can run the same exact test from antoher XP machine or Windows 7 machine, using the crashed database, and never get a crash, so I don't think the database is being corrupted.


 
(In reply to :Gijs Kruitbosch from comment #1)
> Ray, can you provide a copy of the problematic DB and more detailed steps
> regarding how to reproduce after installing the add-on?
> 
> Marco, do you have ideas about what's going on here and/or how to debug this
> further?
Flags: needinfo?(ray_engelking)
Some of the async statements that you mentioned are not documented anywhere that I can see, and according to the current documentation, I am doing things correctly, but you are the expert so I would like to try what you suggest...at this point I will try anything.  However, I can't find any documentation on the openAsyncDatabase statement.  Is there documentation somewhere, or example code for using openAsyncDatabase, as it apparently requires some additional parameters.

I also would not mind trying the Sqlite.jsm, although the documentation on that seems even more sparse.

(In reply to Marco Bonardo [::mak] (needinfo? me) from comment #2)
> off-hand it looks like a thread-safety issue. And basically this is a
> duplicate of bug 1005991, we indeed have some code doing threaded stuff in a
> not very safe way. Unfortunately the change there is not trivial and I never
> finished that patch due to lack of time. Hope to have some time to work on
> it soon.
> 
> You might try to make your code a little bit more correct and see if that
> helps, for example you should use mozIStorageAsyncConnection and not
> mozIStorageConnection.
> For example something you are doing wrong is using createStatement and then
> executeasyncStatement. you should have used createAsyncStatement.
> mozIStorageAsyncConnection will ensure you cannot do these mistakes. you can
> open one through openAsyncDatabase but I'd rather suggest using Sqlite.jsm
> that will do most of the work for you.
> 
> I'm duping this bug there. Since you can likely not access that bug, feel
> free to keep posting here, whether the suggested solution helps any way.
> 
> *** This bug has been marked as a duplicate of bug 1005991 ***
Hi Marco, why can I not view the duplicate bug that this is linked to?  Is there any time frame for getting that bug fixed, as this bug seemed to just appear back in Feb of this year.  Also, I am still trying to convert over using Sqlite.jsm, and it is proving very difficult for me.  The first link you sent me on Sqlite.jsm does not even have a basic example of how to pull data from a select statement.  Plus, I am not sure the best way to use Sqlite.jsm, is it better to keep opening a connection and closing it, or just leave a connection open to share, and can two connections be open to the same file at once?  Thanks for all your help.
(In reply to Ray Engelking from comment #6)
> Hi Marco, why can I not view the duplicate bug that this is linked to?Is
> there any time frame for getting that bug fixed, as this bug seemed to just
> appear back in Feb of this year.

Asap, unfortunately the resources are limited and I'm working on other priorities atm.

> Also, I am still trying to convert over
> using Sqlite.jsm, and it is proving very difficult for me.  The first link
> you sent me on Sqlite.jsm does not even have a basic example of how to pull
> data from a select statement.

I'm sorry, we need to improve the documentation, or you could do once you figure what's missing too, and that would be very welcome!
you can look at our code, for example
http://mxr.mozilla.org/mozilla-central/search?string=conn.executeCached
you can even look at the Sqlite.jsm test
http://mxr.mozilla.org/mozilla-central/source/toolkit/modules/tests/xpcshell/test_sqlite.js

> Plus, I am not sure the best way to use
> Sqlite.jsm, is it better to keep opening a connection and closing it, or
> just leave a connection open to share

It depends on the scope, if it's used for a one-shot query (for example I have an add-on that on request does db cleanup) then it's better to open and close, otherwise it's better to keep the connection around and close it at shutdown (See Sqlite.shutdown.addBlocker)

> and can two connections be open to
> the same file at once?

Absolutely yes, and they will work concurrently (one can read even if the other one is writing) if you use WAL journaling for your database. This is just a wrapper around mozStorage, so whatever you can do with one can be done with the other one.
Thanks for all the info.  I am still getting stuck converting over as I was using lastInsertRowID in a few places and I have searched everywhere in the links you gave me, and all I could find is that it is discontinued, but no alternative.  I need to get the id of new INSERTs, please help.
I ended up opening an extra connection for the insert statement hopefully to get a separate mutex, and then doing a SELECT last_insert_rowid() to get the id.. not sure if this is the best way to do it though, or if Sqlite.jsm provides something better.
last_insert_rowid is not reliable for our API, so I'd suggest to SELECT id FROM table ORDER BY rowid DESC LIMIT 1 (or select max(rowid) if you use the rowid).
you don't need to open a separate connection, indeed if you do it's more likely you will get the wrong result since the connections would be concurrent (one could read while the other one writes)
that seems like a good way to do it, other than I realized I am also using some "INSERT OR REPLACE" statements, which your technique would not work with because the last insert or replace may not be the highest row_id. The "SELECT last_insert_rowid() AS lastInsertRowID" does not have this problem. Either way, if another thread is writing, I believe, even with your suggested method, it still will get the wrong number.  Plus, if AUTOINCREMENT is not being used with the ROWID, it is possible for the ROWID to be reused and not be higher, so your suggestion would fail in that case too.  The good thing about your suggestion is it does give a table specific id, not the last id regardless of table.

You are right though, I double checked and opening another connection won't help.
Although the sqlite.org reference does say "If a separate thread performs a new INSERT on the same database connection while the sqlite3_last_insert_rowid() function is running and thus changes the last insert rowid, then the value returned by sqlite3_last_insert_rowid() is unpredictable and might not equal either the old or the new last insert rowid."

"on the same database connection" is the key here... that makes it sound like opening a new connection will help.  I'll do a test to see if that is the case, and see it helps and report back here.
last_insert_rowid is per connection. But it's sort of a footgun in our code cause we have both a synchronous and an asynchronous API acting on a single connection. If you would use both APIs, then you enter the multiple threads on single connection case that is unpredictable. That's why we don't expose it.

Though if you use *only* Sqlite.jsm, then "SELECT last_insert_rowid()" should be reliable, cause everything runs in a single thread. if you start mixing it with legacy synchronous mozStorage API, you will have issues.

With multiple connections, last_insert_rowid is for each connection, so it should be reliable in the connection scope.

I also think INSERT OR REPLACE changes the rowid of a replaced row so it should still assign the higher rowid.

Btw, how are you using multiple threads from js? Since I don't know well what you are trying to do and how, it's hard to give suggestions.
Good news, I did a test and it appears that the last_insert_rowid() is per connection.
Here's the test:
Task.spawn(function() {
	try {
		let dbFile=Pin4EverLogin.getLocalDirectory();
		dbFile.append("pin4ever.sqlite");
		dbConnection1=yield Sqlite.openConnection({path: dbFile.path});
		dbConnection2=yield Sqlite.openConnection({path: dbFile.path});
		dbConnection3=yield Sqlite.openConnection({path: dbFile.path});
		dbConnection4=yield Sqlite.openConnection({path: dbFile.path});

		let result1=yield dbConnection1.execute("INSERT INTO accounts (username) VALUES('testing1')");
		let result2=yield dbConnection2.execute("INSERT INTO accounts (username) VALUES('testing2')");
		let result3=yield dbConnection3.execute("INSERT INTO accounts (username) VALUES('testing3')");
		let result4=yield dbConnection4.execute("INSERT INTO accounts (username) VALUES('testing4')");

		result1=yield dbConnection1.execute("SELECT last_insert_rowid() AS lastInsertRowID");
		result2=yield dbConnection2.execute("SELECT last_insert_rowid() AS lastInsertRowID");
		result3=yield dbConnection3.execute("SELECT last_insert_rowid() AS lastInsertRowID");
		result4=yield dbConnection4.execute("SELECT last_insert_rowid() AS lastInsertRowID");

		let lastInsertRowID1=result1[0].getResultByName("lastInsertRowID");
		let lastInsertRowID2=result2[0].getResultByName("lastInsertRowID");
		let lastInsertRowID3=result3[0].getResultByName("lastInsertRowID");
		let lastInsertRowID4=result4[0].getResultByName("lastInsertRowID");
		dump("lastInsertRowID1:"+lastInsertRowID1+"\n");
		dump("lastInsertRowID2:"+lastInsertRowID2+"\n");
		dump("lastInsertRowID3:"+lastInsertRowID3+"\n");
		dump("lastInsertRowID4:"+lastInsertRowID4+"\n");
		
		result1=yield dbConnection1.execute("SELECT last_insert_rowid() AS lastInsertRowID");
		result2=yield dbConnection2.execute("SELECT last_insert_rowid() AS lastInsertRowID");
		result3=yield dbConnection3.execute("SELECT last_insert_rowid() AS lastInsertRowID");
		result4=yield dbConnection4.execute("SELECT last_insert_rowid() AS lastInsertRowID");
		
		lastInsertRowID1=result1[0].getResultByName("lastInsertRowID");
		lastInsertRowID2=result2[0].getResultByName("lastInsertRowID");
		lastInsertRowID3=result3[0].getResultByName("lastInsertRowID");
		lastInsertRowID4=result4[0].getResultByName("lastInsertRowID");
		dump("lastInsertRowID1:"+lastInsertRowID1+"\n");
		dump("lastInsertRowID2:"+lastInsertRowID2+"\n");
		dump("lastInsertRowID3:"+lastInsertRowID3+"\n");
		dump("lastInsertRowID4:"+lastInsertRowID4+"\n");

	} catch (err) {
		Pin4EverLogin.logE("testDB sql error:"+err.message);
	} finally {
		yield dbConnection1.close();
		yield dbConnection2.close();
		yield dbConnection3.close();
		yield dbConnection4.close();
	}
});

and here is the result:

lastInsertRowID1:25
lastInsertRowID2:26
lastInsertRowID3:27
lastInsertRowID4:28
lastInsertRowID1:25
lastInsertRowID2:26
lastInsertRowID3:27
lastInsertRowID4:28
(In reply to Marco Bonardo [::mak] (needinfo? me) from comment #13)
> last_insert_rowid is per connection. But it's sort of a footgun in our code
> cause we have both a synchronous and an asynchronous API acting on a single
> connection. If you would use both APIs, then you enter the multiple threads
> on single connection case that is unpredictable. That's why we don't expose
> it.
> 
> Though if you use *only* Sqlite.jsm, then "SELECT last_insert_rowid()"
> should be reliable, cause everything runs in a single thread. if you start
> mixing it with legacy synchronous mozStorage API, you will have issues.
> 
> With multiple connections, last_insert_rowid is for each connection, so it
> should be reliable in the connection scope.
> 
> I also think INSERT OR REPLACE changes the rowid of a replaced row so it
> should still assign the higher rowid.
> 
> Btw, how are you using multiple threads from js? Since I don't know well
> what you are trying to do and how, it's hard to give suggestions.

I almost have everything converted over to pure Sqlite.jsm, just needed to figure out that last_insert_rowid part.  Thank you very much for your help, sorry for taking up so much of your time!
Just an update, the conversion of everything over to the Sqlite.jsm appears to have eliminated the crashing on every machine no matter how large a Pinterest account we test it on, so THANK YOU!

I wouldn't mind updating the Sqlite.jsm with some of the things that I have learned; how to I go about that?
To clarify, updating the Sqlite.jsm documentation examples.
you can just create an account on mdn and submit changes
You need to log in before you can comment on or make changes to this bug.