Created attachment 439351 [details] [diff] [review] patch v1.0 We call COMMIT TRANSACTION both for sync and async statements, the connection is tracking whether we have already created a transaction or not through mTransactionInProgress. if COMMIT TRANSACTION fails though, we don't reset mTransactionInProgress, then any further attempt to call BeginTransactionAs will fail. Also, SQLite docs seem to point out that if COMMIT TRANSACTION fails with certain return values, we should always rollback. This patch implements an half-approach, commit still resets mTransactionInProgress only on success, but the helper in case of a recoverable error tries to rollback. Rollback should keep retrying on busy and finally reset the flag if it succeeds. Resetting mTransactionInProgress regardless in transactionCommit would be still wrong since in case of failure the transaction could still be there and the user could still rollback it. My first impl was just getting the error in transactionCommit and directly ExecuteSimpleSQL("ROLLBACK TRANSACTION"), but Shawn thinks it's better to put this logic in the helper. Shawn suggested to get some feedback from Asuth We are basically trying to debug what happens in a bunch of Places tests, where an explicit mozStorageTransaction.commit() call fails with NS_ERROR_FAILURE. Looks like this could be fired as a generic storage error, but also if one tries to commit a transaction that does not exist (how could that happen?!), we still have not found the underlying reason, but returning a better error could help claryfying that (thus the DEBUG warning on generic error and the NS_ERROR_UNEXPECTED change). PS: I'm not sure i can make a test where i can cause a wanted failure of COMMIT TRANSACTION.
hm, actually re-reading the docs for the third time i noticed a WITHIN: "If certain kinds of errors occur _within_ a transaction". So looks like the rollback request is not on commit, but on other queries that run inside the transaction (is commit included though?). This could be a bit harder to manage. The above phrase "When COMMIT fails in this way, the transaction remains active and the COMMIT can be retried later after the reader has had a chance to clear." sounds like suggesting to do for commit the same we do for rollback (why don't we retry now?)
finally, even if we actually retry, will this give a meaningful success return value to the .Commit() caller? (i'm unsure how PR_Sleep is meant to work)
Comment on attachment 439351 [details] [diff] [review] patch v1.0 You are checking for NS_ERROR_ABORT twice. There is a typo on "explitic". You are probably getting bit by SQLITE_BUSY and Places sketchy use of queries on the main thread. Since both commits and rollbacks will fail in the busy case, I don't think trying to rollback in case of busy makes much sense. You probably want to busy-wait just like we do if we encounter a busy during the processing of asynchronous statements.
well SQLITE_BUSY would not give us an NS_ERROR_FAILURE exception :) btw, i just discovered a case where we don't use the transaction helper, that does not try to commit if creating the transaction fails. But this is another bug that i'll file. I should also split out the errors change to a new bug and this bug will just be about the fact if COMMIT TRANSACTION fails, we won't anymore create any transaction.
(In reply to comment #4) > well SQLITE_BUSY would not give us an NS_ERROR_FAILURE exception :) I'm not sure what you are referring to. With your patch, mozStorageTransaction::Commit will still return NS_ERROR_STORAGE_BUSY in the SQLITE_BUSY case, but now has the potential for the highly amusing following scenario: - Healthy transaction's operations complete on the async thread - Some query on the main thread sneaks in. - Automatic COMMIT (for the serious of healthy actions) fails with SQLITE_BUSY on the async thread... - Query on the main thread completes - ...automatic ROLLBACK happens on the async thread because of the SQLITE_BUSY and succeeds now because the queries from the main thread are all done. - Weird database state! In general, all of this code is only going to matter to places, so I'm not majorly concerned, but you probably should consider using systemtap to induce fault injection or some direct SQLite method assuming it supports fault injection.
(In reply to comment #5) > (In reply to comment #4) > > well SQLITE_BUSY would not give us an NS_ERROR_FAILURE exception :) > > I'm not sure what you are referring to. With your patch, > mozStorageTransaction::Commit will still return NS_ERROR_STORAGE_BUSY in the > SQLITE_BUSY case yes, we get NS_ERROR_FAILURE instead, so we don't fail on NS_ERROR_STORAGE_BUSY. I have a small patch that could prevent some race condition on our side, maybe it will be enough. About this bug, after comment 1 i was already not thinking anymore to auto rollback here, and as you point out with our "fancy" sync/async situation it would be a bigger issue. Still the fact any commit failure will kill all future transactions needs to be handled somehow...
Comment on attachment 439351 [details] [diff] [review] patch v1.0 ok, i'm splitting error changes to another bug that could get a quicker resolution, and leaving this to a better solution.
Attachment #439351 - Attachment is obsolete: true
So, if you have ideas on how this could be fixed, i can eventually put up a patch, otherwise i dunno how Storage wants to handle the case, thus for now i'm unassigning from me.
Assignee: mak77 → nobody
Apart the originally reported issue here, thus that after a failure we stop trying to create transactions, we are also not handling SQLITE_BUSY in any way: from http://www.sqlite.org/lang_transaction.html "The explicit COMMIT command runs immediately, even if there are pending SELECT statements. However, if there are pending write operations, the COMMIT command will fail with an error code SQLITE_BUSY. An attempt to execute COMMIT might also result in an SQLITE_BUSY return code if an another thread or process has a shared lock on the database that prevented the database from being updated. When COMMIT fails in this way, the transaction remains active and the COMMIT can be retried later after the reader has had a chance to clear." See also bug 860357.
You need to log in before you can comment on or make changes to this bug.