Closed Bug 1090961 Opened 10 years ago Closed 10 years ago

Bookmarks.jsm and History.jsm need a solution for concurrent Sqlite transactions

Categories

(Toolkit :: Places, defect)

defect
Not set
normal
Points:
5

Tracking

()

RESOLVED FIXED
mozilla37
Iteration:
37.2

People

(Reporter: mak, Assigned: mak)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

The short story here is that Sqlite.jsm executeTransaction throws everytime it tries to create a transaction if one is already running.

I filed Bug 1079180 about this, but didn't find a solution, though at that time my use-case was weak.

Now the problem is the Sqlite.jsm behavior is problematic, cause it might make the bookmarks API unreliable.
Suppose code A is restoring bookmarks, while code B is trying to create a new bookmark. B will very likely throw, without a good reason.

The solutions I analyzed in https://bugzilla.mozilla.org/show_bug.cgi?id=1079180#c1 are still valid, but any solution so far has downsides.

We cannot just use savepoints cause we share the connection with add-ons and other code. Mixing up savepoints and transactions breaks atomicity (a commit clears any pending savepoint) and if a RELEASE fails, we don't know if it failed due to a real error (and thus we should rollback) or cause a commit cleared the savepoints stack.
We could make Sqlite ignore transaction errors, but we'd sacrifice atomicity again, we could not completely rollback an operation in case of error.
We could implement a queue, so that if a transaction is in progress we enqueue the operation and retry later, but that will have performance hits, and what if an add-on forgets an open transaction? we'd wait forever.
Blocks: asyncHistory
Summary: Bookmarks.jsm needs a solution for concurrent Sqlite transactions → Bookmarks.jsm and History.jsm needs a solution for concurrent Sqlite transactions
I'd vote for the queue. It's just a few lines of code, and it's simple to understand.
This is very problematic for History.jsm too, and for any Sqlite.jsm consumer sharing a connection through multiple components/consumers.

For pure Sqlite.jsm solutions, we could either use savepoints (and ensure the consumer doesn't issue a BEGIN query) or implement an internal queue (if a transaction is running and another one is requested, we can add it to a queue and run it later).

For wrapped connections it's harder, cause even the initial BEGIN could fail if something else has a transaction ongoing. Tracking external transactions is more complex, we should expose sqlite3_commit_hook and sqlite3_rollback_hook, in case a transaction is ongoing register with both a callback that will unregister itself and try to process the queue.

The risk of using a queue is that someone opens a transaction and forgets to commit it, for which likely we need a timeout, after which we'll proceed without a transaction.

This may take a bit of time to get right, so in the meanwhile a simple workaround (that will throw away atomicity) would be to make wrapped connection ignore BEGIN TRANSACTION failures, and just proceed without a transaction.
OK, so, queue + ignoring failures for wrapped connections, sounds like the best path forward for now.  In future we can improve the queue with hooks and timeouts.

Unfortunately a queue doesn't help if I have a a very long transaction (like restoring 1 thousands bookmarks) and I'm trying to add a single bookmark... the latter will wait in queue for a long time even if it's a very simple operation :(
well, we don't know that a transaction is running (cause maybe history.cpp created one), so we will know than just when we'll try to BEGIN TRANSACTION and it will throw... what to do at this point? we cannot interrupt the operation cause it would be for causes we can't control, if we enqueue the operation we don't know when the current transaction will finish (we need the hooks)
we can at a maximum reportError the problem, we cannot predict if a transaction already exists, and if it exists is not an error (remember we wrapped a connection we don't know anything about)
Fair enough. Then bug 1080457, to help us keep track of transactions that should not have taken place concurrently.
Marco, can you please estimate this and set qe+/-?
Flags: needinfo?(mak77)
Flags: firefox-backlog+
David, what do you think about picking this up? That's of course given your current workload allows it. It's blocking a bunch of stuff so it would be nice to prioritize it next.
Flags: needinfo?(dteller)
Not during Q4, I'm afraid. My plate is really full.
Flags: needinfo?(dteller)
(In reply to David Rajchenbach-Teller [:Yoric] (away until November 17th - use "needinfo") from comment #11)
> Not during Q4, I'm afraid. My plate is really full.

Ok, thanks for the quick response!
Making this "perfect" would require deeper changes, also involving Task.jsm due to an underlying architectural debit (see bug 1091446) that disallows us from dectecting nested transactions.

We can make this work like PlacesTransactions though, in the sense we will put a big warning about the fact nested transactions will completely block the queue, and trust consumers until we add something that allows us to fix the underlying miss.

Or we could just workaround the problem ignoring nested transactions detection completely for wrapped connections. this would be easier to do, we'd lose some more atomicity (note we lose it in any case cause the connection is wrapped, so something else can issue transactions on it).
Points: --- → 5
Flags: needinfo?(mak77)
Flags: qe-verify-
Summary: Bookmarks.jsm and History.jsm needs a solution for concurrent Sqlite transactions → Bookmarks.jsm and History.jsm need a solution for concurrent Sqlite transactions
Assignee: nobody → mak77
Status: NEW → ASSIGNED
Iteration: --- → 36.3
Attached file MozReview Request: bz://1090961/mak (obsolete) —
Attachment #8526667 - Flags: review?(dteller)
/r/879 - Bug 1090961 - Enqueue Sqlite.jsm transactions. r=Yoric

Pull down this commit:

hg pull review -r 22e84b7a58f47236aeae13c64da3f290eb47e317
this doesn't yet include a solution for bug 1091446. I have a solution for places transactions at hand, not yet one for Sqlite.jsm.
Actually, I must recheck if this implementation works fine in case or more than 2 consumers, cause both could end up yielding on the same promise and then trying to run concurrently...
Iteration: 36.3 → 37.1
https://reviewboard.mozilla.org/r/877/#review533

::: toolkit/modules/Sqlite.jsm
(Diff revision 1)
> -  this._log = Log.repository.getLoggerWithMessagePrefix("Sqlite.Connection." +
> +  this._log = Log.repository.getLoggerWithMessagePrefix("Sqlite.Connection",

That change doesn't look useful.

::: toolkit/modules/Sqlite.jsm
(Diff revision 1)
> -      // purpose: catching errors in statement execution.
> +      // purpose: catching errors in statement execution.      

Nit: whitespace.

::: toolkit/modules/Sqlite.jsm
(Diff revision 1)
> +          yield this._inProgressTransaction.promise;

This will fail if you have several transactions queued here.

You need to do something along the lines of:

// In the constructor
this._transactionQueue = Promise.resolve();

// Acquiring the queue
let queue = this._transactionQueue;
let promise = Task.spawn(function*() {
  // ...
});
this._transactionQueue = promise.then(null, () => {});
return promise.then(null, null);

Since we do that in OS.File already, this sounds like something we may want to put in a module at some point.

::: toolkit/modules/Sqlite.jsm
(Diff revision 1)
> +        if (!wrappedConnections.has(this._identifier))

I don't really understand these lines.

::: toolkit/modules/Sqlite.jsm
(Diff revision 1)
> +        // In such a case we'll just continue without a transaction.

That comment is not very clear.

::: toolkit/modules/tests/xpcshell/test_sqlite.js
(Diff revision 1)
> -add_task(function test_detect_multiple_transactions() {
> +add_task(function test_multiple_transactions() {

Let's make it `function*`.

::: toolkit/modules/tests/xpcshell/test_sqlite.js
(Diff revision 1)
> -      });
> +  });

Could you rather make it 10 concurrent transactions each with several `yield`, to be on the safe side?

::: toolkit/modules/tests/xpcshell/test_sqlite.js
(Diff revision 1)
> +add_task(function test_wrapped_connection_transaction() {

Nit: A comment explaining what the test does might be useful.
Also, `function*`.

::: toolkit/modules/tests/xpcshell/test_sqlite.js
(Diff revision 1)
> +  yield c.executeSimpleSQLAsync("BEGIN");

What's the expected behavior of

yield c.executeSimpleSQLAsync("BEGIN");
yield c.executeSimpleSQLAsync("BEGIN");
 // ...
(In reply to David Rajchenbach-Teller [:Yoric] (away until November 17th - use "needinfo") from comment #18)
> https://reviewboard.mozilla.org/r/877/#review533
> 
> ::: toolkit/modules/Sqlite.jsm
> (Diff revision 1)
> > -  this._log = Log.repository.getLoggerWithMessagePrefix("Sqlite.Connection." +
> > +  this._log = Log.repository.getLoggerWithMessagePrefix("Sqlite.Connection",
> 
> That change doesn't look useful.

well, it is actually a bug fix cause currently we print "undefined". the method expects 2 arguments and we were providing just one 

> ::: toolkit/modules/Sqlite.jsm
> (Diff revision 1)
> > +        if (!wrappedConnections.has(this._identifier))
> 
> I don't really understand these lines.
> 
> ::: toolkit/modules/Sqlite.jsm
> (Diff revision 1)
> > +        // In such a case we'll just continue without a transaction.
> 
> That comment is not very clear.

The comment states we continue without a transaction and it's what happens... what do you suggest?
we cannot in any way handle this error for wrapped connections, no way.

> ::: toolkit/modules/tests/xpcshell/test_sqlite.js
> (Diff revision 1)
> > +  yield c.executeSimpleSQLAsync("BEGIN");
> 
> What's the expected behavior of
> 
> yield c.executeSimpleSQLAsync("BEGIN");
> yield c.executeSimpleSQLAsync("BEGIN");
>  // ...

it would throw. transactions cannot be nested.
Attachment #8526667 - Flags: review?(dteller)
/r/879 - Bug 1090961 - Enqueue Sqlite.jsm transactions. r=Yoric

Pull down this commit:

hg pull review -r cc6ada5c75b53ab064dc351307fbd59f9886ad6e
For the transactions queue I got some help by Paolo, it now supports a timeout to avoid hanging indefinitely
Iteration: 37.1 → 37.2
https://reviewboard.mozilla.org/r/877/#review791

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> -    let onRollback = this._finalize.bind(this, this._deferredClose);
> +    return this._transactionQueue.then(this._finalize.bind(this));

Nit: () => this._finalize()

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> -  this._inProgressTransaction = null;
> +  this._inProgressTransaction = false;

If this is a boolean, can you prefix with `is` or `has`?

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
>      // performing finalization.

I have the impression that this statement is not true anymore. I don't see a ROLLBACK anywhere.

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> -  this._deferredClose = Promise.defer();
> +  this._deferredClose = PromiseUtils.defer();

Could you take the opportunity to document this property?

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> +  this._transactionQueue = Promise.resolve();

Could you document this property?

e.g. Can it reject?

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> -    Task.spawn(function doTransaction() {
> +      if (this._closeRequested)

if (...) {
}

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> +            if (!wrappedConnections.has(this._identifier))

Braces, please.

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> +          } catch (ex) {

Is there any way to be more specific and catch only the "right" kind of error?

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> +              throw ex;

Could you log this? Is there a way to find out which statements are being executed externally? If so, that would be a nice addition to the log, either immediately or as a followup.

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> -        // It's possible that a request to close the connection caused the
> +            // It's possible that a closed connection caused the error.

That's a bit ambiguous. Do you mean that one of the possible reasons for throwing here is that `this` connection may be closed?

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> +          if (this._inProgressTransaction) {

In which case can this be false?

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> -        this._inProgressTransaction = null;
> +        setTimeout(() => reject(new Error("Deadlock detected")),

While it's very likely to be a deadlock, there is no certainty. I'd prefer something along the lines of "Transaction timeout, most likely caused by a timeout".

Also, this begs for documentation.

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> -    return deferred.promise;
> +    this._transactionQueue = promise.catch(ex => { console.error(ex) });

Ok, that's better than `Cu.reportError`.

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> -  let deferred = Promise.defer();
> +  

Nit: Whitespace.

::: toolkit/modules/Sqlite.jsm
(Diff revision 2)
> +   * *****************************************************************************

As a side-note, we might be able to detect simple cases, if we find out that this is a common happenstance.

::: toolkit/modules/tests/xpcshell/test_sqlite.js
(Diff revision 2)
> -  let errored = false;
> +  let promise = c.executeTransaction(function transaction(conn) {

`function*`?

::: toolkit/modules/tests/xpcshell/test_sqlite.js
(Diff revision 2)
> +  let errored = false;

This looks like a good place to use `Assert.rejects`.

::: toolkit/modules/tests/xpcshell/test_sqlite.js
(Diff revision 2)
> -  do_check_eq(rows.length, 1);
> +  do_check_eq(rows.length, 20);

It might be interesting to check whether all values have been added.
Attachment #8526667 - Flags: review?(dteller) → feedback+
https://reviewboard.mozilla.org/r/877/#review831

> Is there any way to be more specific and catch only the "right" kind of error?

unfortunately beginTransaction returns very generic NS_ERROR_FAILURE that is the most common error we return around. I don't think we are going to gain anything by filtering on that :(

> Could you log this? Is there a way to find out which statements are being executed externally? If so, that would be a nice addition to the log, either immediately or as a followup.

Not off-hand. but one can use NSPRLOG to check all executed statements.

> In which case can this be false?

I forgot to set _hasInProgresstransaction = false in the wrapped connections case, now it should be clearer.

> As a side-note, we might be able to detect simple cases, if we find out that this is a common happenstance.

Based on discussions I had with Paolo, we don't seem to be interested in complicating the Task/Promise code to detect simple cases when in any case the possibilities to break stuff are so wide.

> It might be interesting to check whether all values have been added.

I don't think it's the goal of this test to check Sqlite properly executes queries
Attachment #8526667 - Flags: feedback+ → review?(dteller)
/r/879 - Bug 1090961 - Enqueue Sqlite.jsm transactions. r=Yoric

Pull down this commit:

hg pull review -r ee25637fc05237e3c9fc7c4a171c384dbd6adba8
https://reviewboard.mozilla.org/r/877/#review835

::: toolkit/modules/Sqlite.jsm
(Diff revisions 2 - 3)
> +              this._hasInProgressTransaction = false;

There is something I don't understand.
If `this._hasInProgressTransaction` was already `true` before we called `executeTransaction`, this line is going to set it to `false` although the transaction has not been completed.

Can you explain me why this should work?

Also, I have the intuition that it's actually `if (!wrapConnections.has(...))`, no?
https://reviewboard.mozilla.org/r/877/#review837

> There is something I don't understand.
> If `this._hasInProgressTransaction` was already `true` before we called `executeTransaction`, this line is going to set it to `false` although the transaction has not been completed.
> 
> Can you explain me why this should work?
> 
> Also, I have the intuition that it's actually `if (!wrapConnections.has(...))`, no?

if the connection is wrapped and BEGIN TRANSACTION failed, means someone else had a transaction ongoing. Thus we are not managing this transaction. In such a case it makes sense that _hasInProgressTransaction is false for 2 reasons:
1. it's not OUR transaction, so we should not lie to the consumer (fwiw, the hasTransaction API we expose is pointless with enqueuing)
2. we should not try to commit or rollback since it's not our transaction, we check _hasInProgressTransaction to avoid doing that.
https://reviewboard.mozilla.org/r/877/#review839

::: toolkit/modules/Sqlite.jsm
(Diff revisions 2 - 3)
> +              this._hasInProgressTransaction = false;

Let me expand.

We set `this._hasInProgressTransaction = true` without checking whether it is already `true`. If I understand the state, it should always be `false` at that state, but that's pretty much not documented/checked, so an assertion would clarify that situation a lot.

Also, I didn't understand the comment `"The transaction is not handled by us"` when I first read the code, so my guess is that I won't be the only one. Perhaps something along the lines of `"Apparently, a transaction was started by a client of the same connection that doesn't use Sqlite.jsm (e.g. in C++). The best we can do is proceed without a transaction and hope for the best."`? Also, let's log this, it might help us find out about such conflicts.

Also, `this._hasInProgressTransaction` is used for two things: informing clients of `get transactionInProgress()` that we indeed have a transaction in progress, and finding out whether `executeTransaction` could acquire a transaction or had to fallback to non-transaction mode. While I believe that both are indeed equal, I would find it easier to trust if we had a local variable for the latter use. This way, we wouldn't have switch the brain to figuring-out-async-state-machin-mode to understand the fallback case.

::: toolkit/modules/Sqlite.jsm
(Diff revisions 2 - 3)
>      this._transactionQueue = promise.catch(ex => { console.error(ex) });

By the way, this is the tricky part of the transaction queue, so I'd like a comment to document the fact that we are atomically updating `this._transactionQueue`, before anyone else has a chance to enqueue another transaction.
https://reviewboard.mozilla.org/r/877/#review841

::: toolkit/modules/Sqlite.jsm
(Diff revision 3)
> +          } catch (ex) {

As a side-note, at low-level, this is probably a `SQLITE_BUSY` error, right? If so, it would be interesting (in a followup bug), to expose this to mozStorage and Sqlite.jsm.
(In reply to David Rajchenbach-Teller [:Yoric] (hard to reach until December 10th - use "needinfo") from comment #28)
> https://reviewboard.mozilla.org/r/877/#review841
> 
> ::: toolkit/modules/Sqlite.jsm
> (Diff revision 3)
> > +          } catch (ex) {
> 
> As a side-note, at low-level, this is probably a `SQLITE_BUSY` error, right?
> If so, it would be interesting (in a followup bug), to expose this to
> mozStorage and Sqlite.jsm.

No, I think it's a generic SQLITE_ERROR, but I didn't verify. SQLITE_BUSY wouldn't make much sense.
(In reply to David Rajchenbach-Teller [:Yoric] (hard to reach until December 10th - use "needinfo") from comment #27)
> We set `this._hasInProgressTransaction = true` without checking whether it
> is already `true`. If I understand the state, it should always be `false` at
> that state, but that's pretty much not documented/checked, so an assertion
> would clarify that situation a lot.

Yes it should never be true.
Which kind of "assertion" do you expect there? should we throw?

> Also, I didn't understand the comment `"The transaction is not handled by
> us"` when I first read the code, so my guess is that I won't be the only
> one. Perhaps something along the lines of `"Apparently, a transaction was
> started by a client of the same connection that doesn't use Sqlite.jsm (e.g.
> in C++). The best we can do is proceed without a transaction and hope for
> the best."`? Also, let's log this, it might help us find out about such
> conflicts.

Sure, I can improve the comment.

> Also, `this._hasInProgressTransaction` is used for two things: informing
> clients of `get transactionInProgress()` that we indeed have a transaction
> in progress, and finding out whether `executeTransaction` could acquire a
> transaction or had to fallback to non-transaction mode.

Nope, transactions are enqueued, so there's no possibility we cannot acquire a transaction.
And as I said, adding transactionInProgress API was a bad idea, I'd make it always return false to be honest. I don't care what it returns and I don't think consumers will make any good use of it.
I'm evaluating to file a bug to Deprecate.warning from it.

> ::: toolkit/modules/Sqlite.jsm
> (Diff revisions 2 - 3)
> >      this._transactionQueue = promise.catch(ex => { console.error(ex) });
> 
> By the way, this is the tricky part of the transaction queue, so I'd like a
> comment to document the fact that we are atomically updating
> `this._transactionQueue`, before anyone else has a chance to enqueue another
> transaction.

ok.
To be clear, having transactionInProgress return true when we actually were not able to open a transaction (due to wrapping) would make it worse, imo. So that's why I think makes more sense to use a single property to track both.
/r/879 - Bug 1090961 - Enqueue Sqlite.jsm transactions. r=Yoric

Pull down this commit:

hg pull review -r f983a070064fd8511f948be8bf54407659ce7030
(In reply to Marco Bonardo [::mak] (needinfo? me) from comment #30)
> (In reply to David Rajchenbach-Teller [:Yoric] (hard to reach until December
> 10th - use "needinfo") from comment #27)
> > We set `this._hasInProgressTransaction = true` without checking whether it
> > is already `true`. If I understand the state, it should always be `false` at
> > that state, but that's pretty much not documented/checked, so an assertion
> > would clarify that situation a lot.
> 
> Yes it should never be true.
> Which kind of "assertion" do you expect there? should we throw?

I'd say `Warning.fail`, but that module doesn't exist yet :)
So, yes, I guess log and throw.

> > Also, `this._hasInProgressTransaction` is used for two things: informing
> > clients of `get transactionInProgress()` that we indeed have a transaction
> > in progress, and finding out whether `executeTransaction` could acquire a
> > transaction or had to fallback to non-transaction mode.
> 
> Nope, transactions are enqueued, so there's no possibility we cannot acquire
> a transaction.

Well, yes, there is a possibility: if the connection is wrapped and we cannot `BEGIN TRANSACTION`.

> And as I said, adding transactionInProgress API was a bad idea, I'd make it
> always return false to be honest. I don't care what it returns and I don't
> think consumers will make any good use of it.
> I'm evaluating to file a bug to Deprecate.warning from it.

That sounds good. Would the bug also replace `this._hasInProgressTransaction` with a local varable?
Attachment #8526667 - Flags: review?(dteller) → review+
https://reviewboard.mozilla.org/r/877/#review843

Trivial nit, then ship it!

::: toolkit/modules/Sqlite.jsm
(Diff revisions 3 - 4)
> +          console.error("Unexpected transaction in progress when trying to start a new one.");

Nit: Braces.
https://hg.mozilla.org/integration/fx-team/rev/00b588dac8bc
Flags: in-testsuite+
Target Milestone: --- → mozilla37
https://hg.mozilla.org/mozilla-central/rev/00b588dac8bc
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Attachment #8526667 - Attachment is obsolete: true
Attachment #8618499 - Flags: review+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: