Closed Bug 1624983 Opened 5 years ago Closed 5 years ago

abort() pending read transactions in remote settings' Database.jsm when we shutdown to enable faster shutdown

Categories

(Firefox :: Remote Settings Client, defect)

Product:

Component:

Type:

defect

Priority:

Not set

Severity:

normal

Tracking

()

Status:

RESOLVED FIXED

Milestone:

Firefox 76

Tracking Flags:

Tracking

Status

firefox74

---

wontfix

firefox75

---

wontfix

firefox76

---

fixed

People

(Reporter: Gijs, Assigned: Gijs)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [fxperf:p1])

Attachments

(2 files)

Bug 1624983 - abort pending read transactions from the DB when quitting, r?asuth!,leplatrem! 5 years ago :Gijs (he/him) 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1624983 - follow-up: add missing Array.from call 5 years ago :Gijs (he/him) 47 bytes, text/x-phabricator-request		Details \| Review

Assignee

Description

•

5 years ago

Per bug 1621759 comment 6:

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #6)

(In reply to :Gijs (he/him) from comment #3)

It feels like read requests to indexeddb shouldn't be blocking shutdown (because performance), but I'm unclear on whether, per bug 1545438 comment 26, we need to do that at least for now to avoid a much longer hang (I think so.). :asuth, can you clarify whether I'm right in thinking we're better off blocking the completion of profile-before-change on those IDB requests completing for now?

If you're not going to wait for the reads, it's appropriate to invoke abort() on any IDBTransaction instances that have already been created and then call close().

It seems like we could do this fairly easily in Database.jsm by making executeIDB take an option to indicate the transaction can be aborted, and then adding that option to callers that do read-only transactions. executeIDB can use that option to keep track of pending read transactions, and when the async shutdown promise getter trips we can call abort on all those transactions. Per the spec, it seems this should cause an abort event which executeIDB can listen for, and it can reject its promise with an error in that case.

After the changes in bug 1598924 doing so should automatically throw and stop processing whatever we were doing, and then call .close(). I think this should always be fine for reads on the RS databases, though we should perhaps audit the callsites of the read-only methods from the previous paragraph to ensure there aren't problematic consumer patterns where they do multiple writes interspersed with reads that aren't in the same transaction - of course, that would already be bad, because both the writes and reads can already be aborted/throw for other reasons.

Assignee

Updated

•

5 years ago

See Also: → 1621759

Assignee

Updated

•

5 years ago

Whiteboard: [fxperf]

Assignee

Updated

•

5 years ago

Blocks: 1624878

Assignee

Updated

•

5 years ago

Assignee: nobody → gijskruitbosch+bugs

Status: NEW → ASSIGNED

Assignee

Comment 1

•

5 years ago

Attached file Bug 1624983 - abort pending read transactions from the DB when quitting, r?asuth!,leplatrem! — Details

Comment 2

•

5 years ago

Pushed by gijskruitbosch@gmail.com: https://hg.mozilla.org/integration/autoland/rev/15a52d0bf116 abort pending read transactions from the DB when quitting, r=asuth,leplatrem

Comment 3

•

5 years ago

Backout by csabou@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c012372b0db2 Backed out changeset 15a52d0bf116 for bc failures on browser_RecipeRunner.js

Cosmin Sabou [:CosminS]

Comment 4

•

5 years ago

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception&revision=15a52d0bf11640e0fbbbd30c09d7a7e822e0524a&selectedJob=295107068

Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=295107068&repo=autoland
https://treeherder.mozilla.org/logviewer.html#?job_id=295096429&repo=autoland
https://treeherder.mozilla.org/logviewer.html#?job_id=295096602&repo=autoland

Backout link: https://hg.mozilla.org/integration/autoland/rev/c012372b0db2

Flags: needinfo?(gijskruitbosch+bugs)

Assignee

Comment 5

•

5 years ago

Note: there were also xpcshell failures:

TEST-UNEXPECTED-FAIL | xpcshell-legacyconfig.ini:toolkit/components/search/tests/xpcshell/test_selectedEngine.js | test_fallback_kept_after_restart - [test_fallback_kept_after_restart : 265] A promise chain failed to handle a rejection: A mutation operation was attempted on a database that did not allow mutations. - stack: ensureShutdownBlocker/<@resource://services-settings/Database.jsm:534:21

Assignee

Comment 6

•

5 years ago

The issues with browser_RecipeRunner.js are easy to fix - there we can just await the calls to db.close() (which they're not right now), and then we stop trying to reuse a db we're in the middle of closing.

However, fundamentally all the methods on Database (now InternalDatabase) share their access to _idb. And all of the ones that want that object call await this.open(). That's great. The problem is that a bunch of the operations they then proceed to do are asynchronous, and so if there are multiple consumers using the DB via remotesettings, one of them can end up calling this.db.close() while one of the others (using the same db!) is mid-operation. This does not go well.

Mathieu, how do you want this part of remote settings to work? Should we somehow keep track of pending operations and make Database.close a no-op as long as at least one operation is still ongoing?

Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(mathieu)

Andrew Sutherland [:asuth] (he/him)

Comment 9

•

5 years ago

As a general FYI, the IndexedDB backend will automatically close the underlying database connections if they're idle for a while and reopen it on demand. It would probably be fine to keep the database around once the consumer triggers the open, and then hold onto it until shutdown as long as there's an expectation that there will be future database accesses.

Also, re: database closing, note that IDB's close() doesn't actually take effect "until all outstanding transactions have completed" so as long as things are happening within a transaction, things should generally keep working.

Assignee

Comment 10

•

5 years ago

Ah, I think I've figured it out. I think this is a data race in my patch. This happens:

call A calls close on the DB (which async-awaits the cached _idb ref)
call B asks for a DB (which also async-awaits the cached _idb ref)

(microtask checkpoint runs, and A+B get their db)

call A closes the DB
call B tries to start a transaction on the closed DB

If calling close clears the this._idb property sync, instead of only doing so after await'ing it, this problem should go away, because then in step 2, B gets a fresh ref.

If the microtask queue goes the "other way", so B runs before A, this already can't happen:

call B asks for a DB (awaits this._idb)
call A calls close, clearing _idb and await'ing it

(microtask checkpoint runs, and A+B get their db)

call B starts a transaction on the DB.
call A calls close, which per Andrew in comment #9 will wait for the transaction to finish.

Assignee

Comment 11

•

5 years ago

So apparently comment #10, while sounding plausible, is still wrong - it's possible for the close to happen between when we call open() (getting a cached promise) and when the promise resolves and we actually try to use said DB...

Assignee

Comment 12

•

5 years ago

(In reply to :Gijs (he/him) from comment #11)

So apparently comment #10, while sounding plausible, is still wrong - it's possible for the close to happen between when we call open() (getting a cached promise) and when the promise resolves and we actually try to use said DB...

OK, this took me a surprising amount of time to work out, but it's because open is declared async. See also this (to me, surprising) minimal example: https://jsbin.com/ceyiquxeqe/edit?html,js,output

Declaring open as a sync function (that just returns a promise) makes this issue go away. The fact that we take an extra turn through the event loop / extra microtask queue to get the DB if open is declared async is... surprising, I think, and feels fragile.

I'm not really sure what to do about that though, short of re-architecting Database.jsm and remote settings to somehow never call close, or keep track of pending transactions and delaying close until all pending transactions have finished...

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #9)

As a general FYI, the IndexedDB backend will automatically close the underlying database connections if they're idle for a while and reopen it on demand. It would probably be fine to keep the database around once the consumer triggers the open, and then hold onto it until shutdown as long as there's an expectation that there will be future database accesses.

I would do this (ie just not call close) if it wasn't for the other issues that we've had in this space, where IDB seems to take 30 seconds, even during shutdown, to decide the connection is idle and to close it...

That said, not all the operations in RS currently call close when done, so this generally feels like it's already messy.

Andrew Sutherland [:asuth] (he/him)

Comment 13

•

5 years ago

(In reply to :Gijs (he/him) from comment #12)

I would do this (ie just not call close) if it wasn't for the other issues that we've had in this space, where IDB seems to take 30 seconds, even during shutdown, to decide the connection is idle and to close it...

To clarify, I meant calling close (and abort) at your component's shutdown, not never calling close.

Florian Quèze [:florian]

Updated

•

5 years ago

Whiteboard: [fxperf] → [fxperf:p1]

Assignee

Comment 15

•

5 years ago

If I can't manage to get this into 76 before we branch I'll probably be looking to uplift.

status-firefox74: --- → wontfix

status-firefox75: --- → wontfix

Assignee

Updated

•

5 years ago

Flags: needinfo?(mathieu)

Assignee

Comment 16

•

5 years ago

Andrew, Mathieu: just to check, this should now be back in your queues...

Flags: needinfo?(mathieu)

Flags: needinfo?(bugmail)

Assignee

Comment 18

•

5 years ago

I'm now running into this assertion failure:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a99771ff5245ab7e6acd02e4b8320a3bda05270a&selectedJob=295911786

GECKO(1226) | Assertion failure: data.mRecursionDepth > mBaseRecursionDepth, at /builds/worker/checkouts/gecko/xpcom/base/CycleCollectedJSContext.cpp:546

and not 100% sure what to do about that...

Flags: needinfo?(mathieu)

Assignee

Updated

•

5 years ago

See Also: → 1626935

Assignee

Updated

•

5 years ago

Flags: needinfo?(bugmail)

Comment 19

•

5 years ago

Pushed by gijskruitbosch@gmail.com: https://hg.mozilla.org/integration/autoland/rev/7312dfafef58 abort pending read transactions from the DB when quitting, r=asuth,leplatrem

Assignee

Comment 21

•

5 years ago

Attached file Bug 1624983 - follow-up: add missing Array.from call — Details

Comment 22

•

5 years ago

Pushed by malexandru@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/825f035018c2 follow-up: add missing Array.from call a=fix

Stefan Hindli [:stefan_hindli]

Comment 23

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/7312dfafef58
https://hg.mozilla.org/mozilla-central/rev/825f035018c2

Status: ASSIGNED → RESOLVED

Closed: 5 years ago

status-firefox76: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → Firefox 76

Noemi Erli[:noemi_erli]

Comment 24

•

5 years ago

This is not fixed as it continued appearing after the last follow-up:
https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=success%2Ctestfailed%2Cbusted%2Cexception&searchStr=xpc&revision=09fc2941dd6844a6404834ec3bd76ed4f2e44102&selectedJob=296019366
https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=success%2Ctestfailed%2Cbusted%2Cexception&searchStr=xpc&tochange=7f069649d6cf0d7d8cba14cfd987e835509b22fa&fromchange=825f035018c2a7c4526ad68909d96ddd585b7a71

Gijs, could you please take a look at this?

Flags: needinfo?(gijskruitbosch+bugs)

Assignee

Updated

•

5 years ago

Regressions: 1627179

Assignee

Comment 25

•

5 years ago

(In reply to Noemi Erli[:noemi_erli] from comment #24)

This is not fixed as it continued appearing after the last follow-up:
https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=success%2Ctestfailed%2Cbusted%2Cexception&searchStr=xpc&revision=09fc2941dd6844a6404834ec3bd76ed4f2e44102&selectedJob=296019366
https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=success%2Ctestfailed%2Cbusted%2Cexception&searchStr=xpc&tochange=7f069649d6cf0d7d8cba14cfd987e835509b22fa&fromchange=825f035018c2a7c4526ad68909d96ddd585b7a71

Gijs, could you please take a look at this?

Thanks for the heads-up, I filed bug 1627179 for these failures.

Flags: needinfo?(gijskruitbosch+bugs)

Assignee

Updated

•

5 years ago

Blocks: 1621759

Assignee

Updated

•

5 years ago

See Also: 1621759 →

Comment hidden (Intermittent Failures Robot)

Comment hidden (Intermittent Failures Robot)

Assignee

Updated

•

5 years ago

Blocks: 1629005

You need to log in before you can comment on or make changes to this bug.