Open Bug 1806358 Opened 2 years ago Updated 2 years ago

Failed to import bookmarks from chrome://browser/content/default-bookmarks.html: Error: places.sqlite#1: cannot execute operation Bookmarks.jsm: insertBookmarkTree, the connection is already closing"

Categories

(Toolkit :: Places, defect, P3)

defect

Tracking

()

People

(Reporter: whimboo, Unassigned, NeedInfo)

References

(Blocks 1 open bug)

Details

Several tests in CI that are using Marionette as underlying harness showing the following failure during shutdown which is causing task timeouts due to a lot of restarts of Firefox (bug 1589796):

https://treeherder.mozilla.org/logviewer?job_id=400022129&repo=try&lineNumber=58642-58660

task 2022-12-18T14:32:32.823Z] 14:32:32     INFO - 1671373952821	Marionette	DEBUG	Marionette stopped listening
[task 2022-12-18T14:32:32.860Z] 14:32:32     INFO - console.error: (new TypeError("NetworkError: Network request failed", "resource://services-settings/Utils.jsm", 230))
[task 2022-12-18T14:32:40.260Z] 14:32:40     INFO - console.error: "Failed to import bookmarks from chrome://browser/content/default-bookmarks.html: Error: places.sqlite#1: cannot execute operation Bookmarks.jsm: insertBookmarkTree, the connection is already closing"
[task 2022-12-18T14:32:42.135Z] 14:32:42     INFO - JavaScript error: resource://gre/modules/AsyncShutdown.jsm, line 718: Error: Phase "IOUtils: waiting for profileBeforeChange IO to complete" is finished, it is too late to register completion condition "JSON store: writing data for 'shieldpreferenceexperiments'"
[task 2022-12-18T14:32:42.875Z] 14:32:42     INFO - WARNING: At least one completion condition is taking too long to complete. Conditions: [{"name":"Places Connection shutdown","state":{"PlacesShutdownProgress":5},"filename":"/builds/worker/checkouts/gecko/toolkit/components/places/Database.cpp","lineNumber":470,"stack":""},{"name":"Remote Settings profile-before-change","state":"Remaining: 5 callbacks (importJSONDump-main-anti-tracking-url-decoration-1, importJSONDump-main-hijack-blocklists-2, importJSONDump-main-doh-providers-3, importJSONDump-main-search-telemetry-v2-4, importJSONDump-main-search-config-5).","filename":"resource://services-settings/RemoteSettingsWorker.jsm","lineNumber":203,"stack":["resource://services-settings/RemoteSettingsWorker.jsm:null:203","resource://services-settings/RemoteSettingsClient.jsm:_importJSONDump:938","resource://services-settings/RemoteSettingsClient.jsm:get/this._importingPromise<:443","resource://services-settings/RemoteSettingsClient.jsm:get:458"]}] Barrier: profile-before-change
[task 2022-12-18T14:32:57.496Z] 14:32:57     INFO - console.log: "RemoteSettingsWorker error: Error: Can't import when we've started shutting down."
[task 2022-12-18T14:32:57.497Z] 14:32:57     INFO - console.error: (new RemoteSettingsWorkerError("Error: Can't import when we've started shutting down.", "resource://services-settings/RemoteSettingsWorker.jsm", 48))
[task 2022-12-18T14:32:58.247Z] 14:32:58     INFO - JavaScript error: resource://activity-stream/lib/PlacesFeed.jsm, line 187: NS_ERROR_UNEXPECTED: Component returned failure code: 0x8000ffff (NS_ERROR_UNEXPECTED) [nsINavBookmarksService.addObserver]
[task 2022-12-18T14:32:58.248Z] 14:32:58     INFO - JavaScript error: resource://gre/modules/ExtensionSettingsStore.jsm, line 122: Error: The ExtensionSettingsStore was accessed before the initialize promise resolved.
[task 2022-12-18T14:32:59.933Z] 14:32:59     INFO - console.log: "RemoteSettingsWorker error: Error: Can't import when we've started shutting down."
[task 2022-12-18T14:32:59.934Z] 14:32:59     INFO - console.error: (new RemoteSettingsWorkerError("Error: Can't import when we've started shutting down.", "resource://services-settings/RemoteSettingsWorker.jsm", 48))
[task 2022-12-18T14:32:59.936Z] 14:32:59     INFO - console.log: "RemoteSettingsWorker error: Error: Can't import when we've started shutting down."
[task 2022-12-18T14:32:59.937Z] 14:32:59     INFO - console.log: "RemoteSettingsWorker error: Error: Can't import when we've started shutting down."
[task 2022-12-18T14:32:59.938Z] 14:32:59     INFO - console.error: (new RemoteSettingsWorkerError("Error: Can't import when we've started shutting down.", "resource://services-settings/RemoteSettingsWorker.jsm", 48))
[task 2022-12-18T14:32:59.939Z] 14:32:59     INFO - console.error: (new RemoteSettingsWorkerError("Error: Can't import when we've started shutting down.", "resource://services-settings/RemoteSettingsWorker.jsm", 48))
[task 2022-12-18T14:32:59.939Z] 14:32:59     INFO - console.log: "RemoteSettingsWorker error: Error: Can't import when we've started shutting down."
[task 2022-12-18T14:32:59.940Z] 14:32:59     INFO - console.error: (new Error("IndexedDB: main/doh-config getLastModified() IndexedDB:  execute() The application is shutting down", "resource://services-settings/IDBHelpers.jsm", 18))
[task 2022-12-18T14:32:59.945Z] 14:32:59     INFO - console.error: (new RemoteSettingsWorkerError("Error: Can't import when we've started shutting down.", "resource://services-settings/RemoteSettingsWorker.jsm", 48))

This is also adding a good amount of delay for each and every shutdown. But not sure if this is related to RemoteSettings. Marco could you please have a look?

Flags: needinfo?(mak)

Something interesting here is that these tests are running Firefox only for a second or two, and then shutting it down. So it is likely that Firefox/Gecko hasn't been given a chance to fully start, and the side effect is that these tests are additionally stress testing the "startup & shutdown straight away" case.

Most of the errors/warnings are repeated for the rest of the shutdowns, but these two appear to be additions compared to the other shutdowns:

INFO - JavaScript error: resource://gre/modules/AsyncShutdown.jsm, line 718: Error: Phase "IOUtils: waiting for profileBeforeChange IO to complete" is finished, it is too late to register completion condition "JSON store: writing data for 'shieldpreferenceexperiments'"
INFO - WARNING: At least one completion condition is taking too long to complete. Conditions: [{"name":"Places Connection shutdown","state":{"PlacesShutdownProgress":5},"filename":"/builds/worker/checkouts/gecko/toolkit/components/places/Database.cpp","lineNumber":470,"stack":""},{"name":"Remote Settings profile-before-change","state":"Remaining: 5 callbacks (importJSONDump-main-anti-tracking-url-decoration-1, importJSONDump-main-hijack-blocklists-2, importJSONDump-main-doh-providers-3, importJSONDump-main-search-telemetry-v2-4, importJSONDump-main-search-config-5).","filename":"resource://services-settings/RemoteSettingsWorker.jsm","lineNumber":203,"stack":["resource://services-settings/RemoteSettingsWorker.jsm:null:203","resource://services-settings/RemoteSettingsClient.jsm:_importJSONDump:938","resource://services-settings/RemoteSettingsClient.jsm:get/this._importingPromise<:443","resource://services-settings/RemoteSettingsClient.jsm:get:458"]}] Barrier: profile-before-change

Marionette awaits the browser-idle-startup-tasks-finished observer notification. So if parts of Firefox haven't been started yet I assume that observer notification is getting send out too early?

Ok, it looks like it isn't guaranteed that places has finished starting/initialising before that point gets hit, hence the statement about the bookmark migration failing.

These warnings on shutdown are sort of expected in these cases, we have many cases in which services/components try to initialize too late when shutdown already started, and we're fixing them as soon as possible. The warnings may be a symptom of that, or may be expected (failing as expected and the failure is handled).

What may be causing very long shutdowns is AsyncShutdown when we don't properly handle it. For example we may fail to add a blocker (because it's too late) but we don't handle the addBlocker exception... if that blocker contained code that was unblocking another blocker, then we wait 1 minute and then crash. Or a blocker may be waiting on a promise that never resolves (because maybe it invoked a method with a callback that doesn't callback in case of error).
The timeout happens when you see "At least one completion condition is taking too long to complete". One thing you could do is collecting a list of the ones that happen more often. One issue seems to be with RemoteSettings, the other one with Places. I have a wip patch for Sqlite in bug 1805529 that maybe will help the Places case, but I can't be sure.
Unfortunately each consumer must be identified and fixed on its own, there's no silver bullet.

One workaround you could apply to Marionette is reducing the async shutdown timeout, by setting toolkit.asyncshutdown.crash_timeout to a value smaller than 60000 (the minute above), 10s may be enough.
Just ensure these crashes are not reported to crash-stats.

Flags: needinfo?(mak)

And yes, I suggest filing a bug for the RemoteSettings asyncshutdown timeouts, it's likely RemoteSettings is starting up during shutdown, while it could probably avoid doing any work (nsIAppStartup.isInOrBeyondShutdownPhase() can be used to avoid doing late init work)

(In reply to Marco Bonardo [:mak] (OoO until Jan 2 - Slack ping for urgency) from comment #4)

One workaround you could apply to Marionette is reducing the async shutdown timeout, by setting toolkit.asyncshutdown.crash_timeout to a value smaller than 60000 (the minute above), 10s may be enough. Just ensure these crashes are not reported to crash-stats.

That is something that we cannot do. Several code in Gecko relies on the default value of toolkit.asyncshutdown.crash_timeout and Marionette actually takes this value into account when waiting for the application to shutdown. Only if the shutdown doesn't happen in that time + 10s the Marionette harness will raise an error. It means that we want Firefox to shutdown itself and check for long shutdown times.

(In reply to Marco Bonardo [:mak] (OoO until Jan 2 - Slack ping for urgency) from comment #5)

And yes, I suggest filing a bug for the RemoteSettings asyncshutdown timeouts, it's likely RemoteSettings is starting up during shutdown, while it could probably avoid doing any work (nsIAppStartup.isInOrBeyondShutdownPhase() can be used to avoid doing late init work)

Filed bug 1808168 for the Remote Settings client.

Depends on: 1805529

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #6)

That is something that we cannot do. Several code in Gecko relies on the default value of toolkit.asyncshutdown.crash_timeout and Marionette actually takes this value into account when waiting for the application to shutdown. Only if the shutdown doesn't happen in that time + 10s the Marionette harness will raise an error. It means that we want Firefox to shutdown itself and check for long shutdown times.

Which code in Gecko relies on toolkit.asyncshutdown.crash_timeout? Any code is expected to shutdown ASAP and take less time possible to do so, the crash timeout is a catch-all for reporting to us misbehaving code through crash-stats.
I see the will to test real-world cases, but some of the shutdowns here don't seem normal use-cases, especially if Firefox is terminated even before any ui is usable.
That said, I'm surely happy if we can fix some of the components doing late startup.

Severity: -- → S4
Priority: -- → P3
Blocks: 1809845

(In reply to Marco Bonardo [:mak] from comment #7)

Which code in Gecko relies on toolkit.asyncshutdown.crash_timeout? Any code is expected to shutdown ASAP and take less time possible to do so, the crash timeout is a catch-all for reporting to us misbehaving code through crash-stats.

Correct. There is IPC code for content process shutdown which doesn't directly depend on the pref's value but sets similar values. See bug 1783217 comment 52 for the still to be answered question. If there are issues with that please mention it over there.

I see the will to test real-world cases, but some of the shutdowns here don't seem normal use-cases, especially if Firefox is terminated even before any ui is usable.
That said, I'm surely happy if we can fix some of the components doing late startup.

Well, as spoken with various developers allowing Marionette to start running the tests should be fine when we receive the browser-idle-startup-tasks-finished observer. Or what's wrong with that?

Flags: needinfo?(mak)
You need to log in before you can comment on or make changes to this bug.