Closed Bug 1355414 Opened 7 years ago Closed 7 years ago

places.sqlite migration fails if an application has never used the bookmarks service

Categories

(Toolkit :: Places, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
thunderbird_esr45 --- unaffected
thunderbird_esr52 ? affected
firefox52 --- unaffected
firefox-esr52 --- fixed
firefox53 --- unaffected
firefox54 --- fixed
firefox55 --- fixed

People

(Reporter: mak, Assigned: mak)

References

Details

(Whiteboard: [fxsearch])

Attachments

(1 file)

[Tracking Requested - why for this release]:

[Tracking Requested - why for this release]:

[Tracking Requested - why for this release]:

[Tracking Requested - why for this release]:

Thunderbird hit a schema migration problem due to the fact it doesn't use the bookmarks service.
In particulare migrateV35Up() tries to create a mobile root, and in doing so it tries to read previous roots to position the new one.
That's not going to work in TB because there are no roots at all, nothing ever created them.

We need to also add a schema migration test for this case.
Assignee: nobody → mak77
Status: NEW → ASSIGNED
Actually the fact we cannot remove a corrupt database file is a consequence of various facts:
1. some of the migrators run async statements because they are expensive operations
2. but we invoke Close() because the database is corrupt and we need to delete it synchronously and hand back a working one to the consumer, so we cannot wait for asyncClose
3. but Close throws because we didn't invoke asyncClose
4. the connection doesn't get closed, the file is locked and cannot be removed, everyone is sad

To fix that, the only way would be to detect close failing, try asyncClose with event loop spinning. It's not great, we could even decide to do that in mozStorage for everyone, so MOZ_ASSERT if someone calls close instead of asyncClose, and fallback to asyncClose with spinning. It's not great but the opposite (leaving the connection open) is not great as well. We could also still throw, but after closing the connection and then Places could ignore the error and assume close worked.

Andrew, what do you think, could we asyncClose() and spin in mozStorage close() method if asyncClose was not invoked?
Flags: needinfo?(bugmail)
(In reply to Marco Bonardo [::mak] from comment #1)
> Andrew, what do you think, could we asyncClose() and spin in mozStorage
> close() method if asyncClose was not invoked?

Building on your proposal, I suggest:

- Add a new method called "spinningSynchronousClose()" that does what you propose and making it very clear what's going on to anyone copying or reading the code.  This does not assert.  Especially in a WebExtensions-only world where front-end/back-end development are hopefully now more integrated, I think it's reasonable to trust callers to either know what they're doing or trust that we should be able to fix the problem in-tree if there's mis-use.

- In event of close() where an asyncClose() ideally would have happened, do a MOZ_DIAGNOSTIC_ASSERT() that will explode in both debug and release builds for nightly (and previously in aurora) but not beta/release, calling spinningSynchronousClose() if we survive the explosion.
Flags: needinfo?(bugmail)
Uh, and we may want to do the XPConnect stack dump trick proposed in the other bug too, or maybe delay being so aggressive about the MOZ_DIAGNOSTIC_ASSERT until such time as we're in WebExtesions-only land.  Realistically, if there are misbehaving extensions out there right now, they're not super likely to get massively overhauled right before they're retired.
(In reply to Andrew Sutherland [:asuth] from comment #2)
> - Add a new method called "spinningSynchronousClose()" that does what you
> propose and making it very clear what's going on to anyone copying or
> reading the code.

You mean a public API, right?

> - In event of close() where an asyncClose() ideally would have happened, do
> a MOZ_DIAGNOSTIC_ASSERT() that will explode in both debug and release builds
> for nightly (and previously in aurora) but not beta/release, calling
> spinningSynchronousClose() if we survive the explosion.

SGTM, I will file a separate storage bug for this.
Depends on: 1355561
(In reply to Marco Bonardo [::mak] from comment #4)
> You mean a public API, right?

Yes.
Let's start taking this, I'll work on bug 1355561 separately since it's different issues, even if they happen in related situations.
This should be uplifted to Thunderbird ESR52. It's less likely to affect Firefox since every profile touches bookmarks on startup.
note that bug 1355561 is instead more critical for Firefox, since some of our users may be stuck in a corrupt database loop. This would also explain some of the reports I've seen in the last years, where we were expecting a corrupt db to be replaced, but it didn't happen.
ehr, on the other side, thanks to the forced crash, this one will also fix a db corruption loop.
With bug 1355561 we could potentially avoid the forced crash.
Comment on attachment 8857342 [details]
Bug 1355414 - places.sqlite schema migration fails if an application has never used the bookmarks service.

https://reviewboard.mozilla.org/r/129328/#review132046

::: toolkit/components/places/Database.cpp:644
(Diff revision 1)
> +Database::ForceCrashAndReplaceDatabase()
> +{
> +  Preferences::SetBool(PREF_FORCE_DATABASE_REPLACEMENT, true);
> +  // We could force an application restart here, but we'd like to get these
> +  // cases reported to us, so let's force a crash instead.
> +  MOZ_CRASH("places.sqlite is corrupt and locked, app can't work properly");

Since there are three potential causes for crash (the three call sites), wouldn't it be useful to pass a reason parameter to ForceCrashAndReplaceDatabase so it can get appended to the crash signature? That would inform what further mitigations we might consider.

::: toolkit/components/places/Database.cpp:1926
(Diff revision 1)
>  Database::MigrateV35Up() {
>    MOZ_ASSERT(NS_IsMainThread());
>  
>    int64_t mobileRootId = CreateMobileRoot();
> -  if (mobileRootId <= 0) return NS_ERROR_FAILURE;
> +  if (mobileRootId <= 0)  {
> +    // Either the schema is broken or there isn't any root. This can happen if

"This can happen" -> "The latter can happen"

It's a bit ambiguous what 'this' refers to ("broken schema" or "no root").
Attachment #8857342 - Flags: review?(past) → review+
Pushed by mak77@bonardo.net:
https://hg.mozilla.org/integration/autoland/rev/a121d68c6eec
places.sqlite schema migration fails if an application has never used the bookmarks service. r=past
https://hg.mozilla.org/mozilla-central/rev/a121d68c6eec
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Comment on attachment 8857342 [details]
Bug 1355414 - places.sqlite schema migration fails if an application has never used the bookmarks service.

Approval Request Comment
[Feature/Bug causing the regression]: new Sync mobile root in Firefox 51
[User impact if declined]: In case of a places.sqlite corruption the user is stuck with the corrupt database, breaking a good part of the browser features. For other apps like thunderbird this fixes an almost certain corruption.
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: no
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: no
[Why is the change risky/not risky?]: this touches an old migration we already went through, and code that is only touched in case of a corrupt database.
[String changes made/needed]: none
Attachment #8857342 - Flags: approval-mozilla-esr52?
Attachment #8857342 - Flags: approval-mozilla-aurora?
do we need any other uplifts for TB? Will you take care of those?
Flags: needinfo?(jorgk)
If it gets uplifted to M-ESR52, then I don't have to do anything. If that is declined, I'll put it onto the TB release branch.

Thanks for addressing it quickly!
Flags: needinfo?(jorgk)
Depends on: 1356812
Comment on attachment 8857342 [details]
Bug 1355414 - places.sqlite schema migration fails if an application has never used the bookmarks service.

Fix a database corrupt issue related to bookmark sync. Aurora54+.
Attachment #8857342 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Hm, I actually wonder if it would be better to add a call to savePrefFile() before crashing, I fear the pref file may not be saved otherwise... it sounds like it would be safer.
I'll probably land that as a followup changeset here, it would be a safe change also for branches.
Woudl that be fine with you or do you prefer a separate bug?
Flags: needinfo?(gchang)
Hi ::mak,
Can you file a separate bug to track?
Flags: needinfo?(gchang) → needinfo?(mak77)
Depends on: 1357366
filed bug 1357366.
Flags: needinfo?(mak77)
landing this on ESR52, will also require landing the follow-up in bug 1357366.
that follow-up bug should also land in 54, since this one already landed there.
Comment on attachment 8857342 [details]
Bug 1355414 - places.sqlite schema migration fails if an application has never used the bookmarks service.

Corrupt SQl.lite database is a severe problem, also Thunderbird team needs this, ESR52.2+
Attachment #8857342 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: