content-prefs.sqlite is corrupted(and site specific zoom level is lost)when downgrade from Nightly35.0a1 to Aurora34.0a2 and earlier

RESOLVED FIXED in Firefox 34

Status

()

defect
--
major
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: alice0775, Assigned: tomasz)

Tracking

({dataloss, regression})

35 Branch
mozilla36
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox32 wontfix, firefox33 wontfix, firefox34+ fixed, firefox35 fixed, firefox36 fixed, firefox-esr31 wontfix)

Details

Attachments

(2 attachments, 4 obsolete attachments)

[Tracking Requested - why for this release]:

Steps To Reproduce:
1. Start Nightly35.0a1 and quit
2. Start Firefox esr31.0 with same user profile and quit

Actual Results:
content-prefs.sqlite is corrupted.
content-prefs.sqlite.corrupt is produced.
And site specific zoom level is lost

Expected Results:
No  corrupt.
site specific zoom level should be kept between versions
Summary: content-prefs.sqlite is corrupted(and site specific zoom level is lost)when downgrade from Nightlt35.0a1 to Aurora34.0a2 and earlier → content-prefs.sqlite is corrupted(and site specific zoom level is lost)when downgrade from Nightly35.0a1 to Aurora34.0a2 and earlier
Regression window(fx)
Good:
https://hg.mozilla.org/integration/fx-team/rev/121504802da7
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0 ID:20140923033544
Bad:
https://hg.mozilla.org/integration/fx-team/rev/f3912d0ff184
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0 ID:20140923064943
Pushlog:
https://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=121504802da7&tochange=f3912d0ff184

Regressed by:
9910f35506c6	Tomasz Kołodziejski — Bug 1058435 - Add removeAllDomainsSince to ContentPrefService2. r=adw
Blocks: 1058435
So the console is saying:

error migrating DB: no migrator function from version 4 to version 3; backing up and recreating

Which is exactly what is happening. It looks like the right way to proceed is to remove this check and if we see that there's a new version of db we shouldn't try to migrate back because it's impossible (firefox, say, 31 don't have the knowledge as how to migrate back from db schema introduced in newer firefox 34).
Assignee: nobody → tkolodziejski
Status: NEW → ASSIGNED
the usual migration schema we follow is to create incremental migrator functions that change the db in a way that it always stays compatible at least with the previous version.
The migrator functions should also be built so that they can run on an already migrated db.
On downgrade we just downgrade the schema version, so when the user will update again, the migrator function runs again (and it can fill up holes created during the downgrade time).
Now, I don't recall if content-prefs follow this schema, but it would be sane if it would.
Posted patch content-prefs-corrupted.patch (obsolete) — Splinter Review
It looks like I might have read that here: http://hg.mozilla.org/mozilla-central/file/14665b1de5ee/toolkit/components/places/Database.cpp#l627. To make my migrator function idempotent I have to add try-catch around the "alter table add column". Otherwise, if there already is a column like that, it throws.

In my case migration 1 to 2 is completely breaking because it is rename. But I guess we don't care anymore about such an old migration.

I added a proposed patch. It's missing tests because they are always complicated to write so it'll take me some time.

In the meantime, what do we do with the fact that we can't push this patch into the previous firefox? (and so fix the bug itself but we'll just fix it for the future)
Flags: needinfo?(mak77)
(In reply to Tomasz Kołodziejski [:tomasz] from comment #4)
> Created attachment 8498276 [details] [diff] [review]
> content-prefs-corrupted.patch
> 
> It looks like I might have read that here:
> http://hg.mozilla.org/mozilla-central/file/14665b1de5ee/toolkit/components/
> places/Database.cpp#l627. To make my migrator function idempotent I have to
> add try-catch around the "alter table add column". Otherwise, if there
> already is a column like that, it throws.

you might first check if the column exists (it's enough to prepare a synchronous  SELECT column FROM table stmt, without even executing it)

> In my case migration 1 to 2 is completely breaking because it is rename. But
> I guess we don't care anymore about such an old migration.

Right

> In the meantime, what do we do with the fact that we can't push this patch
> into the previous firefox? (and so fix the bug itself but we'll just fix it
> for the future)

If the bug only happened in Nightly, we just don't do anything. Nightly is supposed to be unstable and users using it are aware dataloss is behind the corner.

But the patch that caused this should not make Aurora, or a fix for the issue should be pushed before we merge to Aurora (that is in less than 2 weeks from now)

So, it would be saner to push your fix and write the test later if it's going to take more time than a week.
Flags: needinfo?(mak77)
I'll make the tests ready soon.

So the bug was there like forever (the migrator was broken -- when it saw version bigger than current it assumed the db is corrupt) but we didn't do any migrations for this code in a long time. When I added migrations it just got noticed.

So to do it right we'd have to fix the migrations first and then wait a couple of firefox versions (so that nobody wants to migrate back) and merge my migrations 1058435. Can we do it better?
Flags: needinfo?(mak77)
(In reply to Tomasz Kołodziejski [:tomasz] from comment #6)
> So to do it right we'd have to fix the migrations first and then wait a
> couple of firefox versions (so that nobody wants to migrate back) and merge
> my migrations 1058435. Can we do it better?

we can fix the migrator in Nightly 35 and uplift the fix to Aurora 34, we don't care that much about people downgrading by 2 versions. What matters is people having to downgrade by one versio,n cause the new update may cause them so bad issues that they can't use the browser anymore and we have to make a dot release (worst case scenario).

The alternative is to backout bug 1058435, fix the migrator in nightly and aurora, and after the merge reland bug 1058435, that is basically the 2 versions you suggested.

The decision depends on the priority of the fixes in bug 1058435, that I can't decide, for that you should ask to a project manager or Gavin (the module owner).

off-hand, since one version should be enough, I'd go for fixing the migrator in Nightly 35 and uplift the fix to Aurora 34.
Flags: needinfo?(mak77)
(In reply to Marco Bonardo [::mak] (needinfo? me) from comment #7)
> off-hand, since one version should be enough, I'd go for fixing the migrator
> in Nightly 35 and uplift the fix to Aurora 34.

Let's do this.
Posted patch content-prefs-corrupted.patch (obsolete) — Splinter Review
I did another investigation of this bug and here is the result:

1. Migrator already got fixed by bug 1058435. This is because we're iterating there migrations oldVersion -> oldVersion+1 -> ... -> newVersion. So in case we migrate back nothing bad happens.
2. I added a comment and some basic test in this patch.

So I guess I should extract the migrator fix to this bug, right?
Attachment #8498276 - Attachment is obsolete: true
Attachment #8500602 - Flags: review?(MattN+bmo)
Posted patch migrator-aurora.patch (obsolete) — Splinter Review
Since patch in bug 1058435 already solves the issue I extracted the migrator changes so that we can uplift them to aurora.

Matt please have a look and please help with the uplift.
Attachment #8500602 - Attachment is obsolete: true
Attachment #8500602 - Flags: review?(MattN+bmo)
Attachment #8500695 - Flags: review?(MattN+bmo)
Comment on attachment 8500695 [details] [diff] [review]
migrator-aurora.patch

rebalancing review load to mark/marco
Attachment #8500695 - Flags: review?(mhammond)
Attachment #8500695 - Flags: review?(mak77)
Attachment #8500695 - Flags: review?(MattN+bmo)
Updating the flags to reflect my understanding of the situation:

* 35 - not affected, even though the regressing change was made in this release.
* 34 - affected and will be fixed by this patch.
* earlier - all affected, but we are living with it.

[Tracking Requested - why for this release]:
See comments above - this fix is for 34 to prevent a change in 35 from causing loss of these content preference should the user then fall back to version 34.  If the user uses 35 then falls back to lower than 34, the dataloss will remain.
Comment on attachment 8500695 [details] [diff] [review]
migrator-aurora.patch

Review of attachment 8500695 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM.
Attachment #8500695 - Flags: review?(mhammond) → review+
Comment on attachment 8500695 [details] [diff] [review]
migrator-aurora.patch

Review of attachment 8500695 [details] [diff] [review]:
-----------------------------------------------------------------

my comments also apply to the version we have in mozilla-central... I think it's worth to update both.

Note that this looks wrong in mozilla-central:

_dbMigrate3To4: function ContentPrefService__dbMigrate3To4(aDBConnection) {
   aDBConnection.executeSimpleSQL("ALTER TABLE prefs ADD COLUMN timestamp INTEGER NOT NULL DEFAULT 0");

if we reapply this migrator, the column will already exist and this will throw, forcing us to replace the db, again.
we MUST first check if column timestamp exists, and only if it doesn't, try to add it.

It may be worth adding a warning comment above migrators explaining the queries must be able to apply to already migrated databases to support the downgrade/upgrade case.

::: toolkit/components/contentprefs/nsContentPrefService.js
@@ +1178,5 @@
> +      */
> +      if (aOldVersion == 0) {
> +        this._dbCreateSchema(aDBConnection);
> +      } else {
> +        for (let i = aOldVersion; i < aNewVersion; i++) {

so, the fact this will be skipped in the downgrade case is a little too much implicit, it's pretty clear but someone touching this code in future could overlook that.

please add a comment above the loop, clarifying that in case of downgrade we don't run any migration, but we still set the schemaVersion to the old one, so that on the next upgrade migration functions will be reapplied.

@@ +1205,1 @@
>      aDBConnection.executeSimpleSQL("DROP TABLE groupers");

"this should be DROP TABLE IF EXISTS groupers" cause we stated migrators must be able to reapply
In practice is unlikely we hit this code.
Attachment #8500695 - Flags: review?(mak77) → review+
Posted patch Beta 34 patchSplinter Review
I addressed the issues and here's the updated patch for aurora. The issues with fx-team version are addressed in an attachment that I made obsolete by accident. I'll upload it once again later.

For people (possibly myself) from the future: You may also want to read http://hg.mozilla.org/mozilla-central/file/e4cfacb76830/toolkit/components/places/Database.cpp to see other migrations.
Attachment #8500695 - Attachment is obsolete: true
Attachment #8503401 - Flags: review+
This fix (and the Nightly fix) need to land sooner than later. Now this must already be uplifted to Beta.
Flags: needinfo?(tkolodziejski)
I don't have the uplifting powers. I don't even have the commit access and I've never asked for an uplift before . I thought that the updated patch is ready and because of comment 12 someone will take care of doing it.

It looks that it was not the case. Since :adw, who's my mentor, is not here, can you guide me as what I should do now?
Flags: needinfo?(tkolodziejski) → needinfo?(mak77)
In comment 15 you should have set checkin-needed.
Keywords: checkin-needed
Comment on attachment 8503401 [details] [diff] [review]
Beta 34 patch

(In reply to Matthew N. [:MattN] (behind on reviews - focused on Loop) from comment #18)
> In comment 15 you should have set checkin-needed.

Sorry, I thought there was a portion that was supposed to land on m-c too but I guess not anymore.

Approval Request Comment
[Feature/regressing bug #]: bug 1058435
[User impact if declined]: Users will lose content preferences if they downgrade Firefox
[Describe test coverage new/current, TBPL]: No test. Manual test by Tomasz only. This isn't landing on m-c/m-a since bug 1058435 is already there.
[Risks and why]: Low risk of failed migrations due to a bug. This is mostly zoom settings, spell check language, and last download/upload dir..
[String/UUID change made/needed]: None
Attachment #8503401 - Flags: approval-mozilla-beta?
Flags: needinfo?(mak77)
Tomasz, which patch should be unobsoleted? I guess something will land on m-c after all per comment 17.
Keywords: checkin-needed
OS: Windows 7 → All
Hardware: x86_64 → All
Attachment #8503401 - Attachment description: migrator-aurora.patch → Beta 34 patch
Attachment #8503401 - Attachment filename: migrator-aurora.patch → migrator-beta.patch
(In reply to Matthew N. [:MattN] (behind on reviews - focused on Loop) from comment #19)
> Sorry, I thought there was a portion that was supposed to land on m-c too
> but I guess not anymore.

There should indeed be an additional patch for Nightly and Aurora. though It's not a good idea to wait for that before uplifting this part.
I'd even suggest to move the nightly/aurora part to a separate bug.
Yeah, I talked to Tomasz and he is working on a patch for 35+.
Posted patch m-c 36 & Aurora 35.patch (obsolete) — Splinter Review
I'm sorry for the confusion. I just wasn't fully aware of the uplifting process procedures. Thanks :MattN for helping!

So this patch addresses the issues raised in comment 14, thanks :mak. It should land in the same version as the one that introduced new migration (bug 1058435), that is (right now) Aurora 35.

Please have a look at it :MattN. Once this is ready we need to check it in mozilla central and then uplift to Aurora 35.

I also tested it manually like this:
1. create new profile
2. change zoom on your mozilla.com
2. go to profile folder
3. sqlite3 content-prefs.sqlite
4. run: PRAGMA user_version = 1000/3
5. open firefox again
6. expect zoom to be there and PRAGMA user_version to be back to 4.

This shouldn't be that difficult to write tests for this but when I tried I had a problem with the: 1. create db with current version 2. set the user_version to some strange number 3. force ContentPrefService migration to run again. We may want to make it (write tests for this downgrade/upgrade) a follow-up bug.
Attachment #8505305 - Flags: review?(MattN+bmo)
Comment on attachment 8505305 [details] [diff] [review]
m-c 36 & Aurora 35.patch

Review of attachment 8505305 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8505305 - Flags: review?(MattN+bmo) → review+
Attachment #8505305 - Attachment description: Aurora 35.patch → m-c 36 & Aurora 35.patch
Tomasz, can you provide a Try push?
Keywords: checkin-needed
Comment on attachment 8505305 [details] [diff] [review]
m-c 36 & Aurora 35.patch

Review of attachment 8505305 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/contentprefs/nsContentPrefService.js
@@ +1221,5 @@
>  
>    _dbMigrate3To4: function ContentPrefService__dbMigrate3To4(aDBConnection) {
> +    // Add timestamp column if it does not exist yet. This operation is idempotent.
> +    try {
> +      aDBConnection.createStatement("SELECT timestamp FROM prefs");

you should .finalize() the created statement, so
let stmt = aDBConnection.createStatement("SELECT timestamp FROM prefs");
stmt.finalize();
thank you, now it looks great!
(In reply to Tomasz Kołodziejski [:tomasz] from comment #27)
> Created attachment 8505577 [details] [diff] [review]
> m-c 36 & Aurora 35.patch
> 
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=3e858e9bdc8d
> (https://tbpl.mozilla.org/?tree=Try&rev=3e858e9bdc8d)

- * Migrations should follow a template rules in bug 1074817 comment 3 which are:
+ * Migrations should follow the template rules in bug 1074817 comment 3 which are:
Comment on attachment 8505577 [details] [diff] [review]
m-c 36 & Aurora 35.patch

https://hg.mozilla.org/integration/fx-team/rev/42a182f0431f

Tomasz, can you request approval‑mozilla‑aurora please?
Attachment #8505577 - Flags: checkin+
Flags: needinfo?(tkolodziejski)
https://hg.mozilla.org/mozilla-central/rev/42a182f0431f
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
Comment on attachment 8505577 [details] [diff] [review]
m-c 36 & Aurora 35.patch

Approval Request Comment
[Feature/regressing bug #]: Bug 1058435. This code introduced new migrations that were not following the migrations pattern (so that downgrade and re-upgrade works).
[User impact if declined]: Users will lose content preferences if they downgrade Firefox and then upgrade again.
[Describe test coverage new/current, TBPL]: No test for reupgrade. Manual testing done by me described in comment 23.
[Risks and why]: Low risk of failed migrations due to a bug. This is mostly zoom settings, spell check language, and last download/upload dir.
[String/UUID change made/needed]: None.
Flags: needinfo?(tkolodziejski)
Attachment #8505577 - Flags: approval-mozilla-aurora?
Comment on attachment 8503401 [details] [diff] [review]
Beta 34 patch

Given the current state of Firefox 33 and that some users are choosing to downgrade, I think this is quite important. Beta+
Attachment #8503401 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment on attachment 8505577 [details] [diff] [review]
m-c 36 & Aurora 35.patch

Aurora+
Attachment #8505577 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Adding checkin-needed for uplifts to Aurora and Beta.
(In reply to Matthew N. [:MattN] (behind on reviews - focused on Loop) from comment #35)
> Adding checkin-needed for uplifts to Aurora and Beta.

Not really necessary. We have bug queries for them.
Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.