Bug 1687685 Comment 19 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Original comment by

Andrew Sutherland [:asuth] (he/him)

on 2021-01-27 12:08:51 PST

(In reply to Jan Varga [:janv] from comment #17)
> We could implement an intermediate step regarding QM storage upgrades and improving the success rate even more. If the storage initialization fails in one of the upgrades for storage version < 2.2, then entire storage/ directory and storage.sqlite would be deleted.

The v2.3 schema was introduced in Firefox 70 in bug 1563023.  I think it'd be appropriate to wipe storage in any all-or-nothing version upgrade failure given that v70 pre-dates v72 for non-ESR and we would expect ESR68 to have already upgraded to ESR78.  It would then be a subsequent enhancement to remove support for older upgrade versions and have them move to using the same storage{.sqlite} clearing mechanism.

In the end state, we'd end up with the startup check deciding on versions sorta like:
- **Too Old**: The storage is just too old, clear and start from scratch.
- **Best Effort: All or Nothing**:  The storage is old but we'll still try and update the data, but if we can't, we'll clear and start from scratch.
- **Current Migration: Clear Broken Origins**: For all new upgrades from here on out on Beta and Release, we'll clear origins that fail to upgrade.
- **Nightly Active Development: Retain Broken Origins**: For recently landed upgrades on nightly perhaps we initially mark the upgrade as one that will not clear the origin, instead letting us ask for help investigating from people.  This could perhaps trigger the tab to open up about:quotamanager at most once (ex: guarded by a pref that gets set to the new version upgrade that broke).  The idea here is that a newly landed upgrade shouldn't clear everyone's data, instead giving the opportunity to back things out and/or address the problems.  This then advances to the next situation:
- **Nightly Stabilized: Clear Broken Origins**: After we think we've successfully landed the upgrade on nightly or we're down to a long tail of errors, we land a minor change to switch to clearing the broken origins, but we still have the detailed _TRY telemetry logging so the problems are known.

I understand Jan's proposal above to be that we keep everything in "Best Effort" for now and basically nothing in "Too Old", but I think my general argument is that it's appropriate for binary size, general complexity, and general sanity reasons reasons to start moving things into "Too Old" once we think the total clearing logic is sound.

Revision 1 by

Andrew Sutherland [:asuth] (he/him)

on 2021-01-27 12:10:46 PST

(In reply to Jan Varga [:janv] from comment #17)
> We could implement an intermediate step regarding QM storage upgrades and improving the success rate even more. If the storage initialization fails in one of the upgrades for storage version < 2.2, then entire storage/ directory and storage.sqlite would be deleted.

The v2.3 schema was introduced in Firefox 70 in bug 1563023.  I think it'd be appropriate to wipe storage in any all-or-nothing version upgrade failure given that v70 pre-dates v72 for non-ESR and we would expect ESR68 to have already upgraded to ESR78.  It would then be a subsequent enhancement to remove support for older upgrade versions and have them move to using the same storage{.sqlite} clearing mechanism.

In the end state, we'd end up with the startup check deciding on versions sorta like:
- **Too Old**: The storage is just too old, clear and start from scratch.
- **Best Effort: All or Nothing**:  The storage is old but we'll still try and update the data, but if we can't, we'll clear and start from scratch.
- **Current Migration: Clear Broken Origins**: For all new upgrades from here on out on Beta and Release, we'll clear origins that fail to upgrade.
- **Nightly Active Development: Retain Broken Origins**: For recently landed upgrades on nightly perhaps we initially mark the upgrade as one that will not clear the origin, instead letting us ask for help investigating from people.  This could perhaps trigger the tab to open up about:quotamanager at most once (ex: guarded by a pref that gets set to the new version upgrade that broke).  The idea here is that a newly landed upgrade shouldn't clear everyone's data, instead giving the opportunity to back things out and/or address the problems.  This then advances to the next situation:
- **Nightly Stabilized: Clear Broken Origins**: After we think we've successfully landed the upgrade on nightly or we're down to a long tail of errors, we land a minor change to switch to clearing the broken origins, but we still have the detailed _TRY telemetry logging so the problems are known.

I understand Jan's proposal above to be that we keep everything in "Best Effort" for now and basically nothing in "Too Old", but I think my general argument is that it's appropriate for binary size, general complexity, and general sanity reasons to start moving things into "Too Old" once we think the total clearing logic is sound.

Back to Bug 1687685 Comment 19