Closed
Bug 1410270
Opened 7 years ago
Closed 6 years ago
Run Places maintenance for certain Sync bookmark errors
Categories
(Firefox :: Sync, enhancement, P2)
Firefox
Sync
Tracking
()
RESOLVED
FIXED
Firefox 61
Tracking | Status | |
---|---|---|
firefox61 | --- | fixed |
People
(Reporter: lina, Assigned: lina)
References
Details
Attachments
(1 file)
Thom suggested that we can have Sync immediately run Places maintenance if it sees records with missing or invalid GUIDs, or tombstones for items that aren't deleted.
Comment 1•7 years ago
|
||
If immediately sounds problematic, maybe just "soon"?
Assignee | ||
Comment 2•7 years ago
|
||
(In reply to Thom Chiovoloni [:tcsc] from comment #1) > If immediately sounds problematic, maybe just "soon"? Makes sense. I was thinking we could even let the current sync fail, and run `PlacesDBUtils.maintenanceOnIdle` in the `catch` handler if we got a non-server error. If maintenance was able to fix the problem, the next bookmark sync should succeed.
Updated•7 years ago
|
Priority: -- → P3
Updated•6 years ago
|
Priority: P3 → P2
Assignee | ||
Comment 3•6 years ago
|
||
WDYT is the best way to record telemetry for this, Mark? I'm thinking two fields in the ping, or two separate Boolean histograms: 1. Did we run maintenance this sync? (This would be our denominator, the number of maintenance attempts). 2. If we ran maintenance last sync, did the current sync succeed? I'm guessing we'll want a pref or some kind of way to keep track of (2)? At first, I was thinking we could just include (1) in the sync ping, and determine if maintenance succeeded when we build our dashboards...but IIUC finding the "next success" after a failure where we ran maintenance would require trawling through every sync ping, since we're looking for successes, and ~99% of our syncs succeed. So it seems like we'd be doing a lot of work to reconstruct info that the client already knows...but maybe not?
Flags: needinfo?(markh)
Comment 4•6 years ago
|
||
(In reply to Kit Cambridge (they/them) [:kitcambridge] from comment #3) > finding the "next success" after a failure where we ran maintenance would > require trawling through every sync ping, since we're looking for successes, > and ~99% of our syncs succeed. So it seems like we'd be doing a lot of work > to reconstruct info that the client already knows... While I've had to do something similar in the past (find nearest row to a row with some property), it definitely makes the query messier and increases run time. So if having (2) is plausible without much work, it gets my vote.
Comment 5•6 years ago
|
||
(In reply to Kit Cambridge (they/them) [:kitcambridge] from comment #3) > WDYT is the best way to record telemetry for this, Mark? I'm thinking two > fields in the ping, or two separate Boolean histograms: > > 1. Did we run maintenance this sync? (This would be our denominator, the > number of maintenance attempts). > 2. If we ran maintenance last sync, did the current sync succeed? As Leif says, let's go with 2 just to help with analysis. > I'm guessing we'll want a pref or some kind of way to keep track of (2)? I don't think it would hurt to miss some of these, so could we just keep a flag in memory and accept the fact that if the last sync in the session was the repair we just lose the ability to report how it went?
Flags: needinfo?(markh)
Comment hidden (mozreview-request) |
Assignee | ||
Comment 7•6 years ago
|
||
mozreview-review |
Comment on attachment 8967110 [details] Bug 1410270 - Run Places maintenance for certain Sync bookmark errors. https://reviewboard.mozilla.org/r/235760/#review241556 ::: services/sync/modules/engines.js:841 (Diff revision 1) > throw new Error("engine does not implement _sync method"); > } > > - return this._notify("sync", this.name, this._sync)(); > + return this._notify("sync", this.name, async () => { > + await this._sync(); > + return this; This is really unfortunate, do you think there's a better way to pass success info to the `weave:engine:sync:finish` notification, or to pass along side-channel info like this in general? I tried using `engineImpl` at first, but that complicates tests in the cases we unregister the built-in engine. We could also return a POJO like `{ succeededAfterMaintenance }`, but that seems icky, since it's just copying a property that's already on the engine, and doesn't immediately map to the fact that we return it so that it can also be passed to the observer... ::: services/sync/modules/engines/bookmarks.js:420 (Diff revision 1) > + { elapsedSinceMaintenance }); > + await PlacesDBUtils.maintenanceOnIdle(); > + this._ranMaintenanceOnLastSync = true; > + throw new RecoverableError( > + "Bookmark sync failed; ran maintenance to recover", > + { originalError: ex, ranMaintenance: true }); I'm not really happy with this "re-wrap the original exception but also flag that we ran maintenance so we can tell telemetry" pattern, either. :-/
Assignee | ||
Updated•6 years ago
|
Assignee: nobody → kit
Status: NEW → ASSIGNED
Comment 8•6 years ago
|
||
mozreview-review |
Comment on attachment 8967110 [details] Bug 1410270 - Run Places maintenance for certain Sync bookmark errors. https://reviewboard.mozilla.org/r/235760/#review241620 I don't like the changes to the schema (around error), and I kind of feel like the right place to report this in telemetry is via event telemetry. I'd accept a non-event telemetry solution that doesn't add properties to the error types though, if you have a reason events wouldn't work for it. ::: services/sync/modules/engines.js:841 (Diff revision 1) > throw new Error("engine does not implement _sync method"); > } > > - return this._notify("sync", this.name, this._sync)(); > + return this._notify("sync", this.name, async () => { > + await this._sync(); > + return this; Yes, I agree that this is a bit crazy... Is there a reason we wouldn't want to use event telemetry here? (We might need more than one event type which isn't ideal, but might be better than what we have here). ::: services/sync/modules/engines/bookmarks.js:39 (Diff revision 1) > PlacesSyncUtils.bookmarks.DESCRIPTION_ANNO, > PlacesSyncUtils.bookmarks.SIDEBAR_ANNO, PlacesUtils.LMANNO_FEEDURI, > PlacesUtils.LMANNO_SITEURI, > ]); > > +const PLACES_MAINTENANCE_INTERVAL = 1 * 60 * 60 * 1000; // 1 hour. This seems a bit frequent to me... How heavyweight is places maintenance? ::: services/sync/modules/engines/bookmarks.js:416 (Diff revision 1) > + "places.database.lastMaintenance", 0) * 1000; > + if (elapsedSinceMaintenance >= PLACES_MAINTENANCE_INTERVAL) { > + this._log.error("Bookmark sync failed, ${elapsedSinceMaintenance}s " + > + "elapsed since last run; running Places maintenance", > + { elapsedSinceMaintenance }); > + await PlacesDBUtils.maintenanceOnIdle(); Is there any risk to actually doing this in the middle of a sync instead of after one? ... I don't exactly know what I mean by 'risk', admittedly, so the answer is probably "no", but it seems possible that it could cause some confusion... ::: services/sync/tests/unit/sync_ping_schema.json:192 (Diff revision 1) > "otherError": { > "required": ["name"], > "properties": { > "name": { "enum": ["othererror"] }, > - "error": { "type": "string" } > + "error": { "type": "string" }, > + "ranMaintenance": { "type": "boolean" } I'd *really* rather not add additional properties to errors, since the way the scala handles it is somewhat problematic, it just assumes there are two properties and whatever one isnt "name" holds the information about what error occured. Essentially, Android and iOS both messed up on the names (since we have so many, e.g. `code`, `error`, etc...), which forced the scala to do it this way in order to be compatible without additional craziness. It does mean that adding additional properties isn't practical, though, at least not without some thought.
Attachment #8967110 -
Flags: review?(tchiovoloni)
Assignee | ||
Comment 9•6 years ago
|
||
mozreview-review-reply |
Comment on attachment 8967110 [details] Bug 1410270 - Run Places maintenance for certain Sync bookmark errors. https://reviewboard.mozilla.org/r/235760/#review241620 > This seems a bit frequent to me... How heavyweight is places maintenance? `maintenanceOnIdle` is fairly lightweight, it runs at least once a day already...though the intent here is to avoid repeatedly running maintenance when it's unlikely to help, so bumping to 4 hours or so is reasonable. > Is there any risk to actually doing this in the middle of a sync instead of after one? > > ... I don't exactly know what I mean by 'risk', admittedly, so the answer is probably "no", but it seems possible that it could cause some confusion... I don't think so...do you mean in one of the storage methods? > I'd *really* rather not add additional properties to errors, since the way the scala handles it is somewhat problematic, it just assumes there are two properties and whatever one isnt "name" holds the information about what error occured. > > Essentially, Android and iOS both messed up on the names (since we have so many, e.g. `code`, `error`, etc...), which forced the scala to do it this way in order to be compatible without additional craziness. It does mean that adding additional properties isn't practical, though, at least not without some thought. Ouch. Thanks for the clarification, this is very good to know! Changed to use event telemetry, and two events is still much simpler than rigging the Sync ping to support this.
Comment hidden (mozreview-request) |
Comment 11•6 years ago
|
||
mozreview-review |
Comment on attachment 8967110 [details] Bug 1410270 - Run Places maintenance for certain Sync bookmark errors. https://reviewboard.mozilla.org/r/235760/#review241662 This looks great, thanks.
Attachment #8967110 -
Flags: review?(tchiovoloni) → review+
Comment hidden (mozreview-request) |
Comment 13•6 years ago
|
||
Pushed by kcambridge@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/57093565dccc Run Places maintenance for certain Sync bookmark errors. r=tcsc
Comment 14•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/57093565dccc
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
status-firefox61:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 61
You need to log in
before you can comment on or make changes to this bug.
Description
•