Closed Bug 1343103 Opened 7 years ago Closed 3 years ago

Store and upload tombstones for deleted bookmarks

Categories

(Firefox :: Sync, defect, P3)

defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: lina, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [Sync Q4 OKRs])

Attachments

(4 files)

In bug 1332290 and bug 1317223, Mark suggested storing tombstones for some time past a successful sync. This is fairly easy to do for bookmarks; we can use the existing `moz_bookmarks_deleted` table. We could also store synced tombstones from other devices in this table, not just our own. We'd need to agree on an eviction policy, though.

In case of repair, this would allow us to authoritatively answer, "this record was deleted," instead of "I don't have this record."
The challenges I see here are:

* as you mention, some kind of "eviction" policy - maybe the regular maintenance process could remove deleted items over (say) 12 months old?
* handling bookmark backup and restore - given we auto-restore from backup on corruption it might make sense to include these tombstones in this process?

It's difficult to get a sense of what volume we are talking about here - I guess most users would have a reasonably small number of tombstones, but it's hard to guess what outliers might do - how much churn do our heavy bookmark users create?

Another vague idea Kit and I discussed was that sync could itself try and track deleted GUIDs for all collections, maintained outside of places/passwordmgr/etc, although that idea sounds worse overall - it still has the same eviction/backup/restore problems with even less chance of maintaining the invariants we need correctly, but I thought I'd get the idea on record.

Richard/Mak, any thoughts?
Potential sharp edges:

- I delete a bookmark, that deletion syncs, then I manually restore it from a GUID-preserving backup. What's the channel for transmitting undeletions? If it's "the record reappeared", how do we distinguish that from an old client (Bug 1332290)?

The same kind of problem would occur if some other GUID-aware tool -- Xmarks, cloudSync, …? -- gets involved.

If we only use this tombstone list for responding to repair requests, then…

- If different devices have divergent tombstone lists -- which they might if one restores a bookmark from a backup -- then the behavior of Sync becomes even more order-dependent. That becomes an issue if a repair request ever includes GUIDs that are already present on the server. One client might respond to such a request by authoritatively saying "bookmark XYZ was deleted!", but another client might have restored it from a backup, and know for sure that the record is alive -- it might even be on the server!

Again, this is somewhat indistinguishable from Bug 1332290.

- The longer a tombstone list sticks around, the higher the chance of a GUID collision between a new bookmark and an old deleted bookmark. The odds are slim, but our population is large.

- What's the defined behavior if a user switches accounts, and there are GUIDs shared between accounts? Is a deletion scoped to a particular FxA? Gulp.


This is a really hard problem.
Thanks, Richard, this is excellent feedback. I agree it's a hard problem, and worth mulling over before we commit to a strategy. This is really bug 1332290 applied to bookmarks instead of logins.

(In reply to Richard Newman [:rnewman] from comment #2)
> Potential sharp edges:
> 
> - I delete a bookmark, that deletion syncs, then I manually restore it from
> a GUID-preserving backup. What's the channel for transmitting undeletions?

I think we'd want to include tombstones in backups, as Mark suggests in comment 1. A deletion restored from a backup would upload a tombstone; an undeletion would write the full record (and its parent) again.

> - The longer a tombstone list sticks around, the higher the chance of a GUID
> collision between a new bookmark and an old deleted bookmark. The odds are
> slim, but our population is large.

Currently, inserting a bookmark will remove its GUID from the tombstones table. This is how we handle reinsertions; I think collisions would be the same way.

> - What's the defined behavior if a user switches accounts, and there are
> GUIDs shared between accounts? Is a deletion scoped to a particular FxA?
> Gulp.

I think it would be the same as for existing bookmarks. When you sign out, the sync status and change counter are reset, but the entry remains. That does mean we'll upload (likely unnecessary) tombstones, but the alternative leaves us where we are today.
(In reply to Kit Cambridge [:kitcambridge] from comment #3)

> I think we'd want to include tombstones in backups, as Mark suggests in
> comment 1. A deletion restored from a backup would upload a tombstone; an
> undeletion would write the full record (and its parent) again.

To rephrase: you mean the channel for transmitting undeletions to other clients (who have their own tombstones for the same GUIDs) is the record reappearing consistently on the server?


> Currently, inserting a bookmark will remove its GUID from the tombstones
> table. This is how we handle reinsertions; I think collisions would be the
> same way.

But again, consider two devices.

If one creates a new record with GUID "AAA", and then connects to an account to which another device is syncing that has a local tombstone for "AAA", what should happen?

The grave-watching client only knows that the GUID is on a tombstone -- it doesn't know anything else about what was deleted. How does it distinguish between AAA as (a) a new record that happens to collide with the GUID, (b) a deliberately restored-from-backup copy of the original AAA, (c) an accidental resurrection of AAA by a client that never saw the deletion?

I think we're assuming that local tombstones and server tombstones will always be in sync, but I'm not sure that's a correct assumption.


> I think it would be the same as for existing bookmarks. When you sign out,
> the sync status and change counter are reset, but the entry remains. That
> does mean we'll upload (likely unnecessary) tombstones, but the alternative
> leaves us where we are today.

Does this imply that every client must locally store, forever, the tombstones it sees from the server on each account it connects to?

Does that stand a chance of accidentally deleting across accounts?

E.g.,

D1, A1: delete AAA.
D1: disconnect from A1.
D1: connect to A2.
D1 has a tombstoned local deletion of AAA, which exists in A2. Conflict!
Priority: -- → P3
See Also: → 1388884
Priority: P3 → P1
Whiteboard: [Sync Q4 OKRs]
Kit, please talk about this one with Thom upon his return.
Flags: needinfo?(kit)
Thom and I talked about this last week, and worked out a plan.
Assignee: nobody → tchiovoloni
Status: NEW → ASSIGNED
Flags: needinfo?(kit)
Comment on attachment 8936195 [details]
Bug 1343103 - (1/3) WIP DONOTLAND Implement tombstone support in places

This patch is still somewhat rough, and has issues.  It also needs more tests, as well as some discussion on the 'right' thing to do here. I'll try and write up more about it while on the plane tomorrow.

Not to mention, it probably should be at least two patches (if not more). The validation/repair stuff (which was somewhat critical for me being able to manually test and debug it).

(There's also probably some debugging or commented out code that got left behind -- I haven't done a pass to clean that out yet)
Attachment #8936195 - Flags: feedback?(markh)
Attachment #8936195 - Flags: feedback?(kit)
Comment on attachment 8936195 [details]
Bug 1343103 - (1/3) WIP DONOTLAND Implement tombstone support in places

https://reviewboard.mozilla.org/r/206964/#review214058

This was a fairly quick look and I skimmed much of the tests and places changes, but this is looking good - nice work! I think there's a number of details we need to address, particularly around evicting tombstones and whether eraseEverything should keep them or not.

::: services/sync/modules/bookmark_validator.js:788
(Diff revision 1)
> -        }
> +      }
> -      }
> +    }
>  
> -      let sameType = client.type === server.type;
> +    let sameType = client.type === server.type;
> -      if (!sameType) {
> +    if (!sameType) {
> -        if (server.type === "query" && client.type === "bookmark" && client.bmkUri.startsWith(QUERY_PROTOCOL)) {
> +      if (server.type === "query" && client.type === "bookmark" &

I think you still want '&&' here?

::: services/sync/modules/engines.js:1400
(Diff revision 1)
>  
> +  /**
> +   * Only called for engines which persist tombstones. Set whatever state is
> +   * needed to remove the remote zombie 'remoteItem' when uploading
> +   */
> +  async _deleteRemoteZombie(remoteItem) {

I was a bit confused by what this did before I saw it called. I think it's probably worth expanding this a little (eg, something like "... needed to remove a remote item that should be a tombstone (ie, a zombie)" or similar.

And seeing this doesn't actually do the delete, maybe `_markAsRemoteZombie` or similar is better?

::: services/sync/modules/engines.js:1408
(Diff revision 1)
> +
> +  /**
> +   * For engines which persist tombstones (_storesTombstones), should return
> +   * true if we have a local tombstone for the given guid.
> +   */
> +  async _isStoredTombstone(guid) {

This might be better named as `_isLocalTombtone`?

::: services/sync/modules/engines.js:1427
(Diff revision 1)
>        this._delete.ids.push(id);
>    },
>  
> +  // If our remote records carry a better timestamp than "modified",
> +  // we should return that here.
> +  _getRemoteTimestamp(item) {

this doesn't seem used? Although I'm guessing bookmarks was going to use dateRemoved for a tombstone?

::: services/sync/tests/unit/head_http_server.js:263
(Diff revision 1)
>    insertWBO: function insertWBO(wbo) {
>      this.timestamp = Math.max(this.timestamp, wbo.modified);
>      return this._wbos[wbo.id] = wbo;
>    },
>  
> +  cleartextPayload(id) {

while these look like sane helpers, if we add them, we should probably change the existing tests to use them too (ie, this should probably be in either a "part 1" or a different bug IMO)

::: services/sync/tests/unit/test_bookmark_tombstones.js:20
(Diff revision 1)
> +  initTestLogging("Trace");
> +  generateNewKeys(Service.collectionKeys);
> +
> +  Log.repository.getLogger("Sync.Engine.Bookmarks").level = Log.Level.Trace;
> +  Log.repository.getLogger("Sync.Engine.Clients").level = Log.Level.Trace;
> +  Log.repository.getLogger("Sqlite").level = Log.Level.Info; // less noisy

this shouldn't be necessary as initTestLogging already does it.

::: services/sync/tests/unit/test_bookmark_tracker.js:1153
(Diff revision 1)
>        feedURI: CommonUtils.makeURI("http://localhost:0"),
>      });
>      livemark.terminate();
>  
>      await startTracking();
> +    // Mark changes as synci

incomplete comment

::: toolkit/components/places/Bookmarks.jsm:862
(Diff revision 1)
>                `UPDATE moz_bookmarks SET lastModified = :time,
>                                          syncChangeCounter = syncChangeCounter + :syncChangeDelta
>                 WHERE id IN (SELECT id FROM moz_bookmarks WHERE guid = :folderGuid )
>                `, { folderGuid, time, syncChangeDelta });
>            }
> +          if (options.source == Bookmarks.SOURCES.SYNC) {

it's not clear to me that bookmark restores should actually keep old tombstones around (but also not clear that it shouldn't), especially once we make a server wipe in this case more reliable.

Also, I think this patch should probably be split into the places and non-places parts, for easier review by Mak.

::: toolkit/components/places/nsPlacesTables.h:121
(Diff revision 1)
>  )
>  
> -// This table stores tombstones for bookmarks with SYNC_STATUS_NORMAL. We
> -// upload tombstones during a sync, and delete them from this table on success.
> -// If Sync is disconnected, we'll delete all stored tombstones. If Sync is
> -// never set up, we'll never write new tombstones, since all bookmarks will stay
> +// This table stores tombstones for bookmarks deleted on other devices, and
> +// local bookmarks with SYNC_STATUS_NORMAL. Each tombstone has its own sync
> +// status: SYNC_STATUS_NEW if it hasn't been synced yet, and SYNC_STATUS_NORMAL
> +// for remote and synced local tombstones. We periodically evict old synced

I probably just missed it, but where do we evict old ones?
Attachment #8936195 - Flags: feedback?(markh) → feedback+
Comment on attachment 8936195 [details]
Bug 1343103 - (1/3) WIP DONOTLAND Implement tombstone support in places

https://reviewboard.mozilla.org/r/206964/#review214022

Nice work, Thom! Summarizing my understanding, to make sure it's correct:

* Remote tombstones for items that don't exist at all locally, either as tombstones or non-tombstones, are stored with sync status "NORMAL".
* Remote tombstones for already-existing local tombstones are also stored with sync status "NORMAL", and handled by `insertTombstonesIfMissing`.
* Remote non-tombstones for items with local tombstones are "zombies", so we always upload the local tombstone, and weakly reupload the parent if it exists locally.
* We never resurrect items. If the item is deleted on one side, we always take the deletion; the timestamp doesn't matter.
* Remote non-tombstones for local non-tombstones are handled as before.
* Tombstones now have a sync status, which is "NEW" for locally deleted items, and "NORMAL" after a sync.
* Resetting the client (after a node reassignment, for example) keeps the tombstones, and resets their sync statuses to "NEW".
* Restoring from a local backup, where we erase everything, also keeps tombstones, with their existing sync statuses. (However, https://searchfox.org/mozilla-central/rev/51cd1093c5c2113e7b041e8cc5c9bf2186abda13/services/sync/modules/engines/bookmarks.js#979 calls `resetClient` first, so we'll reset the sync statuses to "NEW" in practice).
* Receiving a `wipeEngine` command from another client drops all local tombstones.

The only part I'm not sure about is handling restores, but everything else looks solid.

Splitting the Places parts into their own patch would be great, and I'm also thinking we should have the bookmarks engine override `_reconcile` completely, so that it's easier to understand the semantics.

::: services/sync/modules/bookmark_repair.js:188
(Diff revision 1)
> -    const fixableProblems = ["clientMissing", "serverMissing", "serverDeleted"];
> +    const fixableProblems = [
> +      "clientMissing",
> +      "serverMissing",
> +      "clientDeleted",
> +      "serverDeleted",
> +      "clientMissingTombstones",

I suspect this will be answered further down in the patch, but is there a reason to distinguish `clientMissingTombstones` and `serverMissingTombstones` from `clientMissing` and `serverMissing`? Is it that we want to ensure we're uploading a tombstone, and not a real record? What happens if one client has a tombstone for an item in `moz_bookmarks_deleted`, another client has that item in `moz_bookmarks`, and the server saw neither?

::: services/sync/modules/bookmark_repair.js:701
(Diff revision 1)
> -        if (syncable) {
> +        if (!tombstone) {
>            log.debug(`repair request to upload item '${id}' which exists locally; uploading`);
>            toUpload.add(id);
>          } else {
>            // explicitly requested and not syncable, so tombstone.
>            log.debug(`repair request to upload item '${id}' but it isn't under a syncable root; writing a tombstone`);

Nit: Let's change this log message and comment, since a tombstone doesn't mean it's not syncable.

::: services/sync/modules/bookmark_repair.js:705
(Diff revision 1)
>            // explicitly requested and not syncable, so tombstone.
>            log.debug(`repair request to upload item '${id}' but it isn't under a syncable root; writing a tombstone`);
>            toDelete.add(id);
>          }
> -      // The item wasn't explicitly requested - only upload if it is syncable
> +      // The item wasn't explicitly requested - only upload if it's not a tombstone
>        // and doesn't exist on the server.

Does this still make sense, or should we always upload tombstones now, even if they're not explicitly requested but we know they're in `moz_bookmarks_deleted`?

::: services/sync/modules/bookmark_validator.js:680
(Diff revision 1)
>  
>      return cycles;
>    }
>  
>    // Perform client-side sanity checking that doesn't involve server data
> -  _validateClient(problemData, clientRecords) {
> +  _validateClient(problemData, clientRecords, clientGraveyard) {

This is a delightful name. :-)

::: services/sync/modules/bookmark_validator.js:743
(Diff revision 1)
> +  }
> +
> +  _recordMissing(problems, id, clientRecord, serverRecord, deletions) {
> +    if (!clientRecord && serverRecord) {
> +      if (deletions.client.has(id)) {
> +        problems.clientDeleted.push(id);

Can we stash the `parentid` in `clientDeleted` here, too, and add both for weak upload in `tryServerOnlyRepairs`?

::: services/sync/modules/engines.js:1428
(Diff revision 1)
>    },
>  
> +  // If our remote records carry a better timestamp than "modified",
> +  // we should return that here.
> +  _getRemoteTimestamp(item) {
> +    return item.modified;

Thinking some more about this, I wonder if it makes sense to keep using the local time for `dateRemoved`, and removing this helper. IIRC, we talked about including the local modified time in the tombstone record payload...but, if we always prefer the tombstone (local or remote), I don't see why we'd need the timestamp?

::: services/sync/modules/engines.js:1484
(Diff revision 1)
>        // can't check for duplicates because the incoming record has no data
>        // which can be used for duplicate detection.
>        if (!existsLocally) {
> -        this._log.trace("Ignoring incoming item because it was deleted and " +
> -                        "the item does not exist locally.");
> -        return false;
> +        this._log.trace((this._storesTombstones ? "Storing" : "Ignoring") +
> +                        " incoming tombstone because the item does not exist locally.");
> +        return this._storesTombstones;

At a certain point, it almost seems like we should have the bookmarks engine override `_reconcile` entirely, copy-pasting parts of the function from this file, instead of continuing to add hooks to the base engine. Or maybe we can reimplement tombstone handling only, and call the base method for the case where both local and remote are non-tombstones.

My concern is there's a lot of subtle tweaks that engines can make to how reconciliation works by overriding methods, and it takes some context-switching between the base and bookmarks engines to piece this together. On the other hand, `_reconcile` is already complex as it is, and duplicating the code also doesn't seem ideal.

I guess I don't have an actionable request here, except maybe to ask how you'd feel about overriding `_reconcile` in the bookmarks engine, instead of adding `_storesTombstones`, `_deleteRemoteZombie`, and `_isStoredTombstone`?

::: services/sync/modules/engines.js:1514
(Diff revision 1)
>  
> +    // Check if we have a local tombstone, and delete the remote record if so.
> +    if (!existsLocally && this._storesTombstones) {
> +      if (await this._isStoredTombstone(item.id)) {
> +        this._log.trace("Found local tombstone, Killing remote", item);
> +        await this._deleteRemoteZombie(item);

Making sure I understand:

* A zombie is an item that exists on the server, but for which we've stored a tombstone locally.
* We always prefer the local tombstone if we have one, because the server modified time for the zombie could still be newer even if the record hasn't actually changed, just reuploaded by a client that didn't have the tombstone...and we don't want to resurrect in that case.

Is that accurate? If so, mind expanding the comment to explain?

::: toolkit/components/places/Bookmarks.jsm:863
(Diff revision 1)
>                                          syncChangeCounter = syncChangeCounter + :syncChangeDelta
>                 WHERE id IN (SELECT id FROM moz_bookmarks WHERE guid = :folderGuid )
>                `, { folderGuid, time, syncChangeDelta });
>            }
> +          if (options.source == Bookmarks.SOURCES.SYNC) {
> +            await db.executeCached(`DELETE FROM moz_bookmarks_deleted`);

I think this needs some more work, specifically for the case where a restore resurrects a deleted bookmark. IIUC, that won't clear out `moz_bookmarks_deleted`, so the GUID could exist in both tables. This might be a good place for an `AFTER INSERT` trigger on `moz_bookmarks` that deletes the GUID from `moz_bookmarks_deleted`.

However, I'm also not sure if this is the right thing to do. It's possible to restore a tree that doesn't match the one that we have on the server at all, in which case we don't really need to keep those tombstones around. Also, an automatic restore due to Places corruption won't restore tombstones, and there's a case for keeping the same behavior for manual restores. (Automatic restore already does the wrong thing today, though; it won't reset `lastSync` to do a full reconcile with the server. But one step at a time).

If we only do this to handle the case where we fail to wipe the server after a restore, I think it's worth keeping the current behavior of dropping tombstones on wipe, and make wiping the server more durable in a follow-up.

::: toolkit/components/places/PlacesSyncUtils.jsm:694
(Diff revision 1)
>            }
>            let guid = BookmarkSyncUtils.recordIdToGuid(recordId);
>            let bookmarkItem = await PlacesUtils.bookmarks.fetch(guid);
>            if (!bookmarkItem) {
> -            BookmarkSyncLog.trace(`remove: Item ${guid} already removed`);
> +            BookmarkSyncLog.trace(`remove: Item ${guid} doesn't exist or is already removed`);
> +            tombstones.push(recordId);

Nit: s/recordId/guid, although it doesn't really matter.

::: toolkit/components/places/PlacesSyncUtils.jsm:704
(Diff revision 1)
>              folderGuids.push(bookmarkItem.guid);
>              continue;
>            }
>            let wasRemoved = await deleteSyncedAtom(bookmarkItem);
>            if (wasRemoved) {
> +            tombstones.push(recordId);

Is this necessary, now that you've changed `insertTombstone{s}`?. We still need to store tombstones for nonexistent items above, but I'm wondering if recording tombstones here and after `deletedSyncedFolder` is redudant.

::: toolkit/components/places/PlacesSyncUtils.jsm:1844
(Diff revision 1)
>    // Removing the last tagged item will also remove the tag. To preserve
>    // tag IDs, we temporarily tag a dummy URI, ensuring the tags exist.
>    let dummyURI = PlacesUtils.toURI("about:weave#BStore_tagURI");
>    let bookmarkURI = PlacesUtils.toURI(item.url.href);
> -  if (newTags && newTags.length > 0)
> -    PlacesUtils.tagging.tagURI(dummyURI, newTags, SOURCE_SYNC);
> +  if (newTags && newTags.length > 0) {
> +    // Don't pass the real source since that causes it to be created with

Nice catch!

::: toolkit/components/places/nsNavBookmarks.cpp:1902
(Diff revision 1)
>      size_t rowsPerChunk = std::min(maxRowsPerChunk, aTombstones.Length() - startIndex);
>  
>      // Build a query to insert all tombstones in a single statement, chunking to
>      // avoid the SQLite bound parameter limit.
>      nsAutoCString tombstonesToInsert;
> -    tombstonesToInsert.AppendLiteral("VALUES (?, ?)");
> +    tombstonesToInsert.AppendLiteral("VALUES (?, ?, ");

`AppendPrintf` might be cleaner here.

::: toolkit/components/places/nsPlacesTables.h:122
(Diff revision 1)
>  
> -// This table stores tombstones for bookmarks with SYNC_STATUS_NORMAL. We
> -// upload tombstones during a sync, and delete them from this table on success.
> -// If Sync is disconnected, we'll delete all stored tombstones. If Sync is
> -// never set up, we'll never write new tombstones, since all bookmarks will stay
> -// in SYNC_STATUS_NEW.
> +// This table stores tombstones for bookmarks deleted on other devices, and
> +// local bookmarks with SYNC_STATUS_NORMAL. Each tombstone has its own sync
> +// status: SYNC_STATUS_NEW if it hasn't been synced yet, and SYNC_STATUS_NORMAL
> +// for remote and synced local tombstones. We periodically evict old synced
> +// tombstones. If Sync is disconnected, we'll reset the status of all tombstones

Let's see what Mak says about eviction. We may be able to get away with storing them permanently.
Attachment #8936195 - Flags: feedback?(kit) → feedback+
QA Contact: rbillings
This code has some issues and inconsistencies, largely because it was developed over a long time period with usually decently sized breaks between (generally when there was something else to work on, that's what I did, since these patches are very confusing to me), and multiple times the intended semantics changed completely. Additionally, I still don't fully understand the mirror, and so it's hard for me to state with confidence that this behaves appropriately with respect to it. I suspect ideally more of tombstone mirror work would be done using triggers, but this reaches approaches the edge of my understanding of how to use SQL.

That said, I think the plan has some serious conceptual issues unrelated to the concrete issues with the patches I'm uploading -- I don't think there *is* a way to implement this that's fully well behaved unless we want to actually attempt repair of the remote tree.

These are the currently desired semantics as I understand them:

1. At the end of a sync, keep local tombstones in places (don't drop them).
2. If we see an incoming record, and we have a local tombstone
    a. If sync status is normal or new, undelete or reupload it based on date modified.
    b. If sync status is unknown (which for a tombstone means that syncId changed, I believe), we need to reupload our local tombstone.
3. If we see an incoming tombstone and have a local record, we delete the local record iff it either has status unknown, or is older than the tombstone.
4. At the end of the sync, any tombstones with status != normal should be uploaded to the server, and set all tombstones to status normal.

Considering step 2 for a moment (I suspect step 3 and 4 have some of the same issues, but they might be more likely to be well-behaved):

If we upload a tombstone for a record we just downloaded, this may leave the server in an inconsistent state, since it's certainly referenced by at least one other bookmark. We also almost certainly didn't reupload that bookmark in the case of a change (since the buffer doesn't attempt repair)...

On the other hand, if we *don't* upload a tombstone, and just skip applying the record, it will be obvious to the user that the devices have diverged, and we'll likely revive the record later when the user inevitably moves or deletes it (since at this point it won't have UNKNOWN status locally, and its modified date will be new).

The two devices having obviously different bookmarks sounds like confusing behavior that leads people not to trust Sync. At least in the current state yeah, we'll undelete your bookmarks and you'll see old ones again, but sync will still appear synced across the devices, and the states will be roughly consistent (modulo other bugs).

So, AFAICT we're stuck with either
1. corrupting the server more,
2. leaving ourselves with an obviously inconsistent state WRT the server (which IMO is likely worse than the undeleted bookmarks).
3. Attempting some form of client-side repair and reupload, which I think if we wanted to do this patch, is our only option. I think we could do it without requiring a validator run or concensus across multiple devices, but I don't think it's worth it to fix this bug.

Anyway, to my understanding this can introduce corruption in a tree even if all the clients are well behaved -- The client that's uploading a partial state is us when we upload the local tombstone.

So I think this plan is at least not good. I could be misunderstanding all the cases when we get status UNKNOWN and it's possible things will actually all work out with the syncId changes and such, but under my current understanding I don't see how.

Some concrete notes on the patches I'm about to upload themselves:

Part 1 and 2 are a bit tricky to break up, but probably more a lot of part 1 should be in part 2. Ideally part 1 would just be the places changes, and part 2 would be sync changes. This is easier said than done since things are rather broken if we don't delete tombstones at the end of a sync.

Part 3 is incomplete, and needs more tests, but its part of the way there to implementing what I described, although that seems problematic to me.
Thanks Thom,

(In reply to Thom Chiovoloni [:tcsc] from comment #12)

> These are the currently desired semantics as I understand them:
> 
> 1. At the end of a sync, keep local tombstones in places (don't drop them).
> 2. If we see an incoming record, and we have a local tombstone

IIUC, we already need to deal with that and test_locally_deleted_remotely_modified exercises it. At the moment we know that the local tombstone must have been created between the previous sync and this sync, but I'm not clear why tombstones fundamentally changes this equation.

>     a. If sync status is normal or new, undelete or reupload it based on
> date modified.
>     b. If sync status is unknown (which for a tombstone means that syncId
> changed, I believe), we need to reupload our local tombstone.

That sounds correct to me (although IIUC, we current unconditionally undelete the local item using the structure implied by the server)

> Considering step 2 for a moment (I suspect step 3 and 4 have some of the
> same issues, but they might be more likely to be well-behaved):
> 
> If we upload a tombstone for a record we just downloaded, this may leave the
> server in an inconsistent state, since it's certainly referenced by at least
> one other bookmark. We also almost certainly didn't reupload that bookmark
> in the case of a change (since the buffer doesn't attempt repair)...

Yeah, I agree that sounds tricky.

> On the other hand, if we *don't* upload a tombstone, and just skip applying
> the record, it will be obvious to the user that the devices have diverged,
> and we'll likely revive the record later when the user inevitably moves or
> deletes it (since at this point it won't have UNKNOWN status locally, and
> its modified date will be new).

Right - so IIUC that's why we unconditionally revive the local bookmark.

> So, AFAICT we're stuck with either
> 1. corrupting the server more,
> 2. leaving ourselves with an obviously inconsistent state WRT the server
> (which IMO is likely worse than the undeleted bookmarks).
> 3. Attempting some form of client-side repair and reupload, which I think if
> we wanted to do this patch, is our only option. I think we could do it
> without requiring a validator run or concensus across multiple devices, but
> I don't think it's worth it to fix this bug.

Isn't the other option:

4. Make the local state match the server, effectively undeleting the bookmark (although it's location may not match where it was deleted from), meaning the local state and server state now agree?
(Sorry about that, I pushed my patches changes to the wrong bug ;_;)
Comment on attachment 8969788 [details]
Bug 1343103 - (3/3) WIP DONOTLAND Implement tombstone syncing for mirror engine

https://reviewboard.mozilla.org/r/238622/#review244348

This looks mostly good! I think consolidating `deleteLocally` and `deleteRemotely`, and using a temp table to store the complete set of deletions and writing it back to `moz_bookmarks_deleted`, will help untangle some of the confusion around the statuses.

::: toolkit/components/places/SyncedBookmarksMirror.jsm:1133
(Diff revision 1)
>                    ) THEN :livemarkKind
>                    ELSE :folderKind END)
>                  ELSE :separatorKind END) AS kind,
> -             b.lastModified / 1000 AS localModified, b.syncChangeCounter
> +             b.lastModified / 1000 AS localModified,
> +             b.syncChangeCounter,
> +             b.syncStatus

Bug 1454864 moves this query into `LocalItemsSQLFragment` at the top. (I'm sorry I keep bitrotting you. :-|)

::: toolkit/components/places/SyncedBookmarksMirror.jsm:1157
(Diff revision 1)
>  
>      // Note tombstones for locally deleted items.
>      let tombstoneRows = await this.db.execute(`
> -      SELECT guid FROM moz_bookmarks_deleted`);
> +      SELECT guid, syncStatus
> +      FROM moz_bookmarks_deleted
> +      WHERE syncStatus != ${PlacesUtils.bookmarks.SYNC_STATUS.NORMAL}`);

It would be good to include all tombstones here, so that the merger sees the complete tree, and so that we see them in debug logs, too. This also matches the remote tree; I think we'll include all tombstones, regardless if they're `needsMerge`.

::: toolkit/components/places/SyncedBookmarksMirror.jsm:1327
(Diff revision 1)
>          DELETE FROM moz_items_annos
>          WHERE item_id IN (SELECT itemId FROM itemsRemoved
>                            WHERE NOT isUntagging)`);
>  
> -      // Remove any local tombstones for deleted items.
> +      dump("%% Insert or ignore into deleted from removed\n");
> +      // XXX Is there a way to do the next two statements at once?

I think these could both be removed, though there's a bit of work to make that happen...

::: toolkit/components/places/SyncedBookmarksMirror.jsm:1336
(Diff revision 1)
> +               STRFTIME('%s', 'now', 'localtime', 'utc') * 1000000 as dateRemoved,
> +               :syncStatus as syncStatus
> +        FROM itemsRemoved`,
> +        { syncStatus: PlacesUtils.bookmarks.SYNC_STATUS.NORMAL });
> +      dump("%% Update deleted from removed\n");
> +      // Mark any local tombstones for deleted items as synced.

Hmm, I think it's premature to mark them as `NORMAL` here. `PSU.b.pushChanges` would be a better place, after we've uploaded them all...

::: toolkit/components/places/SyncedBookmarksMirror.jsm:1360
(Diff revision 1)
>      }
>  
>      MirrorLog.debug("Flagging remotely deleted items as merged");
>      for (let chunk of PlacesSyncUtils.chunkArray(remoteDeletions,
>        SQLITE_MAX_VARIABLE_NUMBER)) {
> +      if (chunk.length) {

Nit: `chunkArray` shouldn't yield empty chunks.

::: toolkit/components/places/SyncedBookmarksMirror.jsm:1382
(Diff revision 1)
>      }
> +
> +    dump("%% insert or ignore into deleted from items where isDeleted\n");
> +    // Insert remote tombstones.
> +    // XXX: Not sure if there's a way to do this as part of the above?
> +    await this.db.execute(`

If we're including all tombstones in the local and remote trees, the merged tree should mention them all, and `subsumes` will assert this...so this could be removed, I think.

::: toolkit/components/places/SyncedBookmarksMirror.jsm:2944
(Diff revision 1)
>   * A node in a local or remote bookmark tree. Nodes are lightweight: they carry
>   * enough information for the merger to resolve trivial conflicts without
>   * querying the mirror or Places for the complete value state.
>   */
>  class BookmarkNode {
> -  constructor(guid, age, kind, needsMerge = false) {
> +  constructor(guid, age, kind, needsMerge = false, possiblyStale = false) {

Nit: Let's just store the `syncStatus` as a property on the node. It can be `NORMAL` for remote nodes.

::: toolkit/components/places/SyncedBookmarksMirror.jsm:3144
(Diff revision 1)
>    isDeleted(guid) {
>      return this.deletedGuids.has(guid);
>    }
>  
> +  // Note that it's an error to call this on the remote tree.
> +  localTombstoneSyncStatus(guid) {

As with `syncStatus` for nodes, I think it's OK to have `tombstoneSyncStatus` default to `NORMAL` for the remote tree. `NORMAL` just means "exists on server", and items in the remote tree definitely do, at least at merge time.

::: toolkit/components/places/SyncedBookmarksMirror.jsm:3451
(Diff revision 1)
>        let mergedSyncableRoot = await this.mergeNode(guid, localSyncableRoot,
>                                                      remoteSyncableRoot);
>        mergedRoot.mergedChildren.push(mergedSyncableRoot);
>      }
>  
>      // Any remaining deletions on one side should be deleted on the other side.

Since we're permanently storing tombstones, I think we can merge `deleteLocally` and `deleteRemotely` into a single `tombstones` map with GUID to sync status, since the distinction between the two gets fuzzier. WDYT?

`deleteLocally` *could* be `NORMAL`, and `deleteRemotely` could be `NEW`, but the same GUIDs will exist in both now, if we have `fetch{Local, Remote}Tree` return all tombstones. As with `mergeStates`, we could populate a temp table with the contents of this map, replacing `guidsWithLevelsToDelete`, and use it to update `moz_bookmarks_deleted`.

So...

* All new remote tombstones need to be inserted into `moz_bookmarks_deleted` with `syncStatus = NORMAL`. We don't need to reupload these, since we just downloaded them.
* All existing remote tombstones don't need to be reuploaded, and should already exist in `moz_bookmarks_deleted` with `NORMAL`. If they don't, the mirror is out of sync with Places...which is good to note in `fetchInconsistencies`, but we'll fix it up, anyway.
* All new local tombstones are `NEW`, and should be uploaded on the next sync.
* All existing local tombstones are `NORMAL` or `UNKNOWN`. `NORMAL` tombstones don't need to be reuploaded. `UNKNOWN` only needs to be reuploaded if it conflicts with a remote non-tombstone; if it's already a tombstone on the server, we can change it to `NORMAL` when we apply the merged tree. If it's `UNKNOWN` locally, and doesn't exist on the server at all...we should probably still reupload. This all means we might upload tombstones for records we just downloaded, but that's OK, because we'll also reupload the parent with the new structure in that case (https://searchfox.org/mozilla-central/source/toolkit/components/places/SyncedBookmarksMirror.jsm#3708-3719,4030-4043,4058,4062,4064-4068,4070,4117-4170).
* Everything in `localTree.deletedGuids` has `syncStatus = NEW` or `syncStatus = UNKNOWN`.
* Everything in `remoteTree.deletedGuids` has `syncStatus = NORMAL`.
* Conflicts between an `UNKNOWN` local deletion and a remote non-deletion add an `UNKNOWN` entry to the `deletions` map. (Not `NEW`, I don't think; if we're interrupted after merging, but during upload, we want to prefer the tombstone when we re-merge on the next sync).
* Conflicts between an `UNKNOWN` local non-deletion and a remote deletion add a `NORMAL` entry to `deletions`, since we don't need to reupload.
* After merging, we add remaining tombstones from both sides to `deletions`, as we do here, keeping their sync statuses.

We can then insert the map into a separate temp table, drop rows from `moz_bookmarks_deleted` that don't exist in the temp table, then `INSERT OR IGNORE` new entries, and update sync statuses where they don't match. That might be easiest to do using an `AFTER DELETE` trigger on the temp table. That way, we can do all our bookkeeping (https://searchfox.org/mozilla-central/rev/36dec78aecc40539ecc8d78e91612e38810f963c/toolkit/components/places/SyncedBookmarksMirror.jsm#1343-1349,1356-1360,1363-1366,1369-1371,1373-1376,1379-1384) in the trigger body, avoid unnecessary writes where the sync statuses already match, and avoid nesting subselects in the updates...and, since we'll need to read and delete all rows in the temp table, anyway, we might as well have SQLite do the work for us.

::: toolkit/components/places/SyncedBookmarksMirror.jsm:4109
(Diff revision 1)
>      }
>  
> +    let localSyncStatus = this.localTree.localTombstoneSyncStatus(remoteNode.guid);
>      if (remoteNode.needsMerge) {
>        if (!remoteNode.isFolder()) {
> +        if (localSyncStatus == PlacesUtils.bookmarks.SYNC_STATUS.UNKNOWN) {

This looks correct to me, and I wonder if we need to restrict it to non-folders. Could we move this block above the `needsMerge` check?

IOW, if we have an `UNKNOWN` tombstone, treat it exactly as we would a non-syncable item: delete remotely, and call `relocateLocalOrphansToMergedNode` to flatten all its remote children that we don't know about locally into the merged folder.

::: toolkit/components/places/SyncedBookmarksMirror.jsm:4111
(Diff revision 1)
> +    let localSyncStatus = this.localTree.localTombstoneSyncStatus(remoteNode.guid);
>      if (remoteNode.needsMerge) {
>        if (!remoteNode.isFolder()) {
> +        if (localSyncStatus == PlacesUtils.bookmarks.SYNC_STATUS.UNKNOWN) {
> +          MirrorLog.trace("Remote non-folder ${remoteNode} deleted locally " +
> +                          "with unknown status. taking local change",

Nit: taking local deletion.

::: toolkit/components/places/SyncedBookmarksMirror.jsm:4113
(Diff revision 1)
>        if (!remoteNode.isFolder()) {
> +        if (localSyncStatus == PlacesUtils.bookmarks.SYNC_STATUS.UNKNOWN) {
> +          MirrorLog.trace("Remote non-folder ${remoteNode} deleted locally " +
> +                          "with unknown status. taking local change",
> +                          { remoteNode });
> +          this.structureCounts.localItemDel++;

This is now called `localDeletes`.

::: toolkit/components/places/SyncedBookmarksMirror.jsm:4188
(Diff revision 1)
>        }
>        return BookmarkMerger.STRUCTURE.UNCHANGED;
>      }
>  
>      if (localNode.needsMerge) {
> +      if (localNode.possiblyStale) {

Ditto, I'm wondering if we can do this above the `needsMerge` check, and call `relocateLocalOrphansToMergedNode` for folders.

::: toolkit/components/places/SyncedBookmarksMirror.jsm:4192
(Diff revision 1)
>      if (localNode.needsMerge) {
> +      if (localNode.possiblyStale) {
> +        MirrorLog.trace("Local non-folder ${localNode} deleted remotely and " +
> +                        "changed locally, but local sync status is unknown, " +
> +                        "deleting locally.", { localNode });
> +        this.structureCounts.localItemDel++;

`localDeletes` here, too.
Keywords: feature
Assignee: tchiovoloni → nobody
Status: ASSIGNED → NEW
Priority: P1 → P3
Is this what needs to be worked on next to get TPS Prod passing, now that the mozrunner version bug was fixed?

This got complicated and I don't see us coming back to it in both this engine and the rust engine

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.