Closed Bug 1274496 Opened 4 years ago Closed 3 years ago

Sync doesn't ignore items with places/excludeFromBackup

Categories

(Firefox :: Sync, defect, P1)

defect

Tracking

()

RESOLVED FIXED
Firefox 51
Tracking Status
firefox51 --- fixed

People

(Reporter: markh, Assigned: Lina)

References

Details

(Whiteboard: [sync-tracker])

Attachments

(2 files, 1 obsolete file)

The first time the user opens "Show All Bookmarks", the places root and all "magic" items under that root (eg, "Downloads", "Unfiled" etc) all get uploaded to the server. This is because Places fires an OnItemAdded notification for these items *before* they have the annotation, so they are added to the tracker. OnItemChanged doesn't handle this annotation being changed and doesn't remove it.

Further, there's no code in the engine itself that checks this record. IIUC, if Sync later decides it wants to upload all bookmarks (rather than just tracked ones), they'll all end up there as well.

And finally, when we create the mobile root we add this annotation to the root. I don't think we should be doing that either (ie, that this record *should* end up on the server) - but I'm not sure about that.
Flags: firefox-backlog+
Interesting. Definitely sounds like it's worth it to make sync understand and handle that annotation better, this is certainly the source of a lot of weirdness inside the bookmark tree on my dev profile.

That said, IMO it sounds like it's a potential bug in places that the event is fired before the annotation is added. This stuff should be handled transactionally, which means either:

A) it's not, e.g. the annotation is added outside of a transaction, or no transaction is used at all, in which case we could end up with these items without annotations. This seems likely to me, as I have a number of these without annotations in my bookmarks tree (although, these could all be due to sync bugs).

or B) the OnItemAdded event is fired before the transaction completes. This seems less likely, since it would probably result in more broken stuff than we see. (IIUC in order not to be broken, all rollbacks would have to fire events in reverse order undoing the other events for it to be certain not to result in sync bugs... which sounds like something we probably aren't doing).

Unless I'm missing something. In either case, it sounds like its a bad situation that we don't want.
I don't have a good understanding of why Places makes this distinction -- see Bug 417228 for some confusing context.

ISTM that the mobile bookmarks folder at this point is just another default top-level root, and it should be included in backups.
(In reply to Thom Chiovoloni [:tcsc] from comment #1)
> That said, IMO it sounds like it's a potential bug in places that the event
> is fired before the annotation is added. This stuff should be handled
> transactionally, which means either:

Yep, agreed - although IIUC that probably will not solve the general problem for us - it should stop those items being added to the tracker, but I believe there are cases where we decide to upload the entire tree, bypassing the tracker. There's also a (small IIUC) risk that these bookmarks may be created before Sync is initialized but the annotation added after it initializes.

So ISTM that regardless of what places should be doing here, Sync just needs to be more robust and exclude this anno in the engine itself rather than (or as well as) the tracker.

(In reply to Richard Newman [:rnewman] from comment #2)
> ISTM that the mobile bookmarks folder at this point is just another default
> top-level root, and it should be included in backups.

Agreed - it is Sync that adds that anno - it should not. None of these other "roots" have that anno.
(In reply to Mark Hammond [:markh] from comment #3)
> Yep, agreed - although IIUC that probably will not solve the general problem
> for us - it should stop those items being added to the tracker, but I
> believe there are cases where we decide to upload the entire tree, bypassing
> the tracker. 

Right, in these cases the engine would need to be aware of the annotation... But if the annotation may not be added at that point... I'm not sure what we should do.  Specifically, unless I'm mistaken, nothing prevents us sending the bookmark to the server between when it's created and when the anno is added.

In those cases, is there anything we can do? (Requesting deletion sounds like the wrong thing to do, since we don't want it gone from any clients, just from the server).

> So ISTM that regardless of what places should be doing here, Sync just needs
> to be more robust and exclude this anno in the engine itself rather than (or
> as well as) the tracker.

Again, not sure I understand. If the annotation isn't added until later, how would the engine being aware of it make any difference?
Priority: -- → P1
(In reply to Thom Chiovoloni [:tcsc] from comment #4)
> Right, in these cases the engine would need to be aware of the annotation...
> But if the annotation may not be added at that point... I'm not sure what we
> should do.  Specifically, unless I'm mistaken, nothing prevents us sending
> the bookmark to the server between when it's created and when the anno is
> added.

Yeah, that's correct in general and we should probably try and dig into the code creating those special elements to see if there is anything we can do there. I suspect it will be tricky.

Note that I believe the annotation is added somewhat "immediately", just not within a transaction and after the notification of the creation, so if the engine itself enforces this the window would be very small and we'd need to be extremely unlucky for a sync to actually hit this in practice - but yeah, that's still a possibility we should look to mitigate (one in a million chances happen daily for our users :/)

IOW:
* The work described here seems worthwhile and would be a vast improvement over what we currently have.
* Working out how to make places do better in this regard would make it better still.
Something that just occurred to me: the current incarnation of the tracker ignores items that have the "places/excludeFromBackup" anno. However, if an item doesn't have the anno, but its *parent* does...then it'll sync the parent. Tracking the item still seems desirable—for example, if I change the title or URL of a mobile bookmark, I definitely want those changes synced—but it doesn't make sense to track the parent.

It's not really related; I just wanted to write this down somewhere for investigating tomorrow. But checking for this anno is a source of complexity in the queries for bug 1258127, so +1 to cleaning them up. :-)
TBH I still don't understand what this annotation is actually for. However, none of the "roots" have that anno - the mobile root does, but only because we explicitly add it - it seems like we shouldn't.

The other bookmarks that *do* have it seem to be some kind of "intermediate" folder, with the parents as the root and having *real* bookmarks as children, and they never have a title. Best I can tell, the bookmark backup code doesn't simply ignore them completely, but instead treats the children as if they are actually parented by the parent - otherwise they would be orphans. The UI *does* ignore them - ie, their children do appear in the UI as parented by their actual grand-parent.

(In reply to Kit Cambridge [:kitcambridge] from comment #6)
> Something that just occurred to me: the current incarnation of the tracker
> ignores items that have the "places/excludeFromBackup" anno. However, if an
> item doesn't have the anno, but its *parent* does...then it'll sync the
> parent. Tracking the item still seems desirable—for example, if I change the
> title or URL of a mobile bookmark, I definitely want those changes
> synced—but it doesn't make sense to track the parent.

Yes - but see above - I *think* it's something like "to find the 'logical' parent, walk up the tree until you find a parent without that anno." Or something :/

I suspect we will need to reverse engineer the bookmarks backup code to work out what's going on. Or maybe Mak can tell us :)
Flags: needinfo?(mak77)
(In reply to Mark Hammond [:markh] from comment #7)
> TBH I still don't understand what this annotation is actually for.

The annotation is set on items that are created by the UI, rather than by the user. For example the left pane folder of the Library is a real bookmarks folder but it is not user created. When we backup it doesn't make sense to back it up, since it may also create confusion to the creation code, and when we restore we don't remove it.
Regarding mobile, I suppose the reasoning was similar, since they are synced from elsewhere, we were not backing them up/restoring them.

> However,
> none of the "roots" have that anno - the mobile root does, but only because
> we explicitly add it - it seems like we shouldn't.

Probably could stop doing that if you want to also backup that root contents, we may need to also change the backup/restore code to consider it a valid root.

> The other bookmarks that *do* have it seem to be some kind of "intermediate"
> folder, with the parents as the root and having *real* bookmarks as
> children, and they never have a title. Best I can tell, the bookmark backup
> code doesn't simply ignore them completely, but instead treats the children
> as if they are actually parented by the parent - otherwise they would be
> orphans. The UI *does* ignore them - ie, their children do appear in the UI
> as parented by their actual grand-parent.

I'm not sure what you are saying here, the marked bookmarks are real folders containing queries or folders. The bookmarks backup code completely skips them, since any version is able to generate them. They are only queries or folder containing  queries. The contained bookmarks are results of querying but they are not direct children of those excluded nodes.
The only solution to stop caring about this annotation, would be to generate on-the-fly the contents of the Library left pane folder. It may also be a good idea, but I can't tell off-hand how hard it would be, could be matter of just creating "fake" nodes and calling the right treeview insert notifications to show them. Or even creating a fake nsINavHistoryResult. I'd be happy if we'd solve this by making those things non-real bookmarks. We never looked into that, since from a long time there was a plan to move the Library to an in-content view, and then it may not have a real left pane at all.

> (In reply to Kit Cambridge [:kitcambridge] from comment #6)
> > Something that just occurred to me: the current incarnation of the tracker
> > ignores items that have the "places/excludeFromBackup" anno. However, if an
> > item doesn't have the anno, but its *parent* does...then it'll sync the
> > parent.

In Firefox UI code any excluded folder only contains queries or folders that are also marked as excluded. And they DO NOT contain bookmarked uris.

The only special case is the mobile root.
Flags: needinfo?(mak77)
the tree structure of the left pane root is
places_root
--left_pane_root
---- history_query
---- downloads_query
---- tags_query
---- allBookmarks_folder
-------- toolbar_query
-------- menu_query
-------- unsorted_query
-------- mobile_query

There is nothing else there.

Now, I don't recall what the mobile root does, ideally there should be a real mobile root without the exclude annotation, and then a query in All Bookmarks with the exclude annotation.
(In reply to Marco Bonardo [::mak] from comment #8)
> I'm not sure what you are saying here, the marked bookmarks are real folders
> containing queries or folders. The bookmarks backup code completely skips
> them, since any version is able to generate them. They are only queries or
> folder containing  queries. The contained bookmarks are results of querying
> but they are not direct children of those excluded nodes.

Oh, right, I think that's what I was missing. I suspect sync is uploading these results and isn't correctly ignoring changes to nodes under a parent with this anno.

> The only solution to stop caring about this annotation, would be to generate
> on-the-fly the contents of the Library left pane folder.

That sounds like alot more (and risky) work. I suspect we can use it correctly once we get a handle on all of this - I'm going to need to re-read the above and re-look at my places database a few times to pretend I truly understand it, but that is a great help, thanks.
(In reply to Mark Hammond [:markh] from comment #10)
> Oh, right, I think that's what I was missing. I suspect sync is uploading
> these results and isn't correctly ignoring changes to nodes under a parent
> with this anno.

By "these results" what do you mean? Afaict, Sync should never open a place: query, do I don't think it can really upload the bookmarks it contains. I suppose you mean it's uploading the excluded folders/queries?

> > The only solution to stop caring about this annotation, would be to generate
> > on-the-fly the contents of the Library left pane folder.
> 
> That sounds like alot more (and risky) work. I suspect we can use it
> correctly once we get a handle on all of this - I'm going to need to re-read
> the above and re-look at my places database a few times to pretend I truly
> understand it, but that is a great help, thanks.

I don't think it's that risky, in the worst case the left pane may be in a bad shape, we'd notice very soon in Nightly. The current code is very complex and riskier than simulating a left pane folder.
I have no doubt about the fact it would be better to generate the left pane root in some "fake" way, my only doubt is how hard would be to do that.
(In reply to Mark Hammond [:markh] from comment #10)
> Oh, right, I think that's what I was missing. I suspect sync is uploading
> these results and isn't correctly ignoring changes to nodes under a parent
> with this anno.

Yeah. Sync tracks items and their parents when they're added or removed, but only checks for the anno on the item: https://dxr.mozilla.org/mozilla-central/rev/46fe2115d46a5bb40523b8466341d8f9a26e1bdf/services/sync/modules/engines/bookmarks.js#1412-1416,1424-1425,1428-1429,1434-1436,1439-1440

In the mobile case, I think that's correct, since we don't (?) want to ignore changes to mobile bookmarks. So I'm not sure why we're special-casing the mobile root, either. :-)
FTR:

(In reply to Thom Chiovoloni [:tcsc] from comment #1)
> That said, IMO it sounds like it's a potential bug in places that the event
> is fired before the annotation is added. This stuff should be handled
> transactionally, which means either:

It is (almost) *both* rather than either :/

These items are added at https://dxr.mozilla.org/mozilla-central/rev/829d3be6ba648b838ee1953fdfa1a477dace752f/browser/components/places/PlacesUIUtils.jsm#1271 - as you can see, there are 2 calls made - one to create the folder and the next to add the annotation.

> A) it's not, e.g. the annotation is added outside of a transaction, or no
> transaction is used at all, in which case we could end up with these items
> without annotations.

So nsINavBookmarksService.createFolder does create a transaction internally, mainly to protect the child indexes - but that doesn't help here.

Now it seems that we could possibly use a nested transaction here, but:

> or B) the OnItemAdded event is fired before the transaction completes.

IIUC, this would still happen before the "outer" transaction was complete - the notifications are sent directly by that c++ code - so a nested transaction would just end up with the notification being called at the same time, and as you say, probably causing other bad things to happen (IIUC, it would depend on what of the numerous db connections used by places was used to look for the data.)

That PlacesUIUtils code also does this in a "batch" - I always thought the main point for these batches was to prevent many observer notifications from firing for everything done in the batch and would be used for, eg, bookmark restore - with a single notification is fired at the end of the batch. But based on both reading the code and the actual behaviour, this isn't the case. Bookmarks.jsm, which has a better and async API for bookmarks also doesn't seem to have the capability to create a bookmark and annotations in a single transaction with the notification only coming at the end.

I'm still getting my head around some of this, so while this obviously sucks, it's not yet clear to me how much of a problem it actually is for Sync (ie, whether the work in not uploading these records that we need to do can side-step this problem)
Annotations is a separate feature that lives apart from bookmarks/pages, so it makes sense that you get a notification that an item is added and then an annotation is added for that item.

(In reply to Mark Hammond [:markh] from comment #13)
> That PlacesUIUtils code also does this in a "batch" - I always thought the
> main point for these batches was to prevent many observer notifications from
> firing for everything done in the batch and would be used for, eg, bookmark
> restore - with a single notification is fired at the end of the batch. But
> based on both reading the code and the actual behaviour, this isn't the
> case

Batches are only a way to tell consumers "you are about to get a bunch of notifications, do what you think is better". Results for example during a batch could ignore notifications, and rather requery when the batch finishes.
But we never stop sending notifications.

> Bookmarks.jsm, which has a better and async API for bookmarks also
> doesn't seem to have the capability to create a bookmark and annotations in
> a single transaction with the notification only coming at the end.

Yes, cause annotations are a separate thing one can use to add any kind of data to a bookmark or an uri.

that said, the only thing Sync should care about here is to not sync the left pane folder contents, that is direct children of PlacesUIUtils.leftPaneFolderId and children of PlacesUIUtils.allBookmarksFolderId.
Note these 2 are not carved in stone, indeed to move the APIs to guid and async, we'll sooner or later need to change how the whole left pane story works (and we could then make those non-real bookmarks as I suggested in comment 8).

For now I fell like you could just ignore notifications having those 2 ids as parent.
(In reply to Marco Bonardo [::mak] from comment #14)
> For now I fell like you could just ignore notifications having those 2 ids
> as parent.

That helps, thanks. I'm also seeing, eg, "Recent Tags" being added directly to the "menu_" root etc, which I'm guessing we want to ignore? They don't have the "exclude" anno but do have "Places/SmartBookmark". It's almost sounding like either those annos, or something similar to |{leftPaneItems} + {item.url.startsWith("places:")};|?
those are indeed Smart Bookmarks, that is pre-populated queries we add to each profile. Currently there should be 2, Recent Tags and Most Visited (in previous versions we also had Recently Bookmarked).
Ideally it's another thing you could avoid syncing, since every profile will go looking for them and create them if they are missing. The problem is that sync may create duplicates of these if it doesn't implement specific handling.
You could surely skip urls with a "place" scheme, and in some cases it may also be a good idea (folder shortcuts like place:folder=ID are unlikely to be portable across profiles, indeed we plan from a long time to move those to guids).
The only downside is that if a user has some specific query he uses, like for example a shortcut to a specific tag, a shortcut to a specific history view, or something like that, those won't be synced and he'll have to recreate them manually.
On the other side, these smart bookmarks are created only once in the profile life, very early at profile creation, just after default bookmarks and distribution.ini bookmarks are imported.
I suppose also distribution.ini and default bookmarks are something you are unlikely to want to sync (you would sync Ubuntu specific bookmarks to a Windows instance?)
Maybe we just need a way to indicate which bookmarks are created with the profile? Though, that way could be annotations and doesn't look like you have a good plan to handle those.
(In reply to Marco Bonardo [::mak] from comment #16)
> The only downside is that if a user has some specific query he uses, like
> for example a shortcut to a specific tag, a shortcut to a specific history
> view, or something like that, those won't be synced and he'll have to
> recreate them manually.

Right - and I think those other special ones are copy-and-move able, so I don't think that will fly. But we are getting closer.
 
> On the other side, these smart bookmarks are created only once in the
> profile life

Except for the corrupt-at-startup case, which is how I'm actually testing some of this.

I'm also not yet convinced annotations aren't workable (and can possibly be filtered post-tracker in Kit's patch). I'll keep playing and learning...
An observation: it doesn't make sense to not sync any record that:

* is a non-root in a tree that is synced
* the user cares about its position. 

The safest thing to do is sync everything in a tree. Badness would happen if we skipped place: queries inside folders; you'd need to exclude them in child lists, make reparenting smarter, etc. 

The bookmark tree we upload needs to be internally consistent and other clients need to be able to write to it with impunity. That's why we also upload separators and a bunch of other legacy bookmark stuff that only desktop cares about — we have to preserve data.
(In reply to Richard Newman [:rnewman] from comment #18)
> An observation: it doesn't make sense to not sync any record that:
> 
> * is a non-root in a tree that is synced
> * the user cares about its position. 
> 
> The safest thing to do is sync everything in a tree. Badness would happen if
> we skipped place: queries inside folders; you'd need to exclude them in
> child lists, make reparenting smarter, etc.

Yep, agreed - we need to sync "Places/SmartBookmark" items and just exclude the entire "left pane" tree (which is what Mak said). I'm still thinking about whether it is reliable and performant to try and detect this at notification time, or if we should just allow the tracker to record them but exclude them at Sync time - the issue is that at notification time we only have the parent ID and no other details, and having the tracker get annotations for parents seems less efficient than just filtering when getChangedIDs is called.
Attached patch t.patchSplinter Review
(In reply to Marco Bonardo [::mak] from comment #14)

> For now I fell like you could just ignore notifications having those 2 ids
> as parent.

That does work - I get good results with this patch. However, I don't think we should use it :) I'm starting to think we should perform this filtering at getChangedIDs() time, still against the anno, for 2 reasons:

* It will help keep improve the observer performance - we wouldn't need to pre-create those leftpane items, and could avoid looking for the "exclude" annotation during the notifications. We only expect a very small number of such items to appear in the tracker, and even then only very rarely.

* Do doing the filtering at getChangedIDs() time we are paying small additional cost, but we are doing so at the time we are about to hit many bookmarks anyway (ie, we are about to Sync some bookmarks). If that works out, it means Kit's work in bug 1258127 could be greatly simplified by not attempting to indirect through annotations on each change. IOW, we could probably land something along these lines using the existing tracker, prove it works well, and use the exact same approach in the DB-based tracker (ie, the DB would tell us about the excluded sub-trees, but we'd still ignore them while resetting their change counter)

The only real downside I could see would be if *only* excluded items are in the tracker - we'll increment the score, start to Sync, just to find there's actually nothing to do. I think that's fine as it is a one-off penalty - those items are almost certainly not going to reappear in the tracker again.

(note the patch also has a hack to exclude the root from being synced, which we should take in some form, but can probably roll that into the above.  I think we hit that case when the "Places/SmartBookmark" items are added)

Any thoughts on that?
Whiteboard: [sync-data-integrity]
Priority: P1 → P2
Whiteboard: [sync-data-integrity] → [data-integrity]
Assignee: nobody → kcambridge
Priority: P2 → P1
This is a rough draft at filtering out bookmarks at sync time. The tracker now records all changes, and `getChangedIDs()` filters out anything that's not under a change root (menu, toolbar, unfiled, and mobile). It's a backport of some of the work from bug 1258127, using the old observer-based tracker.

The query is hideously complicated because we need to reconstruct the ancestry for deleted bookmarks: we know their parent, but not the root they belonged to. We can't look this up in `onItemRemoved`, either, because the bookmark has been deleted from moz_bookmarks at that point. There are a couple of things we can do to work around this:

1. Just upload tombstone records for all deleted "bookmarks", including tags, left pane queries, and anything else that we normally exclude. Code-wise, this is probably the simplest option. This "shouldn't" (scare quotes) cause issues for the receiving clients; since we never uploaded the original record, they'll see a deletion for an item that they don't have, and ignore it. It does mean we'll write more tombstones to the server, though, especially when tagging and untagging. It also means we'll need to document this behavior, since it's rather surprising to upload phantom tombstones for items that we don't otherwise track.

2. Pass the root ID to the observer. Looking up the root is a fairly small WITH RECURSIVE query, but we'd need to do that for every `removeItem` call. (We'd also need to do this for `removeFolderChildren`, though we can optimize that to only query the folder's root instead of every descendant). Additionally, we can include the root ID in the record, which could help with reconciliation, and, for newer clients, filtering out incoming records that shouldn't be there. This is conceptually simpler than (1), cleans up the hideous query, and means we don't need to upload phantom tombstones. It does mean an additioal query every time we remove a bookmark, though.

2a. Only look the root up for synced bookmarks. Not really feasible with the current observer-based tracker, but might be a good compromise for the new tracker in bug 1258127.

3. Do nothing; keep the complicated query here and in bug 1258127.

I'm inclined to try (2), since it's "obviously" (understandably) correct, and see how much of a perf hit we'll suffer. If that doesn't pan out, we can consider something else.
This is also interesting: http://web.archive.org/web/20140723034731/https://testpilot.mozillalabs.com/testcases/a-week-life/results.html Walking the ancestors on each deletion might not be very expensive with a shallow tree.
(In reply to Kit Cambridge [:kitcambridge] from comment #22)
> 1. Just upload tombstone records for all deleted "bookmarks", including
> tags, left pane queries, and anything else that we normally exclude.
> Code-wise, this is probably the simplest option. This "shouldn't" (scare
> quotes) cause issues for the receiving clients; since we never uploaded the
> original record, they'll see a deletion for an item that they don't have,
> and ignore it. It does mean we'll write more tombstones to the server,
> though, especially when tagging and untagging. It also means we'll need to
> document this behavior, since it's rather surprising to upload phantom
> tombstones for items that we don't otherwise track.

Actually, it turns out we already do the wrong thing! `removeItem` removes the item's annos, so http://searchfox.org/mozilla-central/rev/064025c802c22cd5ad122746733cbd34ea47393c/services/sync/modules/engines/bookmarks.js#1010-1014 is never going to be hit for removed bookmarks. So I think we'll already upload a tombstone in this case.
(In reply to Kit Cambridge [:kitcambridge] from comment #22)

> 1. Just upload tombstone records for all deleted "bookmarks", including
> tags, left pane queries, and anything else that we normally exclude.
> Code-wise, this is probably the simplest option. This "shouldn't" (scare
> quotes) cause issues for the receiving clients; since we never uploaded the
> original record, they'll see a deletion for an item that they don't have,
> and ignore it.

I suspect that the real behavior for clients will be something blindly optimistic and trusting like:

  handleRecord(record) {
    if (record.deleted) {
       this._store.deleteRecordForGUID(record.id);
       return;
    }
    ...

so I would not advise this option.



> 2. Pass the root ID to the observer. Looking up the root is a fairly small
> WITH RECURSIVE query, but…

Have you considered storing the root of a bookmark as part of the database row? This requires recursive update when moving folders, and non-recursive update when moving anything else, but makes root lookup free. The schema change should be relatively straightforward.


> I'm inclined to try (2), since it's "obviously" (understandably) correct,
> and see how much of a perf hit we'll suffer. If that doesn't pan out, we can
> consider something else.

That seems like the best option to me.


As a sidenote: this bug (and IRC chatter) implies that in some circumstances we have already uploaded:

* Non-deleted records for bookmarks we shouldn't sync (e.g., the places root itself)
* Deleted records for bookmarks we shouldn't sync.

At some point we should intersect the GUIDs on the server with the non-synced GUIDs we're tracking locally, and HTTP DELETE those records from the server. They'll only cause trouble.
(In reply to Kit Cambridge [:kitcambridge] from comment #23)
> Walking the ancestors on each deletion
> might not be very expensive with a shallow tree.

Something that Nick and I are discovering as we build Datomish is that sqlite is actually really, really fast.

One of the lessons from exploring pure-JS and web-platform storage is that sqlite tramping around the disk is very often much, much faster than JS tramping around hash tables in memory.

We're fond of saying "always bet on the web", but I increasingly prefer "always bet on the folks who have spent years and years writing C and making their stuff fast".
Attachment #8786520 - Attachment is obsolete: true
Comment on attachment 8786918 [details]
Bug 1274496 - Filter excluded bookmarks at sync time based on their root.

https://reviewboard.mozilla.org/r/75784/#review73724

::: services/sync/modules/engines.js:1299
(Diff revision 1)
> +   * changeset is an object keyed by the record ID, with opaque values.
> +   * By default, the value is a timestamp, though the bookmarks engine
> +   * uses `{ modified, deleted }` tuples.
> +   */
> +
> +  // Returns the last modified time, in seconds, for an entry in the changeset.

I backported these from bug 1258127. The idea is to make `this._modified` more opaque to the base engine, so that engines can store additional data for changed items, instead of just timestamps.

It's more useful there because we don't actually delete entries from `this._modified`, just mark them as synced. But it makes for a smaller diff.

::: services/sync/modules/engines/bookmarks.js:1005
(Diff revision 1)
> +    let shouldSaveChange = false;
> +    let currentChange = this.changedIDs[id];
> +    if (currentChange) {
> +      if (typeof currentChange == "number") {
> +        // Allow raw timestamps for backward-compatibility with persisted
> +        // changed IDs. The new format uses tuples to track deleted items.

The old JSON format is `{ guid: timestamp }`, the new one is `{ guid: { deleted: boolean, modified: timestamp }`.

Not sure how much we care about back-compat. We can remove this, but then we'll lose any tracked IDs stored in the JSON file now.

::: services/sync/modules/engines/bookmarks.js:1139
(Diff revision 1)
> +    try {
> +      grandParentId = PlacesUtils.bookmarks.getFolderIdForItem(parentId);
> +    } catch (ex) {
> +      // `getFolderIdForItem` can throw if the item no longer exists, such as
> +      // when we've removed a subtree using `removeFolderChildren`.
> +      return;

It's possible for us to be smarter here, but we really need to know the item's root in order to do the right thing. The observer notification just doesn't give us enough context.

::: services/sync/modules/engines/bookmarks.js:1164
(Diff revision 1)
> +     * pane root. For good measure, we'll also upload the Places root, because
> +     * it's the parent of the left pane root.
> +     *
> +     * As a workaround, we can track the parent GUID and reconstruct the item's
> +     * ancestry at sync time. This is complicated, and the previous behavior was
> +     * already wrong, so we'll wait for bug 185127 to fix this generally.

So, yes...

1. We miss out on stuff we *do* want to track, like `removeFolderChildren` removing descendants...because `gFIFI` will fail.
2. We upload stuff we shouldn't, like the Places root and tombstones for removed left pane queries.

::: services/sync/tests/unit/test_bookmark_tracker.js:1263
(Diff revision 1)
>  
> +    // `eraseEverything` removes all items from the database before notifying
> +    // observers. Because of this, grandchild lookup in the tracker's
> +    // `onItemRemoved` observer will fail. That means we won't track
> +    // (bzBmk.guid, bugsGrandChildBmk.guid, bugsChildFolder.guid), even
> +    // though we should.

Another way we can easily miss changes today: remove a folder's contents with `removeFolderChildren` or `Bookmarks.remove`. We'll track its *immediate* children, but none of their descendants, because `getFolderIdForItem` will throw, and we won't add the changed ID to the tracker!

This patch doesn't fix that...but this is already possible today, so it doesn't make it worse. :-(
Comment on attachment 8786918 [details]
Bug 1274496 - Filter excluded bookmarks at sync time based on their root.

Mark, Richard, I expect you'll have a lot to say about this. :-) Please take a look at your convenience.
Attachment #8786918 - Flags: feedback?(markh)
Comment on attachment 8786918 [details]
Bug 1274496 - Filter excluded bookmarks at sync time based on their root.

Oops, forgot to add Richard.
Attachment #8786918 - Flags: feedback?(rnewman)
Blocks: 1258127
Status: NEW → ASSIGNED
Whiteboard: [data-integrity] → [sync-tracker]
Comment on attachment 8786918 [details]
Bug 1274496 - Filter excluded bookmarks at sync time based on their root.

https://reviewboard.mozilla.org/r/75784/#review73788

f+

::: services/sync/modules/engines.js:145
(Diff revision 2)
> +    this.changedIDs[id] = when;
> +    this.saveChangedIDs(this.onSavedChangedIDs);
> +  },
> +
> +  _now() {
> +    return Math.floor(Date.now() / 1000);

It occurs to me that flooring here doesn't really make any sense.

::: services/sync/modules/engines.js:177
(Diff revision 2)
>      if (!id) {
>        this._log.warn("Attempted to remove undefined ID to tracker");
>        return false;
>      }
> -    if (this.ignoreAll || (id in this._ignored))
> +    if (this._isIgnored(id))
>        return false;

Take this opportunity to add `{}`.

But consider: the changed IDs file might be from an older version of Firefox. I don't think it hurts to remove an ignored change ID -- it shouldn't be present anyway, and if it is we do want to remove it.

::: services/sync/modules/engines.js:1420
(Diff revision 2)
>  
>          // If the local item was modified, we carry its metadata forward so
>          // appropriate reconciling can be performed.
> -        if (dupeID in this._modified) {
> +        if (this._isModified(dupeID)) {
>            locallyModified = true;
> -          localAge = Date.now() / 1000 - this._modified[dupeID];
> +          localAge = Date.now() / 1000 - this._getModifiedTimestamp(dupeID);

Except that the modified time is rounded down, so this has a little bit of a bias.

::: services/sync/modules/engines.js:1423
(Diff revision 2)
> -        if (dupeID in this._modified) {
> +        if (this._isModified(dupeID)) {
>            locallyModified = true;
> -          localAge = Date.now() / 1000 - this._modified[dupeID];
> +          localAge = Date.now() / 1000 - this._getModifiedTimestamp(dupeID);
>            remoteIsNewer = remoteAge < localAge;
>  
> -          this._modified[item.id] = this._modified[dupeID];
> +          this._replaceModified(dupeID, item.id);

But we don't know if this record was previously uploaded and is now changed, or if it's modified because it's new, or if it's modified because we're re-syncing from scratch.

When you call `replaceModified`, you remove this ID from the set that we're going to upload. That will potentially leave the duped record on the server.

I think here you should consider uploading a deletion.

::: services/sync/modules/engines.js:1463
(Diff revision 2)
>          return true;
>        }
>  
>        this._log.trace("Ignoring incoming item because the local item's " +
>                        "deletion is newer.");
>        return false;

I think the bookmark engine will want to track the record's parent here?

That is: if the incoming record is a deletion, presumably the server's parent record doesn't include the un-deleted child, and so we need to upload one.

::: services/sync/modules/engines/bookmarks.js:1018
(Diff revision 2)
> +    // Filter out tags, organizer queries, and other descendants that we're
> +    // not tracking.
> +    let query = `
> +      WITH RECURSIVE
> +        modifiedGuids(syncID, guid) AS (
> +          VALUES ${modifiedGUIDs.join()}

SQL injection, inadvertent or not?

And, as noted elsewhere: if you special-case the roots -- perhaps by not including them in modifiedGUIDs, so there's no chance you'll remove them on line 1039 -- you don't need to handle pairs here.

::: services/sync/modules/engines/bookmarks.js:1023
(Diff revision 2)
> +          VALUES ${modifiedGUIDs.join()}
> +        ),
> +        syncedItems(guid) AS (
> +          SELECT r.guid
> +          FROM moz_bookmarks r
> +          WHERE r.id IN (${getChangeRootIds().join(", ")})

Shouldn't these be quoted?

::: services/sync/modules/engines/bookmarks.js:1032
(Diff revision 2)
> +          JOIN moz_bookmarks p ON p.id = b.parent
> +          JOIN syncedItems s ON p.guid = s.guid
> +        )
> +      SELECT m.syncID
> +      FROM modifiedGuids m
> +      WHERE m.guid NOT IN (SELECT guid FROM syncedItems)

Is this more efficient when phrased as a left excluding join? Something like (with my de-pairing suggestion):

```
SELECT m.guid
FROM modifiedGuids m LEFT JOIN syncedItems
ON m.guid = syncedItems.guid
WHERE syncedItems.guid IS NULL
```

::: services/sync/modules/engines/bookmarks.js:1058
(Diff revision 2)
> +      if (this.changedIDs[syncID].deleted === true) {
> +        // The `===` check also filters out old persisted timestamps,
> +        // which won't have a `deleted` property.
> +        continue;
> +      }
> +      guids.push(`(${JSON.stringify(syncID)}, ${JSON.stringify(placesGUID)})`);

* 99% of the time, syncID and placesGUID will be the same. The only time it won't be is for the roots. Can we special-case the roots in `getChangedIDs` to avoid having to handle pairs here?
* JSON stringifying isn't quite the same as SQL escaping, is it?
Attachment #8786918 - Flags: review+
Attachment #8786918 - Flags: review+
Attachment #8786918 - Flags: feedback?(rnewman)
Attachment #8786918 - Flags: feedback+
Comment on attachment 8786918 [details]
Bug 1274496 - Filter excluded bookmarks at sync time based on their root.

https://reviewboard.mozilla.org/r/75784/#review73838

I think is certainly the direction we want to head. In addition to Richards thorough comments, I'm a little concerned with the additional "maybe abstract, maybe not" functions being introduced - more below.

::: services/sync/modules/engines.js:135
(Diff revision 2)
>      if (index != -1)
>        this._ignored.splice(index, 1);
>    },
>  
> +  _isIgnored(id) {
> +    return this.ignoreAll || (id in this._ignored);

Is this change actually needed here? I understand you probably need this change for bug 1299338 (along with a corresponding change in the bookmarks engine so either  ignoreAll is no longer checked, or it's always false for that engine).

I don't object to the change, I'm just trying to understand it.

::: services/sync/modules/engines.js:887
(Diff revision 2)
> +   */
> +  getAllIDs() {
> +    // Mark all items to be uploaded, but treat them as changed from long ago
> +    let modified = {};
> +    for (let id in this._store.getAllIDs()) {
> +      modified[id] = 0;

It seems a little odd that just above we are documenting an "opaque change record" for getChangedIDs, but here we are sticking with "changed timestamp"

I think that a comment like "while change records are opaque, the default implementation here is just a timestamp - however, some engines may override this" or similar would make this clearer.

::: services/sync/modules/engines.js:957
(Diff revision 2)
>      if (this.lastSync) {
>        this._modified = this.getChangedIDs();
>      } else {
> -      // Mark all items to be uploaded, but treat them as changed from long ago
>        this._log.debug("First sync, uploading all items");
> -      this._modified = {};
> +      this._modified = this.getAllIDs();

This is almost the only place remaining in this file that references `this._modified` directly - apart from the new abstractions. At face value, this seems to be increasing complexity a little.

eg, `id in this._modified` has been replaced with the abstract `this._isModified(id)` - but that's not actually overridden by the bookmarks engine, and given the use of `self._modified` here, it's not apparent that it could be.

ISTM that we might be better off with either a new method called, say `this.prepareModified()` which in the default implementation is just `this._modified = this.getAllIDs()` (making the abstraction a little more complete), or sticking with the assumption of `this._modified` being the change records and deleting `._isModified` as that helps reinforce `this._modified` is the only model used. IOW, I don't think your abstractions actually allow an engine to override the use of `this._modified`, only to override what "values" (but not what "keys") it has.

However, another option to consider might be a "ChangesetManager" or similar object with the overridable methods being on that - the core engine stays fairly clean and clear (eg, `this._modified.has(id)`, `this._modified.getTimestamp(id)` etc (assuming `this._modified` remains the name of the instance used))

(This is more an observation than anything, and the correct answer isn't clear)

::: services/sync/modules/engines.js:1423
(Diff revision 2)
> -        if (dupeID in this._modified) {
> +        if (this._isModified(dupeID)) {
>            locallyModified = true;
> -          localAge = Date.now() / 1000 - this._modified[dupeID];
> +          localAge = Date.now() / 1000 - this._getModifiedTimestamp(dupeID);
>            remoteIsNewer = remoteAge < localAge;
>  
> -          this._modified[item.id] = this._modified[dupeID];
> +          this._replaceModified(dupeID, item.id);

> I think here you should consider uploading a deletion.

Bug 1276823 arranges for this to happen (which is waiting for rnewman's review ;)

::: services/sync/tests/unit/test_bookmark_tracker.js
(Diff revision 2)
>      _("Clean up.");
>      yield cleanup();
>    }
>  });
>  
> -add_task(function* test_onItemExcluded() {

I'd have thought we want a test very similar to this, but instead of the anno we just want bookmarks outside of our roots?
Oh hey, a user got bit by us not backing up mobile bookmarks!

https://support.mozilla.org/en-US/questions/1126174
(In reply to Richard Newman [:rnewman] from comment #34)
> Oh hey, a user got bit by us not backing up mobile bookmarks!
> 
> https://support.mozilla.org/en-US/questions/1126174

Actually, IIUC, this bug isn't where that will be fixed - that root will still have the anno, and it's the places backup code itself that looks for that anno. It may be bug 647605 where we fix this, but we really need to do 2 things - (a) stop adding that anno to the mobile root when we create it and (b) work out how to sanely remove that anno for an already created mobile root.
Comment on attachment 8786918 [details]
Bug 1274496 - Filter excluded bookmarks at sync time based on their root.

https://reviewboard.mozilla.org/r/75784/#review73788

> Take this opportunity to add `{}`.
> 
> But consider: the changed IDs file might be from an older version of Firefox. I don't think it hurts to remove an ignored change ID -- it shouldn't be present anyway, and if it is we do want to remove it.

I don't think we need to worry about older versions of Firefox for ignored items, unless I'm missing something. ISTM `this._ignored` is in-memory; only `this.changedIDs` is persisted to disk.

> I think the bookmark engine will want to track the record's parent here?
> 
> That is: if the incoming record is a deletion, presumably the server's parent record doesn't include the un-deleted child, and so we need to upload one.

Hmm, could you elaborate a bit, please? If the incoming record is a deletion, but we've changed the item locally (so our undeletion is newer), wouldn't we already have its parent in the tracker?

> Shouldn't these be quoted?

These are numbers, not strings, so no quotes needed.

> Is this more efficient when phrased as a left excluding join? Something like (with my de-pairing suggestion):
> 
> ```
> SELECT m.guid
> FROM modifiedGuids m LEFT JOIN syncedItems
> ON m.guid = syncedItems.guid
> WHERE syncedItems.guid IS NULL
> ```

Makes sense! The query plan still shows a table scan, but I think that's unavoidable here.

> * 99% of the time, syncID and placesGUID will be the same. The only time it won't be is for the roots. Can we special-case the roots in `getChangedIDs` to avoid having to handle pairs here?
> * JSON stringifying isn't quite the same as SQL escaping, is it?

* Sure, I'll special-case the roots. There's a lot of complexity in handling them, anyway; bug 1295521 should clean that up.
* Indeed, it's not the same as SQL escaping; Bookmarks.jsm uses it for quoting. Talking with Mak on IRC, the Storage API doesn't have a way to escape arrays (short of using positional binding params), so I settled on a `PlacesUtils.isValidGuid` check and a comment.
Comment on attachment 8786918 [details]
Bug 1274496 - Filter excluded bookmarks at sync time based on their root.

https://reviewboard.mozilla.org/r/75784/#review73838

> Is this change actually needed here? I understand you probably need this change for bug 1299338 (along with a corresponding change in the bookmarks engine so either  ignoreAll is no longer checked, or it's always false for that engine).
> 
> I don't object to the change, I'm just trying to understand it.

You're right, I don't think it makes sense to split out `_now` and `_isIgnored` quite yet. I'm going to revert this, and reinstate it in bug 1299338.

> It seems a little odd that just above we are documenting an "opaque change record" for getChangedIDs, but here we are sticking with "changed timestamp"
> 
> I think that a comment like "while change records are opaque, the default implementation here is just a timestamp - however, some engines may override this" or similar would make this clearer.

Another point in favor of a `Changeset` object. Added a comment.

> This is almost the only place remaining in this file that references `this._modified` directly - apart from the new abstractions. At face value, this seems to be increasing complexity a little.
> 
> eg, `id in this._modified` has been replaced with the abstract `this._isModified(id)` - but that's not actually overridden by the bookmarks engine, and given the use of `self._modified` here, it's not apparent that it could be.
> 
> ISTM that we might be better off with either a new method called, say `this.prepareModified()` which in the default implementation is just `this._modified = this.getAllIDs()` (making the abstraction a little more complete), or sticking with the assumption of `this._modified` being the change records and deleting `._isModified` as that helps reinforce `this._modified` is the only model used. IOW, I don't think your abstractions actually allow an engine to override the use of `this._modified`, only to override what "values" (but not what "keys") it has.
> 
> However, another option to consider might be a "ChangesetManager" or similar object with the overridable methods being on that - the core engine stays fairly clean and clear (eg, `this._modified.has(id)`, `this._modified.getTimestamp(id)` etc (assuming `this._modified` remains the name of the instance used))
> 
> (This is more an observation than anything, and the correct answer isn't clear)

That's a good point! I'm a little wary of adding new abstractions to this code, but I agree that something like a `Changeset` would be good here, and it does look rather clean. Also, you're right; engines can only override the values now, not the keys.
Thank you both for the thorough feedback! Going to push a new revision now with a new `Changeset` object and the changes to the query.
(In reply to Kit Cambridge [:kitcambridge] from comment #36)

> > That is: if the incoming record is a deletion, presumably the server's parent record doesn't include the un-deleted child, and so we need to upload one.
> 
> Hmm, could you elaborate a bit, please? If the incoming record is a
> deletion, but we've changed the item locally (so our undeletion is newer),
> wouldn't we already have its parent in the tracker?

Not necessarily. Perhaps you deleted the bookmark on your phone (uploading a deletion and its former parent with a smaller child list), and you renamed it locally. The rename will not track the parent, but you'll undelete it, so you'll need to be prepared to handle the incoming parent's mistake and reupload both records.
Writing a test for this would be really interesting. There are four scenarios:

* Parent with removed child, then deleted child
* Vice versa
* Only parent
* Only child

The end goal is that in all four cases the child we refuse to delete still exists and is still in the same parent, and in the first three cases we have queued up the parent for upload.
Blocks: 1299978
Comment on attachment 8786918 [details]
Bug 1274496 - Filter excluded bookmarks at sync time based on their root.

https://reviewboard.mozilla.org/r/75784/#review74934

Unless I'm misunderstanding it seems that .entries() call isn't going to do the right thing for bookmarks (and I might be missing something, as I'd have expected tests to pick that up?

::: services/sync/modules/engines.js:1572
(Diff revision 3)
>      if (!this._modified) {
>        return;
>      }
>  
>      // Mark failed WBOs as changed again so they are reuploaded next time.
> -    for (let [id, when] of Object.entries(this._modified)) {
> +    for (let [id, when] of this._modified.entries()) {

This seems wrong - IIUC, .entries() is explicitly returning the opaque change record, which isn't going to be the modified timestamp for bookmarks. I'd have expected you to, eg, iterate over just the IDs and call getModifiedTimestamp() on each item. IOW, this seems to break the "opaque change record" concept?

However, the fact addChangedID doesn't support the entire abstract "change record" also seems odd - moving just the modified time into this.changedIDs is obviously losing any other data being carried in the changeset - but I guess that doesn't matter here.

(I guess another way we could have approached this is to consider a changeset a "object with a minimum of a .lastModified entry" or similar, and done that for every engine, with only bookmarks adding additional entries)

::: services/sync/modules/engines.js:1673
(Diff revision 3)
> + * data for each entry.
> + */
> +class Changeset {
> +  // Creates a changeset with an initial set of tracked entries.
> +  constructor(changes = {}) {
> +    this.changes = changes;

I know rnewman has tried to talk you out of using sets before, but IMO they would be perfect here and probably more efficient

::: services/sync/modules/engines/bookmarks.js:998
(Diff revision 3)
> +    let modifiedGUIDs = this._getModifiedGUIDsAsRows();
> +    if (!modifiedGUIDs.length) {
> +      return this.changedIDs;
> +    }
> +
> +    // We can't use `PlacesUtils.promiseDBConnection` here, because we run Sync

I don't think this comment is true any more? (ie, we no longer call PlacesUtils.bookmarks.runInBatchMode in bookmarks.js)
Attachment #8786918 - Flags: review?(markh)
Comment on attachment 8786918 [details]
Bug 1274496 - Filter excluded bookmarks at sync time based on their root.

https://reviewboard.mozilla.org/r/75784/#review74934

It's doing the right thing, but correctly explaining it is tricky.

> This seems wrong - IIUC, .entries() is explicitly returning the opaque change record, which isn't going to be the modified timestamp for bookmarks. I'd have expected you to, eg, iterate over just the IDs and call getModifiedTimestamp() on each item. IOW, this seems to break the "opaque change record" concept?
> 
> However, the fact addChangedID doesn't support the entire abstract "change record" also seems odd - moving just the modified time into this.changedIDs is obviously losing any other data being carried in the changeset - but I guess that doesn't matter here.
> 
> (I guess another way we could have approached this is to consider a changeset a "object with a minimum of a .lastModified entry" or similar, and done that for every engine, with only bookmarks adding additional entries)

Correct. In this case, we want it to return the full record to add back to the tracker; not just the timestamp. I think part of the confusion here is that the engine creates the `Changeset`, but the tracker supplies the values. During the sync, the engine mutates the set, then writes it back to the tracker.

Another design to consider:

* The tracker grows `pullNewChanges`, `pullAllChanges`, and `pushChanges` methods. `pullNewChanges` and `pullAllChanges` return `Changeset` objects and clear the tracker; `pushChanges` takes a `Changeset` and adds it back to the tracker. (I'm using "pull" and "push" in the VCS sense; implying they can modify state instead of a "get". Bug 1258127 also calls them `pullNewChanges` and `pullAllChanges`).

* The engine's `getChangedIDs` and `getAllIDs` methods would call `pullNewChanges` and `pullAllChanges`.

* `_syncCleanup` would call `this._tracker.pushChanges(this._modified)`. The default implementation iterates over the entries and calls `addChangedID`, like we do here.

> I know rnewman has tried to talk you out of using sets before, but IMO they would be perfect here and probably more efficient

We'd need a `Map` here, since it's ID => change record. If you don't feel strongly about it, I'd rather keep it as a plain object; I find maps cumbersome (and slower) to use for simple string keys. But I'll change it if you think it's better. :-)

> I don't think this comment is true any more? (ie, we no longer call PlacesUtils.bookmarks.runInBatchMode in bookmarks.js)

Good catch! The comment is wrong, but not the code. If we use `promiseDBConnection`, `test_{nested_}batch_tracking` will need to be updated, since they call `verifyTrackedCount` from within the batch, and won't see the `createFolder` or `insertBookmark` calls until they're out of the batch. This works in bug 1258127 because we bump the change counter when we insert the bookmark, but not here because we track and filter.
Comment on attachment 8786918 [details]
Bug 1274496 - Filter excluded bookmarks at sync time based on their root.

https://reviewboard.mozilla.org/r/75784/#review73788

> But we don't know if this record was previously uploaded and is now changed, or if it's modified because it's new, or if it's modified because we're re-syncing from scratch.
> 
> When you call `replaceModified`, you remove this ID from the set that we're going to upload. That will potentially leave the duped record on the server.
> 
> I think here you should consider uploading a deletion.

I think Mark's patch for bug 1276823 will land first, and this will need a rebase. For now, it just moves the assignments into a `swap` method.

> Hmm, could you elaborate a bit, please? If the incoming record is a deletion, but we've changed the item locally (so our undeletion is newer), wouldn't we already have its parent in the tracker?

I filed bug 1299978 for this.
Comment on attachment 8786918 [details]
Bug 1274496 - Filter excluded bookmarks at sync time based on their root.

https://reviewboard.mozilla.org/r/75784/#review74934

> Correct. In this case, we want it to return the full record to add back to the tracker; not just the timestamp. I think part of the confusion here is that the engine creates the `Changeset`, but the tracker supplies the values. During the sync, the engine mutates the set, then writes it back to the tracker.
> 
> Another design to consider:
> 
> * The tracker grows `pullNewChanges`, `pullAllChanges`, and `pushChanges` methods. `pullNewChanges` and `pullAllChanges` return `Changeset` objects and clear the tracker; `pushChanges` takes a `Changeset` and adds it back to the tracker. (I'm using "pull" and "push" in the VCS sense; implying they can modify state instead of a "get". Bug 1258127 also calls them `pullNewChanges` and `pullAllChanges`).
> 
> * The engine's `getChangedIDs` and `getAllIDs` methods would call `pullNewChanges` and `pullAllChanges`.
> 
> * `_syncCleanup` would call `this._tracker.pushChanges(this._modified)`. The default implementation iterates over the entries and calls `addChangedID`, like we do here.

I went ahead and implemented that. The names might need some bikeshedding, but I'm happy with how it cleaned up the tracker test.
Attachment #8786918 - Flags: review?(rnewman)
Comment on attachment 8786918 [details]
Bug 1274496 - Filter excluded bookmarks at sync time based on their root.

https://reviewboard.mozilla.org/r/75784/#review75342

LGTM, thanks.
Attachment #8786918 - Flags: review?(markh) → review+
Sigh. I had to move `pullNewChanges` and `pullAllChanges` to the engine, because some of our engines override `Engine::getChangedIDs` to bypass the tracker. I think `getAllIDs` should really live in the tracker, but that's not backward-compatible. (My concern is that `getAllIDs` is a public API, and I seem to recall Mark cautioning me about third-party sync engines).

Not as clean, but should still let us swap in the new tracker with minimal fuss.
Comment on attachment 8786918 [details]
Bug 1274496 - Filter excluded bookmarks at sync time based on their root.

https://reviewboard.mozilla.org/r/75784/#review76444

::: services/sync/modules/engines/bookmarks.js:493
(Diff revision 5)
> +        // The `===` check also filters out old persisted timestamps,
> +        // which won't have a `deleted` property.
> +        continue;
> +      }
> +      let guid = BookmarkSpecialIds.syncIDToPlacesGUID(syncID);
> +      if (!PlacesUtils.isValidGuid(guid)) {

I worry that there most definitely are GUIDs in the database that don't match `PlacesUtils.isValidGuid` -- there's nothing to stop add-ons (or Sync, back in the day!) from setting the GUID of a row to something containing quotes.

I'd love to see telemetry for this, or even automated recovery.

::: services/sync/modules/engines/bookmarks.js:500
(Diff revision 5)
> +        // validate it just in case before quoting.
> +        throw new Error(`Invalid Places GUID ${guid} for sync ID ${syncID}`);
> +      }
> +      // Ideally, we'd use bound params instead of SQL string literals, but the
> +      // Storage API doesn't support escaping arrays yet.
> +      guids.push(`("${guid}")`);

It doesn't need to. Should you be inspired…

Return an array of GUIDs here.

Build the query as:

```
modifiedGUIDs(guid) AS (
  VALUES (?), (?), (?) …
)
```

by:

```
",(?)".repeat(guids.length).substring(1)
```

and then bind the params:

```
for (let i = 0; i < guids.length; ++i) {
  stmt.bindStringParameter(i, guids[i]);
}
```

This will work fine for up to 999 GUIDs. Above that you'll need to chunk.
Attachment #8786918 - Flags: review?(rnewman) → review+
Blocks: 1302286
Pushed by kcambridge@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/40663c31d2d6
Filter excluded bookmarks at sync time based on their root. r=markh,rnewman
https://hg.mozilla.org/mozilla-central/rev/40663c31d2d6
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 51
Depends on: 1310525
Duplicate of this bug: 1006652
Duplicate of this bug: 559154
You need to log in before you can comment on or make changes to this bug.