Closed Bug 1012597 Opened 7 years ago Closed 6 years ago

promiseItemGUID()/promiseItemID() may return wrong values if db changes underneath it.

Categories

(Toolkit :: Places, defect)

defect
Not set
normal
Points:
3

Tracking

()

RESOLVED FIXED
mozilla41
Iteration:
41.1 - May 25
Tracking Status
firefox41 --- fixed

People

(Reporter: markh, Assigned: mak)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 1 obsolete file)

Sync might change the GUID of an item directly using a storage statement.  Once this has been done, PlacesUtils.promiseItemGUID()/promiseItemID() will resolve with the old ID instead of the new.

See attached quick-and-dirty xpcshell test - it will fail with:

 0:03.54 TEST-UNEXPECTED-FAIL | toolkit/components/places/tests/unit/test_guidChange.js | "h8qhQzqWnMPe" == "abcdefghijkl" - See following stack:
...

It looks like an "item updated" observer in placesutils should solve it though, albeit with possibly some perf hit...

Nominating for backlog as there are probably sync-based scenarios where this could cause real problems.
Flags: firefox-backlog+
ugh, we should probably provide an util to set the guid and clear the cache through it.
Luckily these are not yet widely used.
(In reply to Mark Hammond [:markh] from comment #0)
> It looks like an "item updated" observer in placesutils should solve it
> though, albeit with possibly some perf hit...

if it's done through a direct SQL statement we won't observe anything. fwiw bookmarks api now has the possibility to set the guid when creating a bookmark, maybe Sync should just do that.
(In reply to Marco Bonardo [:mak] from comment #2)
> if it's done through a direct SQL statement we won't observe anything. fwiw
> bookmarks api now has the possibility to set the guid when creating a
> bookmark, maybe Sync should just do that.

IIUC, Sync is changing the GUID of an existing record based on an "incoming" record.
Yeah, this is the reconciling case -- remote wins.

A changeGUID util (with appropriate footgun docs, lest someone think using it is a good idea!) is our only option, really, unless we want to push record application logic down into Places.
(In reply to Richard Newman [:rnewman] from comment #4)

> A changeGUID util (with appropriate footgun docs, lest someone think using
> it is a good idea!) is our only option, really, unless we want to push
> record application logic down into Places.

So Sync could change to use this changeGUID function, but there still seems a risk that other code (eg, addons) could still hit the DB directly.  Unfortunately, the failure-case for this *isn't* the Sync or other code hitting the DB directly - it's other code in the system that uses the promise-based functions - IOW, code hitting the DB directly can cause other unrelated code to fail in subtle ways.

Do we have any insights into what this cache is actually buying us?
(In reply to Mark Hammond [:markh] from comment #5)
> So Sync could change to use this changeGUID function, but there still seems
> a risk that other code (eg, addons) could still hit the DB directly. 

This risk will exist forever, there's no way we can prevent an add-on from writing to profile files.

> Unfortunately, the failure-case for this *isn't* the Sync or other code
> hitting the DB directly - it's other code in the system that uses the
> promise-based functions - IOW, code hitting the DB directly can cause other
> unrelated code to fail in subtle ways.
> 
> Do we have any insights into what this cache is actually buying us?

we are migrating APIs from ids to GUIDs, that means until we are done a lot of existing methods will know ids but will need guids to call new methods, querying the database every time would bring the system to his knees.

The problem is not just promiseItemGUID fwiw, the problem is also in results, each node has a .bookmarkGUID property and by changing through a direct query they won't see the change as well.

We may need to put this change guid method into nsNavBookmarks an fire a notification and handle the notification in various places.

Btw, I'd like to reach an equilibrium where Sync doesn't run any direct query. While we can't control add-ons we can control our own code, at least.

If that means adding new APIs just for Sync we will do that, it's unfortunate we can't come up with a solution that doesn't request not changing guids, I was wondering if we couldn't just remove the local entry and create a new one with the remote guid in place of it? I'd like to understand better the reconciliation algorithm.
Richard, could you please keep our hand and drive us again through the reconciling problem and how it works today?
Flags: needinfo?(rnewman)
There's a little bit of relevant backstory in Bug 627487 Comment 21, but I can't find any of the other times we've discussed this over the past three years or so, so here goes...

Sync aims to do structural reconciling for bookmarks, and equivalence reconciling for everything else.

That is to say, if you have two devices, you bookmark mozilla.org on your bookmark toolbar on each, and then sync them, you shouldn't get two different copies of mozilla.org when the sync is done. The same is true for history: if you visit mozilla.org on two different devices at the same time, syncing those two devices shouldn't leave two places in the DB with the same URL.

When Sync downloads a record it tries to find an equivalent locally. It first tries the GUID; if it finds a match that way, we're done -- apply changes to that record.

If it doesn't find the GUID, it searches for characteristics. History uses the URL, of course. Bookmarks uses a map (_guidMap), lazily constructed by walking the whole DB. (Yes, I know: Bug 655722.)

That map is basically like this:

  {
    "Bookmarks Toolbar": {                          // Parent title.
      "bhttp://mozilla.org/:Mozilla": "someguid",   // A bookmark.
      "s1": "otherguid",                            // A separator.
    }
  }

We download a record, look at the parent name, compute the key, and see if we find a match.

If we find a match, what can we do?

We don't want to insert a duplicate record. We need to make sure that this device and your other syncing devices use the same GUIDs to refer to the same record. We don't have a mechanism for maintaining GUID equivalence -- what would we do when you modify one record or another? Complex. So we have no choice: one GUID has to be replaced by the other.

We assume that the server is always right (because a correct post-reconciling client uploaded the record!), and changing the server record just means N clients need to change, so we replace the local GUID.

Currently we do that by explicitly changing the GUID directly in SQL (see _setGUID in bookmarks.js). 

We change the GUID rather than deleting and re-adding for several reasons:

* Perf. A merge after a restore (prior to Bug 627487/Bug 818589) would require rewriting the GUIDs for the entire bookmarks tree. Deleting and re-adding one by one would be even more horrific than running N queries to change the GUID of each in turn.

* Data loss. I have no confidence that the resulting bookmarks tree would be identical-but-for-guid to the tree prior to GUID change, because we'd be relying on Sync's correctness to delete-then-add with all the right fields, and I have no confidence in Sync! This is amplified by Sync's single-step nature: Bug 814801.

* We would break queries and add-ons that use the numeric place ID rather than the GUID.

* Hysterical raisins: there are a few bits of Places that were mostly motivated by Sync, and I understand that the bookmark GUID is one of them. Sync's code tends to behave as if it owns the place, so to speak.


That's the long of it.

The short of it: I would be totally happy calling an asynchronous equivalent of the SQL query we run right now; I too am keen to have Sync run less raw SQL.

I don't think that switching to delete-then-add will yield happiness, for all of the reasons above. It's outwardly simple, but both Sync and Places are complex beasts.

Let me know if you need more rambling, and I would be happy to provide.
Flags: needinfo?(rnewman)
(In reply to Richard Newman [:rnewman] from comment #8)
> We change the GUID rather than deleting and re-adding for several reasons:
> 
> * Perf. A merge after a restore (prior to Bug 627487/Bug 818589) would
> require rewriting the GUIDs for the entire bookmarks tree. Deleting and
> re-adding one by one would be even more horrific than running N queries to
> change the GUID of each in turn.

So, sounds like we can fix this by restoring guids when the json is restored? We currently save guids but we don't restore them yet, iirc.

> * Data loss. I have no confidence that the resulting bookmarks tree would be
> identical-but-for-guid to the tree prior to GUID change, because we'd be
> relying on Sync's correctness to delete-then-add with all the right fields,
> and I have no confidence in Sync! This is amplified by Sync's single-step
> nature: Bug 814801.

I wonder if we may provide a cloneItem API...

> * We would break queries and add-ons that use the numeric place ID rather
> than the GUID.

I'm not really worried about this, since the current behavior is already breaking all Places APIs using GUIDs and all results. Regardless queries and add-ons MUST listen to bookmarks changes, or they are broken already.

> * Hysterical raisins: there are a few bits of Places that were mostly
> motivated by Sync, and I understand that the bookmark GUID is one of them.
> Sync's code tends to behave as if it owns the place, so to speak.

It was a good move, we are indeed moving towards using GUIDs more.

I think the only real issue is understanding how many concurrent guid changes we may be doing today to figure how much more expensive would be to replace bookmarks in place. That said, I'll discuss this with Mano and let's see if we can come up with an API that Sync can use more easily.
> So, sounds like we can fix this by restoring guids when the json is restored? We currently > save guids but we don't restore them yet, iirc.

Now that the new/insert take an optional guid argument, we can.
(In reply to Marco Bonardo [:mak] from comment #9)

> So, sounds like we can fix this by restoring guids when the json is
> restored? We currently save guids but we don't restore them yet, iirc.

Definitely worth doing.

To be clear: restoring GUIDs is great -- it makes life simpler and more efficient -- but it won't eliminate Sync's need to change GUIDs.

Valid dupes can be created through lots of avenues, ranging from the complex (Xmarks, add-ons) to the simple (bookmarking the same site on two machines).

Places must expose some mechanism to change an item's GUIDs. Right now that's SQL...


> I wonder if we may provide a cloneItem API...

You could for other reasons, but it wouldn't be ideal for Sync. Sync would no have other reason to clone an item than to implement a changeGUID method.


> I'm not really worried about this, since the current behavior is already
> breaking all Places APIs using GUIDs and all results. Regardless queries and
> add-ons MUST listen to bookmarks changes, or they are broken already.

I think either case presupposes that change notifications are sent and handled for both GUIDs and place IDs. I don't think we do either right now?


> I think the only real issue is understanding how many concurrent guid
> changes we may be doing today to figure how much more expensive would be to
> replace bookmarks in place.

Anywhere from 0 to 30,000 at a time. Steady-state: ~0.

(Also, not concurrent: each of Sync's changes is made serially. They're also made independently, because we don't want to hold a transaction open for anywhere from 150 milliseconds to fifteen minutes.)

The very best thing you could do for Sync-related perf is replace guidMap: that is, expose some primitive for passing in a bookmark-shaped thing and finding out if an equivalent record already exists.
(In reply to Richard Newman [:rnewman] from comment #11)
> (In reply to Marco Bonardo [:mak] from comment #9)
> Places must expose some mechanism to change an item's GUIDs. Right now
> that's SQL...

all insert/create APIs allow to define a guid nowadays. Imo it's matter of figuring what blocks us from using delete/add path rather than change guid path (sounds like it's number of concurrent changes of this type).

> > I wonder if we may provide a cloneItem API...
> 
> You could for other reasons, but it wouldn't be ideal for Sync. Sync would
> no have other reason to clone an item than to implement a changeGUID method.

it's easier to listen to remove/add items for consumers than to guid-changed it's even more backwards compatible since any consumer listens to remove/add

> > I'm not really worried about this, since the current behavior is already
> > breaking all Places APIs using GUIDs and all results. Regardless queries and
> > add-ons MUST listen to bookmarks changes, or they are broken already.
> 
> I think either case presupposes that change notifications are sent and
> handled for both GUIDs and place IDs. I don't think we do either right now?

there is no change notification for things that ideally never change for the lifetime of an object, like place/item id and guid.

> > I think the only real issue is understanding how many concurrent guid
> > changes we may be doing today to figure how much more expensive would be to
> > replace bookmarks in place.
> 
> Anywhere from 0 to 30,000 at a time. Steady-state: ~0.

Right, we should figure ways to reduce those 30 000 to 30. What causes us to change 30 000 guids? is it just restore? we can fix these pathological cases if pointed out properly

> The very best thing you could do for Sync-related perf is replace guidMap:
> that is, expose some primitive for passing in a bookmark-shaped thing and
> finding out if an equivalent record already exists.

we may also do this in addition, sounds like a searchItemBy...
but at the moment the priority is not to improve performance. We should fist solve the issue by which Sync changes guids with a raw query and that breaks all guid consumers.

let's start from restoring guids from json: bug 967204

If you have other use-cases where we need to change guid after a bookmark has been created (since doing on creation is already supported, even if I suspect Sync is not taking advantage of that) I'd love to have them listed here so we can fix those.
Depends on: 967204
(In reply to Marco Bonardo [:mak] from comment #12)

> it's easier to listen to remove/add items for consumers than to guid-changed
> it's even more backwards compatible since any consumer listens to remove/add

Then might I suggest having PlacesUtils.changeGUID, which can be safe, well-tested, preserve ordering correctly, etc.? It can be implemented in terms of add/remove, by all means (Sync won't be tracking while it's doing this work).

I'd like us to avoid putting more logic into Sync, when things like CloudSync will end up doing similar work.

> Right, we should figure ways to reduce those 30 000 to 30. What causes us to
> change 30 000 guids? is it just restore? we can fix these pathological cases
> if pointed out properly

I think it's an open set. We don't know how Xmarks handles GUIDs, for example. Bad things probably happen if a user has a bookmark-generating add-on installed on two machines, or if they round-trip bookmarks into Internet Explorer or Chrome and then back into a new Firefox profile a few months later.

There will be dupes if a user copied a profile a long time ago. Lots of potential reasons. Bookmark backups are probably the big one.

I definitely agree that we should be trying to reduce opportunities for this kind of work.

> (since doing on creation is already supported, even if I
> suspect Sync is not taking advantage of that)

Correct, it is not. I'll file a bug that someone can pick up to do that; it should definitely be fixed before _setGUID is reimplemented, or we'll end up doing create/delete/create every time we add a new record.
Depends on: 1019226
(In reply to Richard Newman [:rnewman] from comment #13)
> (In reply to Marco Bonardo [:mak] from comment #12)
> 
> > it's easier to listen to remove/add items for consumers than to guid-changed
> > it's even more backwards compatible since any consumer listens to remove/add
> 
> Then might I suggest having PlacesUtils.changeGUID, which can be safe,
> well-tested, preserve ordering correctly, etc.? It can be implemented in
> terms of add/remove, by all means (Sync won't be tracking while it's doing
> this work).

We can do something like that as soon as we are sure it won't be invoked 30 times a second. We didn't provide a changeGUID so far cause it's a very easily abused API. This API would be more like a replaceItem.

> I think it's an open set. We don't know how Xmarks handles GUIDs, for
> example.

We can reach them, I'm sure any method we provide to make things easier to Sync would be appreciated.

> Bad things probably happen if a user has a bookmark-generating
> add-on installed on two machines

yep, but for me a bad thing is dataloss, getting a rare dupe here and there is not a big deal. I don't think the issue here can cause dataloss?

> if they round-trip bookmarks into
> Internet Explorer or Chrome and then back into a new Firefox profile a few
> months later.

I don't think we should handle all of these user-initiated actions. if dupes happen the user can very easily cleanup after them and I don't think we are guilty for those, round-trip is prone to issues by definition. As long as we avoid dataloss I'd be fine.

> There will be dupes if a user copied a profile a long time ago. Lots of
> potential reasons.

Again, it's something so rare and advanced that we should just ignore it.

> Correct, it is not. I'll file a bug that someone can pick up to do that; it
> should definitely be fixed before _setGUID is reimplemented, or we'll end up
> doing create/delete/create every time we add a new record.

Yup, thank you, we already have 2 nice pieces of work here in the dependencies that would already greatly improve the situation.
Blocks: 1071511
so, what about in the meanwhile we make _setGUID call a PlacesUtils.invalidateGuidCache(id)?
It would not be a definitive solution, but should help with some bugs.
is setGuidForLocalId in cloudSync used at all? Remving it would be a huge win.
I don't have any good way to tell conclusively -- the cloudSync API is used by partner add-ons, and I don't have the partner add-on to check!

But I would guess that if it's in PlacesWrapper, and not used by the higher level cloudSync code (which is the case), that it's not used and is safe to remove.
(In reply to Marco Bonardo [::mak] from comment #15)
> so, what about in the meanwhile we make _setGUID call a
> PlacesUtils.invalidateGuidCache(id)?
> It would not be a definitive solution, but should help with some bugs.

That doesn't sound like a bad idea to me, but take that opinion for what it's worth :D
OK, I'll make a patch to add invalidateGuidCache and a test in Places in a dependency. I might ask you for a review, since the Places team is "me" I need help :)
Depends on: 1167522
Assignee: nobody → mak77
Attachment #8424705 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8609648 - Flags: review?(rnewman)
Attachment #8609649 - Flags: review?(rnewman)
Points: --- → 3
Flags: qe-verify-
Comment on attachment 8609648 [details] [diff] [review]
part 1: Places method to invalidate the cache

Review of attachment 8609648 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/places/PlacesUtils.jsm
@@ +2213,5 @@
> +  },
> +
> +  invalidateCacheForItemId(aItemId) {
> +    let guid = this.guidsForIds.get(aItemId);
> +    if (guid !== undefined) {

I assume that a null or "" GUID is invalid, so perhaps just `if (guid)` ?

But is there a reason not to just unconditionally delete?
Attachment #8609648 - Flags: review?(rnewman) → review+
Attachment #8609649 - Flags: review?(rnewman) → review+
Iteration: --- → 41.1 - May 25
Whiteboard: p=8
Duplicate of this bug: 1167522
I made it unconditionally delete, plus added some exceptions when updating the cache to ensure we don't insert invalid values.
Flags: in-testsuite+
https://hg.mozilla.org/mozilla-central/rev/309a1680ece8
https://hg.mozilla.org/mozilla-central/rev/6d7fa2ed4074
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
You need to log in before you can comment on or make changes to this bug.