Closed
Bug 1276969
Opened 9 years ago
Closed 9 years ago
Remove parentName checks from the bookmark validator
Categories
(Firefox :: Sync, defect, P1)
Firefox
Sync
Tracking
()
RESOLVED
FIXED
Firefox 52
People
(Reporter: tcsc, Assigned: markh)
References
Details
(Whiteboard: [data-integrity][sprint-3])
Attachments
(1 file)
When you rename a bookmark folder, it's children's parentName property should be changed on the server, but isn't (which isn't surprising, since AFAIK it's not stored anywhere on the client, and so it would be easy to forget).
Anyway, since the parentName is used during reconciliation, I'd be surprised if this couldn't cause some types of corruption.
This has to be special-cased out in TPS, since it causes `services/sync/tests/tps/test_sync.js` to fail bookmark validation (see the discussion in the comments of bug 1274394 for more).
Assignee | ||
Comment 1•9 years ago
|
||
I'm starting to wonder if some of the reconciliation/de-duping code is doing more harm than good - eg, bug 1276823 where (a) we deleted a perfectly valid separator and (b) left an invalid record on the server. Similarly here, I can imagine using an old parent name could cause problems.
Kit's tracker work may well end up with places knowing the sync-state of a bookmark. I'm starting to think that we should consider doing some of the de-duping work only when the sync-state is "unknown" (ie, an explicit and hopefully one-off reconciliation process) and not on every Sync.
![]() |
||
Updated•9 years ago
|
Whiteboard: [sync-data-integrity]
![]() |
||
Updated•9 years ago
|
Whiteboard: [sync-data-integrity] → [data-integrity]
Updated•9 years ago
|
Whiteboard: [data-integrity] → [data-integrity][sprint-3]
Updated•9 years ago
|
Rank: 1
Assignee | ||
Comment 2•9 years ago
|
||
This is tricky - we get a notification that the parent item changed, but don't know why. If we knew the title changed, it probably *is* reasonable to flag each child as also modified - but without that information, I don't see why we aren't forced to do it on any folder change - they should be relatively rare (ie, we don't get notified when a child is added to a folder, only when the folder itself changes)
Otherwise we know we are leaving the server in a bad state, which is just insane IMO. The only other real alternative is to not use parentname, but that's more difficult and I'm not sure we can sanely wait for that to happen. At least we don't need to recurse ;)
Richard, what problems do you see in marking all local children as changed on a folder change notification that could outweigh an obvious place where we leave the server in a bad state such that future de-dupes will do the wrong thing?
Flags: needinfo?(rnewman)
Comment 3•9 years ago
|
||
(In reply to Mark Hammond [:markh] from comment #1)
> I'm starting to think that we should consider doing some of the
> de-duping work only when the sync-state is "unknown" (ie, an explicit and
> hopefully one-off reconciliation process) and not on every Sync.
That seems like a good idea to me. It's already somewhat the case -- we never consult guidMap unless an incoming record's GUID isn't found, and my suggestion in Bug 747699 might scope that even further to when the parent GUID isn't found.
If you go a step further, buffering records in memory until their parent arrives, rather than adding straight to Unsorted Bookmarks, then maybe the problem largely goes away: after all, every record on the server should have a parent identified by GUID, right?
(In reply to Mark Hammond [:markh] from comment #2)
> Richard, what problems do you see in marking all local children as changed
> on a folder change notification that could outweigh an obvious place where
> we leave the server in a bad state such that future de-dupes will do the
> wrong thing?
* You make it more likely that there will be a conflict -- both sides marked as changed.
* You increase the amount of work done by all other clients. Moving or renaming a folder would force clients to download and apply all of its children again, potentially losing local changes. The more work done by a sync client, the more likely it is to screw something up.
* God forbid a client has seen the previous folder with a different GUID, and downloads one of the newly updated child records first! Different GUID, different title…
But I'm having trouble seeing where this bug would even apply. It applies in three cases that I can see:
1. When syncing down a crufty old server, and you now have a folder with the same name as the renamed folder. If we see children of the renamed folder before their real parent, we'll put them in the wrong folder. This is a problem if we are interrupted before we see all the folders and move them, and of course we prefer not to move bookmarks unnecessarily.
Note that this is really an issue with out-of-order blind application and the use of parent names in general, not really a problem of stale parent names.
2. When syncing a new device that shares structure but not GUIDs. (E.g., if a user managed to export a snapshot with different GUIDs and import it on another device, between a rename and updating the sync account, then sync.)
In this case we rely on content matching to make sure that record application can happen in any order without duplication.
This doesn't seem common to me.
3. When checking for the parent of an incoming record that we haven't seen before. That is:
* Create a folder "Foo".
* Create a bookmark in that folder.
* Sync.
* Rename the folder.
* Sync.
If a new client sees the records in one order, everything will be fine. If it sees them in the reverse order, reparenting will be required.
In the absence of a good example that's more likely: I agree that this would be nice to fix, but not if it involves cracking the nut with a sledgehammer.
This is the kind of bullshit choice that we have when change tracking happens miles away from the thing doing the change. Places knows whether the user changed the title -- why can't we?
Some observer notifications supply the changed record before and after the change, or a template of the changes. Can we make that happen here?
Flags: needinfo?(rnewman)
Assignee | ||
Comment 4•9 years ago
|
||
(In reply to Richard Newman [:rnewman] from comment #3)
> But I'm having trouble seeing where this bug would even apply. It applies in
> three cases that I can see:
I'll read this in more detail later, but FWIW, the existing TPS tests hit this. I'm not sure exactly which test it was, but those tests really are designed to test common user scenarios.
So I think you are saying it is common we'll have the the wrong parentname, but rare that we'll hit a use-case that makes it a problem? Given the parentname is carried around *just* to support that rare case, something still smells fishy here - if it's rare enough to not care about, we shouldn't bother carrying the parent name around in the first place. Conversely, if it's important enough to keep carrying the parentname around, we should make sure it is accurate.
Comment 5•9 years ago
|
||
The parent name is only used at all if the record is new.
If it's new, we're in one of two situations:
* The parent is already known by GUID.
* The parent is also new in the same or subsequent download.
If the parent is known, we will (or should) parent this record correctly regardless of the guidMap lookup -- the parent name doesn't really matter. (Perhaps we'll stuff it in Unsorted Bookmarks first, then move it; I don't remember. We shouldn't even do a content lookup if we have a folder with the parent's GUID, as noted earlier.)
If the parent is new, we'll put the bookmark somewhere. When we see the parent, we'll try to find a place for that, recursively, then move the original bookmark into the right place.
This is the philosophy of the desktop bookmarks engine: apply incoming records to storage, and trust that eventually we'll have seen everything we need to gradually move them into the right place (and won't be interrupted in so doing).
The only situation in which the parent name *really* matters is when we're trying to value-reconcile all the way up -- a full fresh no-shared-GUIDs merge -- and some structure already exists: that is, parentName's sole function is to avoid visible duplication when overlaying an alien tree.
The *only* failure mode we should see in this case is duplication at the folder level. The incoming records should maintain their tree based on remote GUID, not parent names.
My opinion is that a full fresh no-shared-GUIDs merge is fairly rare, and the named-based lookup unlikely to be seen succeeding in the wild: you need to have bookmarked the same thing on multiple computers.
All that said, the current bookmarks engine might not behave as described in practice, perhaps by simply doing too much work (using content-based lookups when not necessary), by misusing guidMap (e.g., to collide children of a folder, rather than by looking at the folder directly), or by failing to maintain remote GUIDs (needing to fall back to content-based reconciling because the remote parent has itself been duped and the historical record erased).
In that case, I think the direction to take is not to urgently try to give content-based matching more accurate information, but instead to reduce the scope of content-based matching and correct any errors.
I'm happy for you and/or Thom to walk me through what you saw with TPS if you want this process to take place in my head rather than yours :D
Comment 6•9 years ago
|
||
I didn't even get to hasDupe, which is almost certainly a horrorshow.
hasDupe is supposed to be set if there are two local records that would collide as duplicates, because bookmarks.js would otherwise make them collide on inbound processing.
There are three problems with that:
* If an error ever causes duplication, it will tend to breed.
* If a duplicate is ever removed, the other potential duplicate needs to be reuploaded.
* If there's a local duplicate and two incoming hasDupe records, *neither* will be duped. That's wrong.
iOS ignores hasDupe entirely, and just does the right thing: incoming records aren't duped into each other, and local collisions are duped.
Assignee | ||
Comment 7•9 years ago
|
||
(In reply to Richard Newman [:rnewman] from comment #5)
> The parent name is only used at all if the record is new.
Ah, that's something I haven't considered and does change my perspective.
<snip lots of useful info>
> I'm happy for you and/or Thom to walk me through what you saw with TPS if
> you want this process to take place in my head rather than yours :D
TPS doesn't actually have a "problem" other than after the test runs, the validator reports the parentname field is different between the client and the server. This will currently happen whenever the parent is renamed, and the discussion above makes me lean towards this not being an actual problem in practice (other than the more general approach of using parentname at all, as you say).
So I'm starting think this can be WONTFIX and we just have the validator ignore this field (or we just repurpose this bug to be the validator change), then we consider better handling (and/or removal of) parentname in bug 747699.
WDYT?
Comment 8•9 years ago
|
||
Triage meeting: downgrading this to a P3 because it's a relatively low-impact bug. We should also measure how frequently we de-dupe bookmarks.
Rank: 1 → 0
Priority: P2 → P3
Assignee | ||
Updated•9 years ago
|
Priority: P3 → P2
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → markh
Priority: P2 → P1
Assignee | ||
Comment 9•9 years ago
|
||
(In reply to Mark Hammond [:markh] from comment #7)
> (In reply to Richard Newman [:rnewman] from comment #5)
> > The parent name is only used at all if the record is new.
>
> Ah, that's something I haven't considered and does change my perspective.
As mentioned above, there doesn't seem enough value in trying to keep the parent name up-to-date given this is only used when the record is new and only for de-duping.
So I'm changing the scope of this bug to have the validator not report this as a problem.
Summary: Bookmark parentName isn't updated in server-side children when parent's name changes → Remove parentName checks from the bookmark validator
Comment hidden (mozreview-request) |
Reporter | ||
Comment 11•9 years ago
|
||
mozreview-review |
Comment on attachment 8793639 [details]
Bug 1276969 - Remove parentName checks from the bookmark validator.
https://reviewboard.mozilla.org/r/80360/#review79068
Looks good to me.
Attachment #8793639 -
Flags: review?(tchiovoloni) → review+
Comment 12•9 years ago
|
||
Pushed by mhammond@skippinet.com.au:
https://hg.mozilla.org/integration/autoland/rev/6a3b3e3c690e
Remove parentName checks from the bookmark validator. r=tcsc
![]() |
||
Comment 13•9 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox52:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 52
![]() |
||
Comment 14•9 years ago
|
||
Want to request uplift here?
Assignee | ||
Comment 15•9 years ago
|
||
The reporting of the validator to telemetry landed in 52, so I don't see a good reason to uplift this fix.
You need to log in
before you can comment on or make changes to this bug.
Description
•