Expand the set of IDs bookmark repair requestor requests.

RESOLVED FIXED in Firefox 56

Status

()

enhancement
P1
normal
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: tcsc, Assigned: tcsc)

Tracking

unspecified
Firefox 56
Points:
---

Firefox Tracking Flags

(firefox56 fixed)

Details

Attachments

(1 attachment)

In particular, we should request differences, structuralDifferences (since the data indicates these are rather common), as well as other cases where we could resolve the corruption if we get an ID. Specifically, deletedParents should request the parent, deletedChildren should request the child, and missingChildren should request the parent, in addition to the ids they currently request.

This is a simple patch, most of the complexity is in updating the hardcoded counts in the tests, patch incoming.
Comment on attachment 8879649 [details]
Bug 1374722 - Expand the set of ids requested during the initial repair process

https://reviewboard.mozilla.org/r/151008/#review156542

Another relatively easy thing we might be able to do is add additional metadata to the events so we could determine a kind of "category" for the missing items - for example, if the vast majority of remaining items turn out to be child records that no longer exists (ie, that simply updating the parent record to no longer reference them fixes the issue), we would get a better idea of how many repairs *would* be considered success if we actually did that parent update.

::: services/sync/modules/bookmark_repair.js:114
(Diff revision 1)
>  
>      // Note that we allow any of the validation problem fields to be missing so
>      // that tests have a slightly easier time, hence the `|| []` in each loop.
>  
>      // Missing children records when the parent exists but a child doesn't.
> -    for (let { child } of validationInfo.problems.missingChildren || []) {
> +    for (let { parent, child } of validationInfo.problems.missingChildren || []) {

I think we should add a comment saying "we can't be sure if the child is missing or our copy of the parent is wrong, so request both"

::: services/sync/modules/bookmark_repair.js:136
(Diff revision 1)
>        return ids; // might as well give up here - we aren't going to repair.
>      }
>  
> -    // Entries where we have the parent but know for certain that the child was
> -    // deleted.
> -    for (let { parent } of validationInfo.problems.deletedChildren || []) {
> +    // Entries where we have the parent but we have a record on the server that
> +    // claims the child was deleted.
> +    for (let { parent, child } of validationInfo.problems.deletedChildren || []) {

I'm not convinced this is correct - doesn't this mean we *did* see the child record, but it was marked as deleted?

::: services/sync/modules/bookmark_repair.js:146
(Diff revision 1)
>        return ids; // might as well give up here - we aren't going to repair.
>      }
>  
>      // Entries where the child references a parent that we don't have, but we
>      // know why: the parent was deleted.
> -    for (let { child } of validationInfo.problems.deletedParents || []) {
> +    for (let { parent, child } of validationInfo.problems.deletedParents || []) {

ditto here - doesn't this mean the server has a tombstone?

::: services/sync/modules/bookmark_repair.js:180
(Diff revision 1)
> +    if (ids.size > MAX_REQUESTED_IDS) {
> +      return ids; // might as well give up here - we aren't going to repair.
> +    }
> +
> +    // Cases where there's a structural difference between local and remote
> +    for (let {id} of validationInfo.problems.structuralDifferences || []) {

It's not clear to me why we trust another device more than our own (ie, couldn't we solve this simply by re-uploading our copy of the records?) - however, that assumes we have all children/parents - but if we don't, we should already have then in ids.

Similarly, if we did do this, it seems we'd want to change the responder so that the other device didn't upload a bad record - eg, if that other client happened to include a non-existing child record in the item (ie, that other client had an invalid ID), this might increase the measured corruption.

::: services/sync/modules/bookmark_repair.js:189
(Diff revision 1)
> +    if (ids.size > MAX_REQUESTED_IDS) {
> +      return ids; // might as well give up here - we aren't going to repair.
> +    }
> +
> +    // Cases where there's a non-structural difference between local and remote
> +    for (let {id} of validationInfo.problems.differences || []) {

ISTM that this wouldn't actually solve problems that can cause iOS to not sync? Also, same thought as above re "why trust them more than us?"
Attachment #8879649 - Flags: review?(markh)
Comment on attachment 8879649 [details]
Bug 1374722 - Expand the set of ids requested during the initial repair process

https://reviewboard.mozilla.org/r/151008/#review165864

::: services/sync/modules/bookmark_repair.js:136
(Diff revision 1)
>        return ids; // might as well give up here - we aren't going to repair.
>      }
>  
> -    // Entries where we have the parent but know for certain that the child was
> -    // deleted.
> -    for (let { parent } of validationInfo.problems.deletedChildren || []) {
> +    // Entries where we have the parent but we have a record on the server that
> +    // claims the child was deleted.
> +    for (let { parent, child } of validationInfo.problems.deletedChildren || []) {

Yes, and there are (at least) two ways that could have happened. either 

1. Someone revived the child (as from http://searchfox.org/mozilla-central/rev/3a3af33f513071ea829debdfbc628caebcdf6996/services/sync/modules/engines.js#1372), but didn't get the new child up (and did get the parent up, possibly later as part of an unrelated sync)
2. The child *is* deleted, but the parent was never updated.

Case 1 won't be repaired if we only request the parent.

::: services/sync/modules/bookmark_repair.js:146
(Diff revision 1)
>        return ids; // might as well give up here - we aren't going to repair.
>      }
>  
>      // Entries where the child references a parent that we don't have, but we
>      // know why: the parent was deleted.
> -    for (let { child } of validationInfo.problems.deletedParents || []) {
> +    for (let { parent, child } of validationInfo.problems.deletedParents || []) {

If we have conflicting information about a server tombstone, I don't think the right thing to do is to trust it. See my above comment about the deleted child, a slight variation of this is applies for a deleted parent.

::: services/sync/modules/bookmark_repair.js:180
(Diff revision 1)
> +    if (ids.size > MAX_REQUESTED_IDS) {
> +      return ids; // might as well give up here - we aren't going to repair.
> +    }
> +
> +    // Cases where there's a structural difference between local and remote
> +    for (let {id} of validationInfo.problems.structuralDifferences || []) {

The requestor won't be riding the trains, these issues (structuralDifferences/differences) can't cause iOS to fail to sync, and so there's no point to requesting repair for them. So I've removed them from the patch. 

Worth noting for posterity, on further thought I'm not terribly convinced by the "why trust them over us" argument\*, but there's no reason to keep this code given that the requestor is just a prototype for iOS.

\* One concrete reason is: because we know we're wrong, if multiple clients wrong then someone should replace the server data, (and we have no real metric by which to judge "rightness" here), so it might as well be the first responder to the repair request.
Comment on attachment 8879649 [details]
Bug 1374722 - Expand the set of ids requested during the initial repair process

https://reviewboard.mozilla.org/r/151008/#review167190

thanks!
Attachment #8879649 - Flags: review?(markh) → review+
Pushed by tchiovoloni@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/14ca55749ce4
Expand the set of ids requested during the initial repair process r=markh
https://hg.mozilla.org/mozilla-central/rev/14ca55749ce4
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 56
You need to log in before you can comment on or make changes to this bug.