Closed Bug 1438445 Opened 6 years ago Closed 6 years ago

bookmark validator confused about orphans

Categories

(Firefox :: Sync, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
Firefox 60
Tracking Status
firefox60 --- fixed

People

(Reporter: markh, Assigned: tcsc)

Details

Attachments

(1 file)

* In about:sync, load attachment 8949791 [details]

* Note 'zduTXMKKDSIB' is said to be an orphan - but it exists only on the server as a tombstone.

* When the validator hits https://searchfox.org/mozilla-central/rev/9011be0a172711bc243e50dfca16d42e877bf4ec/services/sync/modules/bookmark_validator.js#571, it has childObject={"id":"zduTXMKKDSIB","type":"item","deleted":true,"isDeleted":true}, so is looking up undefined in the map.

* Therefore decides it is an orphan
I'll take this since I did refactored the validator in bug 1343103 (tombstones), but there's no reason it couldn't land early (minus the tombstone bits).
Assignee: nobody → tchiovoloni
This is much bigger than it needs to be to fix the bug as stated, but it does fix it. It also takes the opportunity to fix some really gross parts that were there before.

Part of this was stolen from my tombstones patch, e.g. where I fixed
`eslint-disable-next-line complexity` on compareServerWithClient, but since this is a big refactor, might as well (to be honest, it was intended to be two separate patch, but I messed up a rebase...).

Anyway, there are some notable differences between the previous code:

1. It is more considered about handling deleted items (the bug in question was one case of this, but I think there were more).

2. parentChildMismatches are computed up front. Any time your parent's children and your parentid disagree with eachother it should report, whereas the last validator managed to confuse itself into thinking things were find so long as they were basically consistent.

3. Orphans are defined differently. Previously it was something like "have a missing or unknown parentid", wheras now it's "have a missing or unknown parentid OR not be reachable following children from the root"

4. There were a few bugs around comparing GUIDs and RecordIDs that may never have caused problems, but were wrong nonetheless. These are gone.

5. The code is now commented about what it's doing, more or less, and not in a single enormous function. It also makes much more sense now.

Etc. Ok, gotta run for now.
Comment on attachment 8951348 [details]
Bug 1438445 - Refactor bookmark validator to be simpler and more correct

https://reviewboard.mozilla.org/r/220596/#review226598


Code analysis found 1 defect in this patch:
 - 1 defect found by mozlint

You can run this analysis locally with:
 - `./mach lint path/to/file` (JS/Python)


If you see a problem in this automated review, please report it here: http://bit.ly/2y9N9Vx


::: services/sync/tests/unit/test_bookmark_repair.js:78
(Diff revision 1)
>    actual.sort((a, b) => String(a.name).localeCompare(b.name));
>    expected.sort((a, b) => String(a.name).localeCompare(b.name));
> +  try {
> -  deepEqual(actual, expected);
> +    deepEqual(actual, expected);
> +  } catch (e) {
> +    dump(JSON.stringify(validationResult.problems, null, 4)+"\n");

Error: Infix operators must be spaced. [eslint: space-infix-ops]
Comment on attachment 8951348 [details]
Bug 1438445 - Refactor bookmark validator to be simpler and more correct

https://reviewboard.mozilla.org/r/220596/#review226684

LGTM and is certainly much clearer, thanks!
Attachment #8951348 - Flags: review?(markh) → review+
Comment on attachment 8951348 [details]
Bug 1438445 - Refactor bookmark validator to be simpler and more correct

https://reviewboard.mozilla.org/r/220596/#review226716

Excellent refactoring, I really like how you broke up `inspectServerRecords` into `ServerRecordInspection`! Mostly nits, but a couple more substantive questions that I'll leave to you.

::: services/sync/modules/bookmark_validator.js:21
(Diff revision 1)
>                                 "resource://gre/modules/PlacesUtils.jsm");
>  
>  ChromeUtils.defineModuleGetter(this, "PlacesSyncUtils",
>                                 "resource://gre/modules/PlacesSyncUtils.jsm");
>  
> +ChromeUtils.defineModuleGetter(this, "CanonicalJSON",

Nit: This appears to be unused?

::: services/sync/modules/bookmark_validator.js:312
(Diff revision 1)
> +    this._orphans = new Map();
> +    this._multipleParents = new Map();
> +
> +    this.maybeYield = Async.jankYielder();
> +    // Only allowed to use this once.
> +    this._ran = false;

Nit: Should we check this somewhere, or does the `if (!serverRecords)` check in `_setRecords` take care of this? (Or pass `serverRecords` directly to the constructor, and give `_setRecords` a different namee?)

::: services/sync/modules/bookmark_validator.js:333
(Diff revision 1)
> +  // We don't set orphans in this.problemData. Instead, we walk the tree at the
> +  // end to find unreachable items.
> +  _noteOrphan(id, parentId = undefined) {
> +    // We take it if we don't have anything for it, or if it provides a parentid
> +    // and we don't have one.
> +    if (!this._orphans.has(id) || (parentId && !this._orphans.get(id))) {

This check stumped me for a bit...I'm reading this as "if we haven't already recorded it as an orphan, or we're given a parent ID and we previously recorded it with an unknown parent", is that right?

Since `_noteOrphan` has three places where we call it, with different inputs (`_findOrphans` will never evaluate `(parentId && ...)`, since it passes `undefined`, and `_linkParentIDs` already knows if it's missing the ID and the node, or just the node)...you might consider inlining this into those methods.

::: services/sync/modules/bookmark_validator.js:412
(Diff revision 1)
> +
> +      if (new Set(record.children).size !== record.children.length) {
> +        this.problemData.duplicateChildren.push(record.id);
> +      }
> +
> +      // After we're through with them, folder records store 4 (ugh) arrays that

:-)

::: services/sync/modules/bookmark_validator.js:429
(Diff revision 1)
> +      // - children: This is the 'cleaned' version of the child records that are
> +      //   safe to iterate over, etc.. If there are no reported problems, it should
> +      //   be identical to unfilteredChildren.
> +      //
> +      // The last two are left alone until later `this._linkChildren`, however.
> +      record.childGUIDs = record.children;

Hmm, isn't this the other way around: `record.children` is `childIDs`, and `childGUIDs` are those IDs converted to GUIDs?

::: services/sync/modules/bookmark_validator.js:591
(Diff revision 1)
> +      await this.maybeYield();
> +      this.problemData.orphans.push({id, parent});
> +    }
> +  }
> +
> +  async _finish() {

Nit: Unless you're using `async` for consistency, this could be a vanilla function.
Attachment #8951348 - Flags: review?(kit) → review+
Comment on attachment 8951348 [details]
Bug 1438445 - Refactor bookmark validator to be simpler and more correct

https://reviewboard.mozilla.org/r/220596/#review226788

::: services/sync/modules/bookmark_validator.js:21
(Diff revision 1)
>                                 "resource://gre/modules/PlacesUtils.jsm");
>  
>  ChromeUtils.defineModuleGetter(this, "PlacesSyncUtils",
>                                 "resource://gre/modules/PlacesSyncUtils.jsm");
>  
> +ChromeUtils.defineModuleGetter(this, "CanonicalJSON",

Ah right, I had been planning on using it keep objects in a Set, but didn't end up needing it.

::: services/sync/modules/bookmark_validator.js:312
(Diff revision 1)
> +    this._orphans = new Map();
> +    this._multipleParents = new Map();
> +
> +    this.maybeYield = Async.jankYielder();
> +    // Only allowed to use this once.
> +    this._ran = false;

Whoops, yeah that's a leftover from when I handled it differently. The check in `_setRecords` does handle it now.

Essentially, if there were a way to have an async constructor, a lot more of this would be done by the constructor...

::: services/sync/modules/bookmark_validator.js:333
(Diff revision 1)
> +  // We don't set orphans in this.problemData. Instead, we walk the tree at the
> +  // end to find unreachable items.
> +  _noteOrphan(id, parentId = undefined) {
> +    // We take it if we don't have anything for it, or if it provides a parentid
> +    // and we don't have one.
> +    if (!this._orphans.has(id) || (parentId && !this._orphans.get(id))) {

The basic idea is that we have more than one reason we think this item is an orphan:

1. It doesn't have a parentid. (may never happen. The old code considered it as an orphan, but honestly I feel like we should indicate these kinds of errors as a specific `corruptRecord` error or whatever, oh well, a bug for another time)
2. It has a parentid, but it points to a record that we don't have.
3. We know it's an orphan because it's not reachable from the root.

In case 2, we want to record both the record and it's parent in the orphan object that gets written into problemData.orphans. This is for the benefit of repair and about:sync. 

In case 3 the parentid isn't relevant unless case 2 also applies.

In the case that it's called twice with a `parentid`, the code as written takes the first one, but it shouldn't happen (short of adding new validations). So, we can simplify the check to `parentId || !this.orphans.has(id)`, in which case we'll take the most recent one.
Comment on attachment 8951348 [details]
Bug 1438445 - Refactor bookmark validator to be simpler and more correct

https://reviewboard.mozilla.org/r/220596/#review226716

> Hmm, isn't this the other way around: `record.children` is `childIDs`, and `childGUIDs` are those IDs converted to GUIDs?

Hm, I don't think so? But you have a better mental model of this than I do, admittedly (I think of it mostly in terms of which ids have the underscore padding and which don't).

My impression is: places GUIDs and recordIds are the same, except the recordIDs sometimes (e.g. for roots) have underscores.

For incoming records (before we normalize them and build the tree and such), the id field and parentid field both contain recordIDs, (e.g. no underscores). The `children` array contains GUIDs (e.g. with underscores).

After this function, the `childIDs` array has the ids without underscores, and the `childGUIDs` array has the original values. That is, all three of `id`, `parentid` and `childIDs` should be using the same types of values, while `childGUIDs` is the only place where we might find a placesGUID.

> Nit: Unless you're using `async` for consistency, this could be a vanilla function.

I meant to add more jankYielding, actually.

I've also gone through and added it to more of the client records. People using the validator on trees of 15k records probably need it.
Comment on attachment 8951348 [details]
Bug 1438445 - Refactor bookmark validator to be simpler and more correct

https://reviewboard.mozilla.org/r/220596/#review226716

> Hm, I don't think so? But you have a better mental model of this than I do, admittedly (I think of it mostly in terms of which ids have the underscore padding and which don't).
> 
> My impression is: places GUIDs and recordIds are the same, except the recordIDs sometimes (e.g. for roots) have underscores.
> 
> For incoming records (before we normalize them and build the tree and such), the id field and parentid field both contain recordIDs, (e.g. no underscores). The `children` array contains GUIDs (e.g. with underscores).
> 
> After this function, the `childIDs` array has the ids without underscores, and the `childGUIDs` array has the original values. That is, all three of `id`, `parentid` and `childIDs` should be using the same types of values, while `childGUIDs` is the only place where we might find a placesGUID.

This has been replaced by using childGUIDs everywhere, which are really just record IDs. Apparently it was only GUIDs since it was accidentally uploaded in the first place (we keep the name childGUIDs for backwards compatibility).
Pushed by tchiovoloni@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/383cf4909bfb
Refactor bookmark validator to be simpler and more correct r=kitcambridge,markh
Status: NEW → ASSIGNED
Priority: -- → P1
https://hg.mozilla.org/mozilla-central/rev/383cf4909bfb
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 60
You need to log in before you can comment on or make changes to this bug.