Open Bug 1530145 Opened 6 years ago Updated 2 years ago

Record telemetry for possible sync loops caused by the Rust merger

Categories

(Firefox :: Sync, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: lina, Unassigned)

References

Details

The Rust bookmarks merger now tries to fix diverging tree structure on the server. This includes missing parents and children, multiple parents, parent-child disagreements, and invalid GUIDs.

We need to make sure this doesn't cause sync fights with Fennec (or even other Desktops), where the devices keep reuploading the same records, even though they haven't changed locally.

One way to detect this is:

  1. Store all uploaded GUIDs, and the reason for uploading (locally new, merge conflict, or new structure), in a separate table. This table is basically a log of all uploads, keyed on (sync time, GUID).
  2. After each sync, check if any GUIDs have been uploaded for several consecutive syncs. We can use SQLite's new window functions to help.
  3. If the number of consecutive syncs exceed a set threshold, warn and record telemetry.

Eventually, we can consider enabling trace logging, pausing or stopping syncing, or even showing some UI to let users automatically file bugs and resume syncing. For now, let's just record that it happened, and the number of consecutive syncs.

CREATE TABLE mirror.uploadedGuids(
  at INTEGER NOT NULL, /* In milliseconds. */
  guid TEXT NOT NULL,
  reason INTEGER NOT NULL,
  PRIMARY KEY(at, guid)
) WITHOUT ROWID;

/* After merging... */
INSERT INTO uploadedGuids(at, guid, reason)
SELECT now(), mergedGuid, uploadReason
FROM mergeStates
WHERE uploadReason <> :UPLOAD_NO_REASON
ON CONFLICT(at, guid) DO UPDATE SET
  reason = excluded.reason
WHERE excluded.reason = :UPLOAD_NEW_STRUCTURE;

WITH
ranks(rank, at, guid, reason) AS (
  /* First, assign the same rank to all items uploaded in the same
     sync. Dense ranking assigns consecutive rank values to
     successive syncs. */
  SELECT dense_rank() OVER (ORDER BY at), at, guid, reason
  FROM uploadedGuids
),
groupings(guid, at, grouping) AS (
  /* Next, assign groups for all reuploaded items that weren't
     changed or deleted locally. The grouping values are arbitrary,
     but identical for consecutive syncs, so grouping by (GUID,
     grouping) yields groups with rows for consecutive syncs where
     the GUID was reuploaded. */
  SELECT guid, at, rank - row_number() OVER (PARTITION BY guid ORDER BY rank)
  FROM ranks
  WHERE reason = :UPLOAD_NEW_STRUCTURE
)
SELECT guid, min(at) AS startedAt, max(at) AS stoppedAt,
       count(*) AS consecutiveReuploads
FROM groupings
GROUP BY guid, grouping
HAVING consecutiveReuploads >= 5;

/* Warning: Item ${guid} was reuploaded for ${consecutiveReuploads} syncs between ${startedAt} and ${stoppedAt} */

Downgrading this for now. We can keep an eye on the number and frequency of bookmark syncs in 70 compared to 69, and add client-side telemetry if we notice anything out of the ordinary.

Priority: P2 → P3
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.