Closed Bug 1287687 Opened 3 years ago Closed 3 years ago

Syncs happen every 90 seconds when stale device records exist.

Categories

(Firefox :: Sync, defect, P1)

48 Branch
defect

Tracking

()

RESOLVED FIXED
Firefox 50
Tracking Status
firefox47 --- unaffected
firefox48 --- fixed
firefox49 --- fixed
firefox50 --- fixed

People

(Reporter: markh, Assigned: markh)

References

Details

(Keywords: regression)

Attachments

(2 files)

Bug 1264498 (landed in 48) made a change to "stale" device records were not shown to the user. This patch had the side effect of causing a Sync to happen every 90 seconds when such stale devices exist. The reason is that on every Sync we report that we re-applied those stale devices, causing the scheduler to constantly think we had new incoming items and increase the Sync rate accordingly.

The patch here doesn't actually delete the records locally, but instead adds a "stale" flag to them. Functions that expose these device names for the UI now filter based on this flag. The end result is that subsequent Syncs no longer re-apply these records, so Sync settles down to normal.
Comment on attachment 8772290 [details]
Bug 1287687 - prevent the clients engine from scheduling syncs every 90 seconds.

Edouard and Thom, it would be great if you could both have a good look at this and try and find any holes in it, as I want to uplift this all the way to beta. The general idea is that we just hide stale, duplicate devices from all functions that expose device names outside of Sync.
Mark, are you going to ask for an uplift of this fix? Thanks
Flags: needinfo?(markh)
Comment on attachment 8772290 [details]
Bug 1287687 - prevent the clients engine from scheduling syncs every 90 seconds.

https://reviewboard.mozilla.org/r/65140/#review62240

I don't feel stellar about the fact that we don't know why this bug only became a visible problem in 50 as of around the 5th.  If it were triggered by some change in FxA, you'd expect it to be in 49 (and maybe 48?) too, so it seems likely that something else landed to cause us to have so many stale devices... which could indicate a bug elsewhere.

That said, this is still a valuable fix, and is a straightforward way of approaching the problem. r=me if you're confident that we don't want to ignore stale records coming from the server.

::: services/sync/modules/engines/clients.js:630
(Diff revision 1)
>        record.cleartext = this._remoteClients[id];
> +      if (record.cleartext.stale) {
> +        // It's almost certainly a logic error for us to upload a record we
> +        // consider stale, so make log noise, but still remove the flag.
> +        this._log.warn(`Preparing to upload record ${id} that we consider stale`);
> +        delete record.cleartext.stale;

Hm, I'm not sure I understand why we just remove the flag here. Wouldn't we want to ignore this record entirely?

Also, it seems like we might want to ues `this._log.error` so we dump an error log instead, but I could be wrong.
Attachment #8772290 - Flags: review?(tchiovoloni) → review+
Comment on attachment 8772290 [details]
Bug 1287687 - prevent the clients engine from scheduling syncs every 90 seconds.

https://reviewboard.mozilla.org/r/65140/#review62278

Same as Thom, I'm not sure I understand what's the logic behind the change in *clients.js:630* but I trust you.
Otherwise LGTM!
Attachment #8772290 - Flags: review?(edouard.oger) → review+
https://reviewboard.mozilla.org/r/65140/#review62240

> Hm, I'm not sure I understand why we just remove the flag here. Wouldn't we want to ignore this record entirely?
> 
> Also, it seems like we might want to ues `this._log.error` so we dump an error log instead, but I could be wrong.

> Hm, I'm not sure I understand why we just remove the flag here. Wouldn't we want to ignore this record entirely?

Yes - I think this patch will cause the record to be ignored completely, so we should never get here. I was considering throwing, but I felt that slightly more risky for uplift (ie, I probably would have thrown if targetting only 50)

> Also, it seems like we might want to ues this._log.error so we dump an error log instead, but I could be wrong.

SGTM, I'll do that.
Pushed by mhammond@skippinet.com.au:
https://hg.mozilla.org/integration/fx-team/rev/abd1161f6a90
prevent the clients engine from scheduling syncs every 90 seconds. r=tcsc,eoger
Comment on attachment 8772290 [details]
Bug 1287687 - prevent the clients engine from scheduling syncs every 90 seconds.

Approval Request Comment
[Feature/regressing bug #]: Bug 1264498
[User impact if declined]: Sync will sometimes run every 90 seconds causing impact for the user and our operations.
[Describe test coverage new/current, TreeHerder]: Landed with new tests
[Risks and why]: Fairly low risk limited to Sync.
[String/UUID change made/needed]: None
Flags: needinfo?(markh)
Attachment #8772290 - Flags: approval-mozilla-beta?
Attachment #8772290 - Flags: approval-mozilla-aurora?
Comment on attachment 8772290 [details]
Bug 1287687 - prevent the clients engine from scheduling syncs every 90 seconds.

This is a new regression and we care about sync. We'll have 48 beta 10 before RC to fix potential regressions. Let's take it in 48 beta 10 and aurora.
Attachment #8772290 - Flags: approval-mozilla-beta?
Attachment #8772290 - Flags: approval-mozilla-beta+
Attachment #8772290 - Flags: approval-mozilla-aurora?
Attachment #8772290 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/mozilla-central/rev/abd1161f6a90
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 50
does not apply cleanly to aurora

grafting 354913:abd1161f6a90 "Bug 1287687 - prevent the clients engine from scheduling syncs every 90 seconds. r=tcsc,eoger"
merging services/sync/modules/engines/clients.js
merging services/sync/tests/unit/test_clients_engine.js
warning: conflicts while merging services/sync/modules/engines/clients.js! (edit, then use 'hg resolve --mark')
abort: unresolved conflicts, can't continue
(use 'hg resolve' and 'hg graft --continue')
Flags: needinfo?(markh)
Version: unspecified → 48 Branch
This version of the patch applies cleanly on Aurora and Beta.

Try runs:
Aurora: https://treeherder.mozilla.org/#/jobs?repo=try&revision=aa41d240a1d0
Beta: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3b85edb67cf8
Flags: needinfo?(markh) → needinfo?(cbook)
uplift done, thanks mark!
Flags: needinfo?(cbook)
You need to log in before you can comment on or make changes to this bug.