Closed
Bug 1250531
Opened 9 years ago
Closed 9 years ago
[Sync] Duplicate devices when logout/login or disconnect/connect to sync account
Categories
(Firefox :: Sync, defect, P2)
Tracking
()
RESOLVED
FIXED
Firefox 48
People
(Reporter: aflorinescu, Assigned: lina)
References
Details
(Whiteboard: [sync-data-integrity])
Attachments
(2 files, 1 obsolete file)
STR:
1. Sign in to FF using two profiles and the same account.
2. On FF with profile 1 open few tabs, sync.
3. On FF with profile 2 sync and open Sync tabs sidebar.
4. In FF profile 1 change device name and re-sync FF profile 1 and profile 2.
5. In FF profile 1 disconnect and reconnect the sync account. (the initial tabs from step2 are still open and unchanged on FF profile 1)
6. Do a sync on FF profile 2 and refresh the Sync Tab Sidebar.
AR:
On the Sync Tab sidebar there are now two devices with the same name.
Also, in this scenario(take in account step.5) the new duplicate device will show no tabs, although there are tabs open while the initial device will show the correct tabs.
Reproducible on:
Windows10, Windows7, Linux, Mac
Note:
On mobile devices sync, the steps are reduced, just disconnecting and reconnecting the sync account will duplicate the device.(see 1209160 and 670474)
Updated•9 years ago
|
Flags: firefox-backlog+
Comment 1•9 years ago
|
||
This is a side-effect of how Sync manages devices. The new FxA device work will improve this - we probably need a meta-bug to track them.
Comment 2•9 years ago
|
||
I suspect this may be because when the user logs out, it first kills the FxA creds, which may prevent the client from cleaning itself up on the Sync server. The next step here is to investigate whether there's a straightforward solution to make this better in the near term or we should just fix it more broadly with the device management work.
Priority: -- → P2
Assignee | ||
Comment 3•9 years ago
|
||
Assigning to myself for investigation.
Assignee: nobody → kcambridge
Status: NEW → ASSIGNED
Assignee | ||
Comment 4•9 years ago
|
||
The plot thickens.
Here's the error I see on the first profile when I sign out... https://pastebin.mozilla.org/8864146 It looks like the delete succeeds, so the FxA error might be a red herring. (I set "services.sync.log.appender.dump" to "Trace" manually in services/sync/services-sync.js; otherwise, Sync resets it when I sign out).
And here's what I see on the second profile when I sync, after signing out of the first: https://pastebin.mozilla.org/8864148 ("ZhsXhMex3LNo" is the new client).
Comment 5•9 years ago
|
||
So I think we worked out that the problem is that the now-deleted client remains in memory on the other devices, and a mitigation of that would be to perform a full sync of the clients engine each sync (whereas now we do a full sync only on startup). I expect that to be reasonable for the clients engine.
I *think* the same issue will probably exist for the tabs engine too - we will still have an in-memory copy of the tabs from the now-deleted device. We probably don't want to fully sync this engine each sync, but we can probably mitigate it by ignoring any im-memory tabs from a device that the clients engine doesn't know about.
A bit of an aside here - there are 3 cases I know of that will cause the deletion of the remote device to fail:
1) The one Chris mentions - if we sign out after the token becomes invalid, so the delete fails. This is probably rare as tokens have a long life.
2) If you disconnect a device before the first Sync you also will not have a valid token.
3) Disconnecting when offline.
(1) and (2) are the same basic problem - although in (1) we've already done a login dance, just need a new token. For (2) we have never done the login dance. For (3) we are screwed, but that's probably ok
Assignee | ||
Comment 6•9 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/41187/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/41187/
Attachment #8732446 -
Flags: review?(markh)
Assignee | ||
Comment 7•9 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/41189/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/41189/
Attachment #8732447 -
Flags: review?(markh)
Assignee | ||
Comment 8•9 years ago
|
||
This is an awfully hacky fix until we can integrate the device manager. I haven't fixed up the Sync xpcshell tests yet, but is the general approach sensible?
Updated•9 years ago
|
Attachment #8732447 -
Flags: review?(markh) → review+
Comment 9•9 years ago
|
||
Comment on attachment 8732447 [details]
MozReview Request: Bug 1250531 - Only show existing remote clients in the Synced Tabs UI. r?markh
https://reviewboard.mozilla.org/r/41189/#review37805
This LGTM, but please see if you can add a simple test for this new branch (ie, ensure the list is filtered when the new function returns false.
::: services/sync/modules/SyncedTabs.jsm:124
(Diff revision 1)
> let seenURLs = new Set();
> let parentIndex = 0;
> let ntabs = 0;
>
> for (let [guid, client] in Iterator(engine.getAllClients())) {
> + if (!Weave.Service.clientsEngine.remoteClientExists(client.id)) {
let's add a small comment here indicating why we have this check.
Comment 10•9 years ago
|
||
Comment on attachment 8732446 [details]
MozReview Request: Bug 1250531 - Only show existing remote clients in the Synced Tabs UI. r=markh
https://reviewboard.mozilla.org/r/41187/#review37803
It looks like a simpler solution that doesn't require changing the semantics of the clients collection would be to have the clients engine override _processIncoming, and have it set this.lastSync = 0 before calling the base implementation?
::: services/sync/modules/engines/clients.js:155
(Diff revision 1)
> if (this._store._remoteClients[id])
> return this._store._remoteClients[id].type == DEVICE_TYPE_MOBILE;
> return false;
> },
>
> + _wipeCachedClients() {
I don't think we need this - we should startup with no remote clients and nothing flagged as changed (and the only item the clients engine will flag as changed is this client)
Attachment #8732446 -
Flags: review?(markh)
Assignee | ||
Comment 11•9 years ago
|
||
So...there's some additional complexity to consider.
* I added `_wipeCachedClients` because, even though we're pulling down the complete list from the server, deleted clients will still remain in `_remoteClients`. It's similar to the problem with bookmarks that we talked about at today's standup, where we need to compare the local database with the full collection from the server to detect corruption. In this case, we're downloading the full collection, but we still need to compare it against ClientStore to detect deletes.
Unfortunately, `_wipeCachedClients` doesn't go far enough. If we have a command that we want to send to a deleted client, we'll re-upload the record with commands for a client that will never see them.
* For the multi-device case: since we're syncing the entire collection, `count.applied` will always be > 0 for the clients collection. That causes the scheduler to think we always have incoming items (https://dxr.mozilla.org/mozilla-central/rev/6202ade0e6d688ffb67932398e56cfc6fa04ceb3/services/sync/modules/policies.js#230-232), and set the sync interval to immediateInterval.
I can work around this by either changing the condition to `numItems && data != this.service.clientsEngine.name`, or have `_reconcile` apply the record and return false. Icky.
Am I overcomplicating this? :-)
Flags: needinfo?(markh)
Comment 12•9 years ago
|
||
(In reply to Kit Cambridge [:kitcambridge] from comment #11)
> So...there's some additional complexity to consider.
>
> * I added `_wipeCachedClients` because, even though we're pulling down the
> complete list from the server, deleted clients will still remain in
> `_remoteClients`. It's similar to the problem with bookmarks that we talked
> about at today's standup, where we need to compare the local database with
> the full collection from the server to detect corruption. In this case,
> we're downloading the full collection, but we still need to compare it
> against ClientStore to detect deletes.
It looks to me like we don't actually *have* a local database in this case :/ ISTM that if we attempt to send a command to a remote client, but we restart before we've successfully synced, we lose the command (we will persist the fact the client ID has changed and needs to be updated, but IIUC we've lost the commands themselves as the locally-modified client record hasn't been persisted).
So after pulling down the entire list of remote clients, I think it's safe to discard any missing clients completely (eg, from both _remoteClients can from the list of changed clients in the tracker). But it does seem that we will want to reconcile locally-changed clients (ie, if a client is dirty, we need to ensure that after we've pulled down the new list, we re-add any commands the old version had queued and ensure it remains in the modified list, as we still need to upload that record). It *looks* like a change to the update() method would help us do the right thing (ie, instead of blindly doing the |this._remoteClients[record.id] = record.cleartext;| we could merge queued commands we have back into the record and ensure it is marked as modified and thus should be synced back out.
>
> Unfortunately, `_wipeCachedClients` doesn't go far enough. If we have a
> command that we want to send to a deleted client, we'll re-upload the record
> with commands for a client that will never see them.
Yeah, we should never do that, but I think the above should avoid it - ie, we just discard local records that don't exist from the server (including the fact we think they have changed).
>
> * For the multi-device case: since we're syncing the entire collection,
> `count.applied` will always be > 0 for the clients collection. That causes
> the scheduler to think we always have incoming items
> (https://dxr.mozilla.org/mozilla-central/rev/
> 6202ade0e6d688ffb67932398e56cfc6fa04ceb3/services/sync/modules/policies.
> js#230-232), and set the sync interval to immediateInterval.
>
> I can work around this by either changing the condition to `numItems && data
> != this.service.clientsEngine.name`, or have `_reconcile` apply the record
> and return false. Icky.
Or maybe have _reconcile be smart and actually compare the local and remote versions of the record - so even though we are always fetching remote records, we only apply ones that changed - which in most cases should be none.
> Am I overcomplicating this? :-)
No - I think I'm over simplifying it ;) I'm sure there's devil in the detail that I missed.
Flags: needinfo?(markh)
Comment 13•9 years ago
|
||
It also sounds like there's an existing latent bug here? If an incoming record has commands for a device, and we also have commands for the same device, ISTM that our commands will be lost as we've overridden our client record with the incoming one? Eg, consider 2 desktops trying to send a tab to a 3rd device that hasn't synced for some time...
Assignee | ||
Comment 14•9 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/43607/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/43607/
Attachment #8732446 -
Attachment description: MozReview Request: Bug 1250531 - Unconditionally sync the clients collection. r?markh → MozReview Request: Bug 1250531 - Only show existing remote clients in the Synced Tabs UI. r?markh
Attachment #8736956 -
Flags: review?(markh)
Attachment #8732446 -
Flags: review?(markh)
Assignee | ||
Comment 15•9 years ago
|
||
Comment on attachment 8732446 [details]
MozReview Request: Bug 1250531 - Only show existing remote clients in the Synced Tabs UI. r=markh
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/41187/diff/1-2/
Assignee | ||
Updated•9 years ago
|
Attachment #8732447 -
Attachment is obsolete: true
Assignee | ||
Comment 16•9 years ago
|
||
Getting closer! https://dxr.mozilla.org/mozilla-central/rev/494289c72ba3997183e7b5beaca3e0447ecaf96d/services/sync/tests/unit/test_syncscheduler.js#784-837 still fails when run with the other tests, though not when I comment out the ones before it.
I wonder if it has anything to do with `"bar"` not being a valid record; i.e., if I need to write a "real" cleartext like `JSON.stringify({ id: ..., name: ...}` into the store. Though I am wiping the store at the end of each test, and it passes when run independently...so something else is afoot.
Assignee | ||
Comment 17•9 years ago
|
||
Comment on attachment 8736956 [details]
MozReview Request: Bug 1250531 - Unconditionally sync the clients collection. r=markh
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/43607/diff/1-2/
Assignee | ||
Comment 18•9 years ago
|
||
Comment on attachment 8732446 [details]
MozReview Request: Bug 1250531 - Only show existing remote clients in the Synced Tabs UI. r=markh
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/41187/diff/2-3/
Comment 19•9 years ago
|
||
Comment on attachment 8736956 [details]
MozReview Request: Bug 1250531 - Unconditionally sync the clients collection. r=markh
https://reviewboard.mozilla.org/r/43607/#review40211
This is looking great - just a few trivial tweaks to go!
::: services/sync/modules/engines/clients.js:167
(Diff revision 1)
> + try {
> + SyncEngine.prototype._processIncoming.call(this);
> + // Since clients are synced unconditionally, any records in the local store
> + // that don't exist on the server must be for disconnected clients. Remove
> + // them, so that we don't upload records with commands for clients that will
> + // never see them.
This comment could also mention that we simply don't want stale devices as their list of tabs is confusing.
::: services/sync/modules/engines/clients.js:176
(Diff revision 1)
> + this._removeRemoteClient(staleID);
> + }
> + } finally {
> + this._incomingClients = null;
> + }
> + }
missing a trailing comma here.
::: services/sync/modules/engines/clients.js:450
(Diff revision 1)
> + },
> +
> + _removeRemoteClient(id) {
> + if (this._store._remoteClients[id]) {
> + delete this._store._remoteClients[id];
> + this._tracker.removeChangedID(id);
I'd be inclined to hoist this out of the if - ie, check if the id is in changedIDs even if it's *not* in _remoteClients.
::: services/sync/modules/engines/clients.js:473
(Diff revision 1)
> this.engine.localCommands = record.commands;
> - else
> + else {
> + let currentRecord = this._remoteClients[record.id];
> + if (currentRecord && currentRecord.commands) {
> + // Merge commands.
> + for (let action of currentRecord.commands) {
ISTM that if we hit this case the ID should *also* already be in changedIDs (ie, it means we have new commands for the client so should upload them later.
I wonder if it is worth adding a check for this, and if it fails log an error message (ie, a kind of non-fatal assertion)?
IOW, I think it would imply a logic problem if the ID is not in changedIDs and we should know about it somehow.
Also, if it's not too much work, I think we should ensure test_clients tests this case (if I'm reading it correctly I think it doesn't)
Attachment #8736956 -
Flags: review?(markh)
Comment 20•9 years ago
|
||
Comment on attachment 8732446 [details]
MozReview Request: Bug 1250531 - Only show existing remote clients in the Synced Tabs UI. r=markh
https://reviewboard.mozilla.org/r/41187/#review40221
Awesome, this looks great with the comma moved to the first patch!
::: services/sync/modules/engines/clients.js:180
(Diff revision 3)
> this._removeRemoteClient(staleID);
> }
> } finally {
> this._incomingClients = null;
> }
> - }
> + },
this should be in the other patch :)
Attachment #8732446 -
Flags: review?(markh) → review+
Assignee | ||
Comment 21•9 years ago
|
||
Comment on attachment 8736956 [details]
MozReview Request: Bug 1250531 - Unconditionally sync the clients collection. r=markh
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/43607/diff/2-3/
Attachment #8732446 -
Attachment description: MozReview Request: Bug 1250531 - Only show existing remote clients in the Synced Tabs UI. r?markh → MozReview Request: Bug 1250531 - Only show existing remote clients in the Synced Tabs UI. r=markh
Attachment #8736956 -
Flags: review?(markh)
Assignee | ||
Comment 22•9 years ago
|
||
Comment on attachment 8732446 [details]
MozReview Request: Bug 1250531 - Only show existing remote clients in the Synced Tabs UI. r=markh
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/41187/diff/3-4/
Assignee | ||
Comment 23•9 years ago
|
||
https://reviewboard.mozilla.org/r/43607/#review40211
> ISTM that if we hit this case the ID should *also* already be in changedIDs (ie, it means we have new commands for the client so should upload them later.
>
> I wonder if it is worth adding a check for this, and if it fails log an error message (ie, a kind of non-fatal assertion)?
>
> IOW, I think it would imply a logic problem if the ID is not in changedIDs and we should know about it somehow.
>
> Also, if it's not too much work, I think we should ensure test_clients tests this case (if I'm reading it correctly I think it doesn't)
Sure. Looks like it's too late to check the tracker here, since we clear it out in `_syncStartup`, but we can check the engine's `_modified` dict instead. Seems reasonable?
Comment 24•9 years ago
|
||
Comment on attachment 8736956 [details]
MozReview Request: Bug 1250531 - Unconditionally sync the clients collection. r=markh
https://reviewboard.mozilla.org/r/43607/#review40261
Excellent, thanks - LGTM with the log tweak
::: services/sync/modules/engines/clients.js:477
(Diff revisions 2 - 3)
> for (let action of currentRecord.commands) {
> - if (!hasDupeCommand(record.cleartext.commands, action)) {
> - record.cleartext.commands.push(action);
> + if (hasDupeCommand(record.cleartext.commands, action)) {
> + // We already know about this command.
> + continue;
> + }
> + if (!this.engine._modified[record.id]) {
I don't think this needs to be in the loop - ie, just as part of the "if" above where we know we've got *some* commands - that alone should be enough for it to be modified (and IIUC, the code as written will log one message per command and ignore every one of them, which isn't really necessary). I also think there's no need to take any action (ie, your continue) for this case - just log it and continue as normal.
Attachment #8736956 -
Flags: review?(markh) → review+
Assignee | ||
Comment 25•9 years ago
|
||
Comment on attachment 8736956 [details]
MozReview Request: Bug 1250531 - Unconditionally sync the clients collection. r=markh
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/43607/diff/3-4/
Attachment #8736956 -
Attachment description: MozReview Request: Bug 1250531 - Unconditionally sync the clients collection. r?markh → MozReview Request: Bug 1250531 - Unconditionally sync the clients collection. r=markh
Assignee | ||
Comment 26•9 years ago
|
||
Comment on attachment 8732446 [details]
MozReview Request: Bug 1250531 - Only show existing remote clients in the Synced Tabs UI. r=markh
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/41187/diff/4-5/
Assignee | ||
Comment 27•9 years ago
|
||
https://reviewboard.mozilla.org/r/43607/#review40261
> I don't think this needs to be in the loop - ie, just as part of the "if" above where we know we've got *some* commands - that alone should be enough for it to be modified (and IIUC, the code as written will log one message per command and ignore every one of them, which isn't really necessary). I also think there's no need to take any action (ie, your continue) for this case - just log it and continue as normal.
Removed the logging entirely per our discussion on IRC, since it's possible for other clients in a mesh to pull down commands for clients that they haven't changed. For example: mobile sends a URL to desktop, then laptop (third device) syncs.
Assignee | ||
Comment 28•9 years ago
|
||
Comment 29•9 years ago
|
||
Comment 30•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/c7b00e87d644
https://hg.mozilla.org/mozilla-central/rev/1d4afa71e193
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox48:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 48
Comment 32•9 years ago
|
||
I've tested on Windows 7 64bit and Mac OS X 10.9.5 using latest Nightly 48.0a1 (buildID: 20160413030239) and I have the following results:
- Desktop to Desktop - the problem isn't reproducible anymore: there is no duplicate devices.
- Mobile to Desktop (meaning that you disconnect and reconnect on Mobile and then, on Desktop, you see the dupe) - the problem is still reproducible. There are the following bugs for mobile side related to this issue: bug 670474, bug 1224278, bug 1224278.
Comment 34•9 years ago
|
||
I've tested on latest Aurora 47.0a2 (buildID: 20160418004027) and the issue on "Desktop to Desktop" side is still reproducible here.
I consider it would be a good idea to uplift this to 47. What do you think?
Flags: needinfo?(rkothari)
Flags: needinfo?(kcambridge)
Comment 35•9 years ago
|
||
(In reply to Camelia Badau, QA [:cbadau] from comment #34)
> I've tested on latest Aurora 47.0a2 (buildID: 20160418004027) and the issue
> on "Desktop to Desktop" side is still reproducible here.
>
> I consider it would be a good idea to uplift this to 47. What do you think?
I don't really consider that necessary; this isn't a trivial patch and users don't disconnect and reconnect their accounts with regularity. Taking it to beta with only a few weeks to bake on -central (which is effectively what uplifting to 47 would do) seems a greater risk than the reward.
Assignee | ||
Comment 36•9 years ago
|
||
I agree with Mark. This is a change to how we sync clients, and introduced a regression early on (bug 1262021). I'd feel a lot more comfortable if this rode the trains.
Flags: needinfo?(kcambridge)
(In reply to Camelia Badau, QA [:cbadau] from comment #34)
> I've tested on latest Aurora 47.0a2 (buildID: 20160418004027) and the issue
> on "Desktop to Desktop" side is still reproducible here.
>
> I consider it would be a good idea to uplift this to 47. What do you think?
Based on the replies from Mark and Kit, we will let this one ride the trains. Thanks for posing the question though.
Flags: needinfo?(rkothari)
Comment 38•9 years ago
|
||
Thank you for your answers!
Updated•9 years ago
|
Whiteboard: [sync-data-integrity]
You need to log in
before you can comment on or make changes to this bug.
Description
•