sync may resurrect old deleted items

NEW
Unassigned

Status

()

defect
P3
normal
3 years ago
9 months ago

People

(Reporter: designakt, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

When I reactivated an old computer, it synced accounts that I had deleted in the meantime back into my list of saved logins.
I noticed that on google.com where I usually have 2 accounts (private and mozilla) but now got back 3 other old accounts that I deleted long ago. This makes it harder to select the right login as the list is longer and it appears easier to click the wrong one.
(I also noticed that these old accounts got marked as "last used" at the time of the sync - without me using them to login.)

Expected behavior: the most recent state is synced and those deleted accounts get deleted on the old machine as well.
How old — which version of Firefox, if you know it?
Component: General → Firefox Sync: Backend
Flags: needinfo?(mjaritz)
It was a Nightly from about halve a year to a year ago.
(Now it is already updated to the newest and I don't know if we have an update history.)
Flags: needinfo?(mjaritz)
Markus, do you remember if the reconnected device had those old logins? Do you remember which device you deleted them on?
Flags: needinfo?(mjaritz)
Here's our hypothesis from the meeting:

* At some point, on a device *other* than the very old one, these logins were deleted and a "tombstone" was uploaded to the server to mark the record as deleted. All devices that synced since that date will have had the password removed (but the old device had not yet synced)
* After that deletion (but still before that old device was again synced), the user saw a "node reassignment" (ie, the server used to store the data changed, which happens from time to time.)
* This reassignment loses all server data - other devices re-upload all the data they have to repopulate the server.

note however that deletions are not persisted locally - eg, if another device sees a deleted password, it simply deletes the password locally - it doesn't persist the fact that the item was deleted. In other words, these deleted "tombstones" never survive a node reassignment, as no device has a record of older deleted items.

* The old device then reconnects to sync. It now asks the server if it has that particular login, and instead of answering "yes I do and it's deleted" the server says "nope, never heard of it".
* The old device then re-uploads the login.

So, sadly, this is a problem that could happen to all collections, and is very difficult to reliably fix. Some server side changes to make the data more robust will help in some cases, but not all (eg, password resets). Having the clients remember the deleted state for all items is a large amount of work.

So I'm really not sure what we can do about this :(
Might be time to resurrect Sync Options.

Machine wakes up. Goes to talk to Sync server. Finds that (a) it's on a different server, (b) it's been 6 months since it synced.

Prompts the user: "It's been a while since you synced with your Firefox Account. Do you want to replace your local X Y Z with the contents of the server?"

Or, perhaps, some kind of heuristic:

* If a local item was (a) created before our last sync, and (b) was previously synced
* And we have to do a full sync thanks to a node reassignment, and a majority of records exist remotely
* Assume that the remainder might have been remotely deleted.

At that point we can either drop them on the floor, or prompt the user.
(In reply to Kit Cambridge [:kitcambridge] from comment #3)
> Markus, do you remember if the reconnected device had those old logins? Do
> you remember which device you deleted them on?

I am pretty sure the old device had those old logins as I remember doing some cleanup on the new device, right after syncing everything over.
Flags: needinfo?(mjaritz)
+rfeeley - see comment 4 and comment 5
Ryan, WDYT?
Flags: needinfo?(rfeeley)
Posted image merge.png
Tough one. I've balked at this before as a foot gun. Seeing it plainly described here clears things up a bit.

Would we simply create a dialog like the merge warning? (when users try to create an account with a different username, see attached).

Or are you thinking this choice could also be forced upon users signing in from a browser for the first time?
Flags: needinfo?(rfeeley) → needinfo?(rnewman)
The particular problem described in this bug occurs only when two devices share some historical data.

That data can be passwords (this bug), but it can also occur with, say, history. History expires after sixty days! So if you:

- Visit weddingrings.com on device A. Sync.
- Turn off device A.
- Turn on device B. Sync.
- See your shameful wedding ring history, and delete it. Device B uploads a {deleted: true} record… with a TTL=60d.

Scenario 1 (applies to history):
- 61 days pass, and you turn on device A.
- You expect weddingrings.com to not be in your browsing history, but it is -- the tombstone evaporated from the server in the past two months.

Scenario 2 (the only way this passwords bug can occur):
- You get assigned to a new Sync server. You have no insight into this process.
- Turn on device A.
- Your shameful weddingrings.com history is synced *back to device B*, because device A does a full re-sync. Oops.


###
Sidenote about Sync Options
A new device has only one related worry: that local crud escapes to your other devices. Say you decide to try out Windows 10, so you browse around CNet and Reddit and such. You decide that you're going to keep this machine, so you sign in to your main Firefox Account… oops! Everything got merged.

Sync Options existed to address the latter situation. As an expert user with that new Windows 10 machine, you just pick "Wipe this device and replace with server contents" during setup, and you get what you want.
###


The scenario in this bug is more difficult and a bit different, because the user is *already* signed in, and the resurrecting sync occurs automatically as soon as they launch Firefox.


It's tempting to say "easy! longer TTLs!" as a solution for this problem. Well, yeah, it'd help, if we had durable server storage. But we'd still lose tombstones in a bunch of situations (e.g., format version bumps). Maybe that's uncommon enough.



UX-wise we can decide that perhaps we will _not_ automatically sync in this situation, and offer to do something else.

This is tricky: the user might not know if they were in sync before (and are thus safe to wipe), and they might not understand the question, leading them to do something like sign out and sign back in again (which will demonstrate the bug!). We don't want a user with a skewed clock to see scary questions, and we certainly don't want to quietly throw away that user's new passwords.

So here's my proposal in two parts (which should probably end up in two different bugs).

1. Bump TTLs for deleted records to a much longer duration. That _should_ ensure that this bug only occurs when a syncID changes, at the cost of storing some small records for a bit longer.

2.

If:
  - The syncID for a collection has changed, and this device isn't the one that changed it. That means we'll definitely have lost tombstones.
  - There are local records not present on the server. That means there might have been tombstones.
  - The previous last sync timestamp for that collection is in the distant past, but later than the last modified timestamp of those records (assuming accurate clock). That is: last time we synced successfully, we should have uploaded those records, so if they're missing, it's almost certainly tombstones.

Then:
  - Ask the user: "It looks like you haven't synced this device in a while. Do you want to make the {passwords} on this device the same as the ones on {Richard's iPhone}?"
  - If they choose "Match my other device", wipe the local device and continue the sync.
  - If they choose "Merge", do what we do now.


This approach:

- Requires a little new code, but nothing too absurd.
- Should not hit false positives very often.
- Tries to explain the situation in a way the user will understand.
Flags: needinfo?(rnewman)
See Also: → 1332556
Mark to see how easy it is to get telemetry on this to measure how worthwhile it will be.
Flags: needinfo?(markh)
Let's make this about sync in general rather than just the passwords engine.
Summary: reactivating an old computer synced old deleted accounts back into my saved logins → sync may resurrect old deleted items
Duplicate of this bug: 1217017
(In reply to Mark Hammond [:markh] from comment #11)
> Mark to see how easy it is to get telemetry on this to measure how
> worthwhile it will be.

I'm not sure why I suggested we get telemetry for this - we *know* it happens and we *know* actually occurring scenarios that cause it.

In general, I agree with comment 10 - not because I *like* it, but because I see no other option :) Some questions:

(In reply to Richard Newman [:rnewman] from comment #10)
> So here's my proposal in two parts (which should probably end up in two
> different bugs).
> 
> 1. Bump TTLs for deleted records to a much longer duration. That _should_
> ensure that this bug only occurs when a syncID changes, at the cost of
> storing some small records for a bit longer.

Agreed - that's a client-only change too, right?

> 2.
> 
> If:
>   - The syncID for a collection has changed, and this device isn't the one
> that changed it. That means we'll definitely have lost tombstones.

IIUC, the sync ID changes on node-reassignment, so to clarify, I believe we still want this behaviour when we notice the sync ID has changed externally, right?

>   - There are local records not present on the server. That means there
> might have been tombstones.

That seems tricky - isn't that a common case when there is a new item yet to sync? bookmarks can probably differentiate these 2 cases (ie, truly new item vs item that should be on the server but isn't), but in general the other collections can't.

Another not-particularly-useful signal would be "server has no records" - that's a great indicator tombstones are lost, but that only works for the first device to detect this scenario.

More generally though, isn't the "syncid changed" roughly the same signal as this?

>   - The previous last sync timestamp for that collection is in the distant
> past, but later than the last modified timestamp of those records (assuming
> accurate clock). That is: last time we synced successfully, we should have
> uploaded those records, so if they're missing, it's almost certainly
> tombstones.

Yep. Also, I think this *should* let us gracefully handle a possible future where some engines store tombstones locally - eg, if we can convince Mak that it's OK to store GUIDs for deleted bookmarks, or more radically, have sync create an additional database used purely for tomestones across all collections (ie, just a set of known deleted GUIDs regardless of collection, possibly mapped to a timestamp), then we might be able to loosen this a little for those engines, as a "first sync after a reset" will then be able to re-upload tombstones.

> Then:
>   - Ask the user: "It looks like you haven't synced this device in a while.
> Do you want to make the {passwords} on this device the same as the ones on
> {Richard's iPhone}?"
>   - If they choose "Match my other device", wipe the local device and
> continue the sync.
>   - If they choose "Merge", do what we do now.

I agree the UX here is much better than the old UI, and agree the conditions when it would be displayed are more appropriate than showing it on each connect like old sync did.

needinfo rnewman to clarify my questions above.

(In reply to Ryan Feeley [:rfeeley] from comment #9)
> Would we simply create a dialog like the merge warning? (when users try to
> create an account with a different username, see attached).
> 
> Or are you thinking this choice could also be forced upon users signing in
> from a browser for the first time?

Ryan, I think the tail of comment 10 clarifies this - only in exceptional cases would the "merge" dialog be shown, and when it does, it would have the text and options Richard outlines. Does this seem acceptable to you? If so, is that exact wording OK?
Flags: needinfo?(rnewman)
Flags: needinfo?(rfeeley)
Flags: needinfo?(markh)
(In reply to Mark Hammond [:markh] from comment #14)

> > 1. Bump TTLs for deleted records to a much longer duration. That _should_
> > ensure that this bug only occurs when a syncID changes, at the cost of
> > storing some small records for a bit longer.
> 
> Agreed - that's a client-only change too, right?

Yes.


> IIUC, the sync ID changes on node-reassignment, so to clarify, I believe we
> still want this behaviour when we notice the sync ID has changed externally,
> right?

Yes. Perhaps it's best expressed in the negative: if we were the ones to change the sync ID, should we be concerned?
 

> >   - There are local records not present on the server. That means there
> > might have been tombstones.
> 
> That seems tricky - isn't that a common case when there is a new item yet to
> sync? bookmarks can probably differentiate these 2 cases (ie, truly new item
> vs item that should be on the server but isn't), but in general the other
> collections can't.

These conditions stack: that is, if the sync ID changed since we were last there, *and* there are records locally that aren't remote, *and* timestamps indicate we might be about to resurrect something, *then* prompt.

Yes, there will be situations in which the second check (record presence) passes because there are new local items, rather than missing remote items -- and the timestamp check is a heuristic to try to filter those out.
Flags: needinfo?(rnewman)
Would this be client UI or on the web? (client UI is more contrained by space and format)

Richard's proposal:
  - Ask the user: "It looks like you haven't synced this device in a while. Do you want to make the {passwords} on this device the same as the ones on {Richard's iPhone}?"
  - If they choose "Match my other device", wipe the local device and continue the sync.
  - If they choose "Merge", do what we do now.

My proposal:
  - Ask the user: "This Firefox hasn’t synced with your account in a long time. What do you want to do with the bookmarks, passwords and other data in this browser?"
  - PRIMARY BUTTON: "Merge into account" do what we do now.
  - SECONDARY BUTTON: "Replace with latest", wipe the local device and continue the sync.

Does this capture the same intention?
Flags: needinfo?(rfeeley)
Client UI: this device is already syncing, so we don't have an accounts page open. We could open something to explain, but there would have to be some client UI to get them there.

Two nits on your wording:

- It's not necessarily the case that all of the data -- bookmarks AND passwords AND … -- is affected. If we know, can we use more precise wording?

- "Latest" is perhaps confusing, but I get what you're trying to achieve. Perhaps "Merge into account" and "Keep only account data" or something along those lines?
Flags: needinfo?(rfeeley)
Depends on: 1341923
See Also: → 1343103
Duplicate of this bug: 1354726
Duplicate of this bug: 1381372
How about:
"This Firefox hasn’t synced with your account in a long time. What do you want to do with the Sync data in this Firefox?"
- PRIMARY BUTTON: "Merge Local Data" (do what we do now)
- SECONDARY BUTTON: "Discard Local Data" (wipe the local device and continue the sync)
Flags: needinfo?(rfeeley) → needinfo?(rnewman)
How about:
"This Firefox hasn’t synced to your account in a long time. What would you like to do with its bookmarks, passwords and other data?"
- PRIMARY BUTTON: "Combine" (do what we do now)
- SECONDARY BUTTON: "Discard" (wipe the local device and continue the sync)
(In reply to Ryan Feeley [:rfeeley] from comment #21)
> How about:
> "This Firefox hasn’t synced to your account in a long time. What would you
> like to do with its bookmarks, passwords and other data?"
> - PRIMARY BUTTON: "Combine" (do what we do now)
> - SECONDARY BUTTON: "Discard" (wipe the local device and continue the sync)

Two concerns:

* The meaning of "discard" here hinges on the tiny word "its". If you don't read carefully, you might think that we're asking what you want to do with your Firefox Account data, and pick the wrong choice.

* If we offer two buttons, we don't have an excellent way to hint at a default. It's much more UI work, but I'd be inclined to do something a bit SuperDuper!-ey:

- Offer the choices
- Make the choices radio buttons
- Delay clickthrough, so you can't just hit Enter right away
- Include the thorough explanatory text as part of each choice.

So that might be:

---------------------------------------------------
This Firefox hasn't synced to your account in a long
time. What would you like to do with its bookmarks,
passwords, and other data?

(*) Combine this Firefox's data with your Firefox Account. This
    Firefox, and any device on which you're signed in to the 
    same account, will all have both sets of data.

( ) Discard this Firefox's data and replace it with the contents
    of your Firefox Account. Any bookmarks, passwords, and history
    on this device will be lost.

( ) Sign out and don't sync anything.

                                       [ Continue ]
--------------------------------------------------- 


We also need to figure out how to show the list of things that you're syncing. After all, we don't want to scare people by telling them they'll lose their passwords if they're only syncing history.
Flags: needinfo?(rnewman)
> We also need to figure out how to show the list of things
> that you're syncing. After all, we don't want to scare people
> by telling them they'll lose their passwords if they're only 
> syncing history.

Could we make that generic by saying "What would you like to do with the Firefox data on this device you may be syncing like bookmarks, passwords and history?"
(In reply to Ryan Feeley [:rfeeley] from comment #23)

> Could we make that generic by saying "What would you like to do with the
> Firefox data on this device you may be syncing like bookmarks, passwords and
> history?"

We could, but I'm afraid that most users won't really read that 'may', and will assume it says 'are', and that will scare them.

We will know at this point what things the user is configured to sync, so we can:

- Give an example list if we know the user is syncing all those things.
- See if l10n are OK with having a comma-separated list of nouns.
- Show a list.
- Show a picker. (This starts getting complicated.)
Is this UI in the clients like our merge warning dialog or on the web?
Flags: needinfo?(rnewman)
(In reply to Ryan Feeley [:rfeeley] from comment #25)
> Is this UI in the clients like our merge warning dialog or on the web?

I'd expect it in the client -- we could do it via the web, but we'd have to open a tab or frame to do so, and there would be some trickier coordination between sync code and content if that were the case.
Flags: needinfo?(rnewman)
Priority: -- → P3
Priority: P3 → P2
Whiteboard: [Sync Q4 OKRs]
Flags: needinfo?(markh)
Whiteboard: [Sync Q4 OKRs]
While we are considering some changes that might make some collections less susceptible to the problem described here, the root issue isn't going to go away in the current Sync. I think we can do roughly what comment 10 says with some tweaks to the algorithm Richard describes, but I think the work isn't trivial. We should revisit this once we have a better understanding of what sync.next will give us and when.
Flags: needinfo?(markh)
Priority: P2 → P3
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.