Open Bug 1893650 Opened 25 days ago Updated 14 days ago

Glean SDK Profile Backup and Restore

Categories

(Data Platform and Tools :: Glean: SDK, task)

task

Tracking

(Not tracked)

People

(Reporter: chutten, Assigned: mconley)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fidefe-device-migration])

Firefox is building a profile backup+restore mechanism. Data Science requires specific behaviour from the client_id on restore:

  • The profile's client_id shouldn't change, even if a backup is restored over top of it
  • The backed-up client_id should be reported alongside the current client_id in a ping reporting on the restoration

This means we need some way to backup the client_id and perform logic on restoration to report it.

...but what else should we backup and restore? seq? first_run_date? Are there user-lifetime metrics we want to bring with us? Or, like the client_id, just report them so we know what they were and what they now are?

This bug is about designing and implementing backup and restore logic for the Glean SDK that:

  • Satisfies Data Science's client_id behaviour requirements
  • Ensures that the SDK operates nicely after restoration
  • Instruments the SDK's restoration sufficiently that we can validate things went well within the SDK

And, most importantly, this bug is about working closely with the Firefox devs working on bug 1885955, so that the APIs and behaviour are what they need and expect.

Whiteboard: [fidefe-device-migration]

The profile's client_id shouldn't change, even if a backup is restored over top of it

I was to expand on this a bit to avoid confusion - the restoration mechanism for backups does not overwrite anything, or restore over top of existing data.

Let's say we have a computer with a Firefox profile A. Let's also say that we also have a Firefox with a profile B. These profiles might be on the same or different devices. In some cases A and B might actually be the same device and profile, but just at different points in time!

Let's say that A has created a backup for itself. We'll call that the A-Backup.

In order to recover from the backup, one must use a running instance of Firefox, and tell it to recover from the backup archive. Let's say that B is being used to do that - to initiate recovery of the A-Backup.

What happens is that B creates a new empty profile, we'll call it C, and then copies the contents of the A-Backup into C.

What data science wants is that C always inherits the client ID of the profile that initiated recovery, so in this case, B. C then becomes the default user profile on the device that it's running on. It is known that B and C will then share client IDs, but the expectation is that it's unlikely that the user will go back to actively using B.

We expect the common case is that B is actually just A, but in the future after the backup was created - OR, that B is a very recently created user profile on a new machine that is being used to recover from the A-Backup.

Hopefully all of this pseudo-algebra made things clearer both for you and me and for future historians, instead of complicating things. :)

Speaking with nflorez, our team's data scientist, she writes that things like profile age should match the client ID that is associated with it. So what I expect is that most of the metadata about B should be copied over to C. But no backlogged pings.

You want me to assign this to you for the proposal writing part, Mike?

Flags: needinfo?(mconley)

Yeah, I'll take this for now while I get this document off the ground.

Assignee: nobody → mconley
Flags: needinfo?(mconley)

I have a draft here: https://docs.google.com/document/d/1sXNWmgImAu3XfNAq7w-VP_MbDPZ2IyhzPsdZEWy82Gw/edit

but I think I've taken it about as far as I can without some additional guidance or feedback.

Flags: needinfo?(chutten)
Blocks: 1892744
No longer blocks: 1885955

As I said in the doc:

Seems like a good overview and explanation of what’s required. May need some more technical wrangling about who’s passing what file paths and who’s doing what I/O. For that, I’ll summon Travis.

Travis: does the proposal contain sufficient detail to design+impl the necessary APIs in the Glean SDK? I figure it won't need to concern itself with File I/O and can leave the file management to FOG. Instead it could probably return and consume some structured or opaque data as you'd like.

Flags: needinfo?(chutten) → needinfo?(tlong)

Travis: does the proposal contain sufficient detail to design+impl the necessary APIs in the Glean SDK? I figure it won't need to concern itself with File I/O and can leave the file management to FOG. Instead it could probably return and consume some structured or opaque data as you'd like.

I believe so, the requirements seem pretty clear to me. Not needing to handle storage concerns is great, and having a small-ish external API surface to get() and put() backup data seems quite bearable to me. You have my blessings as SDK tech-lead on this approach.

Flags: needinfo?(tlong)

Okay, sounds like we've got high-level sign-off on this proposal? What's generally the next step for making this kind of change in Glean?

Flags: needinfo?(chutten)

The next steps is to get this work assigned and implemented. We are always open to outside contributions but this one seems to be a little bit of a deep cut to ask of an outside contributor, so let me bring this up in the next Glean SDK meeting (and/or in our team channel) and I'll get back to this bug with an ETA on when we can get this done.

Flags: needinfo?(chutten)
You need to log in before you can comment on or make changes to this bug.