1893650 - Glean SDK Profile Backup and Restore

Reporter

Description

•

1 year ago

Firefox is building a profile backup+restore mechanism. Data Science requires specific behaviour from the client_id on restore:

The profile's client_id shouldn't change, even if a backup is restored over top of it
The backed-up client_id should be reported alongside the current client_id in a ping reporting on the restoration

This means we need some way to backup the client_id and perform logic on restoration to report it.

...but what else should we backup and restore? seq? first_run_date? Are there user-lifetime metrics we want to bring with us? Or, like the client_id, just report them so we know what they were and what they now are?

This bug is about designing and implementing backup and restore logic for the Glean SDK that:

Satisfies Data Science's client_id behaviour requirements
Ensures that the SDK operates nicely after restoration
Instruments the SDK's restoration sufficiently that we can validate things went well within the SDK

And, most importantly, this bug is about working closely with the Firefox devs working on bug 1885955, so that the APIs and behaviour are what they need and expect.

Mike Conley (:mconley) (:⚙️)

Updated

•

1 year ago

Whiteboard: [fidefe-device-migration]

Jira Integration Bot

Updated

•

1 year ago

See Also: → https://mozilla-hub.atlassian.net/browse/FIDEFE-4946

Mike Conley (:mconley) (:⚙️)

Comment 1

•

1 year ago

The profile's client_id shouldn't change, even if a backup is restored over top of it

I was to expand on this a bit to avoid confusion - the restoration mechanism for backups does not overwrite anything, or restore over top of existing data.

Let's say we have a computer with a Firefox profile A. Let's also say that we also have a Firefox with a profile B. These profiles might be on the same or different devices. In some cases A and B might actually be the same device and profile, but just at different points in time!

Let's say that A has created a backup for itself. We'll call that the A-Backup.

In order to recover from the backup, one must use a running instance of Firefox, and tell it to recover from the backup archive. Let's say that B is being used to do that - to initiate recovery of the A-Backup.

What happens is that B creates a new empty profile, we'll call it C, and then copies the contents of the A-Backup into C.

What data science wants is that C always inherits the client ID of the profile that initiated recovery, so in this case, B. C then becomes the default user profile on the device that it's running on. It is known that B and C will then share client IDs, but the expectation is that it's unlikely that the user will go back to actively using B.

We expect the common case is that B is actually just A, but in the future after the backup was created - OR, that B is a very recently created user profile on a new machine that is being used to recover from the A-Backup.

Hopefully all of this pseudo-algebra made things clearer both for you and me and for future historians, instead of complicating things. :)

Mike Conley (:mconley) (:⚙️)

Comment 2

•

1 year ago

Speaking with nflorez, our team's data scientist, she writes that things like profile age should match the client ID that is associated with it. So what I expect is that most of the metadata about B should be copied over to C. But no backlogged pings.

Chris H-C :chutten

Reporter

Comment 3

•

1 year ago

You want me to assign this to you for the proposal writing part, Mike?

Flags: needinfo?(mconley)

Mike Conley (:mconley) (:⚙️)

Comment 4

•

1 year ago

Yeah, I'll take this for now while I get this document off the ground.

Assignee: nobody → mconley

Flags: needinfo?(mconley)

Mike Conley (:mconley) (:⚙️)

Comment 5

•

1 year ago

I have a draft here: https://docs.google.com/document/d/1sXNWmgImAu3XfNAq7w-VP_MbDPZ2IyhzPsdZEWy82Gw/edit

but I think I've taken it about as far as I can without some additional guidance or feedback.

Flags: needinfo?(chutten)

Mike Conley (:mconley) (:⚙️)

Updated

•

1 year ago

Blocks: 1892744
No longer blocks: 1885955

Chris H-C :chutten

Reporter

Comment 6

•

1 year ago

As I said in the doc:

Seems like a good overview and explanation of what’s required. May need some more technical wrangling about who’s passing what file paths and who’s doing what I/O. For that, I’ll summon Travis.

Travis: does the proposal contain sufficient detail to design+impl the necessary APIs in the Glean SDK? I figure it won't need to concern itself with File I/O and can leave the file management to FOG. Instead it could probably return and consume some structured or opaque data as you'd like.

Flags: needinfo?(chutten) → needinfo?(tlong)

Travis Long [:travis_]

Assignee

Comment 7

•

1 year ago

Travis: does the proposal contain sufficient detail to design+impl the necessary APIs in the Glean SDK? I figure it won't need to concern itself with File I/O and can leave the file management to FOG. Instead it could probably return and consume some structured or opaque data as you'd like.

I believe so, the requirements seem pretty clear to me. Not needing to handle storage concerns is great, and having a small-ish external API surface to get() and put() backup data seems quite bearable to me. You have my blessings as SDK tech-lead on this approach.

Flags: needinfo?(tlong)

Mike Conley (:mconley) (:⚙️)

Comment 8

•

1 year ago

Okay, sounds like we've got high-level sign-off on this proposal? What's generally the next step for making this kind of change in Glean?

Flags: needinfo?(chutten)

Travis Long [:travis_]

Assignee

Comment 9

•

1 year ago

The next steps is to get this work assigned and implemented. We are always open to outside contributions but this one seems to be a little bit of a deep cut to ask of an outside contributor, so let me bring this up in the next Glean SDK meeting (and/or in our team channel) and I'll get back to this bug with an ETA on when we can get this done.

Chris H-C :chutten

Reporter

Updated

•

1 year ago

Flags: needinfo?(chutten)

Travis Long [:travis_]

Assignee

Updated

•

1 year ago

Assignee: mconley → tlong

Travis Long [:travis_]

Assignee

Updated

•

1 year ago

Priority: -- → P3

Travis Long [:travis_]

Assignee

Updated

•

1 year ago

Priority: P3 → P2

Travis Long [:travis_]

Assignee

Comment 10

•

1 year ago

With the requirements here being supported by just the group_id, I think we can close this as invalid. Should it turn out we do need to do something in support of this, we can reopen this as needed.

Status: NEW → RESOLVED

Closed: 1 year ago

Resolution: --- → INVALID

Bugzilla

Glean SDK Profile Backup and Restore

Categories

(Data Platform and Tools :: Glean: SDK, task, P2)

Tracking

(Not tracked)

People

(Reporter: chutten, Assigned: travis_)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fidefe-device-migration])

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Updated

Updated

Updated

Comment 10