Use FxA account deletions as deletion requests in shredder
Categories
(Data Platform and Tools :: General, task, P1)
Tracking
(Not tracked)
People
(Reporter: mreid, Assigned: relud)
References
(Blocks 1 open bug)
Details
Similar to bug 1604844, this will let us relay the signal from Firefox Account deletions through to sync telemetry.
This may not actually be necessary, but if it is, we can track the work here.
Reporter | ||
Updated•4 years ago
|
Updated•4 years ago
|
Comment 1•4 years ago
|
||
It occurs to me today that the solution I proposed for relaying FxA deletion events to Amplitude in bug 1651260 could equally well work here.
The solution outlined there is to use the account deletion event logged by the FxA auth server, which is already available in BigQuery, and to hash the uid before sending to Amplitude. But sync telemetry uses the same hashed fxa uid, so this approach should work for cascading Firefox Account deletions to sync telemetry deletion.
:relud, is this something that's easy to do at the same time as the bug 1651260 work?
Assignee | ||
Comment 2•4 years ago
|
||
doing it that way should be trivial, I just need to know the bq table, filtering conditions, and hashing method.
Assignee | ||
Comment 3•4 years ago
|
||
I just need to know the bq table, filtering conditions, and hashing method.
ah, upon re-reading i see that this is about shredder for telemetry (not amplitude) getting deletion requests from the same place as when forwarding fxa deletions to amplitude. yes, I can do that.
Comment 4•4 years ago
|
||
\o/ thanks, :relud!
Assignee | ||
Comment 5•4 years ago
|
||
:_6a68 what field should I use to map account deleted events to sync pings?
I see that payload.uid
exists in sync_v4
pings, and TO_HEX(SHA256(jsonPayload.fields.user_id))
is used in fxa_users_daily_v1
, but one is 32 characters while the other is 64 characters, and I see no overlap in values over the last week when I join on the first or last 32 characters.
Assignee | ||
Updated•4 years ago
|
Comment 6•4 years ago
|
||
The TO_HEX(SHA256(jsonPayload.fields.user_id))
is an alternate hash that we used before the relevant HMAC key was available.
What you need here is the firefox_accounts_derived.fxa_amplitude_user_ids_v1
table. Sync pings contain the first 32 chars of the hashed UID using the same HMAC key as used for sending data to Amplitude. You need to look up the full 64-char hash from fxa_amplitude_user_ids_v1 based on the 32-char prefix you get from sync pings.
Assignee | ||
Comment 7•4 years ago
|
||
perfect, thank you.
Comment 8•4 years ago
|
||
Reading this over again, it looks like you're going the other direction. So deletion events in auth logs have the unhashed ID. You need to do an HMAC hash of that to get the 64-char Amplitude ID, and then take just the first 32 digits. That should be the id matching what we have in sync pings. So no need for fxa_amplitude_user_ids_v1
in this case.
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Description
•