Closed Bug 1604846 Opened 4 years ago Closed 4 years ago

Use FxA account deletions as deletion requests in shredder

Categories

(Data Platform and Tools :: General, task, P1)

task
Points:
1

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mreid, Assigned: relud)

References

(Blocks 1 open bug)

Details

Similar to bug 1604844, this will let us relay the signal from Firefox Account deletions through to sync telemetry.

This may not actually be necessary, but if it is, we can track the work here.

Assignee: nobody → dthorn
Points: --- → 3
Priority: -- → P2
Assignee: dthorn → jhirsch

It occurs to me today that the solution I proposed for relaying FxA deletion events to Amplitude in bug 1651260 could equally well work here.

The solution outlined there is to use the account deletion event logged by the FxA auth server, which is already available in BigQuery, and to hash the uid before sending to Amplitude. But sync telemetry uses the same hashed fxa uid, so this approach should work for cascading Firefox Account deletions to sync telemetry deletion.

:relud, is this something that's easy to do at the same time as the bug 1651260 work?

Flags: needinfo?(dthorn)

doing it that way should be trivial, I just need to know the bq table, filtering conditions, and hashing method.

Flags: needinfo?(dthorn)

I just need to know the bq table, filtering conditions, and hashing method.

ah, upon re-reading i see that this is about shredder for telemetry (not amplitude) getting deletion requests from the same place as when forwarding fxa deletions to amplitude. yes, I can do that.

Assignee: jhirsch → dthorn
Points: 3 → 1
Priority: P2 → P1

:_6a68 what field should I use to map account deleted events to sync pings?

I see that payload.uid exists in sync_v4 pings, and TO_HEX(SHA256(jsonPayload.fields.user_id)) is used in fxa_users_daily_v1, but one is 32 characters while the other is 64 characters, and I see no overlap in values over the last week when I join on the first or last 32 characters.

Flags: needinfo?(jhirsch)

The TO_HEX(SHA256(jsonPayload.fields.user_id)) is an alternate hash that we used before the relevant HMAC key was available.

What you need here is the firefox_accounts_derived.fxa_amplitude_user_ids_v1 table. Sync pings contain the first 32 chars of the hashed UID using the same HMAC key as used for sending data to Amplitude. You need to look up the full 64-char hash from fxa_amplitude_user_ids_v1 based on the 32-char prefix you get from sync pings.

Flags: needinfo?(jhirsch)

perfect, thank you.

Reading this over again, it looks like you're going the other direction. So deletion events in auth logs have the unhashed ID. You need to do an HMAC hash of that to get the 64-char Amplitude ID, and then take just the first 32 digits. That should be the id matching what we have in sync pings. So no need for fxa_amplitude_user_ids_v1 in this case.

Summary: Add a relying party to relay FxA account deletions to the Firefox Telemetry platform → Use FxA account deletions as deletion requests in shredder
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.