Closed Bug 1604844 Opened 6 years ago Closed 5 years ago

Add sync ids to deletion request ping

Categories

(Firefox :: Firefox Accounts, task, P2)

task
Points:
3

Tracking

()

RESOLVED FIXED
Firefox 77
Tracking Status
firefox77 --- fixed

People

(Reporter: mreid, Assigned: rfkelly)

References

Details

Attachments

(2 files, 1 obsolete file)

Sync pings contain a separate identifier which we should report in the deletion-request ping, similar to Bug 1602064 for impression id.

Blocks: 1598720
Assignee: nobody → dthorn
Points: --- → 3
Priority: -- → P2

:_6a68 is there someone who works on the appropriate part of the sync client code that we could assign this to?

Flags: needinfo?(jhirsch)

Hey relud! Redirecting you to :rfkelly

Flags: needinfo?(jhirsch) → needinfo?(rfkelly)

Thanks! +:markh, who will be able say more nuanced things about the technical details.

Since this work is crossing over into a new team, :mreid, could you please remind me of the required prioritization/timeline for getting this work done? I recall it being being driven by an important deadline which it will be useful to have called out in this bug.

Summary of the landscape here:

The "sync ping" includes an identifier based on a HMAC of the user's FxA uid. This id is obtained from the server when the browser connects to sync, and it is currently stored only in memory. We currently avoid sending this id in any pings that include the telemetry client_id, since it ties to the user's actual identity via their Firefox Account.

To do a good job of including this in the deletion request, I think we'll need to do two things:

  1. Similar to https://bugzilla.mozilla.org/show_bug.cgi?id=1602064#c5, we need get the ok from Trust to send this alongside the telemetry client_id in the special case of the "delete all my telemetry data" ping.
  2. We need to deal with the potential race between this value becoming available (which IIUC only happens on the first sync after browser start) and the user opting out of telemetry (which might happen before that). Doing the right thing here probably means persisting that value to disk rather than keeping it solely in memory.
Flags: needinfo?(rfkelly) → needinfo?(mreid)

Similar to https://bugzilla.mozilla.org/show_bug.cgi?id=1602064#c5, we need get the ok from Trust to send this alongside
the telemetry client_id in the special case of the "delete all my telemetry data" ping.

Alicia, similar to Bug 1602064 Comment 5, are we OK to include both the telemetry client_id and the hashed_fxa_uid in the same ping in this special case, in order to delete all telemetry data collected about the user?

The risk here is that, anyone who sees the deletion ping containing both these ids, would be able to associate a stable user identifier with telemetry pings having that client_id (and potentially to tie those telemetry pings back to a real user identity, if they're able to defeat the hashing-based "anonymization" on hashed_fxa_uid).

Flags: needinfo?(agray)

(In reply to Ryan Kelly [:rfkelly] from comment #4)

Similar to https://bugzilla.mozilla.org/show_bug.cgi?id=1602064#c5, we need get the ok from Trust to send this alongside
the telemetry client_id in the special case of the "delete all my telemetry data" ping.

Alicia, similar to Bug 1602064 Comment 5, are we OK to include both the telemetry client_id and the hashed_fxa_uid in the same ping in this special case, in order to delete all telemetry data collected about the user?

The risk here is that, anyone who sees the deletion ping containing both these ids, would be able to associate a stable user identifier with telemetry pings having that client_id (and potentially to tie those telemetry pings back to a real user identity, if they're able to defeat the hashing-based "anonymization" on hashed_fxa_uid).

Hi Ryan,

Three questions for you:

  1. Would this ping be combined at all with the impression_id ping described in https://bugzilla.mozilla.org/show_bug.cgi?id=1602064#c5 in order to facilitate the deletion work?
  2. Regarding access to the deletion ping, is this group small and contained?
  3. The hash key for the fxa_uid - would this be destroyed or kept? I'm not sure we can solve for anyone intent on trying to undo the hash, but it would be that much more difficult if we don't keep the key unless there is a reason to keep it around.
Flags: needinfo?(agray) → needinfo?(rfkelly)

Would this ping be combined at all with the impression_id ping described in https://bugzilla.mozilla.org/show_bug.cgi?id=1602064#c5
in order to facilitate the deletion work?

I believe so yes - the idea is to send all the relevant identifiers in a single "deletion ping", in order to be confident that we can get them all sent reliably.

Regarding access to the deletion ping, is this group small and contained?

I'll need to defer that to :mreid or delegate.

The hash key for the fxa_uid - would this be destroyed or kept?

It's kept, because it's a single key that is used for all users; deleting or changing it would make all sync telemetry pings appear as if they're coming from a brand new user.

Flags: needinfo?(rfkelly)
Assignee: dthorn → jhirsch
Component: General → Firefox Accounts
Product: Data Platform and Tools → Firefox

Since this work is crossing over into a new team, :mreid, could you please remind me of the required prioritization/timeline for getting
this work done?

I think we connected the right people via slack to discuss prioritization here, clearing the ni?

Flags: needinfo?(mreid)

Regarding access to the deletion ping, is this group small and contained?

I'll need to defer that to :mreid or delegate.

But I'm going to add a fresh ni? for this question :-)

Flags: needinfo?(mreid)

Answering both questions here, since I already had a half-composed answer typed out yesterday :)

(In reply to Ryan Kelly [:rfkelly] from comment #3)

Since this work is crossing over into a new team, :mreid, could you please remind me of the required prioritization/timeline for getting this work done? I recall it being being driven by an important deadline which it will be useful to have called out in this bug.

We'd like to get this done ASAP, since deletion request pings are already incoming and deletes are being processed on the pipeline side.

Regarding access to the deletion ping, is this group small and contained?

I'll need to defer that to :mreid or delegate.

Not as of now, the deletion ping table has the same access characteristics as all the Telemetry data.

Flags: needinfo?(mreid)

Not as of now, the deletion ping table has the same access characteristics as all the Telemetry data.

Concretely, does this mean that there's a small window (between writing the deletion ping to disk and actually processing the deletions that it triggers) wherein someone with access to the telemetry data could correlate between sync pings and main telemetry pings using the data from the deletion pings?

Flags: needinfo?(mreid)

I'm going to unassign myself, now that this bug is in motion. I'm guessing either :rfkelly or :markh will wind up working on it.

(In reply to Ryan Kelly [:rfkelly] from comment #10)

Not as of now, the deletion ping table has the same access characteristics as all the Telemetry data.

Concretely, does this mean that there's a small window (between writing the deletion ping to disk and actually processing the deletions that it triggers) wherein someone with access to the telemetry data could correlate between sync pings and main telemetry pings using the data from the deletion pings?

Yes, that will be the state if we go with the approach outlined in this bug. If we need stronger restrictions or guarantees, please let me know.

Flags: needinfo?(mreid)

Alicia, does the above give you enough information around the questions from Comment 5?

Flags: needinfo?(agray)

Hi Ryan,

Yes, thank you. I don't want to hold up the work because we need the deletion ping to work so let's move forward. But we should evaluate and make sure we are comfortable with:

  1. the window between the writing to disk and actual deletion that this is as fast as possible and;
  2. the access to the deletion ping table. I'm not entirely sure who/how many people have access to all telemetry data so it's hard for me to say if that's too many or not.
Flags: needinfo?(agray)

Thanks! I've nominated this for inclusion in the next sync team sprint.

We need to deal with the potential race between this value becoming available (which IIUC only happens on the first sync after browser start)
and the user opting out of telemetry (which might happen before that). Doing the right thing here probably means persisting that value to disk
rather than keeping it solely in memory.

I'm not entirely sure what to do about this part though. Mark, do you think it would be tractable to persist the hashed_fxa_uid as part of the FxA signin state, so we can more reliably include it in the telemetry deletion ping?

Flags: needinfo?(markh)

(In reply to Ryan Kelly [:rfkelly] from comment #16)

I'm not entirely sure what to do about this part though. Mark, do you think it would be tractable to persist the hashed_fxa_uid as part of the FxA signin state, so we can more reliably include it in the telemetry deletion ping?

Yeah, I don't see a fundamental issue with that - it might be slightly ugly, but it wouldn't be lonely in that regard.

Flags: needinfo?(markh)

I intend to pick this up this week, so assigning myself. I have some questions about things that need to happen downstream of receiving such a deletion request, but they're probably better for separate bugs.

Assignee: nobody → rfkelly

do you think it would be tractable to persist the hashed_fxa_uid as part of the FxA signin state,
so we can more reliably include it in the telemetry deletion ping?

Having poked at this a bit, I think it will be better to store this in a pref alongside our existing services.sync.username, something like services.sync.hashedUID. That keeps all the data here owned by sync rather than FxA, and will mean we don't interfere with concurrent work happening to refactor the FxA client code in Firefox.

We'll need to be careful to ensure that this pref has the same lifecycle as services.sync.username, e.g. getting cleared on signout.

(Just leaving some implementation notes to myself for future reference as I poke around here).

The sync ping actually includes two identifiers, noted here;

  • the uid field identifies the current user
  • the deviceId field identifies the current device (similar to main telemetry's client_id).

I think we should send both of these in the deletion-request ping, as fields named "sync_uid" and "sync_device_id".

It's not clear to be if we'd want to actually delete all sync pings associated with that sync_uid or just those associated with the specific sync_device_id, but this seems like a policy question that can be resolved independently of sending the data (and we don't gain any privacy by only sending one or the other, since if you know a sync_device_id then you can go find corresponding pings to get the sync_uid).

(In reply to Ryan Kelly [:rfkelly] from comment #19)

Having poked at this a bit, I think it will be better to store this in a pref alongside our existing services.sync.username, something like services.sync.hashedUID. That keeps all the data here owned by sync rather than FxA, and will mean we don't interfere with concurrent work happening to refactor the FxA client code in Firefox.

I believe we still send "sync" telemetry even when sync isn't configured though - although we always send a "null" uid and null device IDs, there is a bug on file so that we do not send null ids because they mess with our ability to create certain dashboards. I don't think we would want this bug's implementation to block us from ever fixing that.

So there's a case to be made that we should both get back to that bug so we don't send null uids and persist it in the fxa branch.

We'll need to be careful to ensure that this pref has the same lifecycle as services.sync.username, e.g. getting cleared on signout.

FTR, anything under services.sync. is cleared when sync is disconnected - and it's possible to disconnect sync without disconnecting the account. Anything under identity.fxaccounts.account. is cleared when the account is signed out.

there is a bug on file so that we do not send null ids because they mess with our ability to create certain dashboards.
I don't think we would want this bug's implementation to block us from ever fixing that.

Is that Bug 1584356? If so, I still stand by my Bug 1584356 Comment 12: "we should [...] do nothing, and instead just wait for ecosystem telemetry."

But you're right, it does seem important to avoid making that harder with whatever we do in this current bug.

See Also: → 1584356

I think we should send both of these in the deletion-request ping, as fields named "sync_uid" and "sync_device_id".

I have a very hacky version of this working, and found a slight snag with my naive approach.

The device ids in the sync ping are 64 characters long. Following the approach in Bug 1602064 I have attempted to add "sync_uid" and "sync_device_id" scalar strings to the deletion-request ping. But telemetry appears to limit scalar strings to a 50 character max length, so I get this logged in the browser console:

deletion.request.sync_device_id - Truncating scalar value to 50 characters

:chutten, ni? you because you suggested the current approach in Bug 1602064 Comment 8 - is it possible to send strings longer 50 chars using this infrastructure, perhaps in some other scalar type? I wasn't able to find anything from looking around in the docs.

One workaround could be to send just, say, the first 32 characters of the "sync_device_id" rather than the whole thing. That prefix should be plenty long enough to ensure it matches uniquely, but it would complicate processing of the deletion-request ping on the server; we'd have to do the equivalent of DELETE FROM pings WHERE deviceID LIKE "$sync_device_id_prefix%" rather than DELETE FROM pings WHERE deviceID="$sync_device_id". But maybe that's OK in practice.

Flags: needinfo?(chutten)

Oh. Hrm. Huh. I was not expecting ids to be longer than 50 utf8 bytes given that a UUID is 36 (38 with {}) and four of those are -.

Some Options

  1. Raise the global String Scalar length limit beyond 50
    • The Glean SDK has a limit of 100 utf8 bytes, maybe we should go with that.
    • Lightweight solution. Aligns metric limits across products.
  2. Raise the String Scalar length limit specifically for ids that need it
    • How would we identify ids? Put it in Scalars.yaml somehow?
    • A more heavyweight solution, but a more narrow one
  3. Send only the first N bytes of the id and ask the pipeline to work on prefixes
    • How certain are we that the first N bytes are unique?
    • ni?relud for how awful this'd look in the Data Engineering side.
  4. Change the ids to be shorter than 50 utf8 bytes
    • Maybe we re-encode them, maybe we truncate them... either way it requires sync work.
    • ni?rfkelly How much work would this be? (I'm guessing "Quite a bit")

I'm guessing we'll go with Option 1, but let's gather some info first.

Flags: needinfo?(rfkelly)
Flags: needinfo?(dthorn)
Flags: needinfo?(chutten)

(In reply to Chris H-C :chutten from comment #24)

  1. Send only the first N bytes of the id and ask the pipeline to work on prefixes
    • How certain are we that the first N bytes are unique?
    • ni?relud for how awful this'd look in the Data Engineering side.

This would actually be trivial to handle, aside from any risk of deleting more data than intended if there's a collision.

Flags: needinfo?(dthorn)

Oh. Hrm. Huh. I was not expecting ids to be longer than 50 utf8 bytes given that a UUID is 36 (38 with {}) and four of those are -.

Aye. I expect these are 64 chars long entirely by accident (because it uses sha256 to mix a few things together, and that outputs 64 chars by default)

  1. Send only the first N bytes of the id and ask the pipeline to work on prefixes
  • How certain are we that the first N bytes are unique?

As certain as I am about anything. We do this "take the first half of a sha256 hash and assume it's unique" in other places in the FxA ecosystem without worrying about collisions, including in the code that generates the corresponding sync_uid field. (Edit: also, stackoverflow says it's fine.)

Either options (1) or (3) sound fine to me, and I'll lean on the preferences of the data team to decide between them.

  1. Change the ids to be shorter than 50 utf8 bytes
  • Maybe we re-encode them, maybe we truncate them... either way it requires sync work.
  • ni?rfkelly How much work would this be? (I'm guessing "Quite a bit")

We could truncate them. I think the challenge would be dealing with existing historical data (both stored on the server, and submitted by older clients) with would probably have to happen in the data pipeline. So it'd be a bit of work, but mostly not work for me :-)

Flags: needinfo?(rfkelly)

I'm pushing my WIP patch. It works in local testing, but:

  • It needs tests!
  • It needs data-review!
  • It assumes we've chosen the "truncated sync_device_id to 32 chars" option from Comment 24.
  • It assumes some things about how the FxA code works that aren't actually true (but I think they should be made true in a separate bug).

I'll dive into those details presently, but would appreciate an "is this wildly off base?" check if you get a chance :markh.

Attached file data_review_sync_ping_ids.txt (obsolete) —

Here's the data-review details; :chutten, could you please take a look or delegate as appropriate?

In particular, per Comment 20 please observe that we're sending both a user-level and a client-level identifier here. I'm not fully abreast of the shredder requirements here and it may be that the user-level identifier is not required and thus we shouldn't send it.

Ref Comment 14 and its ancestors for some discussion with Trust around the sensitive nature of these identifiers (the user-level id and the client-level id have the same risks around beig correlated back to a user account).

Attachment #9142942 - Flags: data-review?(chutten)

heh - forgot about the lack of tests, so I should retract my r+ ;) I think it looks fine.

(In reply to Ryan Kelly [:rfkelly] from comment #29)

In particular, per Comment 20 please observe that we're sending both a user-level and a client-level identifier here. I'm not fully abreast of the shredder requirements here and it may be that the user-level identifier is not required and thus we shouldn't send it.

I asked about this at the recent Shredder meeting. Deletion requests sent from a client are about data collected by that client. It would be unexpected to cascade that to every client that the user uses across the ecosystem of sync-connected products.

Comment on attachment 9142942 [details] data_review_sync_ping_ids.txt Clearing data review until we're down to just the one id.
Attachment #9142942 - Flags: data-review?(chutten)

Thanks for clarifying :chutten, I've updated the review request to include only the device-specific id.

Attachment #9142942 - Attachment is obsolete: true
Attachment #9143522 - Flags: data-review?(chutten)
Comment on attachment 9143522 [details] data_review_sync_ping_ids.txt DATA COLLECTION REVIEW RESPONSE: Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate? Yes. This collection is Telemetry so is documented in its definitions file [Scalars.yaml](https://hg.mozilla.org/mozilla-central/file/tip/toolkit/components/telemetry/Scalars.yaml) and the [Probe Dictionary](https://telemetry.mozilla.org/probe-dictionary/). Is there a control mechanism that allows the user to turn the data collection on and off? Yes. This collection is Telemetry so can be controlled through Firefox's Preferences. If the request is for permanent data collection, is there someone who will monitor the data over time? Yes, :rfkelly is responsible. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under? Category 4, Highly Sensitive. Is the data collection request for default-on or default-off? Default on for all channels. Does the instrumentation include the addition of any new identifiers? No. The identifiers aren't new. Is the data collection covered by the existing Firefox privacy notice? Yes. Does there need to be a check-in in the future to determine whether to renew the data? No. This collection is permanent. --- Result: datareview+ given Alicia Gray's (Trust's) comments in https://bugzilla.mozilla.org/show_bug.cgi?id=1604844
Attachment #9143522 - Flags: data-review?(chutten) → data-review+
Pushed by rkelly@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ddc24de527d1 add identifiers from the sync ping to the deletion-request ping. r=markh,chutten
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 77

:mreid, I assume there's other pipeline work that needs to be done to consume the sync_device_id field and delete ping records on the server, is that tracked in other bugs that should be cross-linked here? (Or, can I be of any help sketching out the details of such bugs?)

Flags: needinfo?(mreid)

Redirect to :relud :)

Daniel, is there a bug to cover the corresponding pipeline work once this id starts flowing in?

Flags: needinfo?(mreid) → needinfo?(dthorn)
Blocks: 1635221

there is now: bug 1635221

Flags: needinfo?(dthorn)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: