Bug 1584356 Comment 6 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

I don't think we have a good shared understanding of the intended privacy/security properties of the `hashed_fxa_uid` identifier, or to be honest, of how well it actually meets them. Here's what I can reconstruct from memory:

1. We want to be able to submit sync-related telemetry pings, and correlate pings that came from the same user on different devices.
2. We do not want to be able to easily correlate those telemetry pings back to a real user account.
3. So, we have tokenserver [hmac the fxa uid with a server-side secret key](https://github.com/mozilla-services/tokenserver/blob/e048a0656c5a7ee596bfbeac21ca47569d05a569/tokenserver/views.py#L121) and give that value to the client when connecting to sync.
4. Sync clients include this `hashed_fxa_uid` in their sync telemetry ping, allowing us to achieve (1) while making (2) harder.

The important considerations around handling of `hashed_fxa_uid` are thus:

* Who is able to reverse a `hashed_fxa_uid` back to the underlying `uid` (and thus link any sync telemetry ping back to a real user)?
* Who is able to find out a particular user's `hashed_fxa_uid` (and thus identify all the sync telemetry pings from that user)?

Whatever we do here, we should try not to change the answer to those two questions, or be very deliberate about it if we do.

We want to replace this ad-hoc sync-specific telemetry mechanism with general-purpose [ecosystem telemetry](https://docs.google.com/document/d/1rR3uJG8sVtow4plYu6M5Jp5e5zOSrLzdkUx8BdO8kOM/) in the short- to mid-term, and I'm cc'ing :m_and_m who is leading that effort. I'll also note that a similar hmacing-with-a-secret-key approach was proposed for ecosystem telemetry and was rejected, because in practice it won't stop Mozilla operations folks from having both capabilities listed above.

But, it's the scheme we have for now in sync.

---

We're working to make it possible for users to sign in to Firefox and use services other than Sync, such as Monitor and FPN. Part of this "decoupling" work is to decouple signed-in-to-Firefox telemetry from sync-specific telemetry. Given that ecosystem telemetry isn't ready yet, we want to keep using the `hashed_fxa_uid` mechanism in order to avoid losing visibility into browser-side FxA metrics.

So, we'd like a way to obtain the `hashed_fxa_uid` without accessing the Sync service (since accessing the Sync service just to get some metrics stuff, might itself generate misleading metrics about usage of the Sync service!).

---

Given our ambitions to replace `hashed_fxa_uid` with ecosystem telemetry, I think it'd be easy to spend too much effort on designing an elegant solution here.  I had a couple of less-than-elegant ideas.

#### 1) Return it as part of the user's profile data

The simplest option I can think of here is to have fxa-profile-server return a new `hashed_uid` field as part of the user's [profile data bundle](https://github.com/mozilla/fxa/blob/master/packages/fxa-profile-server/docs/API.md#get-v1profile). Our clients are calling that endpoint anyway, and desktop has robust support for storing and querying the data it returns. This would involve copying the [hmacing logic from tokenserver](https://github.com/mozilla-services/tokenserver/blob/e048a0656c5a7ee596bfbeac21ca47569d05a569/tokenserver/views.py#L133) into fxa-profile-server, along with the corresponding secret key.

Pros:
* Small change, if a bit weird.
* Client can avoid making an extra request, since it's fetching profile data already.

Cons:
* Anyone with a "profile"-scoped access token can find out the `hashed_fxa_uid` for that user.
  * In theory such tokens are only held by internal Mozilla RPs, but still, it's a change to the boundaries of the system.

#### 2) Return it as part of the `fxaccounts:login` webchannel message

The [login data](https://github.com/mozilla/fxa/blob/master/packages/fxa-content-server/docs/relier-communication-protocols/fx-webchannel.md#loginData) returned during webchannel login already contains the raw `uid`.  We could update fxa-auth-server to calculate `hashed_fxa_uid` and return it in the response for `/account/create`, `/account/login` and `/session/reauth` requests. The web content on accounts.firefox.com could then forward it on to the browser as part of the webchannel message.

Pros:
* Keeps current access restrictions intact
  * (the data in `fxaccounts:login` is already sufficient to find out `hashed_fxa_uid`, by asking the tokenserver for it)
* Client can avoid making an extra request, since it receives and stores this data already

Cons:
* More complicated change, and still pretty weird.
* Only available to webchannel clients

Do either of these alternatives seem like clearly the right way to go?
I don't think we have a good written-down understanding of the intended privacy/security properties of the `hashed_fxa_uid` identifier, or to be honest, of how well it actually meets them. Here's what I can reconstruct from memory:

1. We want to be able to submit sync-related telemetry pings, and correlate pings that came from the same user on different devices.
2. We do not want to be able to easily correlate those telemetry pings back to a real user account.
3. So, we have tokenserver [hmac the fxa uid with a server-side secret key](https://github.com/mozilla-services/tokenserver/blob/e048a0656c5a7ee596bfbeac21ca47569d05a569/tokenserver/views.py#L121) and give that value to the client when connecting to sync.
4. Sync clients include this `hashed_fxa_uid` in their sync telemetry ping, allowing us to achieve (1) while making (2) harder.

The important considerations around handling of `hashed_fxa_uid` are thus:

* Who is able to reverse a `hashed_fxa_uid` back to the underlying `uid` (and thus link any sync telemetry ping back to a real user)?
* Who is able to find out a particular user's `hashed_fxa_uid` (and thus identify all the sync telemetry pings from that user)?

Whatever we do here, we should try not to change the answer to those two questions, or be very deliberate about it if we do.

We want to replace this ad-hoc sync-specific telemetry mechanism with general-purpose [ecosystem telemetry](https://docs.google.com/document/d/1rR3uJG8sVtow4plYu6M5Jp5e5zOSrLzdkUx8BdO8kOM/) in the short- to mid-term, and I'm cc'ing :m_and_m who is leading that effort. I'll also note that a similar hmacing-with-a-secret-key approach was proposed for ecosystem telemetry and was rejected, because in practice it won't stop Mozilla operations folks from having both capabilities listed above.

But, it's the scheme we have for now in sync.

---

We're working to make it possible for users to sign in to Firefox and use services other than Sync, such as Monitor and FPN. Part of this "decoupling" work is to decouple signed-in-to-Firefox telemetry from sync-specific telemetry. Given that ecosystem telemetry isn't ready yet, we want to keep using the `hashed_fxa_uid` mechanism in order to avoid losing visibility into browser-side FxA metrics.

So, we'd like a way to obtain the `hashed_fxa_uid` without accessing the Sync service (since accessing the Sync service just to get some metrics stuff, might itself generate misleading metrics about usage of the Sync service!).

---

Given our ambitions to replace `hashed_fxa_uid` with ecosystem telemetry, I think it'd be easy to spend too much effort on designing an elegant solution here.  I had a couple of less-than-elegant ideas.

#### 1) Return it as part of the user's profile data

The simplest option I can think of here is to have fxa-profile-server return a new `hashed_uid` field as part of the user's [profile data bundle](https://github.com/mozilla/fxa/blob/master/packages/fxa-profile-server/docs/API.md#get-v1profile). Our clients are calling that endpoint anyway, and desktop has robust support for storing and querying the data it returns. This would involve copying the [hmacing logic from tokenserver](https://github.com/mozilla-services/tokenserver/blob/e048a0656c5a7ee596bfbeac21ca47569d05a569/tokenserver/views.py#L133) into fxa-profile-server, along with the corresponding secret key.

Pros:
* Small change, if a bit weird.
* Client can avoid making an extra request, since it's fetching profile data already.

Cons:
* Anyone with a "profile"-scoped access token can find out the `hashed_fxa_uid` for that user.
  * In theory such tokens are only held by internal Mozilla RPs, but still, it's a change to the boundaries of the system.

#### 2) Return it as part of the `fxaccounts:login` webchannel message

The [login data](https://github.com/mozilla/fxa/blob/master/packages/fxa-content-server/docs/relier-communication-protocols/fx-webchannel.md#loginData) returned during webchannel login already contains the raw `uid`.  We could update fxa-auth-server to calculate `hashed_fxa_uid` and return it in the response for `/account/create`, `/account/login` and `/session/reauth` requests. The web content on accounts.firefox.com could then forward it on to the browser as part of the webchannel message.

Pros:
* Keeps current access restrictions intact
  * (the data in `fxaccounts:login` is already sufficient to find out `hashed_fxa_uid`, by asking the tokenserver for it)
* Client can avoid making an extra request, since it receives and stores this data already

Cons:
* More complicated change, and still pretty weird.
* Only available to webchannel clients

Do either of these alternatives seem like clearly the right way to go?

Back to Bug 1584356 Comment 6