Closed Bug 1288445 Opened 3 years ago Closed 3 years ago

Sync telemetry ping should hash FxA UID

Categories

(Firefox :: Sync, defect, P1)

defect

Tracking

()

RESOLVED FIXED
Firefox 51
Tracking Status
firefox50 --- fixed
firefox51 --- fixed

People

(Reporter: tcsc, Assigned: tcsc)

References

Details

(Whiteboard: [fxsync])

Attachments

(2 files)

It turns out that sql.tmo only has these hashed, and so we only need them that way. Hashing them on the client before sending should be trivial and sounds like a win for privacy.

It doesn't sound worth it to bump the sync ping schema version for this, but I could be wrong.

We should keep the value of all "0".repeat(32) for unknown/invalid UID since that still sounds like useful information that has no bearing on privacy whatsoever.
:dcoates, do you know which hash function we'll need to use on the UID?
Assignee: nobody → tchiovoloni
Flags: needinfo?(dcoates)
(In reply to Thom Chiovoloni [:tcsc] from comment #0)
> It doesn't sound worth it to bump the sync ping schema version for this, but
> I could be wrong.

Yeah, this shouldn't be a big problem while we're still on Nightly and developing things. Just update the schema without bumping the version number for now.
I think we'll want to do this on the server. Its actually a SHA256 HMAC of the fxa uid and a secret set in the server config.

These 2 bits of code are where it happens:

https://github.com/mozilla-services/tokenserver/blob/master/tokenserver/views.py#L113-L125
https://github.com/mozilla-services/tokenserver/blob/master/tokenserver/util.py#L57-L66

Or the ping could send up the fxa_uid from the token, but that seems like more work.
Flags: needinfo?(dcoates)
So I think we've come to the conclusion that: browserid_identity already has the full token from the tokenServer, which is a JSON blob. Currently we use only api_endpoint, but also in that payload is a field called |id|, which it a base64 encoded version of yet another JSON blob (but apparently with trailing garbage we can probably work around). Inside *that* blob is a field fxa_uid. This fxa_uid is the uid logged by the server. We should change the client code to send this as the uid rather than the "real" uid.

We should try and get this done ASAP. Sadly tests will probably be the most difficult part (ie, we'll probably need to ensure that all tests which validate telemetry now have a fake token that roughly corresponds to the above) - probably not particularly difficult, just tedious.
> (but apparently with trailing garbage we can probably work around).

If by "garbage" you mean "HMAC signature so the storage node can check validity of the token", sure :-)

I have reservations about shipping client code that parses the "id" field, it's supposed to be an opaque token.  If you want a value out of it, I'd prefer that the server explicitly pass you that value in plain text as another field in the response.
(In reply to Ryan Kelly [:rfkelly] from comment #6)
> That is to say, have the server return fxa_uid as another field in this
> response:

That sounds perfect (although it might make sense to use a different name, as it's not the "fxa uid" as the rest of the code thinks of it (ie, maybe hashed_uid would be better?)
I opened https://github.com/mozilla-services/tokenserver/issues/92

I propose that once that gets traction, we change the client code to look for the new entry in the token and if it exists, use that as the UID. If it doesn't exist, we don't record any UID. We could land the client change before the server change hits prod.
Priority: -- → P1
Attachment #8773548 - Flags: review?(dcoates)
Depends on: 1290315
Whiteboard: [fxsync]
Comment on attachment 8779398 [details]
Bug 1288445 - Use new hashed_fxa_uid field instead of the real FxA UID in sync telemetry ping.

https://reviewboard.mozilla.org/r/70384/#review67846

Looks great!

::: toolkit/components/telemetry/docs/sync-ping.rst:89
(Diff revision 1)
>  These values should be monotonic.  If we can't get a monotonic timestamp, -1 will be reported on the payload, and the values will be omitted from the engines. Additionally, the value will be omitted from an engine if it would be 0 (either due to timer inaccuracy or finishing instantaneously).
>  
>  uid
>  ~~~
>  
> -This property containing the FxA account identifier, which is provided by the FxA auth server APIs: `<https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md>`_. It may be an empty string in the case that we are unable to authenticate with FxA, and have never authenticated in the past.  If present, it should be a 32 character hexidecimal string.
> +This property containing a hash of the FxA account identifier. It may be an empty string in the case that we are unable to authenticate with FxA, and have never authenticated in the past.  If present, it should be a 32 character hexidecimal string.

It seems these docs are wrong about the empty string, and that it is "0" * 32 - if that's correct we might as well update these docs at the same time.
Attachment #8779398 - Flags: review?(markh) → review+
Comment on attachment 8779398 [details]
Bug 1288445 - Use new hashed_fxa_uid field instead of the real FxA UID in sync telemetry ping.

This is a very small change to the data we're sending (replacing a UID with a hash of the same UID sent down from the server).
Attachment #8779398 - Flags: review?(benjamin)
Comment on attachment 8779398 [details]
Bug 1288445 - Use new hashed_fxa_uid field instead of the real FxA UID in sync telemetry ping.

https://reviewboard.mozilla.org/r/70384/#review68284

::: toolkit/components/telemetry/docs/sync-ping.rst:89
(Diff revision 2)
>  These values should be monotonic.  If we can't get a monotonic timestamp, -1 will be reported on the payload, and the values will be omitted from the engines. Additionally, the value will be omitted from an engine if it would be 0 (either due to timer inaccuracy or finishing instantaneously).
>  
>  uid
>  ~~~
>  
> -This property containing the FxA account identifier, which is provided by the FxA auth server APIs: `<https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md>`_. It may be an empty string in the case that we are unable to authenticate with FxA, and have never authenticated in the past.  If present, it should be a 32 character hexidecimal string.
> +This property containing a hash of the FxA account identifier, which is a 32 character hexidecimal string.  In the case that we are unable to authenticate with FxA and have never authenticated in the past, it will be a placeholder string consisting of 32 repeated ``0`` characters.

This is a little weird to me: I'd normally expect a value like this to just be `null`. But that's just an implementation suggestion, not a requirement.
Attachment #8779398 - Flags: review?(benjamin) → review+
Comment on attachment 8779398 [details]
Bug 1288445 - Use new hashed_fxa_uid field instead of the real FxA UID in sync telemetry ping.

https://reviewboard.mozilla.org/r/70384/#review68316

::: toolkit/components/telemetry/docs/sync-ping.rst:89
(Diff revision 2)
>  These values should be monotonic.  If we can't get a monotonic timestamp, -1 will be reported on the payload, and the values will be omitted from the engines. Additionally, the value will be omitted from an engine if it would be 0 (either due to timer inaccuracy or finishing instantaneously).
>  
>  uid
>  ~~~
>  
> -This property containing the FxA account identifier, which is provided by the FxA auth server APIs: `<https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md>`_. It may be an empty string in the case that we are unable to authenticate with FxA, and have never authenticated in the past.  If present, it should be a 32 character hexidecimal string.
> +This property containing a hash of the FxA account identifier, which is a 32 character hexidecimal string.  In the case that we are unable to authenticate with FxA and have never authenticated in the past, it will be a placeholder string consisting of 32 repeated ``0`` characters.

Unless we were to bump the version, I don't see the benefit of doing this at this point -- This is already the current behavior, it was just misdocumented. So all code will likely have to handle "0" * 32 anyway, and if we added null, it would need to handle that as well.
(Whoops, may have entered that in the wrong spot on mozreview. That was a reply to :bsmedberg's comment directly above).

Anyway...
Keywords: checkin-needed
Pushed by kwierso@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/e4437c4da937
Use new hashed_fxa_uid field instead of the real FxA UID in sync telemetry ping. r=bsmedberg,markh
Keywords: checkin-needed
Ack, apologies. Should be fixed: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6ad8b983b588. 

It looks like one of the tests is failing due to having the code for the patch from bug 1295138, but I can rebase if you need me to.
Flags: needinfo?(tchiovoloni)
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/1bdad03af959
Use new hashed_fxa_uid field instead of the real FxA UID in sync telemetry ping. r=bsmedberg,markh
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/1bdad03af959
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 51
Comment on attachment 8779398 [details]
Bug 1288445 - Use new hashed_fxa_uid field instead of the real FxA UID in sync telemetry ping.

Approval Request Comment
[Feature/regressing bug #]: Sync telemetry ping
[User impact if declined]: Inaccurate sync metrics
[Describe test coverage new/current, TreeHerder]: Tests pass
[Risks and why]: Low
[String/UUID change made/needed]: None

We now hash the userid that we report to telemetry with the sync ping. This is both for user privacy and consistency with server metrics. The Sync ping landed in 50, so we should get this there too.
Attachment #8779398 - Flags: approval-mozilla-aurora?
Blocks: 1298758
No longer blocks: 1298758
Comment on attachment 8779398 [details]
Bug 1288445 - Use new hashed_fxa_uid field instead of the real FxA UID in sync telemetry ping.

Makes sense, a test was added for this, Aurora50+
Attachment #8779398 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Depends on: 1295188
No longer depends on: 1295188
You need to log in before you can comment on or make changes to this bug.