Closed Bug 1339405 Opened 8 years ago Closed 8 years ago

Add optional locale and uid fields to FxA flow metrics export

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: pb, Assigned: whd)

Details

We want to add optional locale and (hashed) uid fields to the FxA flow metrics. Optional because: 1. The change to add them to our flow event logs will not be deployed until FxA train 81 and; 2. Even post-deployment, we will emit many events that don't have these fields. The absence of either field is non-terminal and should leave an empty field in that row of the CSV. And we want the uid hash to be consistent with the retention export, so that we can track metrics across the two data sets in redshift. I had a stab at implementing this in puppet-config PR, because I heard the data-pipeline team is very busy at the moment: https://github.com/mozilla-services/puppet-config/pull/2488 Feel free to ignore that if it's unhelpful or wrong.
Assignee: nobody → whd
Points: --- → 1
Status: NEW → ASSIGNED
Priority: -- → P1
Exports from 2017-02-23 on should have the extra fields (or blanks until they start showing up). The change occurred in the middle of the file, but hopefully that's not an issue as it's only adding columns. Possibly everything will need to be re-exported for bug #1341966 anyway.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Hey :whd, apologies for not checking this sooner but there appears to be a problem with the uids from this change. I was expecting them to match up with the uids in the retention export but they seem to be different. If I compare the first few `account.created` events from yesterday's exports as an example: > $ grep account.created flow-2017-02-27.csv | head -3 > 1488153603,account.created,5550b11840e290500e292508dbd4d81e7de54df83d1c3d63058c9d86aa680888,65877,Firefox,51,Windows 8,,,,sync,,,,,,, > 1488153604,account.created,3b81221c56656db793ee8de04180580a56f5c1b7b0393207574fe4fd70449720,86764,Firefox,51,Windows 8.1,,,,sync,,,,,,, > 1488153606,account.created,374fd4af5643379ac82a70ed15741d76713b39dfc827177131d501e5c3df7390,34161,Firefox,51,Windows 10,,,,sync,,,,,,, > $ grep account.created events-2017-02-27.csv | head -3 > 1488153603,Firefox,51,Windows 8,af0c51f90b15a5d99629b6d9723fad6f854fa0a2a4f6ef247d96fc7c879f68b7,account.created,sync, > 1488153604,Firefox,51,Windows 8.1,4c4896e7a70b7ccf5f4885d06a172587d6cc3361eb52996ec1f63cd000697725,account.created,sync, > 1488153606,Firefox,51,Windows 10,9709db0e40bdfc38dee0909a7ae4aebda7a4c0b74286e181846ba1910873bcf4,account.created,sync, The timestamps and user-agent details all match, but the uids are different. My intention in the puppet-config PR was to hash both uids using the same key. Did I do something wrong, or is there something else at play here that makes them hash differently?
Flags: needinfo?(whd)
I think the problem is that the columns are unlabeled and not in the same order. I'm pretty sure the third column of the flow export is a flow_id and is not hmac'd, whereas the fourth column of the retention export is an hmac'd uid. The hmac'd uid in the flow logs should be the last (18th?) column per https://github.com/mozilla-services/puppet-config/pull/2488/. FWIW, I don't see the new uid in the flow logs yet for today, so if it's supposed to be there by now there's a separate issue.
Flags: needinfo?(whd)
Urgh, good point, sorry for not reading/thinking properly. Digging into it elsewhere now.
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.