Coordinate deployment of Account Ecosystem Telemetry pipeline support
Categories
(Data Platform and Tools :: General, task, P1)
Tracking
(Not tracked)
People
(Reporter: klukas, Assigned: klukas)
References
Details
We will need to coordinate with Data Ops to deploy AccountEcosystemDecryptor Dataflow jobs and to update routing in the edge service to route AET pings to the decryptor.
| Assignee | ||
Updated•5 years ago
|
| Assignee | ||
Comment 1•5 years ago
|
||
New options now exist in the Decoder to allow deployment of a pipeline family that support AET decryption.
| Assignee | ||
Comment 2•5 years ago
|
||
:whd is working on the deployment configuration for stage today.
| Assignee | ||
Comment 3•5 years ago
|
||
AET Decryptor is deployed in stage and is correctly decrypting payloads. The relevant tables for watching payloads flow through stage is:
moz-fx-data-shar-nonprod-efed.telemetry_live.account_ecosystem_v4moz-fx-data-shar-nonprod-efed.firefox_accounts_live.account_ecosystem_v1
Example curl invocation for sending a payload through stage for FxA:
curl -k -X POST "https://stage.ingestion.nonprod.dataops.mozgcp.net/submit/firefox-accounts/account-ecosystem/1/$(uuidgen)" -d '{"ecosystem_client_id":"foo","ecosystem_device_id":"bar","ecosystem_anon_id":"eyJraWQiOiJMbFU0a2VPbWhUdXE5ZkNObnBJbGRZR1Q5dlQ5ZElEd251X1NCdFRnZUVRIiwiYWxnIjoiRUNESC1FUytBMjU2S1ciLCJlbmMiOiJBMjU2R0NNIiwiZXBrIjp7Imt0eSI6IkVDIiwieCI6InpyVHRDYTZDdnhNN0NNMXNXdHlubVVkUW9MSzVpdU01YjZSbHJwWUhxZGsiLCJ5IjoiWEhoWFVJQ21RS0dNbnEwRXVFLXBqRFZ2UGRtTUNHTkRoODNZamEtSVRNcyIsImNydiI6IlAtMjU2In19.aO4mqhu_1C5A4ac99-DMjsbqloeeb9YBQ1kH0ZRYD46pHe_eP35Iyw.LgmHq54T_sSgm5Th.sqjkEhqe5hecP7GsVwqSF4f-9tA2M1_qO3KfOkU_GkE.3bh32H0xW3uvwwyrbEOJ8g"}'
Example curl invocation for sending a payload through stage for desktop telemetry:
curl -k -X POST "https://stage.ingestion.nonprod.dataops.mozgcp.net/submit/telemetry/$(uuidgen)/account-ecosystem/Firefox/77.0a1/default/20200415104457?v=4" -d '{"payload":{"ecosystemClientId":"foo","ecosystemDeviceId":"bar","ecosystemAnonId":"eyJraWQiOiJMbFU0a2VPbWhUdXE5ZkNObnBJbGRZR1Q5dlQ5ZElEd251X1NCdFRnZUVRIiwiYWxnIjoiRUNESC1FUytBMjU2S1ciLCJlbmMiOiJBMjU2R0NNIiwiZXBrIjp7Imt0eSI6IkVDIiwieCI6InpyVHRDYTZDdnhNN0NNMXNXdHlubVVkUW9MSzVpdU01YjZSbHJwWUhxZGsiLCJ5IjoiWEhoWFVJQ21RS0dNbnEwRXVFLXBqRFZ2UGRtTUNHTkRoODNZamEtSVRNcyIsImNydiI6IlAtMjU2In19.aO4mqhu_1C5A4ac99-DMjsbqloeeb9YBQ1kH0ZRYD46pHe_eP35Iyw.LgmHq54T_sSgm5Th.sqjkEhqe5hecP7GsVwqSF4f-9tA2M1_qO3KfOkU_GkE.3bh32H0xW3uvwwyrbEOJ8g"}}'
Note that this desktop example fails schema validation and goes to error output because it lacks various required fields. But the entry in errors contains the correctly decrypted "ecosystemUserId" field.
| Assignee | ||
Comment 4•5 years ago
|
||
AET team is ready to move forward with deploying support in prod, so I will coordinate with :whd about pushing AET Decoder support in the prod data pipeline.
| Assignee | ||
Comment 5•5 years ago
|
||
:whd is planning to push AET support to the prod data pipeline today.
Comment 6•5 years ago
|
||
(In reply to Jeff Klukas [:klukas] (UTC-4) from comment #5)
:whd is planning to push AET support to the prod data pipeline today.
https://github.com/mozilla-services/cloudops-infra/pull/2229 has been deployed to production. The edge routing table update is still propagating but will be done by tomorrow (US time).
| Assignee | ||
Comment 7•5 years ago
|
||
I just attempted to send AET pings to the prod telemetry and structured jobs.
I looks like as soon as the structured AET decoder job attempted to process pings, it started to throw errors indicating it failed to load the cities15000.txt file from GCS, so seems likely there's a permissions issue with the service account. The telemtry job doesn't show those errors, but hasn't yet processed any messages, which indicates a problem either with routing or with the curl invocation I made.
Comment 8•5 years ago
|
||
it started to throw errors indicating it failed to load the
cities15000.txtfile from GCS
This was caused by an error in in the set of terraform apply commands I used to address https://github.com/mozilla-services/cloudops-infra/pull/2229#issuecomment-668769669. The configuration in that PR is correct, I just forgot to target the resource for adding this viewer access. I've done that and it seems to be working now.
which indicates a problem either with routing or with the curl invocation I made.
I'm still investigating this, but it appears at last one message has been routed to telemetry-aet, though I don't see it in the aet decoder subscription.
Comment 9•5 years ago
|
||
I'm still investigating this
This was a permissions issue with the edge addressed by https://github.com/mozilla-services/cloudops-infra/pull/2389. It looks like the edge queued the messages correctly and the publish request I saw go through was actually an error. At this point I believe everything is in order.
| Assignee | ||
Comment 10•5 years ago
|
||
I now see documents that have gotten past the DecryptAetIdentifiers transform in both structured and telemetry, so I can confirm this looks to be working in prod now.
| Assignee | ||
Updated•5 years ago
|
Comment 11•5 years ago
|
||
I expect I probably don't have access to the "live" tables in prod to see my own pings coming in, but I just tried out the patch from Bug 1635659 with prod FxA and prod telemetry stacks, and it claims to have successfully submitted an ecosystem ping. If it's possible to confirm whether we got a successful ping for ecosystemClientId beginning 6ddfcef9 that would be great! (Or alternately, if we got any errors that claim to come from an ISP in Australia, that was probably me).
| Assignee | ||
Comment 12•5 years ago
|
||
You do indeed have access to see the live prod tables and there are indeed several Aussie Broadband pings in there: https://sql.telemetry.mozilla.org/queries/73682/source
And here's a base for investigating AET errors:
https://sql.telemetry.mozilla.org/queries/73683/source
I am assuming that we can tackle locking down access to these tables before AET hits release, but for the current moment it's good to have open access for debugging.
Comment 13•5 years ago
|
||
I am assuming that we can tackle locking down access to these tables before AET hits release, but for the current moment it's good
to have open access for debugging.
Great, yes, this sounds good to me.
Description
•