Closed
Bug 1291340
Opened 9 years ago
Closed 8 years ago
Port Sync server metrics to production
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dcoates, Assigned: mreid)
References
Details
(Whiteboard: [sync-metrics])
Attachments
(2 files, 1 obsolete file)
Sync metrics[1] are currently running in the dev environment (cloudservices-aws-dev) using my one-off method[2]
This bug tracks the work required to transition these metrics to production.
[1] https://sql.telemetry.mozilla.org/dashboard/sync
[2] https://github.com/dannycoates/smt
Updated•9 years ago
|
Priority: -- → P2
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → mreid
Priority: P2 → P1
Assignee | ||
Updated•9 years ago
|
Points: --- → 3
Assignee | ||
Comment 1•9 years ago
|
||
Per a couple of discussions this week, we need to either run the conversion + export code in the prod IAM, or store a pii-scrubbed version of the data that we can access from the dev IAM (specifically from analysis.telemetry.mozilla.org- and airflow-launched instances).
I would prefer to be able to run the import / rollup code via airflow, but if it's significantly easier to run it in prod, let's do that.
Wesley, what do you think?
Flags: needinfo?(whd)
Assignee | ||
Updated•9 years ago
|
Priority: P1 → P2
Comment 2•9 years ago
|
||
As discussed in IRC/vidyo, if the analysis doesn't need access to the PII fields I can refactor https://github.com/mozilla-services/puppet-config/pull/2031/files to facilitate this.
Flags: needinfo?(whd)
Assignee | ||
Comment 3•8 years ago
|
||
Danny, can you confirm that we don't need access to any PII fields?
Flags: needinfo?(dcoates)
Reporter | ||
Comment 4•8 years ago
|
||
Here are the fields we need:
uid CHAR(32) NOT NULL encode lzo, -- a sha256-hashed Firefox Account (FxA) user id
s_uid CHAR(32) encode lzo, -- a surrogate user id (generated by token server)
dev CHAR(32) NOT NULL encode lzo, -- a sha256-hashed FxA device id
s_dev CHAR(32) encode lzo, -- a surrogate device id (generated by token server)
ts TIMESTAMP NOT NULL encode lzo, -- timestamp of the request
method VARCHAR(32) encode lzo, -- request method (GET, POST, etc)
code SMALLINT encode lzo, -- http status code of the response
bucket VARCHAR(255) encode bytedict, -- sync bucket name (bookmarks, history, etc)
t INTEGER encode bytedict, -- request time in milliseconds
ua_browser VARCHAR(255) encode lzo, -- request User Agent browser
ua_version INTEGER encode lzo, -- request User Agent browser version
ua_os VARCHAR(255) encode lzo, -- request User Agent OS
host VARCHAR(255) encode lzo -- server hostname that handled the request
None of those are immediately PII, though with enough access (to our internal production systems) one could probably decipher what the actual fxa uid and dev are.
Flags: needinfo?(dcoates)
Comment 6•8 years ago
|
||
(In reply to Mark Reid [:mreid] from comment #5)
> @whd, does this list look ok?
To me, yes.
Flags: needinfo?(whd)
Comment 7•8 years ago
|
||
Taking this to make the data available to ATMO, whereupon :mreid will take over.
Assignee: mreid → whd
Priority: P2 → P1
Comment 8•8 years ago
|
||
https://github.com/mozilla-services/puppet-config/pull/2263
Data should be available in s3://net-mozaws-prod-us-west-2-pipeline-analysis/sync-metrics/data, which should be accessible from ATMO etc. Back to :mreid.
Assignee: whd → mreid
Assignee | ||
Updated•8 years ago
|
Priority: P1 → P2
Comment 9•8 years ago
|
||
Just checking in to see what's left to get this in production. A lot of work was put into this up until now and I feel like we're really close to the finish line.
Until it lands in prod, all of Danny's beautiful dashboards are reporting back inaccurate numbers because data is struggling to refresh in the dev environment. https://sql.telemetry.mozilla.org/dashboard/sync
Thanks in advance for updating me on this status.
Assignee | ||
Comment 10•8 years ago
|
||
I'm back on this as my #1 item this week. I should have something preliminary in the next few days.
Priority: P2 → P1
Comment 11•8 years ago
|
||
(In reply to Mark Reid [:mreid] from comment #10)
> I'm back on this as my #1 item this week. I should have something
> preliminary in the next few days.
Great news! Thanks for update!
Assignee | ||
Comment 12•8 years ago
|
||
r? @Danny for adherence to the previous conversion and rollup logic.
r? @Roberto for partitioning, Dataset mangling, and general Spark stuff (feel free to redirect to others).
Flags: needinfo?(dcoates)
Attachment #8810833 -
Flags: review?(rvitillo)
Updated•8 years ago
|
Attachment #8810833 -
Flags: review?(rvitillo) → review-
Assignee | ||
Comment 13•8 years ago
|
||
Attachment #8810833 -
Attachment is obsolete: true
Attachment #8824450 -
Flags: review?(rvitillo)
Comment 14•8 years ago
|
||
Updated•8 years ago
|
Attachment #8824450 -
Flags: review?(rvitillo) → review+
Assignee | ||
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(dcoates)
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•