Closed
Bug 1353110
Opened 7 years ago
Closed 7 years ago
Land pings with telemetry experiment annotations into new source
Categories
(Data Platform and Tools :: General, enhancement, P2)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bugzilla, Assigned: whd)
References
Details
(Whiteboard: [SvcOps])
Now that Bug 1348748 has landed, we should start receiving pings annotated with the new unified experiment annotation object. Pings with a non-empty object should be landed into a new source that includes dimensions for submissionDate, docType, experimentId, and experimentBranch. If a ping is tagged with multiple experiments, it should be written into multiple experiment dimensions.
Updated•7 years ago
|
Points: --- → 1
Priority: -- → P2
Updated•7 years ago
|
Component: Metrics: Pipeline → Pipeline Ingestion
Product: Cloud Services → Data Platform and Tools
Hi Trink, do you have a timeline for when you'll be able to get to this?
Flags: needinfo?(mtrinkala)
Comment 2•7 years ago
|
||
I haven't been in the loop on this. This is for a new direct to parquet output that can write one message to multiple files (all the same schema) correct? So basically you want what the generic parquet output does with an additional loop over the experiments hash. Is there a 'new source' schema or is it just the existing schemas (main, crash etc) with the S3 dimensions specified above? This can be scheduled for the next sprint putting it in production at the very end of May.
Flags: needinfo?(mtrinkala) → needinfo?(ssuh)
Updated•7 years ago
|
Points: 1 → 3
Priority: P2 → P1
Comment 3•7 years ago
|
||
Sunah spec'ed out the requirements for a Heka protobuf stream in IRC https://github.com/mozilla-services/lua_sandbox_extensions/pull/137 Assigning to whd to update the production configurations and deploy (name it as desired): # /pipeline/modules/pipeline/templates/hindsight/output/telemetry_s3.cfg.erb (add this to the existing cfg) experiment_dimension_file = "schema.telemetry.per_experiment.json" # /pipeline/modules/pipeline/files/schema/schema.telemetry.per_experiment.json (create a new dimension specification) { "version": 1, "dimensions": [ { "field_name": "submissionDate", "allowed_values": "*" }, { "field_name": "docType", "allowed_values": "*" }, { "field_name": "experimentId", "allowed_values": "*", "is_variable": true }, { "field_name": "experimentBranch", "allowed_values": "*", "is_variable": true } ] }
Assignee: mtrinkala → whd
Points: 3 → 2
Flags: needinfo?(ssuh)
Assignee | ||
Comment 4•7 years ago
|
||
P2 until next sprint per our deploy cadence.
Priority: P1 → P2
Whiteboard: [SvcOps]
whd: What's the date we can expect this in prod, then?
Flags: needinfo?(whd)
Assignee | ||
Comment 6•7 years ago
|
||
2017-05-29. If this is an urgent request, :trink can probably publish the packages sooner and we can do an out-of-band update.
Flags: needinfo?(whd)
Assignee | ||
Comment 7•7 years ago
|
||
While prepping this deploy I noticed we already have a separate heka output for experiments per bug #1255543. It looks like that data source "telemetry-experiments" was never used, and as a result we should close that bug, remove the existing configuration, and replace it with the work here. :mreid does that sound correct? :sunahsuh mentioned telemetry-cohorts as her placeholder for the new data source name, which I will use unless anyone has a strong opinion about it.
Flags: needinfo?(mreid)
Comment 8•7 years ago
|
||
(In reply to Wesley Dawson [:whd] from comment #7) > While prepping this deploy I noticed we already have a separate heka output > for experiments per bug #1255543. It looks like that data source > "telemetry-experiments" was never used, and as a result we should close that > bug, remove the existing configuration, and replace it with the work here. > :mreid does that sound correct? Yes, that sounds right to me. As far as I know, nobody ever actively used the "telemetry-experiments" data source. We should remove its configuration to avoid confusion. I believe there may be plans to expose the Telemetry Experiments annotation in a compatible way with the output in this bug, so if/when that happens, this bug will fully supersede bug 1255543. > :sunahsuh mentioned telemetry-cohorts as her placeholder for the new data > source name, which I will use unless anyone has a strong opinion about it. Sounds good to me.
Flags: needinfo?(mreid)
Comment 9•7 years ago
|
||
> :sunahsuh mentioned telemetry-cohorts as her placeholder for the new data
> source name, which I will use unless anyone has a strong opinion about it.
Ha, this design simply alters the dimensions for the experiment output (same prefix). However, we can run it as two separate plugins and have it work (albeit less efficiently). In the future the design intentions need to be better communicated/written down and ideally the implementation would be created by the team requesting it and reviewed by ops/myself.
Assignee | ||
Comment 10•7 years ago
|
||
We ended up pushing a patched version per https://github.com/mozilla-services/lua_sandbox_extensions/pull/141 to allow us to run only one output writing to two output prefixes. I've deployed https://github.com/mozilla-services/puppet-config/pull/2589 which removes the old experiments output and replaces it with telemetry-cohorts. I've updated the metadata bucket with this new source and verified it works from ATMO.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Updated•2 years ago
|
Component: Pipeline Ingestion → General
You need to log in
before you can comment on or make changes to this bug.
Description
•