Open
Bug 1336977
Opened 7 years ago
Updated 2 years ago
Make environment fields scalars
Categories
(Toolkit :: Telemetry, task, P4)
Toolkit
Telemetry
Tracking
()
NEW
People
(Reporter: rvitillo, Unassigned)
References
Details
(Whiteboard: [measurement:client])
User Story
Currently several ETL jobs have to be changed when a new field to the environment section is added. Even though there is a (sadly incomplete) JSON schema describing the environment section which could be used to automate this process, the fact that fields are nested in a non uniform way means that one can't easily write generic code, e.g. for alerting or aggregation, that works for any kind of scalar measurement. Similar considerations apply to simpleMeasurements. Since ultimately the environment section contains mostly scalar values, it would be convenient to store those attributes within the scalar section of the ping. That would make it trivial to create generic tooling capable of adapting to schema changes, just like we do with histograms.
No description provided.
Reporter | ||
Updated•7 years ago
|
User Story: (updated)
Reporter | ||
Updated•7 years ago
|
User Story: (updated)
Comment 1•7 years ago
|
||
There's some prior art in bug 1278920
Comment 2•7 years ago
|
||
(Commenting on User Story) > Currently several ETL jobs have to be changed when a new field to the > environment section is added. Even though there is a (sadly incomplete) JSON > schema describing the environment section which could be used to automate > this process, the fact that fields are nested in a non uniform way means > that one can't easily write generic code, e.g. for alerting or aggregation, > that works for any kind of scalar measurement. Similar considerations apply > to simpleMeasurements. Another aspect here is discoverability. Once we document the data in a structured format we could integrate it into tooling like the "data explorer". > Since ultimately the environment section contains mostly scalar values, it > would be convenient to store those attributes within the scalar section of > the ping. That would make it trivial to create generic tooling capable of > adapting to schema changes, just like we do with histograms. This is a medium to long-term goal we have, mostly blocked on finding the time to prioritize it. I want to do this at some point for all the "scalar" data in the main ping, as part of the "main ping cleanup". To keep things simpler, i'll make this bug about the "environment" data specifically. Questions that need to be solved before: (1) How to deal with environment parts that are not scalars? (addons etc.) (2) How to deal with the existing environment data format and its consumers? Lets e.g. assume we track environment data in a separate file, Environment.yaml. Then we can solve (1) by allowing for special "object" values or so? Or we could see if we can flatten all of them into keyed scalars. User prefs are probably best tracked in a separate file (see bug 1330856). For (2), would we try to keep the existing format, building environment scalars into a nested JSON object? (requires some awkward tree walking in the jobs) Or would we serialize environment data into the flat scalar format we use for payload/processes/*/scalars. (would require a lot of job updates.
Summary: Move environment fields and simpleMeasurements to the scalar section → Make environment fields scalars
Updated•7 years ago
|
Priority: -- → P4
Whiteboard: [measurement:client]
Updated•5 years ago
|
Type: defect → task
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•