Closed
Bug 1353105
Opened 8 years ago
Closed 7 years ago
Automatically Add All Scalars to main_summary
Categories
(Data Platform and Tools :: General, enhancement, P1)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: frank, Assigned: frank)
References
Details
To support heavy user analysis, they need all engagement scalars. We might as well add *all* scalars automatically, to reduce data engineering workload down the line.
Assignee | ||
Comment 1•8 years ago
|
||
Scalars are per-process, and per-namespace. Current scalars in main_summary are just the scalar name: e.g. "max_concurrent_tab_count". I propose making them like longitudinal, where the name is: <scalar_prefix>_<process>_<namespace>_<name>, e.g. "scalars_parent_browser_engagement_max_concurrent_tab_count". Downside is the previous names won't be available, so we'd probably have to version main_summary so that people can still access them in the historical data (I'm assuming backfill is out of the question). Thoughts, Mark?
Flags: needinfo?(mreid)
Comment 2•8 years ago
|
||
I think this is a good idea. We'll likely want to keep adding all the new scalars as they arrive, so making the process as easy as possible makes sense.
When we version to v4, we could rewrite the v3 main_summary data into v4 using the new column names (rather than backfilling from the raw data, which is quite time-consuming), then users of main_summary would not have to query both tables.
Flags: needinfo?(mreid)
Updated•7 years ago
|
Component: Metrics: Pipeline → Datasets: Main Summary
Product: Cloud Services → Data Platform and Tools
Assignee | ||
Comment 3•7 years ago
|
||
Will be backfilled with bug 1362161
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Updated•2 years ago
|
Component: Datasets: Main Summary → General
You need to log in
before you can comment on or make changes to this bug.
Description
•