Closed Bug 1288444 Opened 8 years ago Closed 4 years ago

Add categorical histograms to the longitudinal dataset

Categories

(Data Platform and Tools Graveyard :: Datasets: Longitudinal, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: gfritzsche, Unassigned)

References

Details

From IRC:

> categorical histograms could be stored in structs which is what we are already doing in main_summary:
> https://github.com/mozilla/telemetry-batch-view/blob/master/src/main/scala/com/mozilla/telemetry/views/MainSummaryView.scala#L387
> Probably using maps is the better choice as it allows to use cross join unnest
Points: --- → 3
Priority: -- → P2
Priority: P2 → P3
Summary: Make categorical histograms convenient in longitudinal dataset & re:dash → Add categorical histograms to the longitudinal dataset
Assignee: nobody → rharter
Priority: P3 → P2
Priority: P2 → P1
I need some clarification on this. The histograms.json [0] file only lists two example categorical histograms. Are there histograms missing from this set or are we expecting more in the future? Is there an easy way to enumerate these besides filtering past pings?

[0] https://hg.mozilla.org/releases/mozilla-release/raw-file/tip/toolkit/components/telemetry/Histograms.json
Flags: needinfo?(gfritzsche)
(In reply to Ryan Harter [:harter] from comment #1)
> I need some clarification on this. The histograms.json [0] file only lists
> two example categorical histograms. Are there histograms missing from this
> set or are we expecting more in the future? Is there an easy way to
> enumerate these besides filtering past pings?

We are expecting more in the future. AFAIU the plan is that once the tooling is in place, data collection reviewers can point engineers to using them.
Flags: needinfo?(gfritzsche)
(In reply to Ryan Harter [:harter] from comment #1)
> Is there an easy way to enumerate these besides filtering past pings?

Not right now, although i'm thinking about how to fix this.
I tried to find some example pings from the test histograms, but they don't
seem to be reporting [0].

Do we have examples or a spec for these histograms payloads?  The docs
mention these will be like enumerated histograms with labels, but the format's
still unclear to me.  Will they payload be an array, as in a enumerated
histogram, or a map? Thanks!

[0] https://gist.github.com/harterrt/ed44bdd4e07b75be652154f5d706eef6
Flags: needinfo?(gfritzsche)
(In reply to Ryan Harter [:harter] from comment #4)
> Will they payload be an array, as in a enumerated
> histogram, or a map? Thanks!

Yes, in the payload they are just like enumerated histograms.
The labels are only used in the client API, internally it is just like an enumerated histogram.

E.g. for TELEMETRY_TEST_CATEGORICAL, adding 1 to the labels "CommonLabel" & "Label2", <ping>/payload/histograms/TELEMETRY_TEST_CATEGORICAL looks like this:
> {"range":[1,3],"bucket_count":4,"histogram_type":1,"values":{"0":1,"1":1,"2":0},"sum":1}
Flags: needinfo?(gfritzsche)
Note that we are changing the bucket logic in bug 1312806 to reduce user error.
Bug 1324478 was filed to deal with this from telemetry-batch-view.
Priority: P1 → P2
Priority: P2 → P3
Moving to P3. I do not plan on addressing this in Q2.
Moving to P2, due to Bug 1376596
Component: Metrics: Pipeline → Datasets: Longitudinal
Priority: P3 → P2
Product: Cloud Services → Data Platform and Tools
I will not be able to take care of this in the next month. Moving to P3.
Priority: P2 → P3
Assignee: rharter → nobody

Longitudinal has been decommissioned per Bug 1572033.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
Product: Data Platform and Tools → Data Platform and Tools Graveyard
You need to log in before you can comment on or make changes to this bug.