Closed Bug 1322802 Opened 8 years ago Closed 4 years ago

Allow for explicit versioning in Histograms.json

Categories

(Toolkit :: Telemetry, defect, P3)

defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox53 --- affected

People

(Reporter: gfritzsche, Unassigned)

References

Details

(Whiteboard: [measurement:client])

Attachments

(1 file)

Currently we can't change bucket counts for histograms, instead requiring to rename the probe.
This leads to naming schemes like HISTOGRAM, HISTOGRAM_2, ...

We should make this information explicit by adding an optional "version" field (default 1) to the histogram entries.
With that we can increment the version field when changing bucket parameters and internally add "_2", "_3", ... to the histogram name where version>1.

This allows us to track this histogram over version changes.
Frank, Roberto:
I assume that this doesn't affect the aggregator or pipeline with the following:
* allow an optional "version" field in Histograms.json (defaults to 1)
* make histogram_tools.py Histogram.name() return `name + '_' + str(version)`
  (e.g. for name:"GC_MS" and version:2, Histogram.name() returns "GC_MS_2")

Can you confirm?
Flags: needinfo?(rvitillo)
Flags: needinfo?(fbertsch)
How will this interact with the display and aggregations on t.m.o/longitudinal/etc? I'm concerned that people will look for the plain variant (and in fact that's what we want, if this is an implementation detail).
(In reply to Georg Fritzsche [:gfritzsche] from comment #1)
> Frank, Roberto:
> I assume that this doesn't affect the aggregator or pipeline with the
> following:
> * allow an optional "version" field in Histograms.json (defaults to 1)
> * make histogram_tools.py Histogram.name() return `name + '_' + str(version)`
>   (e.g. for name:"GC_MS" and version:2, Histogram.name() returns "GC_MS_2")
> 
> Can you confirm?

Some changes are required to python_moztelemetry, python_mozaggregator & telemetry-batch-view to make this work.

> How will this interact with the display and aggregations on t.m.o/longitudinal/etc? I'm concerned that people will look for the plain variant (and in fact that's what we want, if this is an implementation detail).

The longitudinal dataset, just as t.m.o., will end up having multiple versions of a histogram. 

The dashboard of t.m.o. could be adapted to seemingly transition from one version to the other for build-id roll-ups but we can't do the same with submission-date ones since we can't easily merge histogram with different definitions. 

We could enforce that the un-versioned histogram name refers to the most recent version. That could break analyses that assume a certain histogram definition though.
Flags: needinfo?(rvitillo)
(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #3)
> (In reply to Georg Fritzsche [:gfritzsche] from comment #1)
> > Frank, Roberto:
> > I assume that this doesn't affect the aggregator or pipeline with the
> > following:
> > * allow an optional "version" field in Histograms.json (defaults to 1)
> > * make histogram_tools.py Histogram.name() return `name + '_' + str(version)`
> >   (e.g. for name:"GC_MS" and version:2, Histogram.name() returns "GC_MS_2")
> > 
> > Can you confirm?
> 
> Some changes are required to python_moztelemetry, python_mozaggregator &
> telemetry-batch-view to make this work.
> 
> > How will this interact with the display and aggregations on t.m.o/longitudinal/etc? I'm concerned that people will look for the plain variant (and in fact that's what we want, if this is an implementation detail).
> 
> The longitudinal dataset, just as t.m.o., will end up having multiple
> versions of a histogram. 
> 
> The dashboard of t.m.o. could be adapted to seemingly transition from one
> version to the other for build-id roll-ups but we can't do the same with
> submission-date ones since we can't easily merge histogram with different
> definitions. 

I do want to point out that this is superior to what we have now, since currently the different new versions have different names, and thus can't be merged in either build-id or submission-date aggregation. With versioning, as you said, we can merge all versions for build-id aggregation (assuming 1:1 build-id:histogram version).

> 
> We could enforce that the un-versioned histogram name refers to the most
> recent version. That could break analyses that assume a certain histogram
> definition though.

Additionally, we could just add the version selection to the TMO frontend (and also, then, to the service). If we do it this way, it would just replicate the current views (since different versions would all have different views - no merging).
Flags: needinfo?(fbertsch)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #2)
> How will this interact with the display and aggregations on
> t.m.o/longitudinal/etc? I'm concerned that people will look for the plain
> variant (and in fact that's what we want, if this is an implementation
> detail).

Is that a blocker right now?
It seems like starting this way doesn't make the situation worse than what we have right now (while adding semantic information).

(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #3)
> (In reply to Georg Fritzsche [:gfritzsche] from comment #1)
> > Frank, Roberto:
> > I assume that this doesn't affect the aggregator or pipeline with the
> > following:
> > * allow an optional "version" field in Histograms.json (defaults to 1)
> > * make histogram_tools.py Histogram.name() return `name + '_' + str(version)`
> >   (e.g. for name:"GC_MS" and version:2, Histogram.name() returns "GC_MS_2")
> > 
> > Can you confirm?
> 
> Some changes are required to python_moztelemetry, python_mozaggregator &
> telemetry-batch-view to make this work.

I see, i thought if i start with making histogram_tools.py just return a versioned name(), then that would be opaque and equivalent to the current approach of renaming histograms.
I'm scheduling a conversation for next week to figure out what this would entail (and if we can do something in the short term).
This adds versioning in Histograms.json that is opaque for clients of histogram_tools.py.
Priority: P3 → P2
Depends on: 1324474
Depends on: 1324475
Depends on: 1324476
Depends on: 1324477
No longer depends on: 1324477
Priority: P2 → P3

With bug 1324475 WONTFIX'd and our goal of using Glean to solve user problems like these, I guess we should WONTFIX this too.

One last note: Alessio, do you want this codified anywhere as a user research item for Glean enhancements?

Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(alessio.placitelli)
Resolution: --- → WONTFIX

(In reply to Chris H-C :chutten from comment #7)

With bug 1324475 WONTFIX'd and our goal of using Glean to solve user problems like these, I guess we should WONTFIX this too.

One last note: Alessio, do you want this codified anywhere as a user research item for Glean enhancements?

Can you please file a bug about the problem (s) that you think still apply to Glean, related to this bug (rather than a solution)?

Flags: needinfo?(alessio.placitelli) → needinfo?(chutten)
See Also: → 1630966

Done

Flags: needinfo?(chutten)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: