Closed
Bug 1312806
Opened 8 years ago
Closed 8 years ago
Categorical histograms should default to 50 buckets
Categories
(Toolkit :: Telemetry, defect, P1)
Toolkit
Telemetry
Tracking
()
RESOLVED
FIXED
mozilla53
Tracking | Status | |
---|---|---|
firefox53 | --- | fixed |
People
(Reporter: sfink, Assigned: gfritzsche)
References
Details
(Whiteboard: [measurement:client])
Attachments
(1 file, 6 obsolete files)
6.27 KB,
patch
|
Dexter
:
review+
|
Details | Diff | Splinter Review |
In reading through https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Adding_a_new_Telemetry_probe I see that most of the time I want an enumerated histogram. But they're a pain, because they show numbers instead of string names.
From the description, I would much rather be using categorical histograms, except I never will because they disallow new values, and I never have anything where I'm confident enough that the set of possible values will stay completely fixed.
What if categorical histograms were allowed to have some number of placeholder labels? Then you'd allow placeholder labels to change their names to non-placeholder labels, but you'd still disallow non-placeholder names from changing. (I was originally thinking you could just allow some number of empty string categories, and allow empty string -> nonempty string transitions only, but I suppose you probably want them to be unique.)
Maybe make them numeric, and translate "1" -> Telemetry::LABEL_UNUSED_1? To be clear, I'm not suggesting that you have an infinite number of these allowed. I was thinking it'd be the same as setting n_buckets higher for enumerated histograms. It's fine if the user has to list them all out.
Assignee | ||
Comment 1•8 years ago
|
||
Roberto or Frank, is adding more buckets actually a problem right now for categorical histograms specifically?
If yes, do you see a problem pipeline-side if we just set the bucket count for all categorical histograms to a high value (strawman: 100), to allow future extension?
Client-side & storage-wise that would be fine as we store and serialize histograms sparsely.
Flags: needinfo?(rvitillo)
Flags: needinfo?(fbertsch)
Priority: -- → P3
Whiteboard: [measurement:client]
Comment 2•8 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #1)
> Roberto or Frank, is adding more buckets actually a problem right now for
> categorical histograms specifically?
Yes it is, we consider histogram definitions immutable.
> If yes, do you see a problem pipeline-side if we just set the bucket count
> for all categorical histograms to a high value (strawman: 100), to allow
> future extension?
I am not sure that doing this for all categorical histograms is the right thing to do but we certainly can provide the option to do so when a new categorical histogram is defined.
Flags: needinfo?(rvitillo)
Assignee | ||
Comment 3•8 years ago
|
||
(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #2)
> I am not sure that doing this for all categorical histograms is the right
> thing to do but we certainly can provide the option to do so when a new
> categorical histogram is defined.
I would rather not require specifying buckets manually, so always reserving the maximum number of buckets would be a simple approach.
Unless this has pipeline impact.
The other easy option i can think of is to just require renaming the probe each time.
Reporter | ||
Comment 4•8 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #3)
> The other easy option i can think of is to just require renaming the probe
> each time.
I have enumerated histograms that I track across versions, even when I add new buckets to it (we're careful to keep existing values the same). Example: GC_MINOR_REASON.
Hmm... though looking at the dashboard, maybe the consistency isn't helping that much anyway. The "Histogram Dashboard" view is very handy, but only shows one Firefox version at a time. The "Evolution Dashboard" shows multiple versions, but I don't know what it's showing. Perhaps it's the mean numeric enumeration value divided by the number of submissions? At any rate, it appears that the evolution view is meaningless for enumerations. Is that accurate?
But it would be *nice* if I could track these across versions. For example, it could show a stacked bar chart of all the different enum values (or category names, if this bug is resolved somehow) in the evolution dashboard, so that you could see change in relative percentages over time. And it would be nice if, for example, we "split" one category into two subcategories (perhaps abandoning the original category), and we could see both before and after versions in one view.
Though I guess for some measures I wouldn't care about relative percentages, but instead would want a set of bars for each enum value, where the bars in the set correspond to different versions.
But those are for different bugs.
Comment 5•8 years ago
|
||
telemetry.mozilla.org is designed to be a generic view for generic histograms. If you have specialized needs, then specialized dashboards would be more appropriate.
The thing is, without good data regimens like yours, :sfink, we just wouldn't have the data to power any of these different, useful views. So keep up the good work, even if t.m.o isn't designed in a way that shows it off :S
Assignee | ||
Comment 7•8 years ago
|
||
With (In reply to Georg Fritzsche [:gfritzsche] from comment #3)
> (In reply to Roberto Agostino Vitillo (:rvitillo) from comment #2)
> > I am not sure that doing this for all categorical histograms is the right
> > thing to do but we certainly can provide the option to do so when a new
> > categorical histogram is defined.
>
> I would rather not require specifying buckets manually, so always reserving
> the maximum number of buckets would be a simple approach.
> Unless this has pipeline impact.
Roberto or Frank, can you answer on that part?
If there is no pipeline impact, just always reserving e.g. for a maximum of 50 buckets would be cleanest.
Flags: needinfo?(rvitillo)
Comment 8•8 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #7)
> With (In reply to Georg Fritzsche [:gfritzsche] from comment #3)
> > (In reply to Roberto Agostino Vitillo (:rvitillo) from comment #2)
> > > I am not sure that doing this for all categorical histograms is the right
> > > thing to do but we certainly can provide the option to do so when a new
> > > categorical histogram is defined.
> >
> > I would rather not require specifying buckets manually, so always reserving
> > the maximum number of buckets would be a simple approach.
> > Unless this has pipeline impact.
>
> Roberto or Frank, can you answer on that part?
> If there is no pipeline impact, just always reserving e.g. for a maximum of
> 50 buckets would be cleanest.
OK
Flags: needinfo?(rvitillo)
Assignee | ||
Updated•8 years ago
|
Summary: Categorical histograms are not expandable → Categorical histograms should always have 50 buckets
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → gfritzsche
Priority: P3 → P1
Assignee | ||
Comment 9•8 years ago
|
||
We don't have any categorical histograms submitting yet, so lets quickly fix this now.
Assignee | ||
Comment 10•8 years ago
|
||
Attachment #8819885 -
Flags: review?(alessio.placitelli)
Comment 11•8 years ago
|
||
Comment on attachment 8819885 [details] [diff] [review]
Categorical histograms should always have 50 buckets
Review of attachment 8819885 [details] [diff] [review]:
-----------------------------------------------------------------
This looks good. However, should we note somewhere the reason why we're always having 50 buckets for the Categoricals? So we don't forget about it when backtracking stuff 3 months from now.
I'm not sure though where exactly this can be written. Maybe MDN or near CATEGORICAL_BUCKET_COUNT?
Attachment #8819885 -
Flags: review?(alessio.placitelli) → review+
Assignee | ||
Comment 12•8 years ago
|
||
(In reply to Alessio Placitelli [:Dexter] from comment #11)
> This looks good. However, should we note somewhere the reason why we're
> always having 50 buckets for the Categoricals? So we don't forget about it
> when backtracking stuff 3 months from now.
>
> I'm not sure though where exactly this can be written. Maybe MDN or near
> CATEGORICAL_BUCKET_COUNT?
Good point; we should do both.
Assignee | ||
Comment 13•8 years ago
|
||
Assignee | ||
Updated•8 years ago
|
Attachment #8819885 -
Attachment is obsolete: true
Assignee | ||
Updated•8 years ago
|
Attachment #8819890 -
Flags: review+
Assignee | ||
Comment 14•8 years ago
|
||
As i just looked over a bug that could use 48 buckets already, i'm changing the logic to:
* default categorical histograms to 50 buckets (+1 overflow)
* allow overriding this with n_values
Assignee | ||
Comment 15•8 years ago
|
||
Updated to the allow overriding the minimum of 50 buckets.
Attachment #8821506 -
Flags: review?(alessio.placitelli)
Assignee | ||
Updated•8 years ago
|
Attachment #8819890 -
Attachment is obsolete: true
Comment 16•8 years ago
|
||
Comment on attachment 8821506 [details] [diff] [review]
Categorical histograms should default to 50 buckets
Review of attachment 8821506 [details] [diff] [review]:
-----------------------------------------------------------------
This looks good with the comment below addressed. Don't forget to update the MDN page with the new behaviour!
::: toolkit/components/telemetry/tests/unit/test_nsITelemetry.js
@@ +236,5 @@
> +
> + // This histogram overrides the default of 50 values to 70.
> + let h3 = Telemetry.getHistogramById("TELEMETRY_TEST_CATEGORICAL_NVALUES");
> + for (let v of ["CommonLabel", "Label7", "Label8"]) {
> + dump("* adding: " + v + "\n");
Is this a leftover from debugging?
Attachment #8821506 -
Flags: review?(alessio.placitelli) → review+
Assignee | ||
Comment 17•8 years ago
|
||
Assignee | ||
Updated•8 years ago
|
Attachment #8821506 -
Attachment is obsolete: true
Assignee | ||
Updated•8 years ago
|
Attachment #8821513 -
Flags: review+
Assignee | ||
Updated•8 years ago
|
Summary: Categorical histograms should always have 50 buckets → Categorical histograms should default to 50 buckets
Assignee | ||
Updated•8 years ago
|
Keywords: checkin-needed
Comment 19•8 years ago
|
||
Drive-by rebasing so that this can land on Aurora while :gfritzsche is on PTO.
Updated•8 years ago
|
Keywords: checkin-needed
Updated•8 years ago
|
Keywords: checkin-needed
Updated•8 years ago
|
Attachment #8822023 -
Attachment is obsolete: true
Comment 20•8 years ago
|
||
Rebased.
Attachment #8821513 -
Attachment is obsolete: true
Attachment #8822523 -
Flags: review+
Comment 21•8 years ago
|
||
Attachment #8822523 -
Attachment is obsolete: true
Attachment #8822524 -
Flags: review+
Updated•8 years ago
|
Attachment #8822524 -
Attachment is patch: true
Comment 22•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/cd3ac71945c8393db214df43ee2f14be7de15e36
Bug 1312806 - Categorical histograms should default to 50 buckets. r=dexter
Comment 23•8 years ago
|
||
Pushed by alessio.placitelli@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/cd3ac71945c8
Categorical histograms should default to 50 buckets. r=dexter
Comment 24•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox53:
--- → fixed
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Assignee | ||
Updated•8 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•