Closed Bug 1172113 Opened 9 years ago Closed 1 month ago

Treat count histograms as bucketed values

Categories

(Webtools Graveyard :: Telemetry Server, defect)

defect
Not set
normal

Tracking

(firefox41 affected)

RESOLVED INCOMPLETE
Tracking Status
firefox41 --- affected

People

(Reporter: azhang, Unassigned, NeedInfo)

Details

Attachments

(2 files, 3 obsolete files)

Currently, count histograms are bucketed as Boolean histograms, which have 3 fixed buckets. For this reason, count histograms are not supported in Telemetry-dashboard, as they show up incorrectly [1].

Count histograms should actually show the full distribution, like linear and exponential histograms. The number of buckets and the range should also be specifiable in Histograms.json [2]

[1]: See the histogram for http://telemetry.mozilla.org/#filter=nightly%2F40%2FSERVICE_WORKER_WAS_SPAWNED%2Fsaved_session%2FFirefox&aggregates=multiselect-all!Submissions&evoOver=Builds&locked=true&sanitize=true&renderhistogram=Graph
[2] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/Histograms.json
This patch allows the number of buckets and the range to be specified in Histograms.json. As none of the 51 count-type measures specify their n_buckets or high values, we default to reasonable values.

Additionally, this patch moves the mozilla-central-specific imports into the functions where they are needed. This is because histogram_tools.py is also used by Telemetry-server [1] in order to process histograms on the backend.

Currently, we simply download the file at version 72940b27aeaa, which may become outdated in the future if new histogram kinds are added. This patch allows the script to use the latest version once more.

[1]: https://github.com/mozilla/telemetry-server/blob/master/bin/get_histogram_tools.sh
Comment on attachment 8616174 [details] [diff] [review]
count-histogram-fixes.patch

Please rebase this off the latest mozilla-central, bug 1168409 included some related fixes.
Attachment #8616174 - Flags: review?(mreid) → review?(vdjeric)
Attached patch count-histogram-fixes.patch (obsolete) — Splinter Review
Rebased off latest mozilla-central.
Attachment #8616174 - Attachment is obsolete: true
Attachment #8616174 - Flags: review?(vdjeric)
Attachment #8616693 - Flags: review?(vdjeric)
Comment on attachment 8616693 [details] [diff] [review]
count-histogram-fixes.patch

Review of attachment 8616693 [details] [diff] [review]:
-----------------------------------------------------------------

Postponing review until we get some stats on what the ranges for existing count histograms look like

File a bug for adding a new, required count histogram field: expected_max
We can infer num_buckets from expected_max
Attachment #8616693 - Flags: review?(vdjeric)
Attached patch count-histogram-fixes.patch (obsolete) — Splinter Review
rvitillo did an analysis [1] to figure out what the data for the count histograms looks like at the moment. From this we determine that it would be better to use exponential buckets with a high default bound, sort of like how we're doing simple measures.

This is done as a change to histogram_tools rather than as a special case in the analysis job (which is how simple measures is done) because:

1. Adding a special case in the analysis job would require a dependency on histogram_tools.py.
2. Adding it in histogram_tools makes it easy for histogram authors to override the ranges for special use cases.
3. analysis.py line 29-70: "############################## Ugly hacks for simple measures" ... "############################## End of ugly hacks for simple measures"

The default count histogram buckets with this patch are as follows:

    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 21, 23, 25, 27, 29, 31, 34, 37, 40, 43, 46, 50, 54, 58, 63, 68, 74, 80, 86, 93, 101, 109, 118, 128, 138, 149, 161, 174, 188, 203, 219, 237, 256, 277, 299, 323, 349, 377, 408, 441, 477, 516, 558, 603, 652, 705, 762, 824, 891, 963, 1041, 1125, 1216, 1315, 1422, 1537, 1662, 1797, 1943, 2101, 2271, 2455, 2654, 2869, 3102, 3354, 3626, 3920, 4238, 4582, 4954, 5356, 5791, 6261, 6769, 7318, 7912, 8554, 9249, 10000]

This patch additionally adds support for 0 as a lower bound in exponential buckets. Using 0 as a lower bound is now exactly the same as using 1 in terms of output (previously, this would just raise an error), but tells readers that the value can and is expected to be 0 sometimes.

The new expected_max property of histograms still needs to be documented.

[1]: http://nbviewer.ipython.org/gist/vitillo/2fec41015a79fc776096
Attachment #8616693 - Attachment is obsolete: true
Attachment #8621636 - Flags: review?(vdjeric)
Comment on attachment 8621636 [details] [diff] [review]
count-histogram-fixes.patch

Review of attachment 8621636 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/histogram_tools.py
@@ +200,4 @@
>          table = {
>              'boolean': always_allowed_keys,
>              'flag': always_allowed_keys,
> +            'count': always_allowed_keys + ['expected_max'],

I thought we're not doing expected_max anymore for count histograms?
Attachment #8621636 - Flags: review?(vdjeric)
New version of attachment 8621636 [details] [diff] [review] that uses the same, fixed buckets for all count histograms.
Attachment #8621636 - Attachment is obsolete: true
Attachment #8622504 - Flags: review?(vdjeric)
Comment on attachment 8622504 [details] [diff] [review]
count-histogram-fixes.patch

Review of attachment 8622504 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/histogram_tools.py
@@ +250,5 @@
>          return (1, 2, 3)
>  
>      @staticmethod
> +    def count_bucket_parameters(definition):
> +        return (1, 10000, 100)

100 buckets seems a bit high, how does it look in the dash?
Attachment #8622504 - Flags: review?(vdjeric) → review+
Attached image dash-screenshot
Sample screenshot of 100-bucket histograms in the Telemetry Dashboard.
Flags: needinfo?(azhang)
Mentor: vladan.bugzilla
Product: Webtools → Webtools Graveyard
Status: NEW → RESOLVED
Closed: 1 month ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: