Closed Bug 763359 Opened 12 years ago Closed 12 years ago

Telemetry does not always put samples into the correct bucket

Categories

(Toolkit :: Telemetry, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED
Tracking Status
firefox13 - ---
firefox14 - ---
firefox15 - ---
firefox16 - ---
fennec - ---

People

(Reporter: briansmith, Assigned: froydnj)

References

Details

Please see my investigation in bug 763351.

I have found that if you have a linear histogram with range (1, 5) and five buckets, the value "3" will always get put into bucket 2, instead of bucket "3" like you would expect. That is, the values "2" and "3" will get put into the same bucket.

In the case of bug 763351, we were using the buckets to act like enums; value "3" means something completely different than value "2" (in fact, in that probe, the values have almost completely opposite meanings).

In general, if there is a telemetry histogram with range (1, X) with X buckets, we expect each distinct value to be reported in its own bucket.

I have seen other telemetry probes using the same kind of pattern so there are likely other probes being affected by this.

Based on the investigation of bug 763351, it seems like this has been an issue since at least October, if not since the beginning of Telemetry data collection.

I am nominating this to be tracked on all channels because it needs to be fixed ASAP in order for us to meaningfully compare some significant telemetry data across releases.
Please see HISTOGRAM_ENUMERATED_VALUES in TelemetryHistograms.h, introduced in bug 732053.  It would have been a good idea to file bugs on all the enum-values LINEAR_HISTOGRAMS that couldn't be converted when HISTOGRAM_ENUMERATED_VALUES was introduced, but I didn't do that. :(
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → INVALID
(In reply to Nathan Froyd (:froydnj) from comment #1)
> Please see HISTOGRAM_ENUMERATED_VALUES in TelemetryHistograms.h, introduced
> in bug 732053.  It would have been a good idea to file bugs on all the
> enum-values LINEAR_HISTOGRAMS that couldn't be converted when
> HISTOGRAM_ENUMERATED_VALUES was introduced, but I didn't do that. :(

Are you planning on filing these bugs now? Wouldn't this be a good meta bug for tracking?

Is there any reason to believe this bug isn't as concerning as bsmith suggests?
(In reply to Alex Keybl [:akeybl] from comment #2)
> Are you planning on filing these bugs now? Wouldn't this be a good meta bug
> for tracking?

I filed bugs for the Javascript case (
> 
> Is there any reason to believe this bug isn't as concerning as bsmith
> suggests?

I filed bug 764184 and bug 764190 in addition to bug 763351. For the other probes, it is OK for there to be bucket collisions.

Because multiple teams got tripped up by this, I highly recommend that we change the API so that one **cannot** create a linear histogram from (1-n) with n buckets, by adding a MOZ_STATIC_ASSERT for that condition. For the case where this is a false positive, the person who defines the telemetry can just change the bucket count to (n+1). For the case where this check would prevent a problem, it will avoid collecting bad data.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
Depends on: 764585
(In reply to Brian Smith (:bsmith) from comment #3)
> Because multiple teams got tripped up by this, I highly recommend that we
> change the API so that one **cannot** create a linear histogram from (1-n)
> with n buckets, by adding a MOZ_STATIC_ASSERT for that condition. For the
> case where this is a false positive, the person who defines the telemetry
> can just change the bucket count to (n+1). For the case where this check
> would prevent a problem, it will avoid collecting bad data.

This seems reasonable; I filed bug 764585 for this.  I suppose fixing that bug requires modifying all the other histograms anyway...
likely we may just fix the number of buckets here, the thing we care most is that this measure stays low (under 5), so the actual value has less importance. We will just get more precice infos once the fix gets out in the wild.
oops, comment 5 was intended for bug 764190, sorry.
Seems like this is part of work Nathan will be doing so assigning to him. Feel free to move to another assignee (that isn't nobody) if it makes more sense. Just trying to make sure our tracking 14+ bugs have owners at this point in the cycle.
Assignee: nobody → nfroyd
It seems to me that a metabug shouldn't necessarily be tracked for 14; the individual blocking bugs should be, and then those bugs can be tracked with a release based on the importance of information gained from them.

Going to shuffle the dependencies to make them clearer.
No longer blocks: 763351, 764190, 764260
Depends on: 763351, 764190, 764260
tracking-fennec: ? → -
Depends on: 781200
Fixed all the wrongly-specified histograms and put stricter checks in place.  Closing.
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.