Closed Bug 1403994 Opened 7 years ago Closed 6 years ago

Service does not correctly aggregate categorical histograms with different label counts

Categories

(Data Platform and Tools :: General, enhancement, P1)

enhancement
Points:
3

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: frank, Assigned: frank)

References

Details

We allow adding new labels to categorical histograms, but the service does not account for these new labels when querying. The issue is this function[0], which sums the buckets. The problem is a histogram isn't just a histogram, but is really this:

old_histogram = [bucket_1, bucket_2, ..., bucket_n, sum, ping_count]

So when we add a new label, we get:
new_histogram = [bucket_1, ..., bucket_n, bucket_n+1, sum, ping_count]

So when we aggregate histograms, we get:
[bucket_1 + bucket_1, bucket_2 + bucket_2, ..., bucket_n + bucket_n, sum + bucket_n+1, ping_count + sum, 0 + ping_count]

Where the first value comes from old_histogram, and the second from new_histogram. Because sums are so large, it looks like the bucket_n+1 is HUGE, which is the spill bucket.

To fix this, the histogram_aggregator needs to instead add the label buckets, and the last two buckets separately.

[0] https://github.com/mozilla/python_mozaggregator/blob/master/mozaggregator/sql.py#L9
Points: --- → 2
Priority: -- → P3
This fix has been deployed, I want to verify that [0] changes before I mark this as resolved.

[0] https://mzl.la/2qNJKZD
Points: 2 → 3
Priority: P3 → P1
Assignee: nobody → fbertsch
I have good news and bad news.

Good news: Moving forward, we can add new categories and things will work _just_fine_. In fact we can extend this to adding new buckets to any probe, so long as the old buckets stay the same. Historical submission_date based aggregates will also show correctly.

Bad news: I cannot fix historical build_id aggregates - the data in the db is permanently broken for those. This should only affect PREFERENCES_OPENED_VIA before 20170628.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Component: Telemetry Aggregation Service → General
You need to log in before you can comment on or make changes to this bug.