Closed
Bug 1403994
Opened 7 years ago
Closed 6 years ago
Service does not correctly aggregate categorical histograms with different label counts
Categories
(Data Platform and Tools :: General, enhancement, P1)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: frank, Assigned: frank)
References
Details
We allow adding new labels to categorical histograms, but the service does not account for these new labels when querying. The issue is this function[0], which sums the buckets. The problem is a histogram isn't just a histogram, but is really this: old_histogram = [bucket_1, bucket_2, ..., bucket_n, sum, ping_count] So when we add a new label, we get: new_histogram = [bucket_1, ..., bucket_n, bucket_n+1, sum, ping_count] So when we aggregate histograms, we get: [bucket_1 + bucket_1, bucket_2 + bucket_2, ..., bucket_n + bucket_n, sum + bucket_n+1, ping_count + sum, 0 + ping_count] Where the first value comes from old_histogram, and the second from new_histogram. Because sums are so large, it looks like the bucket_n+1 is HUGE, which is the spill bucket. To fix this, the histogram_aggregator needs to instead add the label buckets, and the last two buckets separately. [0] https://github.com/mozilla/python_mozaggregator/blob/master/mozaggregator/sql.py#L9
Assignee | ||
Updated•7 years ago
|
Points: --- → 2
Priority: -- → P3
Assignee | ||
Comment 2•6 years ago
|
||
This fix has been deployed, I want to verify that [0] changes before I mark this as resolved. [0] https://mzl.la/2qNJKZD
Points: 2 → 3
Priority: P3 → P1
Updated•6 years ago
|
Assignee: nobody → fbertsch
Assignee | ||
Comment 3•6 years ago
|
||
I have good news and bad news. Good news: Moving forward, we can add new categories and things will work _just_fine_. In fact we can extend this to adding new buckets to any probe, so long as the old buckets stay the same. Historical submission_date based aggregates will also show correctly. Bad news: I cannot fix historical build_id aggregates - the data in the db is permanently broken for those. This should only affect PREFERENCES_OPENED_VIA before 20170628.
Assignee | ||
Updated•6 years ago
|
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Updated•2 years ago
|
Component: Telemetry Aggregation Service → General
You need to log in
before you can comment on or make changes to this bug.
Description
•