update.status telemetry ping is not reporting correctly

VERIFIED FIXED in Firefox 12

Status

()

Toolkit
Application Update
VERIFIED FIXED
5 years ago
5 years ago

People

(Reporter: bbondy, Assigned: bbondy)

Tracking

12 Branch
mozilla13
x86_64
Windows 7
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox11 unaffected, firefox12 fixed, firefox13 fixed)

Details

(Whiteboard: [qa-])

Attachments

(3 attachments, 3 obsolete attachments)

(Assignee)

Description

5 years ago
Update.status telemetry pings are not being reported correctly currently.

It looks as though there are a lot of memory errors (old error code 1).
But really it is just because all of the errors happening are in bucket #1. 

The current code is: 

> HISTOGRAM(UPDATE_STATUS, 1, 16006, 27, LINEAR, 
> "Updater: the status of the latest update performed")


This means that there are 27 buckets from 1 to 16006 and that there are 592 status codes per bucket. I.e. the bucketizing of the accumulated telemetry values should be distributed linearly across the buckets.

Instead we should have 1 bucket per error code so they show up correctly.  Since there are a lot of unused space in error codes though I'll fix that in a different bug and land that first.
(Assignee)

Updated

5 years ago
Depends on: 711054
(Assignee)

Updated

5 years ago
Blocks: 715942
(Assignee)

Comment 1

5 years ago
Created attachment 601863 [details] [diff] [review]
Patch v1.
Attachment #601863 - Flags: review?(ehsan)
(Assignee)

Comment 2

5 years ago
By the way I know the error codes only go up to 30, but I put 31 because any larger values automatically go into the largest bucket.  So that way we won't accidentally attribute fake manually set error codes as error 30.  Ditto regarding anyone using an older build with the older status codes.
Comment on attachment 601863 [details] [diff] [review]
Patch v1.

Review of attachment 601863 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetryHistograms.h
@@ +318,5 @@
>  
>  /**
>   * Updater telemetry.
>   */
> +HISTOGRAM(UPDATE_STATUS, 1, 31, 31, LINEAR, "Updater: the status of the latest update performed")

Please rename the histogram so that we don't wind up confusing old bucketed data with new.  (The server side will probably be much happier, too.)
(Assignee)

Comment 4

5 years ago
Created attachment 601926 [details] [diff] [review]
Patch v2.

My saving grace was that error code 1 and 0 were the only ones being reported before and error code 1 should no longer be reported. But I agree that a new status code is best.
Attachment #601863 - Attachment is obsolete: true
Attachment #601926 - Flags: review?(ehsan)
Attachment #601863 - Flags: review?(ehsan)
(Assignee)

Comment 5

5 years ago
> But I agree that a new status code is best.

err I meant new name.
Thanks.  Do you think it's worthwhile to make the # of buckets bigger (~40-50) just to ensure we don't have to change it straightaway after we add a new error code or two?
(Assignee)

Comment 7

5 years ago
Created attachment 601993 [details] [diff] [review]
Patch v3.

> Do you think it's worthwhile to make the # of buckets bigger (~40-50) 

Ya might as well
Attachment #601926 - Attachment is obsolete: true
Attachment #601993 - Flags: review?(ehsan)
Attachment #601926 - Flags: review?(ehsan)
Comment on attachment 601993 [details] [diff] [review]
Patch v3.

Review of attachment 601993 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetryHistograms.h
@@ +318,5 @@
>  
>  /**
>   * Updater telemetry.
>   */
> +HISTOGRAM(UPDATER_STATUS_CODES, 1, 50, 50, LINEAR, "Updater: the status of the latest update performed")

Last comment, promise. :)

If you want evenly spaced buckets-of-1, you need to request 51 buckets, as there's an implicit 0 bucket tacked on to the bottom.  See, for instance, the A11Y_CONSUMERS histograms.

Otherwise you'll end up with a bucket somewhere (depending on the math, usually at the high end) that holds two values.  Doesn't make a difference now, but it would at some unspecified future point.
(Assignee)

Comment 9

5 years ago
Created attachment 601996 [details] [diff] [review]
Patch v4.

I thought the implicitly added 0 bucket would be implicitly counted as well.  But thanks for catching this.  Adjusted.
Attachment #601993 - Attachment is obsolete: true
Attachment #601996 - Flags: review?(ehsan)
Attachment #601993 - Flags: review?(ehsan)
Attachment #601996 - Flags: review?(ehsan) → review+
(Assignee)

Comment 10

5 years ago
http://hg.mozilla.org/integration/mozilla-inbound/rev/980483ace374
Target Milestone: --- → mozilla13
https://hg.mozilla.org/mozilla-central/rev/980483ace374
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
(Assignee)

Comment 12

5 years ago
Re-opening since this is not working correctly.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 13

5 years ago
Created attachment 604408 [details] [diff] [review]
Patch v1. Update JS telemetry ping name.

ugh... forgot to update the telemetry name in js after the name change in Comment 3 :(.
Attachment #604408 - Flags: review?(ehsan)
Attachment #604408 - Flags: review?(ehsan) → review+
(Assignee)

Comment 14

5 years ago
http://hg.mozilla.org/integration/mozilla-inbound/rev/e44816f041f5
(Assignee)

Comment 15

5 years ago
Created attachment 604468 [details] [diff] [review]
Aurora patch

[Approval Request Comment]
Regression caused by (bug #): 708123
User impact if declined: We will not have visibility into errors people are getting with updater.
Testing completed (on m-c, etc.): The first patch already landed on m-c for a couple days but silently failed due to using the wrong name when reporting the telemetry which is fixed with the second patch.  Both patches are aggregated in this Aurora patch.
Risk to taking this patch (and alternatives if risky): Extremely low
String changes made by this patch: None
Attachment #604468 - Flags: review+
Attachment #604468 - Flags: approval-mozilla-aurora?
Comment on attachment 604468 [details] [diff] [review]
Aurora patch

[Triage Comment]
Improving telemetry data with low risk fixes = a+.
Attachment #604468 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
https://hg.mozilla.org/mozilla-central/rev/e44816f041f5
Status: REOPENED → RESOLVED
Last Resolved: 5 years ago5 years ago
Resolution: --- → FIXED
(Assignee)

Comment 18

5 years ago
Uplifted to mozilla12 aurora
http://hg.mozilla.org/releases/mozilla-aurora/rev/87bde6b152a7
(Assignee)

Updated

5 years ago
status-firefox11: --- → unaffected
status-firefox12: --- → fixed
status-firefox13: --- → fixed
Is this something QA can help verify?
Whiteboard: [qa?]
(Assignee)

Comment 20

5 years ago
I verified that this is working on the telemetry dashboard.
Status: RESOLVED → VERIFIED
Thanks Brian
Whiteboard: [qa?] → [qa-]
You need to log in before you can comment on or make changes to this bug.