Remove uninteresting Sync probes

RESOLVED DUPLICATE of bug 1236383

Status

()

Firefox
Sync
P2
normal
RESOLVED DUPLICATE of bug 1236383
2 years ago
2 years ago

People

(Reporter: kitcambridge, Assigned: kitcambridge)

Tracking

unspecified
Points:
---
Bug Flags:
firefox-backlog +

Firefox Tracking Flags

(Not tracked)

Details

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment)

* FXA_UNVERIFIED_ACCOUNT_ERRORS, FXA_SERVER_ERRORS, and WEAVE_HMAC_ERRORS had no submissions for 43-45.

* WEAVE_ENGINE_APPLY_NEW_FAILURES only recorded failures for add-ons in 43-45. That's interesting, but not particularly useful. Maybe we should add a counter keyed on the add-on name as a follow-up?

* WEAVE_ENGINE_SYNC_ERRORS isn't very interesting. History had the highest multiple error rate, followed by bookmarks. Most submissions showed one failure for history, add-ons, bookmarks, and tabs...but nothing really stands out. The percentages are close for each engine type.
Created attachment 8698090 [details]
MozReview Request: Bug 1232356 - Remove uninteresting Sync probes; bump versions for the others. r?bsmedberg

Bug 1232356 - Remove uninteresting Sync probes; bump versions for the others. r?bsmedberg
Attachment #8698090 - Flags: review?(benjamin)

Comment 2

2 years ago
I'm going to mark data-feedback+ on this, but I am not a reviewer for this in general. From a more general quality perspective, I encourage teams to keep permanent telemetry on known error cases and have dashboards to monitor and alert on those on a quick or even realtime basis. So it feels funny to be removing some probes just because the current error rate is small.

Comment 3

2 years ago
Comment on attachment 8698090 [details]
MozReview Request: Bug 1232356 - Remove uninteresting Sync probes; bump versions for the others. r?bsmedberg

https://reviewboard.mozilla.org/r/27791/#review24991

::: toolkit/components/telemetry/Histograms.json:9836
(Diff revision 1)
> -    "expires_in_version": "46",
> +    "expires_in_version": "50",

A bunch of these are bumping version without much explanation. I can think of two reasons to keep this data:

A. We've found that data is correct and helpful and we have production monitoring this data. If this is the case, we should make it expires_in_version: never.
B. We still don't have confidence in the data, or haven't produced a dashboard to monitor it effectively, but we still think it's going to be valuable, and we just want an extension to finish testing/reporting.

Which of these seems more true, or is there something else going on?

::: toolkit/components/telemetry/Histograms.json:9839
(Diff revision 1)
>      "description": "If the user is signed in to a Firefox Account on this device"

While you're here, can you document *when* this histogram is recorded? The current doc is pretty unclear about whether this happens once at startup, or can happen other times during the run.
Attachment #8698090 - Flags: review?(benjamin)

Updated

2 years ago
Flags: needinfo?(kcambridge)
Sorry, I should've left some context. Most of these probes were added because we weren't sure what was causing Sync authentication errors, with the intent to revisit once we had some initial data.

To that end, some of these don't seem actionable (WEAVE_HMAC_ERRORS, for example), and some are better monitored by the server (FXA_UNVERIFIED_ACCOUNT_ERRORS, FXA_SERVER_ERRORS, TOKENSERVER_AUTH_ERRORS).

There may be value in keeping WEAVE_ENGINE_APPLY_NEW_FAILURES and WEAVE_ENGINE_SYNC_ERRORS, but, if I'm reading the dashboards correctly, I don't think they tell us much beyond "most samples have one bad record."

For the others, I bumped the version because I think we need more information to decide whether the probes are valuable.
Flags: needinfo?(kcambridge)
Flags: firefox-backlog+
Priority: -- → P2
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1236383
You need to log in before you can comment on or make changes to this bug.