Closed
Bug 1393731
Opened 7 years ago
Closed 7 years ago
Investigate high ratio of health ping clients (comparing to DAU)
Categories
(Toolkit :: Telemetry, enhancement, P2)
Toolkit
Telemetry
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox57 | --- | fix-optional |
People
(Reporter: katejimmelon, Unassigned)
References
Details
Attachments
(1 file)
170.14 KB,
image/png
|
Details |
No description provided.
Reporter | ||
Comment 1•7 years ago
|
||
1) Find correlation between health ping reason and failures. 2) Find correlation between os and failures.
Reporter | ||
Updated•7 years ago
|
Comment 2•7 years ago
|
||
Chris, besides the steps in comment 1, what other explorations/analysis steps do you think are valuable here?
Flags: needinfo?(chutten)
Comment 3•7 years ago
|
||
Geography might be an interesting factor. See if some countries' users are prone to certain errors. Build would be interesting. Some builds might be buggier. There might be something in userPrefs that makes users more prone to these sorts of errors. Number of pending pings might contribute to count. If our current hypothesis is correct (that transient network errors are the bulk of things) I would expect the distribution to be fairly level. There will be a lot of US users experiencing problems... but only because we have a lot of users in the US. Ditto Windows, release, and anything that's common. What would be cool is if our hypothesis were incorrect (or at least incomplete) and that there's something causal hidden in the data. Uncovering it, if it exists, might be difficult.
Flags: needinfo?(chutten)
Reporter | ||
Comment 4•7 years ago
|
||
This is latest results (Beta) https://gist.github.com/katejim/c7ca9befa55992435741910f0deb3e4b . But basically, most of the pings are from Windows and connected to eChannelOpen failure on shutdown.
Comment 5•7 years ago
|
||
We probably need to normalize this to the different OS user populations before drawing conclusions here.
Comment 6•7 years ago
|
||
Looking locally, at my about:telemetry page, i see a common pattern: - one health ping with reason "immediate" - followed after <1sec by a "shutdown" health ping We should be able to confirm if this common with a longitudinal, per-client analysis. If so, that would (again) point to that we are trying to send pings after the Firefox network stack already shut down.
Assignee: katejimmelon → nobody
Comment 7•7 years ago
|
||
The minimum next step here would be to confirm what comment 6 suggests. If that is true and most/many pings follow this pattern, we can probably close this bug.
Updated•7 years ago
|
status-firefox57:
--- → fix-optional
Updated•7 years ago
|
Priority: P1 → P2
Comment 8•7 years ago
|
||
Well, it's not causal, but after I pushed bug 1397293, this happened. I'd say cutting the volume of "health" pings roughly in half is consistent with the theory presented in comment 6.
Updated•7 years ago
|
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•