Closed Bug 1393731 Opened 7 years ago Closed 7 years ago

Investigate high ratio of health ping clients (comparing to DAU)

Tracking

()

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

firefox57

---

fix-optional

People

(Reporter: katejimmelon, Unassigned)

References

Details

Attachments

(1 file)

Screenshot-2017-10-3 https pipeline-cep prod mozaws net.png 7 years ago Chris H-C :chutten 170.14 KB, image/png		Details

Kate Ustiuzhanina

Reporter

Description

•

7 years ago

      No description provided.

Kate Ustiuzhanina

Reporter

Comment 1

•

7 years ago

1) Find correlation between health ping reason and failures.
2) Find correlation between os and failures.

Assignee: nobody → kustiuzhanina

Depends on: 1391242, 388310

Priority: -- → P1

Kate Ustiuzhanina

Reporter

Updated

•

7 years ago

Depends on: 1388310
No longer depends on: 388310

Kate Ustiuzhanina

Reporter

Updated

•

7 years ago

Blocks: 1372228

Georg Fritzsche [:gfritzsche]

Comment 2

•

7 years ago

Chris, besides the steps in comment 1, what other explorations/analysis steps do you think are valuable here?

Flags: needinfo?(chutten)

Chris H-C :chutten

Comment 3

•

7 years ago

Geography might be an interesting factor. See if some countries' users are prone to certain errors.
Build would be interesting. Some builds might be buggier.
There might be something in userPrefs that makes users more prone to these sorts of errors.
Number of pending pings might contribute to count.

If our current hypothesis is correct (that transient network errors are the bulk of things) I would expect the distribution to be fairly level. There will be a lot of US users experiencing problems... but only because we have a lot of users in the US. Ditto Windows, release, and anything that's common.

What would be cool is if our hypothesis were incorrect (or at least incomplete) and that there's something causal hidden in the data. Uncovering it, if it exists, might be difficult.

Flags: needinfo?(chutten)

Kate Ustiuzhanina

Reporter

Comment 4

•

7 years ago

This is latest results (Beta) https://gist.github.com/katejim/c7ca9befa55992435741910f0deb3e4b . But basically, most of the pings are from Windows and connected to eChannelOpen failure on shutdown.

Georg Fritzsche [:gfritzsche]

Comment 5

•

7 years ago

We probably need to normalize this to the different OS user populations before drawing conclusions here.

Georg Fritzsche [:gfritzsche]

Comment 6

•

7 years ago

Looking locally, at my about:telemetry page, i see a common pattern:
- one health ping with reason "immediate"
- followed after <1sec by a "shutdown" health ping

We should be able to confirm if this common with a longitudinal, per-client analysis.
If so, that would (again) point to that we are trying to send pings after the Firefox network stack already shut down.

Assignee: katejimmelon → nobody

Georg Fritzsche [:gfritzsche]

Updated

•

7 years ago

Comment 7

•

7 years ago

The minimum next step here would be to confirm what comment 6 suggests.
If that is true and most/many pings follow this pattern, we can probably close this bug.

Georg Fritzsche [:gfritzsche]

Updated

•

7 years ago

status-firefox57: --- → fix-optional

Chris H-C :chutten

Updated

•

7 years ago

Priority: P1 → P2

Chris H-C :chutten

Comment 8

•

7 years ago

Attached image Screenshot-2017-10-3 https pipeline-cep prod mozaws net.png — Details

Well, it's not causal, but after I pushed bug 1397293, this happened.

I'd say cutting the volume of "health" pings roughly in half is consistent with the theory presented in comment 6.

Chris H-C :chutten

Updated

•

7 years ago

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Investigate high ratio of health ping clients (comparing to DAU)

Categories

(Toolkit :: Telemetry, enhancement, P2)

Tracking

()

People

(Reporter: katejimmelon, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Updated

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Updated

Updated

Comment 8

Updated

Attachment

General

Description

File Name

Content Type