Closed
Bug 1502378
Opened 6 years ago
Closed 6 years ago
pingsender may be fooled into losing some main/shutdown pings on non-windows clients
Categories
(Toolkit :: Telemetry, defect, P1)
Toolkit
Telemetry
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox65 | --- | affected |
People
(Reporter: chutten, Assigned: chutten)
References
(Blocks 1 open bug)
Details
In looking at proportions of "main" pings with reason: "shutdown" sent from Firefox 55+, our Linux population sends a much noisier and lower proportion[1] than the broader population[2], by about 10-15%. We only looked into this when Jan-Erik noticed that the curl implementation of PingSender::Post seems to treat all response codes < 400 as success[3]. The Windows implementation takes the inverse approach and treats only a 200 code as success. This matters because a success results in the deletion of the pending ping. If pingsender thinks it's a success in a case when ingest didn't receive the ping, this means data loss. The Telemetry HTTP Edge Server spec will only hand out 200 for success and a code >= 400 on failure[4]... but that presumes pingsender was able to reach the edge server at all. I don't know if it is possible to positively confirm with available data whether the curl implementation's broader definition of success is problematic. Thus, I propose we change the curl implementation to match the Windows implementation (only 200 is success) and then measure builds with and without the fix for changes in volume. [1]: [2]: [3]: https://searchfox.org/mozilla-central/rev/72b1e834f384a2ffec6eb4ce405fbd4b5e881109/toolkit/components/telemetry/pingsender/pingsender_unix_common.cpp#187 [3]: https://docs.telemetry.mozilla.org/concepts/pipeline/http_edge_spec.html#postput-response-codes
Assignee | ||
Updated•6 years ago
|
Assignee: nobody → chutten
Status: NEW → ASSIGNED
Points: --- → 2
Priority: -- → P1
Assignee | ||
Comment 1•6 years ago
|
||
Some quick analysis shows that the Linux population is missing subsessions at a lower rate than the broader population, which is inconsistent with the theory that the curl implementation is inherently more problematic. (It might still be more problematically-written, but there may be aspects of the environment or population that lessen the effect). Improvements to the pingsender implementation wrt dealing with the HTTP edge server can come via the broader effort in bug 1290256
You need to log in
before you can comment on or make changes to this bug.
Description
•