Closed Bug 1187340 Opened 9 years ago Closed 9 years ago

Add probe to track pending telemetry ping load failures

Categories

(Toolkit :: Telemetry, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla42
Iteration:
42.3 - Aug 10
Tracking Status
firefox40 --- wontfix
firefox41 --- fixed
firefox42 --- fixed

People

(Reporter: gfritzsche, Assigned: Dexter)

References

Details

(Whiteboard: [unifiedTelemetry] [uplift4])

Attachments

(1 file, 2 obsolete files)

We need to track how many pings we fail to read from disk (and if we discard them).
Blocks: 1185123
Assignee: nobody → alessio.placitelli
Attached patch bug1187340.patch (obsolete) — Splinter Review
Attachment #8640441 - Flags: review?(gfritzsche)
Whiteboard: [unifiedTelemetry] [uplift4]
Comment on attachment 8640441 [details] [diff] [review] bug1187340.patch Review of attachment 8640441 [details] [diff] [review]: ----------------------------------------------------------------- We specifically want to track pending ping load failures here. Additionally it would be useful to discern between types of failures: * failure to load from disk * text decoder failure (can that step fail?) * JSON parse error We should have test-coverage where possible. ::: toolkit/components/telemetry/Histograms.json @@ +4654,5 @@ > "high": "300000", > "n_buckets": 20, > "description": "Time (ms) it takes for evicting over-quota pings" > }, > + "TELEMETRY_PING_LOAD_FAILURES_COUNT": { TELEMETRY_PENDING_LOAD_FAILURES_COUNT? ::: toolkit/components/telemetry/TelemetryStorage.jsm @@ +1328,5 @@ > + > + let array = yield OS.File.read(aFilePath, options).catch(e => { > + // If we failed to load the ping from file, update the histogram and propagate > + // the rejection. > + Telemetry.getHistogramById("TELEMETRY_PING_LOAD_FAILURES_COUNT").add(); This tracks all ping load failures (including archived etc.). We want to know specifically how many pending pings failed to load.
Attachment #8640441 - Flags: review?(gfritzsche)
Iteration: --- → 42.3 - Aug 10
Attached patch bug1187340.patch (obsolete) — Splinter Review
Attachment #8640441 - Attachment is obsolete: true
Attachment #8641110 - Flags: review?(gfritzsche)
Comment on attachment 8641110 [details] [diff] [review] bug1187340.patch Review of attachment 8641110 [details] [diff] [review]: ----------------------------------------------------------------- ::: toolkit/components/telemetry/TelemetryStorage.jsm @@ +1358,2 @@ > * @return {Promise<Object>} A promise resolved with the ping content or rejected if the > * ping contains invalid data. Lets update the documentation to at least mention the errors used on rejection. @@ +1365,5 @@ > } > + > + let array = yield OS.File.read(aFilePath, options).catch(e => { > + throw new PingReadError(e.message); > + }); Lets use try-catch here, that's the Task-style to deal with it. @@ +1377,5 @@ > + if (typeof(ping.payload) == "string") { > + ping.payload = JSON.parse(ping.payload); > + } > + } catch (e) { > + throw new PingParseError(e.message); I think we should at least remove unparsable pings (and maybe on some/all load failures too) - please file a follow-up for that. ::: toolkit/components/telemetry/tests/unit/test_TelemetrySendOldPings.js @@ +307,5 @@ > + > + h = Telemetry.getHistogramById("TELEMETRY_PENDING_LOAD_FAILURE_READ").snapshot(); > + Assert.equal(h.sum, 1, "Telemetry must report a pending ping load failure"); > + h = Telemetry.getHistogramById("TELEMETRY_PENDING_LOAD_FAILURE_PARSE").snapshot(); > + Assert.equal(h.sum, 1, "Telemetry must report a pending ping parse failure"); Also add a test-case where neither histograms are incremented.
Attachment #8641110 - Flags: review?(gfritzsche) → review+
Attached patch bug1187340.patchSplinter Review
Attachment #8641110 - Attachment is obsolete: true
Attachment #8641164 - Flags: review+
(In reply to Georg Fritzsche [:gfritzsche] from comment #4) > @@ +1377,5 @@ > > + if (typeof(ping.payload) == "string") { > > + ping.payload = JSON.parse(ping.payload); > > + } > > + } catch (e) { > > + throw new PingParseError(e.message); > > I think we should at least remove unparsable pings (and maybe on some/all > load failures too) - please file a follow-up for that. Filed bug 1189425. Try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=600fffc71993
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
Blocks: 1191757
Depends on: 1191825
Comment on attachment 8641164 [details] [diff] [review] bug1187340.patch Approval Request Comment [Feature/regressing bug #]: Telemetry Unification [User impact if declined]: This adds diagnostics for when we fail to load persisted pings - we need this for validating the incoming data and detecting issues. It is part of a mostly fixup & diagnostic uplift batch for 41: http://bit.ly/1LYhA16 [Describe test coverage new/current, TreeHerder]: We have automated test-coverage, it baked on Nightly, we are monitoring it. [Risks and why]: Low-risk, we haven't seen anything crop up on Nightly here. [String/UUID change made/needed]: None.
Attachment #8641164 - Flags: approval-mozilla-aurora?
Comment on attachment 8641164 [details] [diff] [review] bug1187340.patch [Triage Comment] Seems safe to uplift to Beta41, fix has been in Nightly for a week.
Attachment #8641164 - Flags: approval-mozilla-aurora? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: