Closed
Bug 1172688
Opened 9 years ago
Closed 8 years ago
Add telemetry for when gethash calls timeout
Categories
(Toolkit :: Safe Browsing, defect)
Toolkit
Safe Browsing
Tracking
()
RESOLVED
FIXED
mozilla48
Tracking | Status | |
---|---|---|
firefox48 | --- | fixed |
People
(Reporter: francois, Assigned: dlee)
References
Details
(Whiteboard: tpe-seceng)
Attachments
(1 file, 2 obsolete files)
MozReview Request: Bug 1172688 - Add telemetry for when gethash calls timeout. r=francois, bsmedberg
58 bytes,
text/x-review-board-request
|
benjamin
:
review+
francois
:
review+
|
Details |
It would be useful to know how often these completions timeout when trying to reach Google's servers.
Reporter | ||
Updated•9 years ago
|
Updated•8 years ago
|
Component: DOM: Security → Safe Browsing
Product: Core → Toolkit
Reporter | ||
Comment 1•8 years ago
|
||
This is probably going to be similar to bug 1150921.
Whiteboard: tpe-seceng
Assignee | ||
Updated•8 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 3•8 years ago
|
||
Hi François, Do we measure this by how many success gethash request with a timeout or the time between each timeout ?
Flags: needinfo?(francois)
Reporter | ||
Comment 4•8 years ago
|
||
I don't think we have a need for the actual time it took. We just want to make sure that requests don't time out very often. So I would suggest: # timeouts / (# timeouts + # completed requests) which means we also need to add telemetry for the requests that succeeded.
Flags: needinfo?(francois)
Assignee | ||
Comment 5•8 years ago
|
||
Attachment #8732696 -
Flags: review?(francois)
Reporter | ||
Comment 6•8 years ago
|
||
Comment on attachment 8732696 [details] [diff] [review] Add telemetry for gethash Review of attachment 8732696 [details] [diff] [review]: ----------------------------------------------------------------- ::: toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js @@ +553,5 @@ > // Notify the RequestBackoff once a response is received. > this._completer.finishRequest(this.gethashUrl, httpStatus); > > if (success) { > + Services.telemetry.getHistogramById("URLCLASSIFIER_COMPLETE_TIMEOUT").add(0); What happens when the request doesn't timeout but still fails? For example, if it returns a 503 instead of a 200. I would say that as far as the telemetry goes, we should consider that a non-timeout event and mark it as such. We're only interested in seeing how often the gethash endpoint fails to answer at all.
Attachment #8732696 -
Flags: review?(francois)
Assignee | ||
Comment 7•8 years ago
|
||
Modification of this patch: - Record all gethash requests in telemetry instead of just record success requests
Attachment #8732696 -
Attachment is obsolete: true
Attachment #8733220 -
Flags: review?(francois)
Reporter | ||
Comment 8•8 years ago
|
||
Comment on attachment 8733220 [details] [diff] [review] Add telemetry for gethash v2 Review of attachment 8733220 [details] [diff] [review]: ----------------------------------------------------------------- That looks fine to me. I think Benjamin needs to review all new telemetry probes.
Attachment #8733220 -
Flags: review?(francois) → review?(benjamin)
Comment 9•8 years ago
|
||
Comment on attachment 8733220 [details] [diff] [review] Add telemetry for gethash v2 If this is an exploratory measurement, it should expire (typically in 6 months). If this is an operational alerting metric, I think we should be collecting from the release population, and you should talk a bit about how you plan on monitoring this with either daily or realtime alerting.
Attachment #8733220 -
Flags: review?(benjamin) → feedback-
Reporter | ||
Comment 10•8 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #9) > If this is an exploratory measurement, it should expire (typically in 6 > months). > > If this is an operational alerting metric, I think we should be collecting > from the release population, and you should talk a bit about how you plan on > monitoring this with either daily or realtime alerting. The question we're trying to answer is: Is the 10s timeout reasonable and (especially) does this happen often in practice? Is that what you'd consider an exploratory measurement? If so, can we run it for 1 year instead of just 6 months?
Flags: needinfo?(benjamin)
Comment 11•8 years ago
|
||
Please run it for 6 months first and if you still need more data you can extend it. You could instead add a metric that measures the total time a gethash call takes, which would give you a distribution and not just a boolean.
Flags: needinfo?(benjamin)
Reporter | ||
Comment 12•8 years ago
|
||
Dimi, I think we should go with the 6 months that Benjamin suggested. https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=1237198&attachment=8735634 suggests that all we need to do is add "expires_in_version" with whatever release will come 6 months after Fx 48.
Assignee | ||
Comment 13•8 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/44105/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/44105/
Attachment #8737773 -
Flags: review?(francois)
Attachment #8737773 -
Flags: review?(benjamin)
Assignee | ||
Updated•8 years ago
|
Attachment #8733220 -
Attachment is obsolete: true
Reporter | ||
Comment 14•8 years ago
|
||
Comment on attachment 8737773 [details] MozReview Request: Bug 1172688 - Add telemetry for when gethash calls timeout. r=francois, bsmedberg https://reviewboard.mozilla.org/r/44105/#review40815
Attachment #8737773 -
Flags: review?(francois) → review+
Assignee | ||
Comment 15•8 years ago
|
||
(In reply to Dimi Lee[:dimi][:dlee] from comment #13) > Created attachment 8737773 [details] > MozReview Request: Bug 1172688 - Add telemetry for when gethash calls > timeout. r?francois, bsmedberg > > Review commit: https://reviewboard.mozilla.org/r/44105/diff/#index_header > See other reviews: https://reviewboard.mozilla.org/r/44105/ Forget to say, based on comment 4, i still use boolean for now instead of metric suggested in comment 11
Reporter | ||
Comment 16•8 years ago
|
||
I think the boolean is fine for this initial exploration.
Updated•8 years ago
|
Attachment #8737773 -
Flags: review?(benjamin) → review+
Comment 17•8 years ago
|
||
Comment on attachment 8737773 [details] MozReview Request: Bug 1172688 - Add telemetry for when gethash calls timeout. r=francois, bsmedberg https://reviewboard.mozilla.org/r/44105/#review41015 data-review=me with that change or something equivalent ::: toolkit/components/telemetry/Histograms.json:3364 (Diff revision 1) > + "URLCLASSIFIER_COMPLETE_TIMEOUT": { > + "alert_emails": ["gcp@mozilla.com", "francois@mozilla.com"], > + "expires_in_version": "52", > + "kind": "boolean", > + "bug_numbers": [1172688], > + "description": "True if gethash lookup timeout" Suggest more detail: "this metric is recorded every time a gethash lookup is performed. `true` is recorded if the lookup times out."
Assignee | ||
Comment 18•8 years ago
|
||
Comment on attachment 8737773 [details] MozReview Request: Bug 1172688 - Add telemetry for when gethash calls timeout. r=francois, bsmedberg Review request updated; see interdiff: https://reviewboard.mozilla.org/r/44105/diff/1-2/
Attachment #8737773 -
Attachment description: MozReview Request: Bug 1172688 - Add telemetry for when gethash calls timeout. r?francois, bsmedberg → MozReview Request: Bug 1172688 - Add telemetry for when gethash calls timeout. r=francois, bsmedberg
Assignee | ||
Updated•8 years ago
|
Keywords: checkin-needed
Comment 19•8 years ago
|
||
https://hg.mozilla.org/integration/fx-team/rev/51688b3611d4
Keywords: checkin-needed
Comment 20•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/51688b3611d4
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox48:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
You need to log in
before you can comment on or make changes to this bug.
Description
•