Last Comment Bug 705796 - telemetry: gather stats on how often disk cache is corrupt.
: telemetry: gather stats on how often disk cache is corrupt.
Product: Core
Classification: Components
Component: Networking: Cache (show other bugs)
: unspecified
: All All
-- normal (vote)
: mozilla14
Assigned To: Jason Duell [:jduell] (needinfo me)
: Patrick McManus [:mcmanus]
Depends on:
  Show dependency treegraph
Reported: 2011-11-28 11:08 PST by Jason Duell [:jduell] (needinfo me)
Modified: 2012-04-09 10:10 PDT (History)
2 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

v1 gather stats on how often disk cache is corrupt. (3.06 KB, patch)
2011-11-28 11:08 PST, Jason Duell [:jduell] (needinfo me)
no flags Details | Diff | Splinter Review
v2: unbitrotted (3.30 KB, patch)
2012-03-29 16:59 PDT, Jason Duell [:jduell] (needinfo me)
michal.novotny: review+
Details | Diff | Splinter Review

Description User image Jason Duell [:jduell] (needinfo me) 2011-11-28 11:08:26 PST
Created attachment 577314 [details] [diff] [review]
v1  gather stats on how often disk cache is corrupt.

Like it says.  One nuance: don't gather a statistic (either false or true) if the cache is new (new profile/user): we want to be able to divide true/false to get the % of restarts where we see a corrupt cache.

Note that until this lands and we have data, SHUTDOWN_OK is probably a fairly good approximation.
Comment 1 User image Bjarne (:bjarne) 2011-12-20 07:11:41 PST
Comment on attachment 577314 [details] [diff] [review]
v1  gather stats on how often disk cache is corrupt.

Review of attachment 577314 [details] [diff] [review]:

Weird..  didn't I review something like this some time ago? (Maybe I forgot to submit..?)

Anyway - I'd propose to combine this with the existing timer in OpenDiskCache() so that we get separate timing info for opening corrupted and working caches. The actual working/corrupted-ratio can be estimated by looking at the number of submissions (which will not be absolutely accurate, but with large enough numbers should give a nice indication).

If you don't want to combine them, the code is fine and r+. However, I'm clearing the review-request so that you can re-request if you decide to combine.
Comment 2 User image Jason Duell [:jduell] (needinfo me) 2012-03-29 16:59:50 PDT
Created attachment 610758 [details] [diff] [review]
v2: unbitrotted


I just found and unbitrotted this patch.  Do you think I should implement Bjarne's idea (if we did, I think we'd want to still keep this boolean metric so we know what % of the time the cache is corrupt), or just take this as it is?
Comment 3 User image Michal Novotny (:michal) 2012-03-30 05:39:48 PDT
Comment on attachment 610758 [details] [diff] [review]
v2: unbitrotted

I don't think we really need to separate timings by working and corrupted caches. Time spent opening a corrupted cache should be always lower that in case of a working cache, so it won't affect the telemetry data in a bad way. I.e. high values in case of NETWORK_DISK_CACHE_OPEN should always represent opening a working cache. BTW ratio of opening corrupted and working cache should be pretty low and if it isn't we have a bigger problem than inaccurate NETWORK_DISK_CACHE_OPEN telemetry.
Comment 4 User image Jason Duell [:jduell] (needinfo me) 2012-04-06 13:15:23 PDT
Comment 5 User image Matt Brubeck (:mbrubeck) 2012-04-09 10:10:18 PDT

Note You need to log in before you can comment on or make changes to this bug.