Closed Bug 914824 Opened 12 years ago Closed 12 years ago

HTTP cache v2: add telemetry so we can compare with current cache metrics

Categories

(Core :: Networking: Cache, defect)

defect
Not set
minor

Tracking

()

RESOLVED FIXED
mozilla27

People

(Reporter: mayhemer, Assigned: mayhemer)

References

(Blocks 1 open bug)

Details

(Whiteboard: [cache2])

Attachments

(1 file, 1 obsolete file)

I'm thinking of carrying the following probes: NETWORK_DISK_CACHE_DELETEDIR NETWORK_DISK_CACHE_SHUTDOWN CACHE_DEVICE_SEARCH_2 *) Give them NETWORK_CACHE_V2_ prefix so we can distinguish. At *) we don't distinguish between entry-found and entry-not-found. So it's hard to compare with the CACHE_MISS_TIME telemetry I plan.
According CACHE_DEVICE_SEARCH_2, I've rather introduced new probes at bug 913813: NETWORK_CACHE_V1_MISS_TIME_MS and NETWORK_CACHE_V1_HIT_TIME_MS.
Blocks: 916750
Other probes: distinguish the cache related telemetry we collect at nsLoadGroup::TelemetryReportChannel by the cache version used.
Attached patch v1 (obsolete) — Splinter Review
Few existing probes are supplemented with "_V2" versions that are accumulated instead when using the new cache. This way we can compare current and new cache when A/B testing is on. The _V2 probes will eventually be removed.
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Attachment #809417 - Flags: review?(michal.novotny)
Attachment #809417 - Flags: feedback?(taras.mozilla)
Comment on attachment 809417 [details] [diff] [review] v1 Adding v2 seems pretty uncontroversial. i dont see any explicit hit rate telemetry. Do you think extrapolating from CACHE, [NO_CACHE] entries is sufficient? I think we should have a histogram & _V2 equiv for 'time to fetch first entry from cache' this would cover cache opening, etc. What happens during shutdown?
Attachment #809417 - Flags: feedback?(taras.mozilla) → feedback+
Eg, is the new cache doing anything other than 'delete on exit' stuff?
(In reply to Taras Glek (:taras) from comment #4) > Comment on attachment 809417 [details] [diff] [review] > v1 > > Adding v2 seems pretty uncontroversial. i dont see any explicit hit rate > telemetry. Do you think extrapolating from CACHE, [NO_CACHE] entries is > sufficient? I absolutely don't understand this comment :) > > I think we should have a histogram & _V2 equiv for 'time to fetch first > entry from cache' this would cover cache opening, etc. > > What happens during shutdown? On a slow storage delayed writes can be queued and we wait for them all to finish. On an extremely slow storage it can be a really long time to shutdown. There is a bug on this: bug 913822. Plan is to just abandon the unwritten data, leave files open and let the system close is for us (like exit(0) would do). (In reply to Taras Glek (:taras) from comment #5) > Eg, is the new cache doing anything other than 'delete on exit' stuff? As described. Not sure what the current cache is doing with deleting. Michal?
Flags: needinfo?(michal.novotny)
(In reply to Honza Bambas (:mayhemer) from comment #6) > (In reply to Taras Glek (:taras) from comment #4) > > Comment on attachment 809417 [details] [diff] [review] > > v1 > > > > Adding v2 seems pretty uncontroversial. i dont see any explicit hit rate > > telemetry. Do you think extrapolating from CACHE, [NO_CACHE] entries is > > sufficient? > > I absolutely don't understand this comment :) > we do not have hit rate telemetry
(In reply to Taras Glek (:taras) from comment #7) > we do not have hit rate telemetry We have, but I don't see a point to introduce a _V2 for cache hit counters. Why do you think it would be useful? The cache reval/evict logic is the same for both the old and the new cache.
(In reply to Honza Bambas (:mayhemer) from comment #8) > (In reply to Taras Glek (:taras) from comment #7) > > we do not have hit rate telemetry > > We have, but I don't see a point to introduce a _V2 for cache hit counters. > Why do you think it would be useful? The cache reval/evict logic is the > same for both the old and the new cache. For peace of mind, it's not the same because we use the disk space differently(eg no block files).
(In reply to Honza Bambas (:mayhemer) from comment #6) > (In reply to Taras Glek (:taras) from comment #5) > > Eg, is the new cache doing anything other than 'delete on exit' stuff? > > As described. Not sure what the current cache is doing with deleting. > Michal? There might be deleting of the cache (corrupted cache or user cleared the whole cache) in progress during shutdown. In this case we stop deleting of the cache during shutdown and we continue when FF starts again. If user sets an option to clear the cache when FF closes, then we delete the cache on shutdown and we wait until it (and any other deletion in progress) is done.
Flags: needinfo?(michal.novotny)
(In reply to Michal Novotny (:michal) from comment #10) > If user sets an option to clear the cache when FF closes, then we delete the > cache on shutdown and we wait until it (and any other deletion in progress) > is done. Thanks. Makes sense. BTW, seems like we are missing this feature in the new cache. I was somehow persuaded that we were storing everything only in memory when "don't remember history" was chosen. But we delete stuff from disk on shutdown... hmm... kinda performance nasty...
(In reply to Taras Glek (:taras) from comment #9) > For peace of mind, it's not the same because we use the disk space > differently(eg no block files). Hmm.. OK, I think I understand. The eviction logic is going to be different from what we have now. So some compare would be good. Will do that.
Comment on attachment 809417 [details] [diff] [review] v1 Looks good. Just one thing I don't like is that whereas NETWORK_DISK_CACHE_SHUTDOWN measures shutdown time of the cache service, NETWORK_DISK_CACHE_SHUTDOWN_V2 measures shutdown of just CacheFileIOManager and not e.g. CacheStorageService.
Attachment #809417 - Flags: review?(michal.novotny) → review+
(In reply to Michal Novotny (:michal) from comment #13) > Comment on attachment 809417 [details] [diff] [review] > v1 > > Looks good. Just one thing I don't like is that whereas > NETWORK_DISK_CACHE_SHUTDOWN measures shutdown time of the cache service, > NETWORK_DISK_CACHE_SHUTDOWN_V2 measures shutdown of just CacheFileIOManager > and not e.g. CacheStorageService. CacheStorageService shutdown time is zero. It just drops the hash tables and is done. No need to measure it. And also CacheStorageService shutdown happens in different time then CacheFileIOManager shutdown. And in CacheFileIOManager all the shutdown disk activity happens, so that is IMO the right place where some regression can happen. I have to update the patch, Taras want's me to add a V2 probe for cache hit/miss numbers...
Attached patch v1.1Splinter Review
- in nsHttpChannel added HTTP_CACHE_DISPOSITION_2_V2 track - removed mCacheEntryDeviceTelemetryID member, no longer used - otherwise the patch is unchanged
Attachment #809417 - Attachment is obsolete: true
Attachment #810008 - Flags: review?(michal.novotny)
Attachment #810008 - Flags: review?(michal.novotny) → review+
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: