HTTP cache v2: add telemetry so we can compare with current cache metrics

RESOLVED FIXED in mozilla27

Status

()

Core
Networking: Cache
--
minor
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: mayhemer, Assigned: mayhemer)

Tracking

(Blocks: 1 bug)

unspecified
mozilla27
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [cache2])

Attachments

(1 attachment, 1 obsolete attachment)

23.08 KB, patch
michal
: review+
mayhemer
: checkin+
Details | Diff | Splinter Review
(Assignee)

Description

5 years ago
I'm thinking of carrying the following probes:
NETWORK_DISK_CACHE_DELETEDIR
NETWORK_DISK_CACHE_SHUTDOWN
CACHE_DEVICE_SEARCH_2 *)

Give them NETWORK_CACHE_V2_ prefix so we can distinguish.

At *) we don't distinguish between entry-found and entry-not-found.  So it's hard to compare with the CACHE_MISS_TIME telemetry I plan.
(Assignee)

Comment 1

5 years ago
According CACHE_DEVICE_SEARCH_2, I've rather introduced new probes at bug 913813: NETWORK_CACHE_V1_MISS_TIME_MS and NETWORK_CACHE_V1_HIT_TIME_MS.
(Assignee)

Updated

5 years ago
Blocks: 916750
(Assignee)

Comment 2

5 years ago
Other probes: distinguish the cache related telemetry we collect at nsLoadGroup::TelemetryReportChannel by the cache version used.
(Assignee)

Comment 3

5 years ago
Created attachment 809417 [details] [diff] [review]
v1

Few existing probes are supplemented with "_V2" versions that are accumulated instead when using the new cache.  This way we can compare current and new cache when A/B testing is on.

The _V2 probes will eventually be removed.
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Attachment #809417 - Flags: review?(michal.novotny)
Attachment #809417 - Flags: feedback?(taras.mozilla)

Comment 4

5 years ago
Comment on attachment 809417 [details] [diff] [review]
v1

Adding v2 seems pretty uncontroversial. i dont see any explicit hit rate telemetry. Do you think extrapolating from CACHE, [NO_CACHE] entries is sufficient?

I think we should have a histogram & _V2 equiv for 'time to fetch first entry from cache' this would cover cache opening, etc.

What happens during shutdown?
Attachment #809417 - Flags: feedback?(taras.mozilla) → feedback+

Comment 5

5 years ago
Eg, is the new cache doing anything other than 'delete on exit' stuff?
(Assignee)

Comment 6

5 years ago
(In reply to Taras Glek (:taras) from comment #4)
> Comment on attachment 809417 [details] [diff] [review]
> v1
> 
> Adding v2 seems pretty uncontroversial. i dont see any explicit hit rate
> telemetry. Do you think extrapolating from CACHE, [NO_CACHE] entries is
> sufficient?

I absolutely don't understand this comment :)

> 
> I think we should have a histogram & _V2 equiv for 'time to fetch first
> entry from cache' this would cover cache opening, etc.
> 
> What happens during shutdown?

On a slow storage delayed writes can be queued and we wait for them all to finish.  On an extremely slow storage it can be a really long time to shutdown.  There is a bug on this: bug 913822.  Plan is to just abandon the unwritten data, leave files open and let the system close is for us (like exit(0) would do).



(In reply to Taras Glek (:taras) from comment #5)
> Eg, is the new cache doing anything other than 'delete on exit' stuff?

As described.  Not sure what the current cache is doing with deleting.  Michal?
Flags: needinfo?(michal.novotny)

Comment 7

5 years ago
(In reply to Honza Bambas (:mayhemer) from comment #6)
> (In reply to Taras Glek (:taras) from comment #4)
> > Comment on attachment 809417 [details] [diff] [review]
> > v1
> > 
> > Adding v2 seems pretty uncontroversial. i dont see any explicit hit rate
> > telemetry. Do you think extrapolating from CACHE, [NO_CACHE] entries is
> > sufficient?
> 
> I absolutely don't understand this comment :)
> 
we do not have hit rate telemetry
(Assignee)

Comment 8

5 years ago
(In reply to Taras Glek (:taras) from comment #7)
> we do not have hit rate telemetry

We have, but I don't see a point to introduce a _V2 for cache hit counters.  Why do you think it would be useful?  The cache reval/evict logic is the same for both the old and the new cache.

Comment 9

5 years ago
(In reply to Honza Bambas (:mayhemer) from comment #8)
> (In reply to Taras Glek (:taras) from comment #7)
> > we do not have hit rate telemetry
> 
> We have, but I don't see a point to introduce a _V2 for cache hit counters. 
> Why do you think it would be useful?  The cache reval/evict logic is the
> same for both the old and the new cache.

For peace of mind, it's not the same because we use the disk space differently(eg no block files).
(In reply to Honza Bambas (:mayhemer) from comment #6)
> (In reply to Taras Glek (:taras) from comment #5)
> > Eg, is the new cache doing anything other than 'delete on exit' stuff?
> 
> As described.  Not sure what the current cache is doing with deleting. 
> Michal?

There might be deleting of the cache (corrupted cache or user cleared the whole cache) in progress during shutdown. In this case we stop deleting of the cache during shutdown and we continue when FF starts again.

If user sets an option to clear the cache when FF closes, then we delete the cache on shutdown and we wait until it (and any other deletion in progress) is done.
Flags: needinfo?(michal.novotny)
(Assignee)

Comment 11

5 years ago
(In reply to Michal Novotny (:michal) from comment #10)
> If user sets an option to clear the cache when FF closes, then we delete the
> cache on shutdown and we wait until it (and any other deletion in progress)
> is done.

Thanks.  Makes sense.

BTW, seems like we are missing this feature in the new cache.  I was somehow persuaded that we were storing everything only in memory when "don't remember history" was chosen.  But we delete stuff from disk on shutdown... hmm...  kinda performance nasty...
(Assignee)

Comment 12

5 years ago
(In reply to Taras Glek (:taras) from comment #9)
> For peace of mind, it's not the same because we use the disk space
> differently(eg no block files).

Hmm.. OK, I think I understand.  The eviction logic is going to be different from what we have now.  So some compare would be good.  Will do that.
Comment on attachment 809417 [details] [diff] [review]
v1

Looks good. Just one thing I don't like is that whereas NETWORK_DISK_CACHE_SHUTDOWN measures shutdown time of the cache service, NETWORK_DISK_CACHE_SHUTDOWN_V2 measures shutdown of just CacheFileIOManager and not e.g. CacheStorageService.
Attachment #809417 - Flags: review?(michal.novotny) → review+
(Assignee)

Comment 14

5 years ago
(In reply to Michal Novotny (:michal) from comment #13)
> Comment on attachment 809417 [details] [diff] [review]
> v1
> 
> Looks good. Just one thing I don't like is that whereas
> NETWORK_DISK_CACHE_SHUTDOWN measures shutdown time of the cache service,
> NETWORK_DISK_CACHE_SHUTDOWN_V2 measures shutdown of just CacheFileIOManager
> and not e.g. CacheStorageService.

CacheStorageService shutdown time is zero.  It just drops the hash tables and is done.  No need to measure it.  And also CacheStorageService shutdown happens in different time then CacheFileIOManager shutdown.  And in CacheFileIOManager all the shutdown disk activity happens, so that is IMO the right place where some regression can happen.

I have to update the patch, Taras want's me to add a V2 probe for cache hit/miss numbers...
(Assignee)

Comment 15

5 years ago
Created attachment 810008 [details] [diff] [review]
v1.1

- in nsHttpChannel added HTTP_CACHE_DISPOSITION_2_V2 track
- removed mCacheEntryDeviceTelemetryID member, no longer used
- otherwise the patch is unchanged
Attachment #809417 - Attachment is obsolete: true
Attachment #810008 - Flags: review?(michal.novotny)
Attachment #810008 - Flags: review?(michal.novotny) → review+
https://hg.mozilla.org/mozilla-central/rev/0d74127c0b31
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
You need to log in before you can comment on or make changes to this bug.