The default bug view has changed. See this FAQ.

Limit disk cache size until/unless "clear recent data" can be done async

RESOLVED FIXED in mozilla17

Status

()

Core
Networking: Cache
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: jduell, Assigned: nwgh)

Tracking

unspecified
mozilla17
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

Unless we can fix bug 648232 before the next code fork, I propose we limit the disk cache for now to something smaller than 1 GB, say 400 MB.

The issue is that we're blocking while we evict all entries instead of using nsDeleteDir.

We don't have great data on how long this is blocking the browser.  We've got no direct telemetry on it (let's add some telemetry to EvictEntries() as part of this bug?).  The closest thing we have are the stats for NETWORK_DISK_CACHE_DELETEDIR (which measures how long nsDeleteDir takes to delete all the files in the cache: I'm not sure how that compares to eviction, but I'm guessing eviction can only take longer as it has bookkeeping overhead too), and we foolishly capped those at 10 sec max (I'm changing those in bug 7092862).
  
   https://metrics.mozilla.com/pentaho/content/pentaho-cdf-dd/Render?solution=metrics2&path=telemetry&file=telemetryHistogram.wcdf
   
Shows that deleting the disk cache takes >10s around 20% of the time. Users who've selected "clear my last hour of browsing history" shouldn't have it take that long (and it's probably much longer in some cases).
(Reporter)

Comment 1

5 years ago
Note that this post from a Chrome dev also suggests that we may want to limit cache size because at some point, lookup speed deteriorates:

  https://plus.google.com/103382935642834907366/posts/XRekvZgdnBb

"beyond 320MB, the cache hit rate improvements were negligible, but the long tail of disk lookups started increasing, so performance actually degraded, and it was actually better to simply load the resource from the network instead."

We should get some measurements and consider pushing out A|B tests to see if a smaller cache size might actually be better.
OS: Linux → All
Hardware: x86_64 → All
Assignee: nobody → hurley
Created attachment 649369 [details] [diff] [review]
patch

Here's a first run at taking the approach that was discussed on dev-tech-network. Try looks good; a few android tests failed unexpectedly, but those seem to be infrastructure issues (I'm re-running the failed tests, just to make sure).
Attachment #649369 - Flags: feedback?(jduell.mcbugs)
(Reporter)

Comment 3

5 years ago
Comment on attachment 649369 [details] [diff] [review]
patch

Review of attachment 649369 [details] [diff] [review]:
-----------------------------------------------------------------

+r with nits fixed (and tryserver OK).

::: netwerk/cache/nsCacheService.cpp
@@ +1594,5 @@
>      return NS_OK;
>  }
>  
> +// Runnable sent from cache thread to main thread
> +class nsSetUseOldMaxSmartSizePrefEvent: public nsRunnable

Since this event only ever turns off the old size, it might read better as "nsDisableOldMaxSmartSizePrefEvent".  Just a thought.

::: netwerk/cache/nsDiskCacheDevice.cpp
@@ +1004,5 @@
>              exists = false;
>          } else {
>              // don't gather telemetry for "corrupt cache" for new profile
>              // where cache doesn't exist (most likely case if we're here).
> +            nsCacheService::MarkStartingFresh();

Hmm, I found a bug here while looking at this: bug 780750.

Once that's fixed we can remove two out of the three MarkStarting calls here, and just use the last one (under if !exists).

I probably should have based that patch on top on this one.  But just remove the other two calls already and things will work well enough--the return rv case here is rare (and the disk cache doesn't get rebuilt, so you actually don't want to MarkFresh anyway), and the ERROR_CORRUPTED call just winds up resulting in calling MarkFresh twice.
Attachment #649369 - Flags: feedback?(jduell.mcbugs) → feedback+
Here's the try run for this patch combined with the one from bug 780750 - https://tbpl.mozilla.org/?tree=Try&rev=988be862d179
Blocks: 780922
Created attachment 649832 [details] [diff] [review]
patch (version for checkin)

Here's the patch with Jason's nits fixed, and rebased on top of his patch in bug 780750, carrying forward r+

Try looks good https://tbpl.mozilla.org/?tree=Try&rev=988be862d179
Attachment #649369 - Attachment is obsolete: true
Attachment #649832 - Flags: review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/3247ce86f58f
Status: NEW → ASSIGNED
https://hg.mozilla.org/mozilla-central/rev/3247ce86f58f
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla17
Comment on attachment 649832 [details] [diff] [review]
patch (version for checkin)

Christian, Josh mentioned I should get sr from you for this, given the scope of its change. It's already landed, but can pretty easily be backed out if you find some problem with it.
Attachment #649832 - Flags: superreview?(cbiesinger)

Updated

5 years ago
Depends on: 783755
Comment on attachment 649832 [details] [diff] [review]
patch (version for checkin)

+    bool            UseOldMaxSmartSize()        { return mUseOldMaxSmartSize; }
+    void            SetUseNewMaxSmartSize()     { mUseOldMaxSmartSize = false; }

Two thoughts on this:
- UseOldMaxSmartSize reads ambiguously, maybe rename to ShouldUseOldMaxSmartSize?
- SetUseNewMaxSmartSize - maybe give it a bool argument to allow setting it either way
Attachment #649832 - Flags: superreview?(cbiesinger) → superreview+
I'm not entirely clear on why we want to use the bigger size when we are creating a new cache, but not otherwise? Or am I misreading the patch? Maybe you can point me to the mentioned dev-tech-network discussion
The changes from comment 9 sound good, I'll take care of them in the followup to this (bug 783755).

As for your question in comment 10, I think you're misreading the patch (or I missed something glaringly obvious). Any time we start with a fresh cache (whether it's because we crashed, are being started for the very first time, the user cleared history, or something else) we will start using the 350MB cache. We're doing the resize then so we don't go thrashing people's disks when they upgrade to Firefox 17 by deleting 650MB of data (a whole CD ROM worth!). Our hope is that this will, over the course of a few releases, get us to the point where most people are using the 350MB size (due to our abysmal crash rate), and then we can just bite the bullet for the last small group of people and make the change permanent. The dev-tech-network discussion is at https://groups.google.com/d/topic/mozilla.dev.tech.network/jMziw6VBD9s/discussion

Comment 12

5 years ago
Just out of curiosity,
What happened to the hard limit of not caching a file larger than 1/8 of the cache size? 50MB was the value set for the pref(browser.cache.disk.max_entry_size), which requires the cache size to be more than 400mb. 350MB means the pref must be lowered, if I'm guessing right.
(Reporter)

Updated

5 years ago
Duplicate of this bug: 800186

Comment 14

5 years ago
I've limited my cache size to 350MB but when I run 'du -sh .' in the cache dir on OS X
/Users/willf/Library/Caches/Firefox/Profiles/dp2ght2h.Firefox7/Cache
$ du -sh .
449M	.

I find this surprising.  Note, I'm running Firefox 16.0.1.

Also, the about:cache states:

Number of entries: 	88384
Maximum storage size: 	358400 KiB
Storage in use: 	358399 KiB
Cache Directory: 	/Users/willf/Library/Caches/Firefox/Profiles/dp2ght2h.Firefox7/Cache
We round the data size up to the nearest kB, but FS blocks are usually 4kB or more, hence the difference.
You need to log in before you can comment on or make changes to this bug.