Last Comment Bug 681546 - Avoid large cache evictions as disk fills by smoothing "smart" max cache size
: Avoid large cache evictions as disk fills by smoothing "smart" max cache size
Status: RESOLVED FIXED
[inbound]
:
Product: Core
Classification: Components
Component: Networking: Cache (show other bugs)
: unspecified
: x86_64 Linux
: -- normal (vote)
: mozilla10
Assigned To: Michal Novotny (:michal)
:
: Patrick McManus [:mcmanus]
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-08-23 19:28 PDT by Jason Duell [:jduell] (needinfo me)
Modified: 2011-10-28 16:29 PDT (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Graph showing old/new smart cache sizes (Y) vs free disk space (X) (35.79 KB, image/png)
2011-08-23 19:28 PDT, Jason Duell [:jduell] (needinfo me)
no flags Details
same graph, but log-based X/Y axes (shows small sizes better) (32.13 KB, image/png)
2011-08-23 19:29 PDT, Jason Duell [:jduell] (needinfo me)
no flags Details
v1: implement smoother smart cache size (5.04 KB, patch)
2011-08-23 19:39 PDT, Jason Duell [:jduell] (needinfo me)
no flags Details | Diff | Splinter Review
version of the patch I used to generate the graphs (5.78 KB, patch)
2011-08-23 19:41 PDT, Jason Duell [:jduell] (needinfo me)
no flags Details | Diff | Splinter Review
patch v2 - adds size of current disk cache to the free space (14.54 KB, patch)
2011-09-15 16:54 PDT, Michal Novotny (:michal)
bjarne: review+
Details | Diff | Splinter Review
unbitrotted patch that was checked in (14.48 KB, patch)
2011-10-19 05:41 PDT, Michal Novotny (:michal)
no flags Details | Diff | Splinter Review

Description Jason Duell [:jduell] (needinfo me) 2011-08-23 19:28:12 PDT
Created attachment 555301 [details]
Graph showing old/new smart cache sizes (Y) vs free disk space (X)

Our current smart cache algorithm has huge jumps in the max cache size--we go from using 1 GB down to 800 MB, and from 800 to 625 MB (and then to 500 MB) as free disk space shrinks.  This means users who hit these cutoffs will need to evict 125 or 175 or 200 MB from the cache at startup, which is likely to take a long time (and may hang the cache in the meantime?).   This is fairly rare, but still nice to avoid.

I've written a patch which avoids these rude large evictions by more smoothly increasing/decreasing the smart cache size.  In order to avoid doing evictions at every single startup (which seems bad for startup benchmarking if nothing else: but that's an important case), it only evicts in 10 MB chunks.  For large disk drives (>7GB free), that will only happen when free disk space has changed by more than 100 MB (or 200 MB for >25GB free) since the last startup (and the cache was full).

Attached are some graphs that show the smart size graphed against free space, using both the old algorithm and my new one.
Comment 1 Jason Duell [:jduell] (needinfo me) 2011-08-23 19:29:56 PDT
Created attachment 555302 [details]
same graph, but log-based X/Y axes (shows small sizes better)
Comment 2 Jason Duell [:jduell] (needinfo me) 2011-08-23 19:39:26 PDT
Created attachment 555304 [details] [diff] [review]
v1: implement smoother smart cache size

Michal: while writing this I noticed that we base the smart size only on remaining amount of disk space on the device.  We don't include the actual size of the cache in this calculation, which is a little silly, in that it means we'll allow the cache to be as big as "40%" of the remaining space (for very small amounts of free space), but then as we actually take up space with the cache, "40% of free space" gets smaller and we actually have to shrink the cache until it's about 29% of free space (see comment in code). 

Do we know the size of the cache this early on in the code? If we do, we could include it in the calculation and get a more stable "free" space value, independent of the cache's size.
Comment 3 Jason Duell [:jduell] (needinfo me) 2011-08-23 19:41:35 PDT
Created attachment 555305 [details] [diff] [review]
version of the patch I used to generate the graphs

This is the same patch, with some code jammed in to generate the raw data I used to generate the graphs, in case that's useful.

There's obviously no easy way to test this code automatedly.  So I figure testing the range of the old/new functions and plotting the results is as good as we need to do here.
Comment 4 Michal Novotny (:michal) 2011-08-31 06:14:45 PDT
(In reply to Jason Duell (:jduell) from comment #2)
> Do we know the size of the cache this early on in the code? If we do, we
> could include it in the calculation and get a more stable "free" space
> value, independent of the cache's size.

We don't know the size at this point. The total size is stored in cache map header which is read when disk device is initialized.
Comment 5 Michal Novotny (:michal) 2011-08-31 07:01:04 PDT
Comment on attachment 555304 [details] [diff] [review]
v1: implement smoother smart cache size

> +  // - Would it be better to do the calculation including the cache size? Yes,
> +  //   but also slower. So no :)

It wouldn't be slower since the total size is stored in the header and isn't calculated. We just need to postpone the call of GetSmartCacheSize() after the disk device initialization. Otherwise the patch looks good.
Comment 6 Michal Novotny (:michal) 2011-09-15 16:54:21 PDT
Created attachment 560490 [details] [diff] [review]
patch v2 - adds size of current disk cache to the free space

Successfully tested on tryserver:

https://tbpl.mozilla.org/?tree=Try&usebuildbot=1&rev=7eda0d31ad59
Comment 7 Bjarne (:bjarne) 2011-09-16 04:07:41 PDT
Comment on attachment 560490 [details] [diff] [review]
patch v2 - adds size of current disk cache to the free space

Review of attachment 560490 [details] [diff] [review]:
-----------------------------------------------------------------

Could you comment upon how/why current code use the 'firstRun' field/param, the corresponding pref, and how this patch affects this?

::: netwerk/cache/nsCacheService.cpp
@@ -223,5 @@
>          if (NS_FAILED(rv)) 
>              smartSizeEnabled = PR_FALSE;
>          if (smartSizeEnabled) {
>              nsCacheService::SetDiskCacheCapacity(mSmartSize);
> -            // also set on observer, in case mDiskDevice not init'd yet.

Why was this removed?
Comment 8 Michal Novotny (:michal) 2011-09-28 07:02:07 PDT
(In reply to Bjarne (:bjarne) from comment #7)
> Could you comment upon how/why current code use the 'firstRun' field/param,
> the corresponding pref, and how this patch affects this?

This patch doesn't change it in any way. It just removes the param from nsSetSmartSizeEvent since it was unused.


> >          if (smartSizeEnabled) {
> >              nsCacheService::SetDiskCacheCapacity(mSmartSize);
> > -            // also set on observer, in case mDiskDevice not init'd yet.
> 
> Why was this removed?

Because now it is ensured that this code is called after initialization of mDiskDevice.
Comment 9 Bjarne (:bjarne) 2011-09-28 11:56:38 PDT
(In reply to Michal Novotny (:michal) from comment #8)
> (In reply to Bjarne (:bjarne) from comment #7)
> > Could you comment upon how/why current code use the 'firstRun' field/param,
> > the corresponding pref, and how this patch affects this?
> 
> This patch doesn't change it in any way. It just removes the param from
> nsSetSmartSizeEvent since it was unused.

Is it used anywhere now?

> > Why was this removed?
> 
> Because now it is ensured that this code is called after initialization of
> mDiskDevice.

Ok - fair enough.
Comment 10 Michal Novotny (:michal) 2011-09-28 15:09:41 PDT
(In reply to Bjarne (:bjarne) from comment #9)
> > This patch doesn't change it in any way. It just removes the param from
> > nsSetSmartSizeEvent since it was unused.
> 
> Is it used anywhere now?

Yes, it is used in nsCacheProfilePrefObserver::PermittedToSmartSize() and in nsCacheProfilePrefObserver::ReadPrefs() - see variable firstSmartSizeRun. It is used to check if this is the first smart size calculation ever. And if it is, we don't allow to override the capacity that was previously set by the user.
Comment 11 Bjarne (:bjarne) 2011-10-18 11:39:00 PDT
Comment on attachment 560490 [details] [diff] [review]
patch v2 - adds size of current disk cache to the free space

Review of attachment 560490 [details] [diff] [review]:
-----------------------------------------------------------------

Ok
Comment 13 Michal Novotny (:michal) 2011-10-19 05:41:21 PDT
Created attachment 568016 [details] [diff] [review]
unbitrotted patch that was checked in
Comment 14 Marco Bonardo [::mak] 2011-10-20 02:56:38 PDT
https://hg.mozilla.org/mozilla-central/rev/604f0f9eabda
Comment 15 Dave Townsend [:mossop] 2011-10-28 16:29:43 PDT
Could this be the cause of assertions in bug 697657? They have nsCacheService in most of the stacks

Note You need to log in before you can comment on or make changes to this bug.