Closed Bug 193911 Opened 22 years ago Closed 14 years ago

Increase default disk cache size

Categories

(Core :: Networking: Cache, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla2.0b4

People

(Reporter: schapel, Assigned: schapel)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

Attachments

(2 files, 1 obsolete file)

In the past several years, the average web page has become larger as the average
user's bandwidth has increased. The default size of the disk cache should be
made larger to keep the cache hit rate high. I suggest doubling the default size
on all platforms. This should not be a problem because disk sizes have been
soaring since 1998 when GMR drives were introduced.
The disk cache size defaults to 50MB, which is quite large.  Do you have actual
statistics on what the cache hit rate is for typical usage at that cache size?
No, I have no actual data collected. It's just a matter of principle that the
disk cache will need to be doubled in size every five years or so to keep the
hit rate up, assuming that average screen resolution
http://upsdell.com/BrowserNews/stat_trends.htm#res and depth 
http://upsdell.com/BrowserNews/stat_trends.htm#col increases, and average
connection speed http://upsdell.com/BrowserNews/res_load.htm#d02 increases, in
turn allowing web sites to get larger. And it's realistic to do so as quickly as
disk capacity is increasing.
Right.  Do you really think that the defauly disk cache was 50MB five years ago?
 ;) It was not.  In particular, see bug 77218.  Of which this is a duplicate if
there is no useful data on the cache performance....
OK. I'll wait to 2005 to confirm this. ;-)

*** This bug has been marked as a duplicate of 77218 ***
Status: UNCONFIRMED → RESOLVED
Closed: 22 years ago
Resolution: --- → DUPLICATE
Blocks: 216490
I think it's time to reopen this, now that work is being done to fully implement offline browsing. As one data point, IE uses over 1 GB of disk cache by default on modern computers. Bumping up the default disk cache size to just 100 MB would help offline browsing work much better.
Status: RESOLVED → UNCONFIRMED
Resolution: DUPLICATE → ---
*** Bug 359033 has been marked as a duplicate of this bug. ***
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to comment #6)
> I think it's time to reopen this, now that work is being done to fully
> implement offline browsing. As one data point, IE uses over 1 GB of disk cache
> by default on modern computers. Bumping up the default disk cache size to just
> 100 MB would help offline browsing work much better.

How do you arrive at 100M? 

I can't see doing this without usage data. I doubt a high percentage of people do, or will do offline browsing. Also, the 1GB default for IE is hardly a rationale for FF going higher. It might be IE's default just plain showmanship (or stupidity) on MS' part.  AFAICT with IE it's excessive as it does little or nothing for performance and is therefore wasting disk space. 
How do you plan to handle disk quotas?  I have several accounts with 1GB or less disk quota.  IE's behavior there doesn't matter much, since it doesn't really run on any of the operating systems involved (unless you count IE5/Solaris, of course)...
I arrived at 100 MB because it is double the current 50 MB. Without any usage data at all, you can know that every so often the disk quota should be doubled to keep the same hit rate. I suppose the only question is how often it should be doubled. How can we determine that?

Do you have data about how many people do or will do offline browsing? Without that data, I don't think anyone can claim that it won't be used much. It will be used, and the larger the cache, the more effective it will be. Right now, I cannot visit sites I browsed just a few days ago if I try to browse offline. Even without Firefox users using offline browsing, some are asking for the default to be raised. Is that enough usage data? If not, what other usage data do we need?

I would think that disk quotas would be handled exactly as they are now. Surely there were cases where trying to create a 50 MB file failed when the disk cache was first bumped up to that value. In the case of someone that doesn't have enough disk space, they can manually lower the cache size.
> Right now, I cannot visit sites I browsed just a few days ago if I try to
> browse offline.

I suspect that might also be related to the collision rate for the hash function used in the cache (which causes automatic eviction of the older page by the newer one if the cache is full).  Worth instrumenting to make sure, of course.  I would really like to know at what cache size (given typical resource size) the hash collisions end up being the limiting factor.  I know they already are for sites that have lots of similar URIs.

To make things clear, I'm not against a bigger disk cache; I just think we need a clear story for use cases where a 1GB disk cache is simply not an option.  100MB would be a lot less of a problem.  Expecting tens of thousands of users (say at MIT, which has a 1GB quota) to all manually change their cache size is unreasonable, so we need a way to do this easily at rollout. 
I'll clear my disk cache and see how long it takes to fill up. If it's full after a few days, that means it's because the older items got evicted, not because of hash collisions. I'll also try browsing offline before my cache fills up to see if hash collisions are causing cache misses.

A 1 GB disk cache sounds too big to me, for the reasons you describe. The rate of increase of default disk cache size should fall well behind the rate of increase of disk capacities to help ensure that we don't take many users' remaining hard disk space.
My 50 MB disk cache is now full. I actually half filled it earlier and lost it all when the Shockwave Player crashed. After just watching a handful of videos on YouTube, the cache was already half full. Going back and visiting the sites I had been to seemed to work fairly well, except for some missing content that I suppose was evicted from the cache. It looks like a 50 MB cache won't hold a day's browsing for many users with a broadband connection (I have a middle-of-the-road 3 Mbps connection).
Oh, hmmm.  YouTube.  Yeah, if we're tossing movies into the cache it'll fill up fast.  Pretty much no matter what size it is, really.

At the same time, there really isn't that much reason to cache them for extended periods of time.  I wonder whether we can either cache them in a separate cache or preferentially expire them...
how should cache determine whether something is a "movie" or not?

We have a maximum file size for cache entries; it's the smaller of 64 MB and half the cache size. perhaps that's too large...
http://lxr.mozilla.org/seamonkey/source/netwerk/cache/src/nsDiskCacheDevice.cpp#692
> how should cache determine whether something is a "movie" or not?

By MIME type, I would assume... Or via an explicit metadata flag set by higher layers.

We do need to cache the stuff, I think, because we need to pass the file to the plug-in in some cases, no?

I guess my main point is that 50MB or 100MB would fit a lot of gmail-like apps, but only very few movies....

I do think that going to a 100MB cache is fine, but I don't think it would solve the problems Steve is concerned about on its own.
Looking through the cache entries, it looks like large (> 1 MB) YouTube files have an expiration time of just a few hours. For example, Data size: 1512748 bytes, Fetch count: 1, Last modified: 2007-07-04 23:08:46, Expires: 2007-07-05 01:06:46 (you can find these files by searching for URLs with "get_video"). If the cache is just big enough so that one session of watching YouTube doesn't fill or nearly fill the cache with these files, they will automatically be preferentially evicted. Just making the default cache size bigger (say 100 MB), as well as recommending that users who watch lots of videos make the disk cache bigger still (say 200 MB), would go a long way to mitigate the problem of YouTube videos wiping out everything else in the cache. If there are popular video sites that have much longer expiration times for their large files, they can be evangelized to set their expirations shorter.
With a "Last modified" that's that recent, the "Expires" we compute will be pretty short, yeah.  Or does YouTube actually send "Expires" headers?
On two YouTube videos I just checked, neither gets sent with an Expires header, so the expiration must be computed.
We should _definitely_ not evangelize video sites to set a short expiration time. the video is unlikely to ever change, so why is a short expiration good?
Maybe there's something about the cache eviction policy I'm not understanding, but it seems to me that it wouldn't make sense for a video to remain in the disk cache for weeks. I will rarely go back and watch a video again days later, but I will go back to websites I frequent every day or every week. If I wanted to set the disk cache to hold the non-changing content that the browser will request again for a period of one week, I would need to set the disk cache to several hundred megabytes, because most of it would be used to store these videos I rarely go back to see again. On the other hand, if the videos are evicted from the cache after about one day, I can live with a 100 MB cache. It seems to me that a short expiration is desirable for generally one-time content such as videos and a long expiration is desirable for content that I will load often, such as large, static graphics from my frequently visited sites.

YouTube does not set an expiration for its videos, so the expiration on Gecko browsers depends on how long ago the video was uploaded. To test, I watched a video that was uploaded several months ago, and the expiration was calculated to be in about two weeks. That does make sense to me, because I will be much more likely to go back and watch that video again, as opposed to the videos uploaded just a day ago, which will expire in just a few hours.

In the case of my usage, it looks like a 50 MB disk cache is too small, as watching just a few videos (for example, just starting to watch two 10-minute videos for long enough for the videos to download completely) will wipe out everything in the disk cache. With a 100 MB disk cache, my cache is full after less than two days of browsing, but there are several videos in the cache that have already expired, so I assume that all the static content of sites I frequent is still in the cache, and the videos will be evicted before the static data that I want to remain in the cache. I presume that other users have similar usage, and their disk caches are being wiped out regularly as well. Is there more usage data I should collect? I can try a 50 MB disk cache without viewing any videos and see how long it takes to fill up.
Despite viewing no videos, I got similar results. Within about six hours of clearing the disk cache, the cache held 50 MB and by the next day my 100 MB cache was full. I suspect that Google maps hybrid view, Flash games, and a 10 MB executable I downloaded contributed the most to the cache size.

When I was starting programming, I had a disk quota of 100 KB. By the time I got to university, I had a quota of 10 MB. And now Boris says that nowadays a 1 GB disk quota is common. Disk quotas on shared systems seem to be increasing by about two orders of magnitude per decade. Doubling the size of the disk cache every three years would increase it in size by only one order of magnitude per decade. It seems fairly safe to double the size of the disk cache every few years, and it seems safe to predict that the size will have to double every few years to keep a comparable hit rate. Specifically, a 50 MB disk cache in 2007 seems like it's not big enough to hold static content from sites that users visit every day, at least for users that have broadband connections and use any of the many web applications that use lots of that bandwidth.
OK, I buy that.  Let's see about doing this for 1.9...

As a back-of-the-envelope calculation, 50MB is about 70 screens full of google maps hybrid view (the jpegs + pngs for that view are about 700KB over here).

Alternately, with a modest broadband connection (1Mbps), it's about 7 minutes of continuous downloading...  Which is not all that much, really.

I still think we need to think about an eviction strategy that would preferentially evict big files.  But that's for a separate bug.
Flags: blocking1.9?
Assignee: gordon → nobody
QA Contact: tever → networking.cache
Some final information for posterity:

I have two Gecko browsers (Firefox 3 and SeaMonkey 2 trunk builds) with large disk caches. They both seem to have plateaued in their disk cache utilization several days after clearing their caches, so that making the disk cache larger will have little or no effect. The disk cache stats for the two browsers are:
Number of entries: 8167   Storage in use:  80393 KiB
Number of entries: 7944   Storage in use: 136983 KiB

In my case, making the cache larger than 100 MB will give little benefit, although a cache of say 200 MB could benefit users that download large files on a regular basis. For now, probably the most important change for further increasing the disk cache hit rate for most users (beyond increasing the default disk cache size) would be fixing bug 175600, which limits the disk cache to 8192 entries. Implementing a heuristic to selectively evict cache entires such as video files that will probably not be viewed again might be worth the trouble, but it seems easy enough to simply manually increase the size of the disk cache, and most users have plenty of disk storage to do so.
Simply increasing the size will do nothing on its own, imo, given the number of YouTube-like things people do all the time that are basically one-off page views.
Flags: blocking1.9? → blocking1.9-
Whiteboard: [wanted-1.9]
Flags: blocking1.9- → blocking1.9?
Flags: blocking1.9?
*Objecting*

Actually, I came accross this bug report while trying to look for a fellow entry to _decrease_ default cache size. I regard it as a kind of intrusion or penetration when such an innoccent browser application as firefox takes over 50 MB of my home directory (which is limited to 100 MB by quota). Just think of companies where hundreds of users have their home folders on the main server, sharing the same hard disc. Maybe we can settle down to disable cache by default whenever the connection is using a PROXY. In this case, seemingly local caching is obsolete, the proxy has the faster cache anyway.
In response to comment #26, we're never going to decide upon a cache size that everyone is happy with. There will always be some users who want it lower and some who want it higher. The best we can do is make a default that as many users as possible are happy with, and that approximately half of those who are not happy with it want it lower and about half want it higher. To ensure that remains the case, we will have to increase the size of the disk cache every so often. This is because over time, web pages contain more data and disk quotas get larger. Those who want it smaller and larger may do so manually. Additionally, note that not all proxies are caching proxies.

In response to comment #25, I thought I already demonstrated that increasing the size of the disk cache makes a significant difference. Merely doubling the size of the cache means the cache holds days worth of sites rather than hours. That makes all the difference in the world for sites that I go back to every day or two, especially if attempting to use offline browsing.
I may be interesting what the 'competition' does: IE6, IE7, Opera, Safari, etc...
As far as I know, MS generally uses a percentage of the disk where the cache is (so my IE6 cache is 1136MB! which is twenty times more than Firefox).
Flags: wanted1.9+
Whiteboard: [wanted-1.9]
In practice most people only visit a couple of websites frequently. This means that an adequately sized cache will help to speed up the loading of these websites considerably.

The current default size of 50 Mb is too small. I observe that my own website - which has plenty of cacheable objects - doesn't use the cache when I visit the site the next day. If the cache is too small the overhead of the cache will reduce the speed rather than increasing it. 

Practically all users will have enough diskspace to handle a cache of - let's say - 500 Mb. In case of (corporate) users with disk quota, it should not be a big issue for their IT-department to roll out Firefox with a smaller cache. In case of a caching proxy it will even be beneficial to roll out Firefox with a disabled cache.

In practice most users don't know what a browser is, so it is not realistic to expect them to change the default value of the cache.
What would help would be to put cached items in buckets based on size, there would be many small items that change infrequently where the cache would be just
patch to double default disk cache size
Assignee: nobody → steve.chapel
Status: NEW → ASSIGNED
Attachment #426432 - Flags: review?(cbiesinger)
Attachment #426432 - Flags: review?(cbiesinger) → review+
Attachment #426432 - Flags: superreview?(darin.moz)
Darin isn't doing reviews anymore.
Attachment #426432 - Flags: superreview?(darin.moz) → superreview?(bzbarsky)
Attachment #426432 - Flags: superreview?(bzbarsky) → superreview+
Keywords: checkin-needed
I believe company/school sysadmins will tune the cache size when users have a
quota defined.

IE is now setting 1GB as default cache size and both Google chrome and Safari
don't let the user to set its cache size (in order to use as much disk space
needed to do good caching).

Web sites' content has grown over the last years but it seems Firefox is still
using 50MB as default disk cache size (since 2004 !). Raising it to 100 MB is better but still too small.
It could be good to have a big cache size even for users having high-speed DSL connections. Today Firefox' default cache size is so small it's not possible to cache a few youtube videos.

So I think a big cache with good indexing of objects in memory, powerful search
algorithm to find if the object is in the cache (some btree stuff) and a smart
cache object replacement algorithm could be a good performace enhancement for
Firefox.


Of course, bug https://bugzilla.mozilla.org/show_bug.cgi?id=175600 should be fixed before moving to big cache size.


Your thoughts ?
http://hg.mozilla.org/mozilla-central/rev/bb9e847a02c8
Status: ASSIGNED → RESOLVED
Closed: 22 years ago14 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.3a2
Flags: in-testsuite-
So, this bug came with a bunch of wins, but also some regressions:
Regression: Tp4 (Private Bytes) increase 4.74% on WINNT 6.1 Firefox http://tinyurl.com/yhgo6ab
Regression: Tp4 (%CPU) increase 34.83% on WINNT 6.1 Firefox http://tinyurl.com/yjqz786
Regression: Tp4 (Memset) increase 6.53% on WINNT 6.1 Firefox http://tinyurl.com/yfv3u6v

Unless these were expected (and it doesn't look like it), this should be backed out and the regression tracked down.
In fact I've just backed this out:
http://hg.mozilla.org/mozilla-central/rev/639c6e42ee38
http://hg.mozilla.org/mozilla-central/rev/db63ed9f0bb6
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Additionally, this regression, which the regression script originally attributed to a different bug, went away with the backout:
Regression: Tp4 (Private Bytes) increase 7.99% on Fedora 12 x64	- Constantine Firefox http://tinyurl.com/yzx58un
My thoughts on disk cache are it should be only large enough to have a performance gain, and not a loss. 

Case in point with the original versions of IE, the 10% rule would run into situations where IE would spend more time trying to find out if it had the page in cache then it would if id down loaded it all over again. The is the potential problem I can see with just making a blanket increase.  

I think that if we wanted to really solve the problem completely and in a final solution we would be to come up with some metric where each item is benched to see if the item is something that should be pulled form cache or downloaded.Call it brilliant cache.
Matthew, if you look at the numbers from this checkin, it sped up pageload times, but increased memory usage and CPU usage during pageload.

Is that a "performance gain" or a "performance loss"?
In this case it’s a gain. My point was not that it should not be done, but rather to elaborate a potential final solution with a dynamic metric because everyone’s system and internet connection is different, and the caching system may need to take that into account. 

i.e. I could be on a really slow P3 with a IDE drive, system, but have a fast internet connection. Which could potentially favor smaller cache due to being able to get the data from the net faster or I could have a fast i7 with SSD in which case and a dial up connection in which case getting it from disk would be faster

That's all.
> with a IDE drive, system, but have a fast internet connection.

IDE drive speeds on the slow end are about 100Mb/s.  IDE seek latencies are a lot lower than typical network connection latencies.

You would have to have a _very_ contrived scenario to make getting data off the net faster than getting it off the local drive, unless you have a lot of ongoing local disk traffic anyway (e.g. are swapping).
One thing to consider which I have not seen mentioned so far, is the fact that large caches can be counter productive.

I set my own cache size on Firefox and IE to 10MiB deliberately to minimise cache hits between sessions whilst ensuring I have sufficient cache for logos etc. when browsing a site.

The reason I do this is because with larger caches there is an increased likelihood of an out of date page being retrieved from the cache instead of the current version from the site and I end up having to do CTRL+F5 to bypass the cache and retrieve a fresh copy of the page anyway (which is more effort and more bandwidth usage).

Whilst I know to do this, most users do not know that the page they are looking at may not be the current version and even if they did, they probably do not know how to bypass the cache.
Hard drives (even SSD) are still very slow and indeed it could be time consuming to walk through a big disk cache to check whether an object is stored in it or not.

What about doing a cache in RAM ? A background job running with a low priority could load the disk cache's content into memory cache at startup and perform regular syncs between the memory cache and the disk cache... A lot of computers today have more than 1GB of RAM that is usually under-used when people only surf with it.

I know bugzilla isn't a good place to ask for features but I believe caching could be a true performance advantage so I wanted to propose this idea...
Malbrouck: any sane modern os already uses ram to cache disk, handles syncing between the two, etc.

Bugzilla is a fine place to ask for features; just file bugs with severity set to "enhancement".
I can understand the small memory increase. There are more entries in the disk cache, so there are more entries in the data structure in memory that keeps track of the disk cache items. But why the large increase in CPU usage? The in-memory data structure seems to be a hash table, which should take O(1) time to look up entries. Perhaps one of the common operations performed on the data structure takes O(n) time.
Hmm.  4-5% memory increase is something on the order of 3-4MB in this case.  Is that really an expected size increase for the in-memory part of disk cache here?
Please, increase the size of Firefox disk cache to at least 1GB, so the load on the servers and network links can be reduced. If we now set caching on static objects (CSS, JS, images), only IE users benefit from it (because they have a large cache), while Opera and Firefox users need to re-download everything on every visit, because the tiny 50MB cache fills at best within a few hours of browsing. The side effect of this would also be faster websites for users as less data needs to be transferred.

Every day this simple configuration change gets delayed, bandwidth is wasted and users need to wait longer for websites to display.
The resounding answer from the Mozilla Caching Summit is that we should increase the default cache size, so we should land this.  And then maybe come up with a smarter algorithm that looks into how much free space the user has.

But for this patch to make much difference I think we need to also land bug 290032 and bug 175600.  Otherwise the extra space may not be used (or well used).
Depends on: 175600, 290032
Blocks: http_cache
Someone will need to look into why the patch caused such a large hit on CPU and memory usage. I don't think I'll have time, and I'm not even sure how to start looking into it if I did have time.
Blocks: 559942
Has it ever been envisaged to replace FireFox's internal cache with an external open-source cache software (ie squid) ?

This would immediately solve the default size problem and associated bugs (8192 max objects and hash collisions). Squid is very stable and FireFox could be bundled with a squid installed with a standard configuration like:
- listening on the loopback interface 127.0.0.1:3128
- big cache size (1 GB at least)
On top of that, having an external cache would also avoid increasing FireFox's CPU and memory consumption.

This external cache should leave the user to configure his own proxy in FF's prefs. Redirections of HTTP requests through squid should be done with or without proxy configured the same way as it is today with the internal cache.

I believe 2 quality open source softwares like FF and Squid could do an excellent job together.

Your thoughts ?
I think a great solution would be for Firefox to use Tokyo Cabinet for object storage. It has it's own index (you can chose among several of them) and it's the fastest thing there is. For unique key, you simply hash the URL of the object to be cached/retrieved. What do you think?
Dup of bug 569709?

changeset:   43560:cc8369aa7bd3
user:        Ehsan Akhgari
date:        Thu Jun 10 22:46:51 2010 -0400
summary:     Bug 569709 - Figure out the max number of entries we should store in the disk cache, and bump the default size of the disk cache to 250MB; r=jduell
Yup, it's a dupe now I guess...
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → DUPLICATE
OK then, the part of bug 193911 that increased the size was backed out due to the same regression we had before...
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
(In reply to comment #35)
> So, this bug came with a bunch of wins, but also some regressions:
> Regression: Tp4 (Private Bytes) increase 4.74% on WINNT 6.1 Firefox
> http://tinyurl.com/yhgo6ab
> Regression: Tp4 (%CPU) increase 34.83% on WINNT 6.1 Firefox
> http://tinyurl.com/yjqz786
> Regression: Tp4 (Memset) increase 6.53% on WINNT 6.1 Firefox
> http://tinyurl.com/yfv3u6v
> 
> Unless these were expected (and it doesn't look like it), this should be backed
> out and the regression tracked down.

I ran Talos + TryServer on a tree with a 50MB cache and 100MB cache. The tree that had the 100MB cache was also with Michal's Asynchronous Reads patch (see Bug 513008). The results were similar to those observed by Shawn earlier. Namely, there was a CPU increase. Here are the results: 

__50 MB cache__
time: 00:17:19
tp4: 602.98
tp4_modlistbytes: 71.6
cpu: 37.02
tp4_pbytes: 91.3
tp4_memset: 91.2
tp4_shutdown: 1272.0

___100MB cache___
time: 00:13:21
tp4: 356.88
tp4_modlistbytes: 63.4
cpu: 47.36
tp4_pbytes: 92.9
tp4_memset: 94.9
tp4_shutdown: 3562.0

We think the reason for this is because the tp4 page set is ~93Mb, and if we increase the disk cache to 100Mb, we have to do comparatively little network traffic, since the page set pretty much fits in the cache. Because we are almost only operating on the cache, it seems reasonable that we see the bump in CPU. 

The more exciting piece of news that came with these results is that doubling the cache size significantly improves overall time measurements. 

Given this, jst, sicking and I think it safe to land the size increase to 100Mb on mozilla-central, and keep an eye on it. We will likely want to have the tp4 test stick to using a 50Mb cache, even if we land this change, for a number of reasons (to force us to still hit the net, to not raise false alarms from the CPU regression, and because we ultimately hope to have a dynamically sized cache based on available disk space), but that can be put to a separate bug.

Any thoughts and feedback are welcome.
Yup, I think we need to get this landed, the regression is due to an interaction with talos and the cache size and not a real regression.

Right now the cache size is such that about half the talos page set fits in the cache, which means that as talos goes through its pages it writes them all to the cache and evicts old versions, and that way cycles everything through the cache for each time through the pages. If we increase the size of the cache enough that the whole page set fits in the cache then the first time through the pages we'll end up reading from the network (localhost), but each consecutive time through the pages we'll read from the cache and not from the network, so this means there's less consistency in performance characteristics between the first run through and all consecutive runs.

I think we should do one of two things here, either lock the size of the cache that we use when running talos to the current size (50 megs) or disable the cache completely when running talos.

If we lock the cache size to the current size when running talos then we get the benefit of not needing to rebase any talos numbers and we also continue to exercise reading, writing, and evictions in the cache as part of talos (which may or may not be a good thing), but it does mean that the performance characteristics when running talos is unaffected by any changes we're making to the cache size, and I predict there will be more of those than this bug alone in the near future.

If we on the other hand disable the cache completely while running talos then we'd get the benefit of always reading from the network each and every time through and we wouldn't waste time writing to the cache only to evict before we get to read it etc.

My vote for now is to lock the size to 50 megs for talos, simply find the profile it uses while running and setting the cache size in the prefs in that profile. Eventually we will need to lock down the size of the cache for talos anyways since the long term plan is to make the disk cache size dynamically adapt to the amount of available space on the computer etc, and AFAICT that would be a disaster for talos in production.

Thoughts?
It's a little weird to be testing a configuration different from the one we ship to users...

In the short term, locking down the talos cache size sounds like the expedient thing to do.  In the long term, doing talos runs in 3 configurations: locked-down partial-size cache, cache-as-shipped-to-users, and no-cache might be interesting (but needs lots more machines, of course).
I'd actually look at things from the opposite point of view -- talos is not a great test for analyzing performance of our network or cache, because it doesn't really control those all that well.  However, if all of tp4 were to fit inside the cache (even if it has to get read in there the first time), it should significantly reduce noise, and should make it more sensitive to everything not network related.   It would also have the benefit of speeding up Tp4 runs by a good amount.  I'd actually suggest that we bump the cache up to 250MB as well instead of 100, if we can actually fill it without collisions (as we're going to be able to do).
I initially had the same thought. However there are two issues:

1. Performance of things like layout and parsing is heavily dependent on in what chunk sizes we receive data. What we likely want to test with the talos tests is the chunk size of loading things from network. I'm not sure if the chunk size when loading from cache is similar.

2. jst pointed out that if we fit the whole cache in memory, we'll end up with the first run being very slow since we load everything from network, and all remaining runs being faster since we load from cache. Thus introducing irregularities in the numbers. It's possible that the math that we're using to calculate averages will remove most of this (iirc we drop the fastest and slowest numbers before calculating averages), but we'll definitely reduce the effectiveness of our average-run calculations.
> I'm not sure if the chunk size when loading from cache is similar.

It's not.
> 2. jst pointed out that if we fit the whole cache in memory, we'll end up with
> the first run being very slow since we load everything from network, and all
> remaining runs being faster since we load from cache. 

But pre-populating the cache (since we know the pageset) would fix this right?
(In reply to comment #63)
> > 2. jst pointed out that if we fit the whole cache in memory, we'll end up with
> > the first run being very slow since we load everything from network, and all
> > remaining runs being faster since we load from cache. 
> 
> But pre-populating the cache (since we know the pageset) would fix this right?

Yes. (alternatively we fix the math such that we ignore the first run in addition to the slowest subsequent run).
Yeah, fixing the math to drop the first run (or even better -- do a 1-cycle run through, and then restart the browser entirely for the actual timing cycles) is probably much easier than having a premade cache.

For the chunk size issue, is that something that's easy to tweak via a pref or similar?  Also, do we have data that shows what the effects of different chunk sizes are on layout and parsing?
(In reply to comment #65)
> For the chunk size issue, is that something that's easy to tweak via a pref or
> similar?  Also, do we have data that shows what the effects of different chunk
> sizes are on layout and parsing?

A pref is an interesting idea. I suspect that would be doable.

As far as chunk size effects data, I strongly suspect that is a moving number. Things like lazy frame construction, new HTML parsing algorithms, retained layers, etc all affect how much work we duplicate when chunk size is smaller.
Moving number in what sense?  Like, for a fixed codebase, do we know what effect dufferent numbers have?  I'm mainly asking why we think the value that it is right now is somehow correct, and why we think that reading from the cache will give less correct (as opposed to just different) results.

I'm pretty strongly in favor of upping the cache size right away for b4 (even to 250MB -- 100MB isn't really that big of a bump), and let's sort out the details afterwards.
(In reply to comment #67)
> Moving number in what sense?  Like, for a fixed codebase, do we know what
> effect dufferent numbers have?  I'm mainly asking why we think the value that
> it is right now is somehow correct, and why we think that reading from the
> cache will give less correct (as opposed to just different) results.

I meant moving for a moving codebase.

I don't have data no. Though given the extremely big speedup of 40% (!!) gained from fitting all of talos in the cache, it is a good guess that at least some of that is due to now loading with bigger chunks.

> I'm pretty strongly in favor of upping the cache size right away for b4 (even
> to 250MB -- 100MB isn't really that big of a bump), and let's sort out the
> details afterwards.

Agreed. Lets land this for b4, IMHO with talos preffed back to 50MB, and continue to discuss how we want to configure talos going forward.
> Agreed. Lets land this for b4, IMHO with talos preffed back to 50MB, and
> continue to discuss how we want to configure talos going forward.

Should I try running Talos on an even larger cache increase--say, to 250Mb, as Vlad suggests, before we land it? Or should we just land on M-C and keep an eye on it?
Well if we land pref'd to 50MB for talos, there should be no change on m-c.  However, I'm actually against doing that; I think we should land with the full value for everything, including m-c talos.
I'd say just land it on central and keep an eye on it. Though it'll be hard to tell the results whatever we do with talos. If we push talos back to 50MB then obviously talos won't tell us about any problems. If we don't push talos back to 50MB then the numbers will change so much that it'll hide any problems.
Yup, I agree with just bumping the size to 250 right now per the above discussions. Long term we can see what else we can do for talos here, but let's not slow down and worry about that now.
Attachment #426432 - Flags: approval2.0+
Attachment #465444 - Flags: superreview+
Attachment #465444 - Flags: review+
Attachment #465444 - Flags: approval2.0+
Attachment #426432 - Flags: approval2.0+
With commit message this time.
Attachment #465444 - Attachment is obsolete: true
Attachment #465445 - Flags: superreview+
Attachment #465445 - Flags: review+
Attachment #465445 - Flags: approval2.0+
Yay! 

Next I think we should implement a "smart" cache size ASAP

    https://bugzilla.mozilla.org/show_bug.cgi?id=559942

so we can make our cache as big as our competitors (but also sensibly-sized on small disks).  That should be easy to do, as long as our cache scales decently. We should try to get a competitively-sized cache into FF4.

[Byron: experimenting to see how the cache performs at much larger sizes should be at the top of your list (check for cache collisions;  some tp4-like perf test with a much bigger working set;  make sure the cache actually fills up to capacity, etc.)]

If we can also solve the large files problem (bug 81640) that'd be awesome, but I don't know if there's a quick fix for that, unless we can simply not store files above a threshold size in the cache.

Followup in those bugs.
I'm on it. And yes, let's push the discussion to the relevant bugs.
Pushed. Commit can be found at http://hg.mozilla.org/mozilla-central/rev/3bfef869720e
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
Target Milestone: mozilla1.9.3a2 → mozilla2.0b4
a cache is only useful for data that is static and not changing from hour to hour or day to day. for example for website logos and css files. 

but most of the cache is filled with dynamic content like news, blogs and so on which is of no use if we have that in the cache. 

a cache size of 50 MB is perfect if we really have static content which is not changing. 

so a clever strategy is to keep only things in the cache that repeat to appear. after the google logo image has been read and verified 3 times without changes, it is clear that it should stay in the cache.

a stupid strategy is to increase the cache size to collect huge dynamic content.
We might use such a "clever strategy" in bug 512849, and at that time we can reevaluate the size of the cache.
> a cache is only useful for data that is static

That's true if all you use it for is a cache.  We also use it for view-source, save-as, history navigation, etc.  Furthermore, even content that changes "from hour to hour" can be usefully cached for reload purposes.

Now perhaps we should separate the "cache" functionality from the "get at content you've recently downloaded" functionality.  But really, they're remarkably similar...
Ok then I continue this thread in bug 512849
See Also: → 512849
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: