User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007 The disk cache seems to only use the sizes of the files stored in the cache in calculating how much space it takes up. It ignores the size of the indices (_CACHE_MAP_ , _CACHE_00[1-3]_) when working out how much space the cache is taking up. This has the effect that a cache limited to a certain size actually takes up more space than the setting browser.cache.disk.capacity is supposed to be limiting it to. (If this behaviour is by design, maybe some UI work would help stop the preferences-->advanced-->cache page implying that the user is being given the opportunity to set the size of the cache on disk.) If the disk cache is on a small partition, it can happen that as well as exceeding the soft limit of browser.cache.disk.capacity, it can fill the partition to capacity, despite browser.cache.disk.capacity being less than the size of the drive. This can cause the partition to run out of space, resulting in pages loaded from the cache being empty or truncated. Reproducible: Always Steps to Reproduce: 1. Set up an 8 MB partition (or ramdisk) 2. Set the cache to 6 MB, and the cache location to the partition, using the cache property panel. 3. Use mozilla for a while, until the cache fille Actual Results: Partition fills past capacity; pages display corrupted, disappear when back button pressed. Shift-reload necessary to make corrupted page display. Expected Results: Partition fills to 6 MB; as indices grow larger, fewer files are stored in cache. Workaround: disable disk cache (browser.cache.disk.enable=false)
note that _CACHE_00[1-3]_ files contain also all cached objects <= 16 KB in size
This is an automated message, with ID "auto-resolve01". This bug has had no comments for a long time. Statistically, we have found that bug reports that have not been confirmed by a second user after three months are highly unlikely to be the source of a fix to the code. While your input is very important to us, our resources are limited and so we are asking for your help in focussing our efforts. If you can still reproduce this problem in the latest version of the product (see below for how to obtain a copy) or, for feature requests, if it's not present in the latest version and you still believe we should implement it, please visit the URL of this bug (given at the top of this mail) and add a comment to that effect, giving more reproduction information if you have it. If it is not a problem any longer, you need take no action. If this bug is not changed in any way in the next two weeks, it will be automatically resolved. Thank you for your help in this matter. The latest beta releases can be obtained from: Firefox: http://www.mozilla.org/projects/firefox/ Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html Seamonkey: http://www.mozilla.org/projects/seamonkey/
Setting disk cache size to 2000 KB and browsing until it's full in Firefox 1.5b2, about cache reports as follows: Disk cache device Number of entries: 186 Maximum storage size: 2000 KiB Storage in use: 1999 KiB Cache Directory: J:\Documents and Settings\lewis\Local Settings\Application Data\Mozilla\Firefox\Profiles\7c0c5mvh.default\Cache DIR, on the other hand, reports 45 File(s) 2,703,816 bytes Which is 2640 KiB . If we exclude the _CACHE files from the calculation, the space taken is under 2000 KiB. Thus, I believe all the original concerns are still valid: 'use up to n MB for cache' still implies that this limits the size of the cache, when it actually doesn't; if browser.cache.disk.capacity were used to notify the browser of the maximum free space available for the the cache to grow to into, the cache would still overflow it. If the user's space usage is limited by capacity or quota, the cache will still experience write errors and corrupt the displayed pages.
(In reply to comment #3) > If the user's space usage is limited by > capacity or quota, the cache will still experience write errors and corrupt the > displayed pages. Write errors ought to not corrupt displayed pages. please file a bug about that.
Two issues here: * size of _CACHE_MAP is not accounted for (which is between 9k and 132k) * size of the metadata for each cache entry is not accounted for (see also bug 277473) The first one is easy (patch underway...) The second one is less easy, allthough: If we make SetMetaDataElement of nsCacheEntry to also call Inc/DecTotalSize of is mCacheDevice then also the MetaData size can be tracked in mTotalSize of both Memory and Disk caches. Is is possible however that MetaData is set in the entry before it is connected/stored in a device? If so, then the SetCacheDevice call should check for MetaData allready set, and then update mTotalSize if needed.
Created attachment 199409 [details] [diff] [review] Correctly account for overhead of map/blockfiles in totalsize To correctly account for the overhead of the map and block files (total size of _CACHE_MAP, and the 4K header for each _CACHE_001 file). Changes: * renamed mDataSize to mTotalSize to better reflect it usage * inline 3 getters for totalsize, entrycount and capacity of DiskCacheDevice * replaced the 3 with kBlockFiles for style Testscript: * Started with empty cache. * Visitied www.ibm.com and www.mozilla.org * Restarted seamonkey * Visitied www.ibm.com and www.mozilla.org and some others (netscape.com, tweakers.net, ...) * about:cache * Close seamonkey Testresult: Number of entries: 284 Bytes according to Windows: 1.214.242 bytes Size on Disk: 1.253.376 bytes Storage in use: 1225 KiB (= 1.254.400 bytes) Problably because of accounting of datasize and metasize in chunks of 1024 bytes, the actual result is very close to the 'about:cache' result At least, this will make the accounting of total size in the diskcache more precise.
Actually, for the diskcache, it may be even better to track the total file size of all the files in the cache as the 'mTotalSize'. Because, within the blockfiles there will be other overhead, as soon as the blockspace is fragmented, so that the blockfile also contains empty blocks. If we really want to maximise the number of bytes stored on disk, the best way is thus to account for the actual file sizes (this will, BTW, still not account for filesystem overhead... ;-)). If we want to do this, I propose a separate bug to do so.
I'm happy to ignore blockfile fragmentation. I believe that the cache should primarily account for the data it stores on behalf of its consumer. If its consumer says that it wants to store 50 megabytes of data, then the cache should allow the consumer to store 50 megabytes of data. If there is additional overhead, then that is an implementation detail that should not be visible to the consumer of the API. In other words, I think we should only count data + meta-data (including cache key, session name, etc.) in the size totals. I think we should also make the cache more fault tolerant to deal with the page load errors described in comment #0.
At the moment, the GUI does not let the user say that they want to store 50 megabytes of data, but that they want the cache to take up 50 megabytes. If this is to change, please don't forget to fix the GUI: at the moment it reads 'use up to'[textbox]' MB of disk space for the cache', which is stating that it limits the space the cache takes up to [textbox] megabytes. It should instead read something like 'store up to'[textbox]' MB of data in the cache'.
Lewis: Sure, but that is a different problem. One could argue that the text in the preferences UI is vague enough to cover either implementation.
I vote for making the hard limit the limit for diskusage (at least as measured by total file size). For anything stored as files in cache, this is allready the case. For material stored in BlockFiles it is just a question of measuring total size of the BlockFiles and the _CACHE_MAP itself. As seen from some of the comments above people want to be able really control the maximum storage usage of the cache. Caching is always about providing a limited space and using that as efficient as possible (compare the L1/L2 caches of a CPU, or the 8MB cache of the disk controller, etc).
Fair enough, but what happens in the extreme edge case where the user sets the size less than 4k? Do we still create _CACHE_MAP_?
Alfred: ping? any response to comment #13? also, are we sure that our block files do not become so fragmented that the fragmentation becomes a great limiting factor in our ability to store data in the cache?
As for the extreme case: any value below 50K or so should be considered as a useless value for diskcaching. IE for example only the unit is MB, so the minimum value is then automatically 1MB. (0 is no cache). As for the fragmentation, with my patch for the bitmap handling of the blockfiles, this is much less. Note that any entry uses at most 3 bits (1, 2, or 3 bits), and free'd entries are always filled up from the begin of the map. In worst case, the whole bitmap may have only 1bit holes left, which will be used soon enough as at least 1/3trd of the new entries will take up the 1bit holes, and there will always be 2 or 3 bit entries that will get freed. Note, the main reason (IMHO) that users want to set a limit is because of available diskspace, not precisely to want she/he wants to store in the cache (as you don't have any control actual storage of objects). One wants generally given a provided size (like the 256MB CPU cache) make the most use of it...
-> default owner
What should we do with this bug?
Find an actual reviewer for the patch?
Michal or bjarne are the most obvious people to look into this.
Comment on attachment 199409 [details] [diff] [review] Correctly account for overhead of map/blockfiles in totalsize There is one fundamental problem with this patch. There is no information about the free space in the block files. This means: 1) When the cache becomes full we start evicting some entries, but evicting entries stored in block files won't free any space. 2) We preallocate block files in 4MB chunks (although this code is temporarily disabled right now), so even an empty cache occupies 12MB. If user sets the cache capacity below this limit, then nothing will be stored in the cache. 3) Even if we would count the free space in the block files, there would need to be some advanced logic when checking the cache fullness. E.g. let's say that the cache is full according to total size but there is a free space in some block file, then we don't need to evict any entry but we need to ensure that the entry will fit into the certain block file...