Closed Bug 1342722 Opened 3 years ago Closed 3 years ago
Firefox cache used all available inodes on the system
Component: Untriaged → Networking: Cache
OS: Unspecified → Linux
Product: Firefox → Core
Hardware: Unspecified → x86_64
Michal, should we have a file number limit?
(In reply to benland100 from comment #0) > Even though the total cache size was small (well under the 350MB default) > the total number of files in the cache2 directory was enormous due to their > small size - in the tens of millions. Needless to say the system ran out of > inodes and even deleting all the files has taken a significant amount of > time. Where did you see tens of millions entries? In cache2/entries or in cache2/doomed? > One would think that Firefox should have some mechanism to safeguard against > using up all the inodes on the system due to caching. Of course lowering the > total cache size is a work around: lowering the cache to 1MB will > effectively cap the total files around 250,000 assuming a 4096 filesystem > block size. However, who really needs over a million cache entries? I don't understand your computation. 250000 small files (smaller than a block size) will consume 1GB, not 1MB. Anyway, we round entry size up to kilobytes, i.e. even if the file contains only 1 byte, we count its size as 1KB. So with standard cache size 350MB the maximum number of files is 358400. If you really have tens of millions files, then I can see two possible causes: 1) We shouldn't generate empty entry files, but if we do then it could lead to the described problem. Check you cache2/entries directory for empty files. 2) Eviction doesn't work for some reason. In this case about:cache would show "Storage in use" bigger than "Maximum storage size".
Flags: needinfo?(michal.novotny) → needinfo?(benland100)
The files I saw were in cache2/entries. You're right I did get my orders of magnitude wrong in the calculation in the original message, so it's unclear to me how this happened as it should have exceeded the cache size. None of the files that I checked were zero in size, but checking comprehensively will take some time (one cannot simply ls a directory with tens of millions of files in it). I've cleared out the files in cache2/entries on the machine that actually ran out of inodes so we could use it again, but I have 3 others with millions of files so I'll check those for zero size files. Unfortunately about:cache seems to hang forever calculating stats for the disk cache - not entirely unexpected.
(In reply to benland100 from comment #3) > Unfortunately about:cache seems to hang forever calculating stats for the > disk cache - not entirely unexpected. We don't iterate through the files to get size of the cache. We use index in memory which keeps track of all files. So it shouldn't take much time to show the summary of the cache stats. Even if we're building the index right now, about:cache should show size of the entries we've parsed so far without waiting for end of the building process. OTOH, in Preferences->Advanced->Network we show "calculating" in case index is not ready.
The disk cache on one of the machines with millions of files in cache2/entries did finally display results: Number of entries: 7486143 Maximum storage size: 358400 KiB Storage in use: 4038660 KiB In the first 150k files none of them are size zero. Should have a final result in about an hour. Interestingly though this seems to indicate that the "Storage in use" cannot be the disk space used since it is less than 4kB * number of entries.
Yes, "Storage in use" is wrong here. The size overflows here http://searchfox.org/mozilla-central/rev/4039fb4c5833706f6880763de216974e00ba096c/netwerk/cache2/CacheStorageService.cpp#448 But in CacheIndex internally we have a correct value. So overlimit eviction should keep the cache size within the defined limit. A problem can arise when the storage is too slow, because writing entries has a bigger priority than eviction, so we could end up not being able to evict the entries as quickly as the new ones are created. You could verify the above by keeping firefox idle (ideally put it into offline mode to make sure we don't do any network request that could be cached) and watching "Number of entries". It should decrease over time.
As it happens I left firefox running with no tabs open over the last hour and the number of entries did decrease by about 1000. This seems quite slow though - no wonder it was able to build up many millions of files with a page updating frequently. Perhaps unimportant now but I can also confirm that none of the files in cache2/entries were zero size. They had an average data size of 703 bytes.
Yes, this is slow. The question is whether it is slow because there are so many files, or whether we've written so many files because the storage is slow in general. Also, It would be good to know if only deleting files is slow. Could you please do 2 tests? 1) Turn on logging of cache activity where I could see what takes most of the time when evicting entries. Easiest would be to turn this in section Logging in about:networking for a while when Firefox is idle and is slowly deleting entries, but this requires Firefox 52, i.e. at least current beta. You could also follow instruction at https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging to start logging in 51 release but this will generate huge amount of data on startup. MOZ_LOG for this bug should be "timestamp,sync,cache2:5". 2) It would be good to know how fast are other IO operations. Quit Firefox and delete cache/index file. Then start Firefox and remember the exact time. Let it run with no tabs open and watch when the index is created. The problem is that once we build the index we'll rewrite it periodically due to eviction. You can use following command to find the first creation time: while ! stat index 2>/dev/null ; do sleep 1; done The whole stat output will be interesting. Difference between access time and modify time tells how slow is writing to the disk. Exact number of entries can be calculated from size. And of course most interesting is the difference between access time and Firefox startup time.
(In reply to Michal Novotny (:michal) from comment #6) > Yes, "Storage in use" is wrong here. The size overflows here > http://searchfox.org/mozilla-central/rev/ > 4039fb4c5833706f6880763de216974e00ba096c/netwerk/cache2/CacheStorageService. > cpp#448 Filed bug 1343191 for it.
Assignee: nobody → michal.novotny
Fix in this patch just ensures that the cache doesn't grow out of control. We still don't know what's exactly slow on eviction in this specific case.
Attachment #8846149 - Flags: review?(honzab.moz)
Comment on attachment 8846149 [details] [diff] [review] fix Review of attachment 8846149 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/cache2/CacheFileIOManager.h @@ +473,5 @@ > nsTArray<CacheFileHandle *> mSpecialHandles; > nsTArray<RefPtr<CacheFile> > mScheduledMetadataWrites; > nsCOMPtr<nsITimer> mMetadataWritesTimer; > bool mOverLimitEvicting; > + bool mCacheSizeOnHardLimit; would use a comment..
Attachment #8846149 - Flags: review?(honzab.moz) → review+
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/mozilla-inbound/rev/474f08673e24 Firefox cache used all available inodes on the system, r=honzab
You need to log in before you can comment on or make changes to this bug.