Closed
Bug 593198
Opened 14 years ago
Closed 10 years ago
Use CACHE__00X files for larger sizes to lower total # of cache files
Categories
(Core :: Networking: Cache, defect, P5)
Core
Networking: Cache
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: taras.mozilla, Assigned: michal)
References
(Blocks 2 open bugs)
Details
Attachments
(2 files)
61.76 KB,
image/png
|
Details | |
18.83 KB,
patch
|
Details | Diff | Splinter Review |
Our cache buckets caches into one of 3 lump files or cache entries get own files.
Currently 1k,4k,16k entries get lumped together in CACHE__001, CACHE__002, CACHE__003 respectively. I claim that this is ridiculously conservative.
In a profile with 2 months of browsing history this yields ~2200 files.
It is a bad idea idea to scatter small cache entries as individual files, especially if they all live in the same directory.
Increasing bucketing limit to 32K yields a 50% reduction in file entries, 64K gets 80% reduction 96K gets is 90% less distinct files in the cache directory.
Once bug 592520 is fixed lumping small cache entries together will significantly cut down on disk seeks while looking up cache entries.
There are other cache tweaks that can be done (like breaking up cache entries into sub-directories and lumping entries intelligently rather than arbitrarily based on size). However simply adding more bucket sizes is is a reasonably low risk thing to do for ff4, so I think we should go for 64k or maybe even 96k buckets in ff4.
Reporter | ||
Comment 1•14 years ago
|
||
Assignee: nobody → tglek
Updated•14 years ago
|
Blocks: http_cache
Comment 3•14 years ago
|
||
(In reply to comment #0)
> I claim that this is ridiculously conservative.
+1
Comment 4•14 years ago
|
||
This is just a check point. The patch causes assertions to fail.
Comment 5•14 years ago
|
||
Blocking? This should be fairly quick to fix, and may have good I/O perf payoff.
cc-ing mobile guys in case they know if their filesystems do poorly with large #'s of files in a single directory.
blocking2.0: --- → ?
tracking-fennec: --- → ?
Summary: Tweak cache packing for modern web → Use CACHE__00X files for larger sizes to lower total # of cache files
Comment 6•14 years ago
|
||
our disk cache is currently disabled
Comment 7•14 years ago
|
||
Taras,
Byron says it'll be a week or two before he could get to this, and he's fine with you grabbing it if you can do it sooner.
Updated•14 years ago
|
blocking2.0: ? → betaN+
Comment 8•14 years ago
|
||
Over to Michal who should be able to look at this reasonably soon.
Assignee: tglek → michal.novotny
OS: Linux → All
Hardware: x86 → All
Assignee | ||
Comment 9•14 years ago
|
||
I've been testing Byron's patch for a while and I didn't see any assertion. Anyway there is another problem. The cache block files can be large up to 8MB, 32MB, 128MB and the new would be large up to 512MB. Simply adding a new cache block file isn't a good idea since we don't shrink these files and there would be theoretically 680MB of space that couldn't be freed. E.g. let's say that the cache is half full (340MB) and user decides to limit the cache size to 100MB. Although Firefox would evict 240MB of cache entries the space on the disk wouldn't be freed. This would IMO lead to bad user experience.
Of course we have the same problem even without the new cache block file, but it is limited to "only" 168MB.
Comment 10•14 years ago
|
||
Could we simply evict the whole cache if the user (or our smart sizing code) shrinks the cache below the combined size of the chunk files? That seems like a rare enough case that it wouldn't matter that we evict the whole cache...
Reporter | ||
Comment 11•14 years ago
|
||
We do shrink these files. See nsDiskCacheDevice::Trim(). Infact looks like we try to shrink them often too, because for some reason after surfing for a few hours my fallocated 4mb cache files gets fragmented. Only way I see that happening is from Trim() being called(or bugs in ext4, which seems less likely).
Reporter | ||
Comment 12•14 years ago
|
||
Err, I meant nsDiskCacheBlockFile::Trim
Assignee | ||
Comment 13•14 years ago
|
||
nsDiskCacheBlockFile::Trim() truncates file at the end of the last allocated block. And because cache block files get fragmented quickly it is nearly useless.
Assignee | ||
Comment 14•14 years ago
|
||
(In reply to comment #10)
> Could we simply evict the whole cache if the user (or our smart sizing code)
> shrinks the cache below the combined size of the chunk files? That seems like a
> rare enough case that it wouldn't matter that we evict the whole cache...
Why do you think that it would be rare case? Anyway this wouldn't solve the problem that the user's limit isn't respected. The current code doesn't count with size of the block files. Total cache size and actual disk usage can be quite different. Total cache size is a sum of sizes of all data and metadata rounded up to nearest kb. It is incorrect especially in case of small cache occupancy.
Just for example:
- Empty cache reports 0KB usage but in fact occupies at least 12MB of disk space.
- When the cache block file _CACHE_001_ is fully used with entries of size <= 256 bytes then reported usage is 32MB but only 8MB of disk space is used.
Comment 15•14 years ago
|
||
> Why do you think that it would be rare case?
We expect it will be rare for users to use a manual cache size. It will probably be more common to hit the case where the smart_size algorithm picks a smaller size at startup due to a user's disk space getting smaller. Which of course hits the same issue of whether we actually shrink disk usage.
I certainly agree that it would be good to 1) guarantee that we don't use more disk space than the user (or smart-size algorithm) has asked for, 2) shrink to that size w/o blowing away the entire cache, and 3) not fragment the cache with Trim() calls.
The only place I see a Trim call being made is in nsDiskCacheDevice::EvictEntries(): perhaps we should remove it from there, and instead call it only at when the cache capacity is reduced (at startup, or if the user changes the size prefs) and we determine that the current sizes are too large. That would fix all this, right?
We would need to come up with a ballpark estimate for what % of the max cache size is OK to be consumed by the CACHE files, and without further checks we'd probably still potentially consume more space than we ought to (ie. the total size of CACHE and individual files could be bigger than capacity, unless we change the check to use CACHE file size instead of the sum of the entries *in* the CACHE files).
Comment 16•14 years ago
|
||
Ah, right, Trim() only truncates at the last entry in the file, so without logic to compress the cache entries, we have no guarantee of what size we get.
Comment 17•14 years ago
|
||
Comment on attachment 473618 [details] [diff] [review]
Add a 64K bucket file to disk layout
> The cache block files can be large up to 8MB, 32MB, 128MB and the new
> would be large up to 512MB
What about Byron's comment in the patch (nsDiskCacheMap.h):
> 0000 0000 1111 1111 1111 1111 1111 1111 : block# 0-16777216 (2^24)
>+* // A block file may be up to 64MB
Doesn't sound right to me--2^24 blocks is 16M *blocks*, which is a file size a lot bigger than 64 MB
Assignee | ||
Comment 18•14 years ago
|
||
(In reply to comment #17)
> > 0000 0000 1111 1111 1111 1111 1111 1111 : block# 0-16777216 (2^24)
> >+* // A block file may be up to 64MB
>
> Doesn't sound right to me--2^24 blocks is 16M *blocks*, which is a file size a
> lot bigger than 64 MB
That comment is wrong and I've already removed it in my patch. Btw there should be 0-16777215 and this is only addressing limit. The filesize is limited by the bitmap size that is currently 4096 bytes which means 32768 blocks.
Comment 19•14 years ago
|
||
not blocking fennec 2.0 - we don't use the disk cache.
tracking-fennec: ? → 2.0-
Reporter | ||
Comment 20•14 years ago
|
||
So I just discovered that xperf can measure overheads of opening the directory. It's absolutely massive on NTFS. Takes 1-2 seconds to open the Cache/ directory in a trace from a user. is there some directory listing code in our cache that gets run on startup? I didn't see any.
Seems that the cause is that directory index is broken up into multiple 4K chunks and those get spread out all over disk. So why don't we do an incremental fix before switching to packing these in a large index file.
lets change the storage algorithm so a file like 3E3E27FEd01 would become
Cache/3/E/3E27FEd01 with Cache/0-Cache/F directories preallocated on cache creation to avoid fragmentation. We can lazy allocate the rest of the dirs.
Comment 21•14 years ago
|
||
By "open the Cache directory" do you mean get a list of all the files in the dir? Or just open a filehandle to the dir?
I don't see any obvious code (from a brief glance) that lists all the files, but it sounds like it's happening somewhere.
I like your proposal: added to bug 597224.
Assignee | ||
Comment 22•14 years ago
|
||
This shouldn't be blocking. Bug 597224 is an intermediate solution and it isn't blocking.
blocking2.0: betaN+ → ?
Comment 23•14 years ago
|
||
Yeah, I don't think this blocks either, though I think we should try to get it done soon and perhaps backport it to the branch once we've shipped FF4.
blocking2.0: ? → -
Updated•14 years ago
|
Whiteboard: [fennec-4.1?]
Updated•14 years ago
|
tracking-fennec: - → 7+
Whiteboard: [fennec-4.1?]
Updated•13 years ago
|
tracking-fennec: 7+ → +
Comment 25•13 years ago
|
||
(In reply to comment #20)
> lets change the storage algorithm so a file like 3E3E27FEd01 would become
> Cache/3/E/3E27FEd01 with Cache/0-Cache/F directories preallocated on cache
> creation to avoid fragmentation. We can lazy allocate the rest of the dirs.
Nice idea. But the current implementation creates Cache/3/E3/E27FEd01 (SM 2.2). This is highly overdrawn.
I would suggest to "lazy allocate" only _one_ level of directories like Cache/3E/3E27FEd01. This would get much closer to the initial suggestion and save one directory level.
Comment 26•12 years ago
|
||
Josh, do we care about this bug
Reporter | ||
Comment 27•12 years ago
|
||
(In reply to Brad Lassey [:blassey] from comment #26)
> Josh, do we care about this bug
Brad I'm not sure this makes sense on mobile. These large files perform best when you can have 'holes' in them for evicted data, but that's wasteful for you.
Assignee | ||
Comment 29•10 years ago
|
||
This is old cache specific bug, so won't fix.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•