Closed Bug 651011 Opened 10 years ago Closed 10 years ago

Delete invalid cache slowly to prevent massive noticeable IO

Categories

(Core :: Networking: Cache, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 670911

People

(Reporter: Dolske, Assigned: jduell.mcbugs)

References

(Blocks 1 open bug)

Details

I've been running Firefox 4.0 on a new MBP. New profile as of a couple weeks ago, but Sync'd with all my previous history and stuff.

I noticed yesterday that the OS X Activity Monitor showed disk IO shoot through the roof, and it stayed that way for over a minute (!), giving me time to figure out what was causing it...

It was Firefox, and I captured the following with |dtruss -p <firefox PID>|...

...
lstat64("/Users/dolske/Library/Caches/Firefox/Profiles/79ypbhnd.default/Cache.Trash/Trash/Cache/F/F8/738BCm01\0", 0x1276039D0, 0x64)		 = 0 0
unlink("/Users/dolske/Library/Caches/Firefox/Profiles/79ypbhnd.default/Cache.Trash/Trash/Cache/F/F8/738BCm01\0", 0x1276039D0, 0x0)		 = 0 0
stat64("/Users/dolske/Library/Caches/Firefox/Profiles/79ypbhnd.default/Cache.Trash/Trash/Cache/F/F8/B2F6Cd01\0", 0x12F2D0698, 0x1)		 = 0 0
lstat64("/Users/dolske/Library/Caches/Firefox/Profiles/79ypbhnd.default/Cache.Trash/Trash/Cache/F/F8/B2F6Cd01\0", 0x1276039D0, 0x64)		 = 0 0
unlink("/Users/dolske/Library/Caches/Firefox/Profiles/79ypbhnd.default/Cache.Trash/Trash/Cache/F/F8/B2F6Cd01\0", 0x1276039D0, 0x0)		 = 0 0
stat64("/Users/dolske/Library/Caches/Firefox/Profiles/79ypbhnd.default/Cache.Trash/Trash/Cache/F/F8/BBEA5d01\0", 0x12F2D0698, 0x1)		 = 0 0
lstat64("/Users/dolske/Library/Caches/Firefox/Profiles/79ypbhnd.default/Cache.Trash/Trash/Cache/F/F8/BBEA5d01\0", 0x1276039D0, 0x64)		 = 0 0
unlink("/Users/dolske/Library/Caches/Firefox/Profiles/79ypbhnd.default/Cache.Trash/Trash/Cache/F/F8/BBEA5d01\0", 0x1276039D0, 0x0)		 = 0 0
getdirentries64(0x16, 0x103CDE400, 0x1000)		 = 0 0
stat64("/Users/dolske/Library/Caches/Firefox/Profiles/79ypbhnd.default/Cache.Trash/Trash/Cache/F/F8/D0C02d01\0", 0x12F2D0698, 0x1)		 = 0 0
lstat64("/Users/dolske/Library/Caches/Firefox/Profiles/79ypbhnd.default/Cache.Trash/Trash/Cache/F/F8/D0C02d01\0", 0x1276039D0, 0x64)		 = 0 0
unlink("/Users/dolske/Library/Caches/Firefox/Profiles/79ypbhnd.default/Cache.Trash/Trash/Cache/F/F8/D0C02d01\0", 0x1276039D0, 0x0)		 = 0 0
close_nocancel(0x16)		 = 0 0
rmdir(0x1031068B8, 0x103100000, 0x1F8100)		 = 0 0
...

(this spew was going on and on and on, this is a representative sample)


Looks like the cache, probably fully populated, decided to clear itself? [I don't have any of the cache clearing settings enabled, and the browser had been mostly quiet immediately prior, though I think I might have just entered Private Browsing mode?] No addons except for FlashBlock.

Currently, the prefs dialog says Firefox is using 460MB of disk for cache.

This is the first time I've seen it do this.
Just hit this again, but right after a crash. I think that might be "normal" (we purge the cache after a unclean shutdown, iirc?), but perhaps needs rethinking now that our cache is enormous. Not sure if comment 0 was caused the same way.
Dup of bug #630420 ?
Not a dupe, as I recall the browser was up and running quickly but clearing the cache in the background.
Duplicate of this bug: 663200
bug 663200 has a very nice log showing that invalid cache deletion is a bad idea.

in case of invalid cache we should just try to reuse existing files if possible and delete in a throttled fashion.

When reusing CACHE_[123] files we should just reuse them. When reusing other files, just use them + truncate. Incrementally renaming files from Cache.Trash/x/y/SOMETHING to Cache/a/b/SOMETHINGELSE + truncating should be very friendly to the filesystem
Seems like, ideally, changes to cache should be atomic such that if the browser croaks mid-update the cache is left in a consistent state.

Other random thought: instead of dumping the whole cache, can we just selectively discard the more recent changes (by count or time)?
I don't restart my browser often, but when I do it happens pretty often. I just had it happen again after restarting firefox 5.0 beta 5 (in safe mode).
This is one of the more noticeable bugs in the cache and will probably get worse as more users have larger caches (users are still filling up their 1 GB cache size introduced in FF 4).  And while it's intermittent now (I assume these are happening from the browser crashing or being killed--if Private Browsing also triggers it then that's much worse), if/when we need to bump up the cache version number, we'll wind up causing a huge I/O thrash party for our entire user base as they upgrade, which would suck.

So I'd suggest we treat this bug as high-priority.

The most reasonable short-term fix here might be to 1) "mv cache_dir old_cache_dir" followed by 2) slowly delete old_cache_dir, so the I/O is slowly spread out.  If the browser closes before we're done we can still detect the old dir at startup and continue/finish the job.

> ideally, changes to cache should be atomic such that if the browser croaks mid-update the cache is left in a consistent state.

Now that all cache writes are done asyc, that should actually be possible w/o unacceptable performance.  But that's not a fix that'll happen in the short term. (and it doesn't solve the cache version upgrade issue.)

> in case of invalid cache we should just try to reuse existing files if 
> possible and delete in a throttled fashion.

I agree with the throttled delete idea--see above.  I'm not sure what you mean by "reuse if possible".  We don't journal, so right not we have no way to detect whether an entry is reusable or garbage (at least that's my understanding: perhaps Bjarne/Michal can say more--perhaps there are some cases where we know we are safe to reuse?).
Blocks: http_cache
OS: Mac OS X → All
Hardware: x86 → All
Summary: Disk cache causing loooong period of heavy disk IO → Delete invalid cache slowly to prevent massive noticeable IO
I've created bug 663580 to specifically look into the connection here with private browsing, which really shouldn't need to cause cache invalidation (it may be something else--the contents of the private session getting evicted)--and may have a different fix.
Blocks: 648605
Duplicate of this bug: 660195
Assignee: nobody → jduell.mcbugs
The interesting discussion (about how renaming the directory is as much of a problem as deleting all the files, at least on NTFS) is in bug 670911
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 670911
You need to log in before you can comment on or make changes to this bug.