In the 12-13 Windows zipped build, if you add this pref: user_pref("browser.cache.enable", true); and open choffman's browser buster, the app will crash after about 7 or 8 pages. I have brought about a crash in the browser 6 times in a row, without exception, doing this. My machine is a P2 450 with 128M RAM. Without the pref enabled, the app ran without issue, and I closed it at 32 pages loaded. MOZILLA caused an invalid page fault in module NKCACHE.DLL at 015f:60524726. Registers: EAX=02cfc690 CS=015f EIP=60524726 EFLGS=00010246 EBX=00000001 SS=0167 ESP=0063fb64 EBP=00000004 ECX=00000000 DS=0167 ESI=00000000 FS=4577 EDX=00000001 ES=0167 EDI=007af910 GS=1367 Bytes at CS:EIP: 8b 0e 50 56 ff 51 0c 5e c2 0c 00 55 8b ec 56 8b Stack dump: 00000000 01dfb244 605279a5 01dfa1c0 01dffa80 00000000 007af910 01dfb200 604f209d 01dfb244 01dffa80 00000000 604f1dd0 01dfb200 01df9bf0 60b7191e
Summary: Browser buster crashes after about 7 or 8 pages, if memory cache is enabled. → Memory cache: browser buster crashes after 7 or 8 pages
I can reproduce this on NT, though it's not 100% consistent due to the non-deterministic choice of HTML page by the browser buster. The problem is in a code path that is exercised infrequently, i.e. whenever 256 new entries are added to the cache. I'm seeing infinite recursion: nsReplacementPolicy::AddAllRecordsInCache(nsINetDataCache * 0x02661900) line 166 nsReplacementPolicy::RankRecords() line 384 nsReplacementPolicy::CompactRankedEntriesArray() line 425 nsReplacementPolicy::AssociateCacheEntryWithRecord(nsINetDataCacheRecord * 0x03770310, nsINetDataCache * 0x02661900, nsCachedNetData * * 0x00000000) line 491 nsReplacementPolicy::AddAllRecordsInCache(nsINetDataCache * 0x02661900) line 181 nsReplacementPolicy::RankRecords() line 384 nsReplacementPolicy::CompactRankedEntriesArray() line 425 nsReplacementPolicy::AssociateCacheEntryWithRecord(nsINetDataCacheRecord * 0x03770310, nsINetDataCache * 0x02661900, nsCachedNetData * * 0x00000000) line 491 nsReplacementPolicy::AddAllRecordsInCache(nsINetDataCache * 0x02661900) line 181 nsReplacementPolicy::RankRecords() line 384 nsReplacementPolicy::CompactRankedEntriesArray() line 425 nsReplacementPolicy::AssociateCacheEntryWithRecord(nsINetDataCacheRecord * 0x03770310, nsINetDataCache * 0x02661900, nsCachedNetData * * 0x0012adac) line 491
The fix for this was pretty easy, but lurking behind this bug were several other bugs that failed with the same test case. There are two code paths in the cache manager that are exercised infrequently, when the cache manager bumps up against programmed resource limits, and neither of these code paths were very well-tested, I'm afraid. With all the fixes, I can run the browser buster for a long time with the cache enabled. (As I'm writing this, it's been running for about 45 minutes and it's still going.) The test is being run with the cache manager resource limits set artificially low so as to increase the frequency of use of these code paths, i.e. cache is limited to 50KB instead of 1024KB and the maximum number of tracked cache entries is limited to 12 instead of 800. I guess I need a review from Warren for these changes, since he's the new cache owner. I don't yet have checkin permission to the CVS repository, so someone will have to check this in for me. I would argue that this should go in for M12 because the cache code DLL is not even loaded unless you set the pref to enable caching but, without these patches, we'll get lots of spurious cache bug reports by those that do enable the pref. There are five -- count 'em -- five independent bugs, although only the first of them is responsible for the initial crash. (The other bugs gave me headaches after I cured the first bug.) 1) When the cache manager's array of in-memory cache entries is full and we need to add an additional entry, the array is compacted in an attempt to squeeze out previously recycled cache entries and reuse them. Just prior to this, all of the cache database records are loaded into the cache manager to ensure that it knows about records from previous browser sessions which, in turn, extends the array of cache entries that we're trying to compact, causing an infinite recursion. It turns out that there isn't even a *need* to load all the cache database records from prior sessions until we're actually trying to choose a database record to evict from the cache among all records. So, I moved the call to nsReplacementPolicy::AddAllRecordsInCache() into nsReplacementPolicy::Evict() and nsReplacementPolicy::DeleteOneRecord(). This eliminates the cause of the infinite recursion. (Then I tested these changes by lowering the cache manager thresholds so the code would be called frequently.) 2) When old cache entries, which are stored in an arena, were recycled, nsCRT:zero() was used to reset all the nsCachedNetData instance variables. Unfortunately, that also zeros out the vtable pointer, resulting in an immediate crash the first time it's used. Doh! (This occurs reliably when the 800th URL is added to the cache, 800 being the setting of an internal parameter that limits the number of memory cache URLs that the cache manager will track.) The solution was to add a placement new operator for the purpose of clearing out an existing cache entry. 3) The number of buckets in a hash table was computed incorrectly in nsReplacementPolicy::Init. This is really only a performance bug. 4) When the maximum number of tracked memory cache entries (800) is reached, the cache manager is supposed to recycle one, but the accounting logic was messed up such that it would not always do so, which would result in a crash. 5) Fixed a bum assertion that had a reversed predicate
final m12 candidates are spinnning now. moving to m13. if we fall off track and need to respin m12 for some yet unknown reason we can consider this if you get a fix in hand.
> if we fall off track and need to respin m12 for some > yet unknown reason we can consider this if you get > a fix in hand. ??? The fix is attached to the bug report.
fur, we have delayed the next m12 spins until tomorrow morning. if you can get these on the SeaMonkey_M12_BRANCH tonight they will get picked up in the build. at the least you are cleared to go ahead and checkin on the trunk if you are ready to go. thanks
Status: NEW → RESOLVED
Last Resolved: 19 years ago
Resolution: --- → FIXED
Checked in patch
*** Bug 22123 has been marked as a duplicate of this bug. ***
verified using choffman's browser buster - build 1999122208
You need to log in before you can comment on or make changes to this bug.