Closed Bug 229984 Opened 16 years ago Closed 16 years ago

HTTP download of over 64MB causes disk cache to break

Categories

(Core :: Networking: Cache, defect, P1, critical)

defect

Tracking

()

RESOLVED FIXED
mozilla1.7beta

People

(Reporter: wd, Assigned: darin.moz)

References

Details

(Keywords: fixed1.4.2)

Attachments

(3 files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.6b) Gecko/20040102 Firebird/0.7+
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.6b) Gecko/20040102 Firebird/0.7+

If you download a file that is larger than your max disk cache setting (default
50MB), it can cause the disk cache to be put into an unusable state.   A
cancelled download is not removed from the disk cache.   And if the download is
cancelled after it has reached the max disk cache size, the disk cache will be
permanently disabled.   (Until manually cleared in prefs)

Reproducible: Always

Steps to Reproduce:
1.Start downloading a large file (over 50MB)
2.Cancel the download after it passes 50MB, but before it completes
3.Restart Firebird
4.View a couple of web pages
5.View about:cache

Actual Results:  
Something like this:   (Num of entries is always 0)

Disk cache device

Number of entries: 	0
Maximum storage size: 	50000 k
Storage in use: 	1441792 k

Expected Results:  
The cancelled download should not remain in the disk cache.
The storage in use should be below the max size, and the number of entries
should be more than zero.     (the disk cache should *work*)

I cannot reproduce this with Mozilla builds.   Only Firebird.
*** Bug 229746 has been marked as a duplicate of this bug. ***
My cache goes haywire if I cancel a download after it is full, CVS HEAD/Linux

Platform/OS -> All
OS: Windows 2000 → All
Hardware: PC → All
Just reproduced with Mozilla.   As it turns out, this can happen with either
Firebird or Mozilla, but in either case it might take a couple of attempts.
Component: Downloading → Networking: Cache
Product: Firebird → Browser
Version: unspecified → Trunk
Assignee: bugs → darin
QA Contact: cacheqa
Depends on: 230125
Confirmed on Mac OS X 10.2.8 w/ build 2004010505 (together with bug 230125).

- I first emptied the disk cache.
- Then I downloaded a 733 MB file with a 50MB cache, from
<http://www.knopper.net/knoppix-mirrors/index-en.html> (I used http, and I made
use to use 'save link as', because some of these servers are serving *.iso files
as text/plain).
- When the downlosed size exceeded 64MB, I saw bug 230125 (random values in the
disk cache counters. The file which was used, was correctly (Finder reported
correct file size).
- After I cancelled the file, the diskfile was removed (so the cache folder was
empty, aside form the default _CACHE_* files), but the counters of the disk
cache in <about:cache> seem to be "stuck" in the last 'random' value.
- When I continued downloading, I noticed that the disk cache counters never
changed anymore, I seemed to be using only the memory cache. This was confirmed
by the fact that the cache folder stayed empty.

Note if you try to reproduce this : make sure that you use the HTTP-protocol.
FTP won't store the data in the cache-folder.
note: clearing the disk cache restored the counters without any problem
Re-summarizing.   It's not necessarily that the download is larger than the
specified max size for the disk cache, but rather larger than 64MB regardless. 
I set my cache to 100MB for testing, cleared it, and downloaded a 70MB file.

Once a download passes the 64MB mark, I get a stream of the following asserts:

###!!! ASSERTION: data size out of range: 'sizeK < USHRT_MAX', file 
http://lxr.mozilla.org/seamonkey/source/netwerk/cache/src/nsDiskCacheDevice.cpp#688
###!!! ASSERTION: data size out of range: 'newSizeK < USHRT_MAX', file 
http://lxr.mozilla.org/seamonkey/source/netwerk/cache/src/nsDiskCacheDevice.cpp#689
###!!! ASSERTION: disk cache size negative?: 'mHeader.mDataSize >= 0', file 
http://lxr.mozilla.org/seamonkey/source/netwerk/cache/src/nsDiskCacheMap.h#469
###!!! ASSERTION: disk cache size negative?: 'mHeader.mDataSize >= 0', file 
http://lxr.mozilla.org/seamonkey/source/netwerk/cache/src/nsDiskCacheMap.h#460


Once Mozilla gets into this state where the disk cache doesn't work, I get loads
of the following asserts upon loading any page:

WARNING: WriteDataCacheBlocks() failed., file 
http://lxr.mozilla.org/seamonkey/source/netwerk/cache/src/nsDiskCacheStreams.cpp#500
###!!! ASSERTION: Flush() failed: 'NS_SUCCEEDED(rv)', file 
http://lxr.mozilla.org/seamonkey/source/netwerk/cache/src/nsDiskCacheStreams.cpp#447
###!!! ASSERTION: deleting dirty buffer: 'mBufDirty == PR_FALSE', file 
http://lxr.mozilla.org/seamonkey/source/netwerk/cache/src/nsDiskCacheStreams.cpp#734

(file references changed to lxr links)
Summary: Cancelled download not removed from disk cache - causes it to be disabled with 0 entries if very large → HTTP download of over 64MB causes disk cache to break
WD, I wonder if you can take a look and see if this is basically bug 193454. 
That is, when you get to this state, can the cache be "fixed" by deleting just
the _CACHE_MAP file?  If so, I'll make my bug a dupe of this one.

Also, if that is the case, I wonder if Mozilla can somehow do a very quick check
of the validity of _CACHE_MAP, and throw it out if it looks corrupted.  That
won't fix the root cause of this problem, but it should workaround it until a
proper fix is found.
Yes, deleting _CACHE_MAP does return the cache to a working state.   I believe
deleting that file is essentially the same as clearing the cache via the prefs.
 (Mozilla will clear out the cache directory and start with a fresh one if that
file is missing)
*** Bug 193454 has been marked as a duplicate of this bug. ***
Possibly related: bug 218391.
tenthumbs:
bug 218391 is a blocker of this bug's blocker.   (assuming my assumptions are
correct)
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.7alpha
hey alfred... you interested in investigating this one?
Yes, I will have a quick look at it.
I have digging in this code.

It seems there are several:
* With objects larger than 64M, now a random value is set in Data Size (size%64M
sortoff)
* Diskcache size is counted using KiB's (new notation!), but fails because of
above, and the too small integer. (USHRT_MAX?)
* The mDataSize field in the cache_map is too small, with some bit shuffling it
could be extended to 9MB.

TO be continued...
Ok, two options:
* Either stopgap the issue with 64M boundary, and the corresponding overflow and
issues, by: setting either -1 or 0 in the size field, and retrieve realsize from
filesystem.
* Re-arrange the bits in the MetaLocation and DataLocation fields, 
so that we get 22 bits for the dataSize in Kib's, which maximizes at 4GB, which
is the actual real filesize limit (as long as we are 32bits).

Some details: metadata is currently always stored as blocks, not as separate
files. Here we can save some code.
All flags are get and set using getters and setters, so the change is minimal.

Both options would need to change the diskCache version number, even in the
stopgap option, the change can potentially conflict with old cache data
This patch allows for 4GB max filesize, even if it is much larger than total
cache size!

No more asserts in files larger than 64K.

Accomplished by reshuffling some of the bits in the bitflags in the
DiskCacheMap. The 'MetaData' (actually the 'DiskCacheEntry' storage) is only
stored in diskCacheBlocks (this was allready so), so some of the corresponding
code and bits could be cleaned.

* Most important: update of minor of kCurrentVersion

Furthermore some cleanup related to this:
* nsDiskCacheDevice: cleaned the SetDataFileSize function from ASSERTs
* nsDiskCacheDevice: check need for Evication before calling function
* nsDiskCacheEntry: removed HeaderVersion and MetaLocation fields: where never
used or checked, changed mKeySize and mMetaSize to PRUint16: total saving 12
bytes per entry (both memory and disksavings)
* nsDiskCacheEntry::CheckConsistency: removed, not used at all
* nsDiskCacheMap::ReadDiskCacheEntry: removed read from file, was never used.
* nsDiskCacheMap::WriteDiskCacheEntry: removed write to file, was never used.
* nsDiskCacheMap::DeleteStorage: function split into DeleteDataStorage and
DeleteMetaStorage, while mostly similar, because MetaStorage now only delete
blocks: this saves some 'if/elses'. Also removed the boolean kData/kMetaData
which could be misused.
* nsDiskCacheMap::GetLocalFileForDiskCacheRecord and
* nsDiskCacheMap::GetFileForDiskCacheRecord: removed the meta flag, as it is
not used

Explanation of the new bitmasks:
* DataLocationInitialized and MetaLocationInitialized: removed, because when
the flag fields is non-zero, it is initialised. (Data: either DataGeneration is
set, 
or DataFileSize is greater than zero) (Meta: MetaFile is always 1, 2 or 3 (as
metadata is never stored as a file).
* MetaFileSelector and DataFileSelector both moved to first (last?) 2 bits 
* ExtraBlocks and StartBlocks (for both Meta and Data): unchanged
* DataFileGeneration: unchanged
* DataFileSize: resized to 22 bits: allowing for 4.000.000 KiB's = 4 GiB.
* Renamed most of the fieldnames, so be really that I've touched all the
accessors.

P.s. this patch is attached in a (debug) build, in which also the 'knoppix'
test was performed, which behaved nicely: cache grew along the download (upto
about 130 MB, on which I cancelled the download), and then shrink back to the
original size (correctly removing the 130 MB).
This is the result from the large file download.
As you can see the total size is now correctly calculated.

However, I see two other things:
* The diskCache is being evicted (almost wiped clean, because of the download)
* The download saves the file BOTH in the cache directory, and in the specified
location: Even if 'save link target as' is used.

This last point is largely discussed in bug 55307.
The final solution would be to stick with the current scenario:
start downloading it, into the cache, until the user has specified the download
location. Currently Mozilla then starts to write to both locations. At least
for items growing larger than the cache limit (or near to it), so be
pro-actively removed from the cache (as it is being saved elsewhere). So
concluding from the other bug, the key problem now is to tell HTTP to stop
writing to the cache, as soon as the file is being saved elsewhere and the size
reaches the cache limits.

Current effect during the was that near the end of the download I had twice a
almost 700MB knoppix image on my simple laptop stored somewhere. Also Windows
went crazy in writing the data to the disk. Actual download speed was about
1.4MB/s! It could have been faster if not for the double writes...
Looks good!    When downloading a file over 64MB, I no longer get assertions
when that limit is passed.   The cache remains functional after the download too.
Cache eviction is mentioned in bug 81640.

I can certainly have disk files > 4GB on Linux so I'm not sure a 4GB limit will
do more than just hide the problem. After all, 64MB worked for a while. I think
the cache should use at least 64 bits for sizes.
(In reply to comment #18)
> I can certainly have disk files > 4GB on Linux so I'm not sure a 4GB limit will
> do more than just hide the problem. After all, 64MB worked for a while. I think
> the cache should use at least 64 bits for sizes.

Maybe you can, but Mozilla as a whole not. All file handling is 32bits, which is
why the limit is 4GB. Nowhere in Mozilla files bigger than 4GB can be handled.
So, at least the 4GB 'problem' is now not cache specific, but mozilla-wide.
So, please create a separate bug to make mozilla supporting 64-bit filesizes...
(In reply to comment #19)
> So, at least the 4GB 'problem' is now not cache specific, but mozilla-wide.
> So, please create a separate bug to make mozilla supporting 64-bit filesizes...

see Bug 184452
fwiw: i'm able to easily reproduce this bug using today's CVS trunk on linux. 
to simply testing, i put an ISO on a local HTTP server.  once i hit 64 MiBs of
downloaded data, i start seeing the same assertions listed in comment #6.
Priority: -- → P1
OK, there are several things going on here to cause the problem.

(1) nsDiskCacheDevice::OnDataSizeChange doesn't return a failure code when it
doom's the cache entry as a result of the cache entry growing too large.

(2) when the disk cache record goes into the mode of storing zero for larger
than 64Mb files, the disk cache streams get confused b/c they were not coded to
handle a data size of zero.

My suggestion here is to fix (1) and moreover place a limit on the size of disk
cache files such that they never grow beyond 64Mb.  I don't think there's any
need to allow files greater than 64Mb in the cache.  In fact, I think it would
probably be good to use a soft-limit of half the cache size such that we don't
blow out the entire disk cache when downloading something large like an ISO.

I'll attach a patch that does this.
Attached patch v1 patchSplinter Review
I have asked Gordon to review this patch.
Flags: blocking1.7b?
Target Milestone: mozilla1.7alpha → mozilla1.7beta
Gordon wrote:
> Hi Darin,
>
> I read your comment in the bug, and it seems reasonable.  I took a quick look at 
> the patch, but I'll try to take a more thorough look at it tonight.
>
> -Gordon

Thanks Gordon!
So, either we stick with a 64MB limit on files in the diskcache, and make sure
that it behaves, or we extend the limit to 4GB (which is the current OS limit on
filesize), and make the diskcache smarter on large file handling (in case of ISO
downloads).

With Darin's patch: what will happen if a download of something larger than 64MB
gets aborted by the diskCache? Will the actual 'save link target as...' still
continue?
> 4GB (which is the current OS limit on filesize)

I don't know what OS you use, but the ones I use all support files much larger.
(at least I'm pretty sure NTFS allows that. I know ext3 does).
(In reply to comment #26)
> With Darin's patch: what will happen if a download of something larger than 64MB
> gets aborted by the diskCache? Will the actual 'save link target as...' still
> continue?

Yes, it will continue just fine.  There is explicit support for this "feature"
in the cache.  See nsInputStreamTee if you are curious how it works.  The point
is that the cache cannot be a required element of the browser.  It is always
optional, and if it cannot be used, that should not result in a failure of an
operation.

In fact, if you notice in my patch I'm actually fixing a bug in the logic that
tries to delete the cache entry that got too big.  It was just failing to notify
the upper layers that this happened so that they could close things down.

Notice also that the memory cache has a much more aggresive limit on the max
size of a cache entry (for good reasons).
Comment on attachment 141556 [details] [diff] [review]
v1 patch

i think this is the patch we want.  gordon said he is interested in
investigating the problem with >64 MiB files, but that doesn't prevent this
patch from going in now to solve the bug.

assuming r=gordon based on our private email exchange and requesting sr=bz
Attachment #141556 - Flags: superreview?(bzbarsky)
(In reply to comment #27)
> > 4GB (which is the current OS limit on filesize)
> 
> I don't know what OS you use, but the ones I use all support files much larger.
> (at least I'm pretty sure NTFS allows that. I know ext3 does).

s/OS/gecko/ ;-)
I'll not be able to look at this for a bit (about a week, probably).

Note that ideally once we start a download we want to stop caching the data
(since there's no reason to store it on disk in both spots).  We have bugs on
that...
(In reply to comment #31)
> I'll not be able to look at this for a bit (about a week, probably).

no problem.


> Note that ideally once we start a download we want to stop caching the data
> (since there's no reason to store it on disk in both spots).  We have bugs on
> that...

yes, but i don't agree at all.  are you saying that if i save a HTML page that i
am viewing that we should remove it from the cache?  what if the content you are
talking about can be viewed in the browser?  i think a size limit is a
reasonable way to deal with having two copies.  for small to average size files,
i think we can agree that two copies is no big deal (it's only a problem for
large files), so it seems to be just a matter of determining where to draw that
line.  anyways, we should probably be having this discussion in the other bug ;)
Comment on attachment 141556 [details] [diff] [review]
v1 patch

sr=bzbarsky
Attachment #141556 - Flags: superreview?(bzbarsky) → superreview+
*** Bug 235625 has been marked as a duplicate of this bug. ***
fixed-on-trunk
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
fixed on 1.4 branch for 1.4.2 with a=mkaply,blizzard
Keywords: fixed1.4.2
Flags: blocking1.7b?
*** Bug 238028 has been marked as a duplicate of this bug. ***
I just noticed my Windows Firefox 1.5 disk cache was in a state similar to the reported problem here. But this bug was fixed nearly two years ago...surely the fix is in Firefox 1.5 then? I don't know enough about Firefox development to tell.

Can anybody see the same problem in their about:cache? Or, can anybody help me understand how my disk cache became broken?

I happened to go to my about:cache for an unrelated reason, and I noticed this broken state:

Disk cache device
Number of entries: 0
Maximum storage size: 200000 KiB
Storage in use: 458742 KiB
Cache Directory: C:\Documents and Settings\user\Local Settings\Application Data\Mozilla\Firefox\Profiles\655vw66k.default\Cache

I manually cleared the cache and it seems to be working now. I have certainly downloaded files larger than 64 MB before.

I wonder how this is related to bug 229984. Or if I have hit another bug entirely.

I'm running:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8) Gecko/20051111 Firefox/1.5

On Windows XP SP2. It was a clean install (not an upgrade from an earlier Firefox version).

I'm using these extensions:
* DomInspector
* Talkback
* Google Toolbar for Firefox
* del.icio.us
You need to log in before you can comment on or make changes to this bug.