194296 - minimize memory cache size and growth

Cathleen

Reporter

Description

•

22 years ago

Darin Fisher

Comment 1

•

22 years ago

this bug can be broken down a bit: 1. allow cache sessions to have a maximum cache entry size (hard limit). e.g., imagelib should be able to say i don't want to store decoded images larger than N bytes and have the cache take care of all the dooming, etc. 2. reduce overhead resulting from cache data structures (bug 192353 and probably others). 3. anything else?

Depends on: 192353

chris hofmann

Updated

•

22 years ago

Blocks: 215636

benc

Updated

•

22 years ago

QA Contact: tever → cacheqa

chris hofmann

Comment 2

•

21 years ago

might have done all that we can here. will address any problems that still exist after some space trace analysis in a different bug

Whiteboard: post-m3?

Alfred Kayser

Comment 3

•

21 years ago

Just a note of some thinking on this. MetaData of cache entries is reasonable optimized internally. However, get/setMetaDateElement uses a key/value system, whereby both are strings. However, only a fairly restricted set of keys is used: nsFtpConnection uses "servertype", nsHttpChannel uses "response-head", "request-method", and "auth" nsHTMLDocument uses "charset" nsImapURL uses "ContentModified" mimemoz2.cpp uses "contentType" tabbrowser.xml uses "Icon" (only to set a flag, but not to use it) To maybe instead a key, a enum of one byte is sufficient! This saves for each entry the saving of these key names in the diskcache... The Metadata is now stored as "response-head"'/0'"...."'/0' for each set key. Using a keyByte, assuming about 3 MetaDataElements per cacheItem, and about 3000 items in the diskCache, saves about 72 K of the diskmap data. A special case are the 'request-*' headers saved as MetaDataElement in the cache by nsHttpChannel. Question again, do they really need to be saved/cached? Other than cookies (which are handled differently anyway) it doesn't seem to be really used anywere. Even further, question is whether the full 'response-head' really needs to be saved on disk. These could save about 50-100 bytes per entry (assuming 3000 entries) this again a 150K-300K savings... (Beside codesize savings...) ---------------------------------------------------- key: http://www.doom9.org/images/mp4ui-optimize.gif fetch count: 2 last fetched: 01/07/04 13:47:14 last modified: 01/07/04 11:54:48 expires: 01/16/04 15:37:30 Data size: 3492 file on disk: none Security: This document does not have any security info associated with it. Client: HTTP request-method: GET request-Accept-Encoding: gzip,deflate response-head: HTTP/1.1 200 OK Via: HTTP/1.1 ips-alm-a.alm.nl.ibm.com (IBM-PROXY-WTE) Vary: Accept-Encoding Date: Wed, 07 Jan 2004 11:52:10 GMT Server: Apache/1.3.27 (Unix) PHP/4.3.1 mod_gzip/1.3.26.1a Last-Modified: Tue, 07 Oct 2003 22:44:51 GMT Etag: "cca6a-da4-3f8341e3" Accept-Ranges: bytes Content-Length: 3492 Content-Type: image/gif ---------------------------------------------------- Also 'request-method' could changed from a key/value to a one byte flag in the cacheEntry itself. This saves bytes in the disk cache, as well as saving code like this (note that http is (beside images) the most occurring cache entry): // Get the method that was used to generate the cached response 1232 rv = mCacheEntry->GetMetaDataElement("request-method", getter_Copies(buf)); 1233 if (NS_FAILED(rv)) return rv; 1234 1235 nsHttpAtom method = nsHttp::ResolveAtom(buf); 1236 if (method == nsHttp::Head) { 1237 // The cached response does not contain an entity. We can only reuse 1238 // the response if the current request is also HEAD. 1239 if (mRequestHead.Method() != nsHttp::Head) 1240 return NS_OK; 1241 } 1242 buf.Adopt(0); 1243 Ok, looking further down: // Parse the cached HTTP response headers 1259 NS_ASSERTION(!mCachedResponseHead, "memory leak detected"); 1260 mCachedResponseHead = new nsHttpResponseHead(); 1261 if (!mCachedResponseHead) 1262 return NS_ERROR_OUT_OF_MEMORY; 1263 rv = mCachedResponseHead->Parse((char *) buf.get()); 1264 if (NS_FAILED(rv)) return rv; 1265 buf.Adopt(0); 1266 1267 // If we were only granted read access, then assume the entry is valid. 1268 if (mCacheAccess == nsICache::ACCESS_READ) { 1269 mCachedContentIsValid = PR_TRUE; 1270 return NS_OK; 1271 } >> If mCaccheAcces == ACCESS_READ, the early return makes the parsing of the cached header redundant. May be first this check, then parse? 1272 1273 // If the cached content-length is set and it does not match the data size 1274 // of the cached content, then the cached response is partial... 1275 // either we need to issue a byte range request or we need to refetch the 1276 // entire document. 1277 PRUint32 contentLength = (PRUint32) mCachedResponseHead->ContentLength(); 1278 if (contentLength != PRUint32(-1)) { 1279 PRUint32 size; 128 If we set the 'content-length' at creation time as a PRUint32 item in the cache entry (e.g. FullContentLength), we don't need to save and parse the response header just to get the length... Similary: why not store these: 1319 if (mCachedResponseHead->NoStore() || 1320 (mCachedResponseHead->NoCache() && mConnectionInfo->UsingSSL())) and 1330 else if (mCachedResponseHead->MustValidate()) { as bitflags in the cacheentry? Also, the existance of 'response-*' or 'response-cookie' set 'ResponseWouldVary' only based on the stored header. So instead of the header just store a 'vary' flag for this. One very specific case are other cases of 'response-...' fields, (also) stored as MetaData, which are compared between current header and cached header, when also the 'Vary' field is set in the header. At least the MetaData fields are redundant, because there are also in the parsed header (or other way round). Question is: Is the 'Vary' thing really used? Can we not just regard anything with a 'Vary' as to be validated anyway? And probably best not to cache anyway? So, concluding: * MetaDataElement fields can be further optimized (using a byte as key) * Storing the 'response-header' can be reduced in storing a couple of specific fields (fullContentLength, noStore, noCache, mustValidate: 4 bytes and 3 bits) instead of 50-100 bytes string. This also saves some processing (reparsing cached headers and such) The move of about 5 bytes of HTTP specific data from MetaData to the CacheEntry itself is worth the trouble, most (if not all) of the (disk) cache items are HttpChannel items.

Alfred Kayser

Comment 4

•

21 years ago

Oops, further looking through the code, I can see that a cached 'response-header' truly useful. Even so, probably still worth of moving out from MetaData to CacheEntry itself, as 99% of disk cached elements are html.

Christian :Biesinger (don't email me, ping me on IRC)

Comment 5

•

21 years ago

> as 99% of disk cached elements are html. that seems a bit high, considering that most html pages contain images and other media elements.

Darin Fisher

Comment 6

•

21 years ago

christian: i think alfred meant "http" instead of "html"

Alfred Kayser

Comment 7

•

21 years ago

Yes, http requests and in the disk cache, which are most heavy users of CacheMetaDataElements. The memory cache is nowadays only (internal) images. I am working on a hack to test out the impact of using 'byte' keys. In this, I can reduce the the CacheMetaDataElements coding a lot. Instead of a linked lists build and maintained, only a 'disk-save' ready buffer is used containing key/values as: keyByte, valueString, zeroBytes. Get/Set and Flatten/Unflatten directly operate from this buffer. This results in only one malloc'ed piece per cache entry (which is resized if more than one entry is set). Profile is currently either 0, 1 or 3 elements per cache. Walking this keyvalue should be as fast or faster than current code. It reduces the need for Flatten/unflatten... Furthermore the current 'VisitMetaDataElements' stuff is only used by the 'about:cache', and adds several layers of calls, and a callback IDL, class and function just to dump the metadata fields to some html. Changing that to a function call from 'about' to the cache with the buffer and the formatstring to use, instead of a callback will reduce complexity of the IDL for this bit, somewhat. So, first I will do some measurements if this is leading somewhere (in terms of codesize, cachespeed, memorysize).

Alfred Kayser

Comment 8

•

21 years ago

Created bug 230675 to simplify nsICacheVisitor (20K codesize savings!) as a spin off of this one, as it is of interrest for all.

Doug Turner (:dougt)

Comment 9

•

21 years ago

moving minimo bugs to the new bugzilla product.

Component: Networking: Cache → General

Product: Browser → Minimo

Version: Trunk → unspecified

chris hofmann

Updated

•

21 years ago

Blocks: 236580

chris hofmann

Updated

•

21 years ago

No longer blocks: 215636

Doug Turner (:dougt)

Updated

•

20 years ago

Component: General → Networking: Cache

Product: Minimo → Core

Summary: [minimo] minimize memory cache size and growth → minimize memory cache size and growth

Version: unspecified → 1.0 Branch

Doug Turner (:dougt)

Updated

•

20 years ago

Version: 1.0 Branch → Trunk

Alfred Kayser

Comment 10

•

20 years ago

So, this bug is really about reducing the size of the memory cache which is only about 'caching' images... Currently, it seems that: a. images are not flushed from cache b. cache size limits are not checked c. images are cached in the fully decoded 24-bit+8-bit format (in some case both the frame format and OS dependent format) So the problem is not really in the cache code, but in the way that images are maintained in memory...

Daniel de Wildt

Updated

•

19 years ago

Blocks: 213391

Pascal Chevrel:pascalc

Updated

•

19 years ago

Status: NEW → ASSIGNED

Alfred Kayser

Comment 11

•

19 years ago

The discussion about the extravant memory usage of the 'caching' of images is raised again by the bfcache discussion. It seems that when using many tabs, with many images per page, and a lot of 'viewers' (bfcache) the memory usage of decoded images is a lot. This is not due to the memory cache itself, but to the way that decoded images (nsImage and friends) are kept around in the document object for that page. Two things that can be done to improve this: 1. Enable to prevent keep decoded images around for 'bfcached' pages. 2. Allow to disable the use of the memory cache for images. Ad 1. This would possibly slowdown bfcache, as images need to be reloaded (hopefully from the disk cache), decoded, and converted to the OS pixmap. But this would also prevent the very extravant memory usage caused by keeping all those images around, just for the 'fast backwards' feature. Created bug 327280 to investigate this further... Ad 2. The memory cache for images keeps dereferenced decoded images as long as there is space. The actual cache limits is only used for determining how many dereferenced images are kept. That's why so many people complain that the memory cache doesn't seem to keep within limits, as all referenced images are allways kept in the 'cache'. To allow to prevent keeping dereferenced decoded images, one could disable the memory cache completely, but better would be to either: a. Implement a preference to not administrate decoded images in the memory cache, and therefor not to keep dereferenced decoded images around (effectively disabling memory cache only for images, but enabling it for other things to be cached) b. Split the memory cache into an 'image cache' and 'memory cache for other items (http cache)', allowing to selectively enable/disable either of them. Propose to do option 2b: split the memory cache into (HTTP) memory cache and to image cache...

Darin Fisher

Updated

•

19 years ago

Assignee: darin → nobody

Status: ASSIGNED → NEW

Cees T.

Comment 12

•

13 years ago

Firefox 10 has built-in BarTab, so is this bug still valid?

Patrick McManus [:mcmanus]

Updated

•

9 years ago

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → WORKSFORME

Bugzilla

minimize memory cache size and growth

Categories

(Core :: Networking: Cache, defect)

Tracking

()

People

(Reporter: cathleennscp, Unassigned)

References

Details

(Keywords: memory-footprint, Whiteboard: post-m3?)

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Updated

Updated

Updated

Comment 10

Updated

Updated

Comment 11

Updated

Comment 12

Updated