Closed
Bug 194296
Opened 22 years ago
Closed 9 years ago
minimize memory cache size and growth
Categories
(Core :: Networking: Cache, defect)
Core
Networking: Cache
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: cathleennscp, Unassigned)
References
Details
(Keywords: memory-footprint, Whiteboard: post-m3?)
Comment 1•22 years ago
|
||
this bug can be broken down a bit:
1. allow cache sessions to have a maximum cache entry size (hard limit). e.g.,
imagelib should be able to say i don't want to store decoded images larger than
N bytes and have the cache take care of all the dooming, etc.
2. reduce overhead resulting from cache data structures (bug 192353 and probably
others).
3. anything else?
Depends on: 192353
Comment 2•21 years ago
|
||
might have done all that we can here. will address any problems that still
exist after some space trace analysis in a different bug
Whiteboard: post-m3?
Comment 3•21 years ago
|
||
Just a note of some thinking on this.
MetaData of cache entries is reasonable optimized internally.
However, get/setMetaDateElement uses a key/value system, whereby
both are strings. However, only a fairly restricted set of keys is used:
nsFtpConnection uses "servertype",
nsHttpChannel uses "response-head", "request-method", and "auth"
nsHTMLDocument uses "charset"
nsImapURL uses "ContentModified"
mimemoz2.cpp uses "contentType"
tabbrowser.xml uses "Icon" (only to set a flag, but not to use it)
To maybe instead a key, a enum of one byte is sufficient!
This saves for each entry the saving of these key names in the diskcache...
The Metadata is now stored as "response-head"'/0'"...."'/0' for each set key.
Using a keyByte, assuming about 3 MetaDataElements per cacheItem, and about 3000
items in the diskCache, saves about 72 K of the diskmap data.
A special case are the 'request-*' headers saved as MetaDataElement in the cache
by nsHttpChannel. Question again, do they really need to be saved/cached?
Other than cookies (which are handled differently anyway) it doesn't seem to be
really used anywere.
Even further, question is whether the full 'response-head' really needs
to be saved on disk. These could save about 50-100 bytes per entry (assuming
3000 entries) this again a 150K-300K savings...
(Beside codesize savings...)
----------------------------------------------------
key: http://www.doom9.org/images/mp4ui-optimize.gif
fetch count: 2
last fetched: 01/07/04 13:47:14
last modified: 01/07/04 11:54:48
expires: 01/16/04 15:37:30
Data size: 3492
file on disk: none
Security: This document does not have any security info associated with it.
Client:
HTTP request-method: GET
request-Accept-Encoding: gzip,deflate
response-head: HTTP/1.1 200 OK
Via: HTTP/1.1 ips-alm-a.alm.nl.ibm.com (IBM-PROXY-WTE)
Vary: Accept-Encoding
Date: Wed, 07 Jan 2004 11:52:10 GMT
Server: Apache/1.3.27 (Unix) PHP/4.3.1 mod_gzip/1.3.26.1a
Last-Modified: Tue, 07 Oct 2003 22:44:51 GMT
Etag: "cca6a-da4-3f8341e3"
Accept-Ranges: bytes
Content-Length: 3492
Content-Type: image/gif
----------------------------------------------------
Also 'request-method' could changed from a key/value to a one byte flag in the
cacheEntry itself. This saves bytes in the disk cache, as well as saving
code like this (note that http is (beside images) the most occurring cache entry):
// Get the method that was used to generate the cached response
1232 rv = mCacheEntry->GetMetaDataElement("request-method", getter_Copies(buf));
1233 if (NS_FAILED(rv)) return rv;
1234
1235 nsHttpAtom method = nsHttp::ResolveAtom(buf);
1236 if (method == nsHttp::Head) {
1237 // The cached response does not contain an entity. We can only reuse
1238 // the response if the current request is also HEAD.
1239 if (mRequestHead.Method() != nsHttp::Head)
1240 return NS_OK;
1241 }
1242 buf.Adopt(0);
1243
Ok, looking further down:
// Parse the cached HTTP response headers
1259 NS_ASSERTION(!mCachedResponseHead, "memory leak detected");
1260 mCachedResponseHead = new nsHttpResponseHead();
1261 if (!mCachedResponseHead)
1262 return NS_ERROR_OUT_OF_MEMORY;
1263 rv = mCachedResponseHead->Parse((char *) buf.get());
1264 if (NS_FAILED(rv)) return rv;
1265 buf.Adopt(0);
1266
1267 // If we were only granted read access, then assume the entry is valid.
1268 if (mCacheAccess == nsICache::ACCESS_READ) {
1269 mCachedContentIsValid = PR_TRUE;
1270 return NS_OK;
1271 }
>> If mCaccheAcces == ACCESS_READ, the early return makes the parsing of the
cached header redundant. May be first this check, then parse?
1272
1273 // If the cached content-length is set and it does not match the data size
1274 // of the cached content, then the cached response is partial...
1275 // either we need to issue a byte range request or we need to refetch the
1276 // entire document.
1277 PRUint32 contentLength = (PRUint32) mCachedResponseHead->ContentLength();
1278 if (contentLength != PRUint32(-1)) {
1279 PRUint32 size;
128
If we set the 'content-length' at creation time as a PRUint32 item in the cache
entry (e.g. FullContentLength), we don't need to save and parse the response
header just to get the length...
Similary: why not store these:
1319 if (mCachedResponseHead->NoStore() ||
1320 (mCachedResponseHead->NoCache() && mConnectionInfo->UsingSSL()))
and
1330 else if (mCachedResponseHead->MustValidate()) {
as bitflags in the cacheentry?
Also, the existance of 'response-*' or 'response-cookie' set 'ResponseWouldVary'
only based on the stored header. So instead of the header just store a 'vary'
flag for this.
One very specific case are other cases of 'response-...' fields, (also) stored
as MetaData, which are compared between current header and cached header, when
also the 'Vary' field is set in the header. At least the MetaData fields are
redundant, because there are also in the parsed header (or other way round).
Question is: Is the 'Vary' thing really used? Can we not just regard anything
with a 'Vary' as to be validated anyway? And probably best not to cache anyway?
So, concluding:
* MetaDataElement fields can be further optimized (using a byte as key)
* Storing the 'response-header' can be reduced in storing
a couple of specific fields (fullContentLength, noStore, noCache,
mustValidate: 4 bytes and 3 bits) instead of 50-100 bytes string.
This also saves some processing (reparsing cached headers and such)
The move of about 5 bytes of HTTP specific data from MetaData to the CacheEntry
itself is worth the trouble, most (if not all) of the (disk) cache items are
HttpChannel items.
Comment 4•21 years ago
|
||
Oops, further looking through the code, I can see that
a cached 'response-header' truly useful.
Even so, probably still worth of moving out from MetaData to CacheEntry
itself, as 99% of disk cached elements are html.
Comment 5•21 years ago
|
||
> as 99% of disk cached elements are html.
that seems a bit high, considering that most html pages contain images and other
media elements.
Comment 6•21 years ago
|
||
christian: i think alfred meant "http" instead of "html"
Comment 7•21 years ago
|
||
Yes, http requests and in the disk cache, which are most heavy users of
CacheMetaDataElements. The memory cache is nowadays only (internal) images.
I am working on a hack to test out the impact of using 'byte' keys.
In this, I can reduce the the CacheMetaDataElements coding a lot.
Instead of a linked lists build and maintained, only a 'disk-save' ready
buffer is used containing key/values as: keyByte, valueString, zeroBytes.
Get/Set and Flatten/Unflatten directly operate from this buffer.
This results in only one malloc'ed piece per cache entry (which is resized if
more than one entry is set). Profile is currently either 0, 1 or 3 elements per
cache. Walking this keyvalue should be as fast or faster than current code.
It reduces the need for Flatten/unflatten...
Furthermore the current 'VisitMetaDataElements' stuff is only
used by the 'about:cache', and adds several layers of calls, and a callback
IDL, class and function just to dump the metadata fields to some html.
Changing that to a function call from 'about' to the cache with the
buffer and the formatstring to use, instead of a callback will reduce complexity
of the IDL for this bit, somewhat.
So, first I will do some measurements if this is leading somewhere
(in terms of codesize, cachespeed, memorysize).
Comment 8•21 years ago
|
||
Created bug 230675 to simplify nsICacheVisitor (20K codesize savings!)
as a spin off of this one, as it is of interrest for all.
Comment 9•21 years ago
|
||
moving minimo bugs to the new bugzilla product.
Component: Networking: Cache → General
Product: Browser → Minimo
Version: Trunk → unspecified
Updated•19 years ago
|
Component: General → Networking: Cache
Product: Minimo → Core
Summary: [minimo] minimize memory cache size and growth → minimize memory cache size and growth
Version: unspecified → 1.0 Branch
Updated•19 years ago
|
Version: 1.0 Branch → Trunk
Comment 10•19 years ago
|
||
So, this bug is really about reducing the size of the memory cache which is only
about 'caching' images...
Currently, it seems that:
a. images are not flushed from cache
b. cache size limits are not checked
c. images are cached in the fully decoded 24-bit+8-bit format (in some case both
the frame format and OS dependent format)
So the problem is not really in the cache code, but in the way that images are
maintained in memory...
Updated•19 years ago
|
Status: NEW → ASSIGNED
Comment 11•19 years ago
|
||
The discussion about the extravant memory usage of the 'caching' of images is raised again by the bfcache discussion.
It seems that when using many tabs, with many images per page, and a lot of 'viewers' (bfcache) the memory usage of decoded images is a lot.
This is not due to the memory cache itself, but to the way that decoded images (nsImage and friends) are kept around in the document object for that page.
Two things that can be done to improve this:
1. Enable to prevent keep decoded images around for 'bfcached' pages.
2. Allow to disable the use of the memory cache for images.
Ad 1. This would possibly slowdown bfcache, as images need to be reloaded (hopefully from the disk cache), decoded, and converted to the OS pixmap.
But this would also prevent the very extravant memory usage caused by keeping all those images around, just for the 'fast backwards' feature.
Created bug 327280 to investigate this further...
Ad 2. The memory cache for images keeps dereferenced decoded images as long as there is space. The actual cache limits is only used for determining how many dereferenced images are kept. That's why so many people complain that the memory cache doesn't seem to keep within limits, as all referenced images are allways kept in the 'cache'.
To allow to prevent keeping dereferenced decoded images, one could disable the memory cache completely, but better would be to either:
a. Implement a preference to not administrate decoded images in the memory cache, and therefor not to keep dereferenced decoded images around (effectively disabling memory cache only for images, but enabling it for other things to be cached)
b. Split the memory cache into an 'image cache' and 'memory cache for other items (http cache)', allowing to selectively enable/disable either of them.
Propose to do option 2b: split the memory cache into (HTTP) memory cache and to image cache...
Updated•18 years ago
|
Assignee: darin → nobody
Status: ASSIGNED → NEW
Comment 12•13 years ago
|
||
Firefox 10 has built-in BarTab, so is this bug still valid?
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•