648429 - HTTP cache: compress all compressible files

Reporter

Description

•

14 years ago

Right now I believe we store all cache items in whatever compression level they were given to us (so gzipped HTML is stored compressed, but non-compressed items are stored uncompressed). This is silly, especially for the memory cache and for mobile, where we need to use space as wisely as possible. I'm not sure of the best compression algorithm to use. Given the high savings of network vs disk IO (usually :), a fairly expensive algorithm might be worth it to get maximum space saving. But almost any compression would be a win. Performance numbers would be useful here.

Harsh86

Comment 1

•

14 years ago

I don't know if this useful but Google recently open sourced a de|compression library called Snappy. "It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger." It was originally licensed under APL 2.0 but was recently relicensed under a BSD license. http://code.google.com/p/snappy/

Jason Duell

Reporter

Updated

•

14 years ago

No longer blocks: 644431

Wayne Mery (:wsmwk)

Updated

•

14 years ago

Blocks: 572459

Jason Duell

Reporter

Updated

•

14 years ago

Blocks: 645848

Geoff Brown [:gbrown]

Assignee

Updated

•

14 years ago

Assignee: nobody → gbrown

Bjarne (:bjarne)

Comment 3

•

14 years ago

Just some quick input: Does anyone have any idea of the percentage of uncompressed vs compressed data? Is it different on mobile and desktop? Could we improve this ratio by tweaking our "Accept-Encoding" header to insist on compression?

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 4

•

14 years ago

We can't insist on compression because there are lots of servers that don't implement any compression, lots of servers that implement compression but ignore the Accept-Encoding, and probably lots of servers that would not correctly interpret "identity;q=0".

Bjarne (:bjarne)

Comment 5

•

14 years ago

It may still be interesting to get some input to the questions from comment #3: Does anyone have any idea of the percentage of uncompressed vs compressed data? Is it different on mobile and desktop? Ref comment #4: Any hard facts or numbers on this? IMO it may be worth experimenting with "Accept-Encoding" in any case, i.e. see if it affects the percentage of compressed vs uncompressed data. (It may very well not, but then at least we know.) At the end of the day we want to understand how much disk-space we can eliminate, right? If we know that e.g. on average 20% of data can be compressed to 1/10, we've only eliminated 18% of the total size. Otoh, if we know that e.g. 60% of data can be compressed to 1/10 we have eliminated 54% of the total size (Amdahls' law applied on size). This may guide the amount of time and effort we want to spend on this.

Michal Novotny [:michal]

Comment 6

•

14 years ago

Right now I'm trying to dump such statistics from my large cache. I should have it soon...

Geoff Brown [:gbrown]

Assignee

Comment 7

•

14 years ago

I do not have statistics, but from a casual look at my cache, I'm seeing: - most files in my cache are images (jpeg, png, gif); they probably won't compress - most "big" servers (google/microsoft/cnn/facebook) gzip their html/js/text I expect compression would apply to less than 10% of the files in my cache.

Geoff Brown [:gbrown]

Assignee

Comment 8

•

14 years ago

I checked the "top 50" most popular web sites and found no site that responded to "Accept-Encoding: gzip, deflate" with uncompressed text. Visiting less popular sites, I found 3 sites that responded with uncompressed text. "Accept-Encoding: gzip, deflate;q=0.9,identity=0.1", and variations, had no effect on these sites: they still responded with uncompressed text. Are there specific sites or servers that we have a concern about? Specific suggestions for experiments with Accept-Encoding?

Michal Novotny [:michal]

Comment 9

•

14 years ago

I've gathered some statistics from my ~600Mb cache. The oocalc document can be downloaded from http://michal.etc.cz/bug648429/stats.ods (~7MB). I've used LZO to compress metadata (URL + metadata) and data. Compression ratio of LZO and SNAPPY should be very similar (http://news.ycombinator.com/item?id=2362651). Some numbers: - number of entries 58592 - number of compressed entries 9560 gzip 61 deflate - content type counts 18272 "image/jpeg" 13162 "image/gif" 7378 "image/png" 7301 "text/html" 3489 "application/x-javascript" 2055 "text/javascript" 1723 "text/css" 1188 "text/plain" 1185 "" 592 "application/json" 578 "application/javascript" 418 "application/x-shockwave-flash" 400 "text/xml" 348 "application/ocsp-response" 245 "image/x-icon" 112 "application/octet-stream" ... - average LZO compression ratio (excluding gzipped/deflated entries) 25.1% "" 29.8% "text/xml" 32.4% "application/json" 34.0% "text/html" 35.6% "text/css" 44.5% "application/javascript" 44.8% "application/x-javascript" 45.7% "image/x-icon" 48.2% "text/javascript" 83.1% "application/octet-stream" 94.3% "image/jpeg" 94.4% "image/gif" 94.7% "application/ocsp-response" 95.0% "text/plain" 98.8% "image/png" 99.6% "application/x-shockwave-flash"

Bjarne (:bjarne)

Comment 10

•

14 years ago

(In reply to comment #8) > I checked the "top 50" most popular web sites and found no site that responded > to "Accept-Encoding: gzip, deflate" with uncompressed text. Nice work! This is IMO a strong argument for postponing this activity. There is also bug #650143 which mentions compressed / uncompressed ratio as one of the stats to obtain, and that will give us some real-life data to work with. > Are there specific sites or servers that we have a concern about? I don't really know but I'd guess we care about the top 50 list which you already checked, and maybe also about behaviour of the most widely used servers. See http://www.http-stats.com/ > Specific > suggestions for experiments with Accept-Encoding? The alternate "Accept-Encoding" you mention in comment #8 has a syntax-error (missing ";q=" for identity). You might try the right one (unless your comment is a typo, of-course). You may also try to disallow identity-encoding (identity;q=0.0) to see how servers deal with that. Please also make a note of which servers (site and http-server) which respond erroneously to this, if any.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 11

•

14 years ago

Michal, Geoff, that is useful info. Michal, it would be interesting to see two other data points: (a) the "number of compressed entries" broken down by content type. The percentage of compressed entries in your cache is low, but I suspect that is because a lot of the uncompressed entries are not compressible. Less important: (b) Of the compressed entries, if they were re-compressed with deflate at maximum setting, how many of them would become smaller? (I believe Apache and IIS do not use maximum compression by default.) (c) Of the gzipped entries, how many of them contain the metadata about what compression level was used?

Geoff Brown [:gbrown]

Assignee

Comment 12

•

14 years ago

(In reply to comment #10) > The alternate "Accept-Encoding" you mention in comment #8 has a syntax-error > (missing ";q=" for identity). You might try the right one (unless your comment > is a typo, of-course). You may also try to disallow identity-encoding > (identity;q=0.0) to see how servers deal with that. Please also make a note of > which servers (site and http-server) which respond erroneously to this, if any. Oops - that's a typo in my comment. I actually used: Accept-Encoding: gzip, deflate;q=0.9,identity;q=0.1 I also tried: Accept-Encoding: gzip, deflate;q=1.0,identity;q=0.0 Some sites - eg. intrinsyc.com - seem to ignore this completely and return uncompressed data. This appears to be an older server, reporting: Server: Microsoft-IIS/5.0 X-Powered-By: ASP.NET X-AspNet-Version: 2.0.50727 A google search for "IIS 5 compression" suggests that HTTP compression was problematic in IIS 5, and also not enabled by default.

Jason Duell

Reporter

Comment 13

•

14 years ago

re: the stats from comment 9 and comment 11. I agree we need to see at a minimum a number that indicates "what percent of compressible content is not being compressed?". Also getting "how much smaller could we compress already-compressed/gzipped content" would be nice. Even if the percentage of uncompressed content is not high, this might still be worth doing, given that it shouldn't (?) be very much work.

Patrick McManus [:mcmanus]

Comment 14

•

14 years ago

(In reply to comment #13) > re: the stats from comment 9 and comment 11. I agree we need to see at a > minimum a number that indicates "what percent of compressible content is not > being compressed?". Also getting "how much smaller could we compress > already-compressed/gzipped content" would be nice. > Maybe we could publish[*] a utility that computes the stats of interest, so we could get a wider data set on this bug? Presumably based off michal's work in comment 9. I know I would run it and would feel more comfortable if even a wider handful of runs agreed - or pump it into telemetry if we want to wait for it. I'm curious for example if my cache compresseses to a similar degree that my wife's cache does. She is an educator and has different browsing patterns. I find the data in comment 9 a little surprising - but data rules at the end of the day. Especially if its reproduced! Dumb question - are caches platform independent? Can I compile and run such a utility on linux and copy over profiles from other platforms as data for it? [*] I don't mean publish in any formal sense - just code we can compile and run against the current trunk cache format.

Michal Novotny [:michal]

Comment 15

•

14 years ago

Some more stats: gzip/deflate by types 3375 "text/html" 2058 "application/x-javascript" 1383 "text/javascript" 746 "text/css" 560 "application/json" 504 "text/plain" 319 "text/xml" 251 "image/jpeg" 202 "application/javascript" 94 "image/gif" 69 "image/png" 23 "text/x-cross-domain-policy" 23 "image/x-icon" 10 "application/x-shockwave-flash" 1 "text/rdf" 1 "image/x-ms-bmp" 1 "httpd/unix-directory" 1 "font/ttf" percentage of compressible content that is compressed 42.5% "" 79.8% "text/xml" 94.6% "application/json" 46.2% "text/html" 43.3% "text/css" 34.9% "application/javascript" 59.0% "application/x-javascript" 9.4% "image/x-icon" 67.3% "text/javascript" gzip method (XFL value) 6416 00 2670 02 (best) 464 04 (fast) 10 78 (unknown/invalid) gzip levels (compared to actual gzip result) 1111 worse than 1 1855 1 54 1-2 276 2 41 2-3 165 3 365 3-4 615 4 119 4-5 1411 5 783 5-6 1403 6 479 6-7 89 7 94 7-8 78 8 22 8-9 26 9 477 better than 9 difference between recompressed (with gzip -9) and original content (in total) 55805084 / 58366830 bytes 64989952 / 67526400 bytes (actual disk usage)

Michal Novotny [:michal]

Comment 16

•

14 years ago

(In reply to comment #11) > (c) Of the gzipped entries, how many of them contain the metadata about what > compression level was used? If you mean XFL value see comment #15. No other information about chosen level is IMO available...

Michal Novotny [:michal]

Comment 17

•

14 years ago

(In reply to comment #14) > Dumb question - are caches platform independent? Can I compile and run such a > utility on linux and copy over profiles from other platforms as data for it? Yes, the cache is platform independent. > [*] I don't mean publish in any formal sense - just code we can compile and run > against the current trunk cache format. Later, I'll put the code somewhere. But you'll be surprised how ugly it is :) Actually it's a simple C++ code that extracts all entries from the cache block files. The whole cache is then parsed with a set of bash scripts.

Jason Duell

Reporter

Comment 18

•

14 years ago

cc-ing Dougt and mfinkle in case they can help with infrastructure: percentage of compressible content that is compressed 79.8% "text/xml" 46.2% "text/html" 43.3% "text/css" 34.9% "application/javascript" 59.0% "application/x-javascript" Excellent numbers to have! If these are even remotely representative, there's definitely a gain to be had here. (but comment 8 suggests that compressed content is much more common than Michal's numbers). Assuming that, and that the code is relatively easy to implement, I think we may want to prioritize this work, at least for the memory cache. (The memory cache is currently the only cache in use by fennec, and so this is one of the most likely things we could do to improve mobile cache perf in the FF 6 timeframe). Implementation-wise, it would be nice if we could serve the data to OnDataAvailable, then compress it (as opposed to compressing it, then decompressing it for OnDataAvailable). Now we need to figure out which compression algorithm to use. I'm no expert here, and I expect there's no definitive "right" answer to the CPU usage versus better compression tradeoff. Suggestions? Here's an opening suggestion: we run various compression algorithms over Michal's data set, and measure both 1) how much they shrink the uncompressed files and 2) how much CPU they took (use a high-performance timer and/or compress them a bunch of times if needed to get a useful timer reading). *Ideally* we'd get those numbers on a representative mobile device, but even desktop would be much better than nothing. (Michal, perhaps you can collaborate with mfinkle or dougt to see if there's a good way to get CPU/space metrics on mobile).

Jason Duell

Reporter

Comment 19

•

14 years ago

Geoff, Do you have intentions to actually implement a fix here, or can this bug be assigned to one of our "usual suspects" for HTTP cache fixes?

Geoff Brown [:gbrown]

Assignee

Comment 20

•

14 years ago

I had planned on doing the implementation, assuming that we think it's worthwhile (I am still sceptical: 80% compression * 50% of compressible content * 30% of content is compressible => maybe a 15% reduction in space used...at the price of more complexity and more CPU). I haven't worked on this code before and it looks a little intimidating, so I am open to alternatives...collaborate with a Usual Suspect, or hand it off entirely if we're in a hurry, etc. Suggestions?

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 21

•

14 years ago

(In reply to comment #18) > Now we need to figure out which compression algorithm to use. I'm no expert > here, and I expect there's no definitive "right" answer to the CPU usage > versus better compression tradeoff. Suggestions? I suggest doing deflate -9 first, then a lower deflate level if it's too much as far as CPU or memory usage goes, and then only test out other algorithms if zlib's deflate is too slow. Note that for TLS compression, we will be have to compress requests with deflate or not at all; there's no other choice of algorithm that is actually deployed on the web. So, it would be good to get the code size savings and complexity savings of using deflate for everything, if possible. Restricting the compression to the following content-types should help ensure that we don't get too crazy with worst-case time due to large inputs, as these types of documents are usually relatively small, while also being common and highly compressible: text/xml, application/json, text/html, text/css, application/javascript, application/x-javascript, text/javascript. Note that if "" is common, then we need to do the compression based on the calculated (sniffed) content-type, not on just the Content-Type header.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 22

•

14 years ago

(In reply to comment #20) > I had planned on doing the implementation, assuming that we think it's > worthwhile (I am still sceptical: 80% compression * 50% of compressible > content * 30% of content is compressible => maybe a 15% reduction in > space used...at the price of more complexity and more CPU). Right. How high we prioritize this should be based on how much it improves mobile performance. Otherwise, I think it should be a low priority.

Bjarne (:bjarne)

Comment 23

•

14 years ago

(In reply to comment #12) > Some sites - eg. intrinsyc.com - seem to ignore this completely and return > uncompressed data. This appears to be an older server Turning it around: Did you find servers which responds correctly to the modified Accept-Encoding header? Besides the major point of clarifying the potential for saving disk-usage by compressing cache-entries, it is IMO also useful to understand how we might influence servers to send compressed content. Given the amount of Firefox-users these days, leaning on (or assisting) servers-ops and/or server-vendors *may* be a way to reach the goal of this bug.

Geoff Brown [:gbrown]

Assignee

Comment 24

•

14 years ago

(In reply to comment #23) > (In reply to comment #12) > > Some sites - eg. intrinsyc.com - seem to ignore this completely and return > > uncompressed data. This appears to be an older server > > Turning it around: Did you find servers which responds correctly to the > modified Accept-Encoding header? No - I did not find a way to influence (uncompressed->compressed) any server with Accept-Encoding. But perhaps I didn't look very hard: I looked at less than 100 sites in total and found only a handful that provided uncompressed data in response to the normal Accept-Encoding. I will need to automate the testing to get more conclusive results.

Geoff Brown [:gbrown]

Assignee

Comment 25

•

14 years ago

I have results from automated testing for Accept-Encoding variations. I started with the Alexa list of the 1,000,000 most popular global websites and selected every 1000th site, to give 1000 sites with some distribution of popularity and geography. For each site, I used wget --mirror to find up to 100 URLs referenced by the main page; obvious image files (gif/jpg/png) were eliminated. This yielded approximately 32000 URLs. Visiting each URL with standard Firefox 4 request headers, key response headers and content length were noted. The most popular Server: headers retrieved were: Apache (222 variants): 66.6% IIS (4 variants): 14.9% Nginx (36 variants): 9.7% The most popular Content-Type: headers retrieved were: text/html and other HTML variants: 85.8% text/javascript and variants: 3.9% text/css and variants: 3.3% text/xml and variants: 3.1% With standard “Accept-Encoding: gzip, deflate”, the most popular Content-Encoding: headers were: <none>: 60.0% gzip: 38.1% deflate: 0.1% Next, URLs that did not report Content-Encoding of gzip/deflate/compress were visited with alternate Accept-Encoding headers. When sending “Accept-Encoding: gzip, deflate;q=1, identity;q=0, *;q=0”, a very small number of these URLs changed to report compressed Content-Encoding: less than 0.2% of the 32000 original URLs / about 0.3% of the ~18000 URLs re-tested. Additionally, there was a decrease in size of the retrieved content for a few URLs not reporting Content-Encoding (but some of these reflect a change in content over time, rather than use of compression). 2 sites reported HTTP status 406. In total, the Accept-Encoding change affected less than 0.6% of the URLs.

Bjarne (:bjarne)

Comment 26

•

14 years ago

Again: Nice work! :) This seems to contradict comment #8, but suggests potential beyond what we see in comment #18. Any good input to why? (I have one, but I'd like to see if there are other input first.) (I must also say I'm also pretty disappointed by the state of server-driven negotiations out there... maybe this could be a goal for evangelism.)

Geoff Brown [:gbrown]

Assignee

Comment 27

•

14 years ago

I am not sure of the cause of the discrepancy between comment #8 and comment #25. I would very much like to hear what you think Bjarne. :) Here's my speculation: - There is a much greater range of site popularity in the 32000 URL study, compared to the top 50. I expect (hope?!) that Google and Facebook are doing everything they can to optimize their traffic; GeoffsUnicornAndPonyDrawings.com might not be so careful, and might be more likely to use an older server. - In the top 50 study, I captured site traffic in Wireshark and examined the headers manually. If I saw text coming back gzip'd, I moved on to the next site. Taking a closer look now I see cases (eg amazon.com) where most text content is compressed -- but not all: The mostly-compressed Amazon main page loads a cloudfront.net URL that comes back uncompressed.

Bjarne (:bjarne)

Comment 28

•

14 years ago

Not my intention to keep you waiting - it was just rather late yesterday when I wrote comment #26 and I wanted to think things through. :) My idea is simply that some percentage of uncompressed items may not be cacheable, hence we would not see them in the cache. I see now that this is more an explanation for why Michals results in comment #18 differs from the results you report in comment #25. Your second speculation above explains IMO comment #8 and also emphasizes an important point: We must consider both main-pages and sub-resources used by them, which is what you present in comment #25. I see two other factors we should also consider: 1) non-cacheable resources should be filtered from the results 2) *size* of entries is more important than number of entries Resolving these may align your results with Michals results. (Otoh, Michals cache probably represent a completely different profile, which has an impact also, but I'd nevertheless expect more comparable results.) Btw, "GeoffsUnicornAndPonyDrawings.com"?? A hidden talent...? :-D

Geoff Brown [:gbrown]

Assignee

Comment 29

•

14 years ago

Attached patch wip - this works for me but isn't quite ready for review (obsolete) — Details — Splinter Review

Geoff Brown [:gbrown]

Assignee

Comment 30

•

14 years ago

I can selectively compress (ie, apply only to uncompressed HTML, plain text, etc) pages now, using zlib (deflate -9, as suggested in comment #21). It works for me, but I have only tested a little. One simple test I tried was to clear the cache, then visit a sequence of websites, first with a base nightly build, then with the cache compression patch. Some results (values from du --total run on Cache directory): Base Cache compression Change www.google.com 1032 896 -136 www.facebook.com 1180 1044 -136 www.youtube.com 1572 1440 -132 www.yahoo.com 1984 1844 -140 www.cnn.com 3268 2884 -384 www.wikipedia.org 3376 2992 -384 www.intrinsyc.com 4252 3740 -512 So it looks like we are saving space. Now I am looking into page load times, particularly on mobile.

Bjarne (:bjarne)

Comment 31

•

14 years ago

It will be interesting to see the timing-results! I have some scripts measuring times for cache-hits and -misses. Contact me directly if you want them for a starting-point (I'm still trying to figure out how to run them on mobiles though). I took the liberty of adding percentages to you table: Base Cache compression Change Change% www.google.com 1032 896 -136 -13.1% www.facebook.com 1180 1044 -136 -11.5% www.youtube.com 1572 1440 -132 - 8.4% www.yahoo.com 1984 1844 -140 - 7.1% www.cnn.com 3268 2884 -384 -11.8% www.wikipedia.org 3376 2992 -384 -11.4% www.intrinsyc.com 4252 3740 -512 -12.0% In other words we see savings between 7 % and 13% (avg 10.8%) for complete loads of selected pages. I assume the actual content loaded for each url was the same, i.e. loaded from a local copy and not from the real server?

Michal Novotny [:michal]

Comment 32

•

14 years ago

Comment on attachment 531800 [details] [diff] [review] wip - this works for me but isn't quite ready for review > + // If the content is compressible and the server has not compressed it, > + // mark the cache entry for compression. > + if ((mResponseHead->PeekHeader(nsHttp::Content_Encoding) == nsnull) && ( > + mResponseHead->ContentType().EqualsLiteral(TEXT_HTML) || > + mResponseHead->ContentType().EqualsLiteral(TEXT_PLAIN) || > + mResponseHead->ContentType().EqualsLiteral(TEXT_CSS) || > + mResponseHead->ContentType().EqualsLiteral(TEXT_JAVASCRIPT) || > + mResponseHead->ContentType().EqualsLiteral(TEXT_ECMASCRIPT) || > + mResponseHead->ContentType().EqualsLiteral(APPLICATION_JAVASCRIPT) || > + mResponseHead->ContentType().EqualsLiteral(APPLICATION_ECMASCRIPT) || > + mResponseHead->ContentType().EqualsLiteral(APPLICATION_XJAVASCRIPT) || > + mResponseHead->ContentType().EqualsLiteral(APPLICATION_XHTML_XML))) { > + rv = mCacheEntry->SetMetaDataElement("compress", "yes"); > + if (NS_FAILED(rv)) return rv; > + } Note that you shouldn't compress the entry if policy == nsICache::STORE_ON_DISK_AS_FILE.

Geoff Brown [:gbrown]

Assignee

Comment 33

•

14 years ago

I ran zippity test scripts on a Galaxy S to collect on-device page load times for several websites (real servers - not local copies). These tests operate on a list of websites. Each site in the list is visited sequentially, then each site is visited sequentially a 2nd time, and finally a 3rd time. I ran the test against a nightly build with and without the cache compression changes, and repeated the test 10 times. There is a fresh install (cache cleared) after each test run, so there is a cache miss for every 3rd site visit (10 misses, 20 hits for each site in the list). Comparing compression-enabled to compression-disabled on a per-site basis, average pageload times are comparable, but there is a slight bias. On some sites (eg. google.com), compression slightly decreases average page load time and variation in page load time; on other sites (eg intrinsyc.com), compression slightly increases average page load time and variation in page load time. Suggestions for additional testing / analysis welcome!

Geoff Brown [:gbrown]

Assignee

Comment 34

•

14 years ago

Attached image Graphs of pageload times for google.com and intrinsyc.com with/without cache compression (obsolete) — Details

Geoff Brown [:gbrown]

Assignee

Comment 35

•

14 years ago

Attached file Raw data used for graphs (obsolete) — Details

Bjarne (:bjarne)

Comment 36

•

14 years ago

Is the cache large enough to cache everything from a full zippity-test or will early entries be evicted by later? I assume this timing includes layout, rendering, JS and the whole sha-bang? What about the img-cache - is it on? Is it flushed between runs? (In reply to comment #33) > sites (eg. google.com), compression slightly decreases average page load > time and variation in page load time; on other sites (eg intrinsyc.com), > compression slightly increases average page load time and variation Referring to comment #30, the uncompressed size of intrinsyc.com is ~4 times larger than google.com. This *may* indicate that the compression itself has an impact only when there is a certain amount of data to compress. Note, however that this is based on only two datapoints which obviously is too few to conclude anything - I'd suggest to select and analyze a few more sites trying to falsify the above hypothesis. Oh - please also report the saved space for each site you analyze (referring to comment #31, this is 13.1% for google.com and 12.0% for intrinsyc.com). Have you measured the raw compression- and decompression-speed on the device using the code in your patch (just to get an idea what we're up against)?

Geoff Brown [:gbrown]

Assignee

Comment 37

•

14 years ago

(In reply to comment #36) > Is the cache large enough to cache everything from a full zippity-test or > will early entries be evicted by later? The full zippity-test causes evictions with the standard cache. The results I posted in comments 33-35 are a modified zippity with only 3 sites visited per run -- no evictions. > I assume this timing includes > layout, rendering, JS and the whole sha-bang? Yes. > What about the img-cache - is > it on? Is it flushed between runs? I don't know. How can I tell? > Referring to comment #30, the uncompressed size of intrinsyc.com is ~4 times > larger than google.com. The values in comment #30 indicate the total cache size after visiting a sequence of sites: without compression, the cache grew by 1032 blocks for google; it grew by 876 blocks for intrinsyc. > Have you measured the raw compression- and decompression-speed on the device > using the code in your patch (just to get an idea what we're up against)? I have not - it's a good idea. Stay tuned.

Geoff Brown [:gbrown]

Assignee

Comment 38

•

14 years ago

I found a problem in the original implementation, which partially accounts for the increased Tp noted in comment 33: nsHttpChannel::CheckCache() compares the cache entry size to the cached Content-Length (if not HTTP compressed) and considers a mismatch as a partial cache entry. Also, I've noted some xpcshell-test failures. Working through these...

Geoff Brown [:gbrown]

Assignee

Comment 39

•

14 years ago

Attached patch wip - corrects CheckCache problem; also does not compress STORE_ON_DISK_AS_FILE entries (obsolete) — Details — Splinter Review

Attachment #531800 - Attachment is obsolete: true

Geoff Brown [:gbrown]

Assignee

Comment 40

•

14 years ago

Attached image Graphs of pageload times for google.com and intrinsyc.com with/without cache compression (updated) — Details

Attachment #532759 - Attachment is obsolete: true

Attachment #532762 - Attachment is obsolete: true

Bjarne (:bjarne)

Comment 41

•

14 years ago

Results for intrinsyc.com indicates that compressing data is quite costly - did you get around to measure raw times for compression? It would be interesting also to get a clear idea of how much data we are compressing in each case. I'm a little confused by comment #37 vs comment #30 in this respect: What exactly does the table in comment #30 show? My interpretation is that the Base-column shows how much total-cache-size grows when loading each url without compression, and the "Cache-compression" column show how much total-cache-size grows when loading each url WITH compression. Is this wrong?

Geoff Brown [:gbrown]

Assignee

Comment 42

•

14 years ago

(In reply to comment #41) > Results for intrinsyc.com indicates that compressing data is quite costly - > did you get around to measure raw times for compression? Compressing data is certainly adding to page load time. With deflate(9), I see an average increase in page load time for intrinsyc.com (my best example of compressible data) of 5% to 10%. I tried measuring raw compression speed by timing just the calls to zlib compression functions, and logging the sum of those times for each file, along with the size of the file. Results varied quite a bit, but on average, on the Galaxy S, using deflate(9), I got about 2 megabytes / second. I repeated the test with deflate(1), and got about 4 megabytes / second. I repeated my zippity tests with deflate(1) and found that page load times with compression were nearly identical to those without compression -- see new graph attachment. > It would be interesting also to get a clear idea of how much data we are > compressing in each case. I'm a little confused by comment #37 vs comment > #30 in this respect: What exactly does the table in comment #30 show? My > interpretation is that the Base-column shows how much total-cache-size grows > when loading each url without compression, and the "Cache-compression" > column show how much total-cache-size grows when loading each url WITH > compression. Is this wrong? Each value in the Base-column of the table in comment #30 shows the total-cache-size (as measured by du on the cache directory) after retrieving the specified site. Subtracting subsequent values in the Base (or Cache compression) column gives the amount of cache growth caused by visiting the second site of the pair. Note that the first value (google.com) in each column may not be very meaningful because it includes the overhead for creating the cache map files, etc. I am just cautious about drawing too many conclusions from this table; my suggestion is only that it shows that compression is translating into a reduction in disk space. Using about:cache, I see intrinsyc.com returns compressible content such as prototype.js, with an uncompressed size of 129738 bytes; deflate(9) compresses this to 30034 bytes (23% of the original size); deflate(1) compresses this to 37491 bytes (29%). The conclusion I have reached is that deflate(9) is a bit too intense for mobile devices, but deflate(1) is fine and still produces a reasonable compression ratio.

Geoff Brown [:gbrown]

Assignee

Comment 43

•

14 years ago

Attached image Graphs of zippity page load times for deflate(5) and deflate(1) — Details

Bjarne (:bjarne)

Comment 44

•

14 years ago

Starts looking good! :) The basics are certainly in place now. In order to provide a strong foundation for a decision in this area some more things would be useful: 1) these graphs for more than one url, preferably some with different profiles in terms of compressed/uncompressed content, as well as different sizes 2) for each graph, indicate how much space the compression saved (both absolute and in %)

Geoff Brown [:gbrown]

Assignee

Comment 45

•

14 years ago

BTW, I repeated tests on a Nexus 1 and obtained very similar results: impaired page load time with deflate(9); no change to page load time with deflate(1).

Geoff Brown [:gbrown]

Assignee

Comment 46

•

14 years ago

(In reply to comment #44) > 1) these graphs for more than one url, preferably some with different > profiles in terms of compressed/uncompressed content, as well as different > sizes > > 2) for each graph, indicate how much space the compression saved (both > absolute and in %) Running the tests and generating the graphs is a bit time-consuming: I can do a few more, but let's not get carried away! Also, I do not have a good strategy for finding different profiles or automatically measuring the compressed content and sizes. Remember that compression operates on individual cache entries, while the page load time applies to a group of such entities (all the elements required to display the page). I can load a page and then manually examine about;cache, or log out a sequence of per-entry measurements... Alternately, I can set up my own very simple pages of text, served from a non-compressing server, but that is not a common real-world scenario. I have looked at data for several other common sites, and of these, intrinsyc.com was the worst case.

Geoff Brown [:gbrown]

Assignee

Comment 47

•

14 years ago

Attached patch patch (obsolete) — Details — Splinter Review

Attachment #534104 - Attachment is obsolete: true

Attachment #535701 - Flags: review?(jduell.mcbugs)

Geoff Brown [:gbrown]

Assignee

Comment 48

•

14 years ago

Attached patch patch (obsolete) — Details — Splinter Review

The previous patch was failing some xpcshell tests; all xpcshell tests pass now. The issue was that the cache entry's content-length differed from the amount of data returned by the entry's stream, for compressed data (particularly when compression caused an increase in data size -- a condition that sometimes cannot be avoided). This patch moves the inflation of compressed cache entries into the entry's output stream wrapper, so that the cache entry input and output streams consistently take and provide data in the format provided by the server.

Attachment #535701 - Attachment is obsolete: true

Attachment #535701 - Flags: review?(jduell.mcbugs)

Attachment #536238 - Flags: review?(jduell.mcbugs)

Geoff Brown [:gbrown]

Assignee

Comment 49

•

14 years ago

Attached image Graphs of pageload times for 3 simple text files with/without cache compression (deflate 1) — Details

As suggested in comment #44, here are some more results from zippity tests. In this case, I tested 3 sizes of plain text pages. Cache entry size (KiB) Average pageload time (ms) no comp comp % change no comp comp % change 5k.txt 5 3 40.0 623.43 624.43 -0.16 50k.txt 50 24 52.0 615.40 612.40 0.49 500k.txt 500 218 56.4 2591.97 2597.97 -0.23

Geoff Brown [:gbrown]

Assignee

Comment 50

•

14 years ago

Attached image Graphs of pageload times for 3 popular sites with/without cache compression (deflate 1) — Details

...and another graph for www.yahoo.co.jp, mail.ru, and news.baidu.com Cache entry size (KiB) Average pageload time (ms) no comp comp % change no comp comp % change yahoo.co.jp 122 88 27.9 3990.47 3966.67 3.10 mail.ru 399 391 2.0 8585.93 8788.60 -2.36 baidu.com 20 20 0.0 4669.03 4223.57 9.54

Jason Duell

Reporter

Updated

•

14 years ago

Attachment #536238 - Flags: review?(jduell.mcbugs) → review?(michal.novotny)

Jason Duell

Reporter

Comment 51

•

14 years ago

from comment 21 > I suggest doing deflate -9 first, then a lower deflate level if it's too much > as far as CPU or memory usage goes, and then only test out other algorithms if > zlib's deflate is too slow. Note that for TLS compression, we will be have to > compress requests with deflate or not at all; there's no other choice of > algorithm that is actually deployed on the web. Brian: What about TLS requires us to use zlib deflate, as opposed to snappy, or something else? I don't see why TLS matters if we're just compressing the channel data in the cache (i.e. not on the wire).

Jason Duell

Reporter

Comment 52

•

14 years ago

re: comment 50: The % change (ms) fields seem a bit off? I expected cases that are faster to have a negative value. Also it seems the yahoo.co.jp case is incorrect for time % (6 ms out of almost 4000 isn't 3%). > Compressing data is certainly adding to page load time. With deflate(9), I see > an average increase in page load time for intrinsyc.com (my best example of > compressible data) of 5% to 10%. So theoretically only *decompressing* has to increase page load time, right? Compressing happens on a separate thread as async I/O, and is (I assume) not in the path between loading bytes from the network and delivering them to the channel listener's OnDataAvailable. That said, I doubt that we deprioritize the cache I/O thread below the parsing/rendering threads, and so the CPU needed to compress might show up as a perf hit for page load just by siphoning away CPU slices from the rest of the browser (particularly on a single-CPU device like a mobile phone). Decompressing CPU cost is similar no matter what level of compression, right? I'm guessing your tests are measuring "hot" disk cache hits, i.e. you've recently stored the cache entry on disk, so it's probably in the operating system's file cache. It would be interesting to see how things fare on "cold" entries that require actual disk I/O, i.e. if you flush the OS cache in between storing the cache entries and doing cache hits. If you're on Linux you can do this via "sudo echo 3 > /proc/sys/vm/drop_cache": see (http://www.linuxquestions.org/questions/linux-kernel-70/how-to-disable-filesystem-cache-627012/). But given how many factors are at play, and how hard it is to really approximate a working cache, the best benchmarking approach for this bug is probably to land it on m-c (with a pref that allows it to be disabled, ideally), and collect some telemetry data to measure how it affects nighly users. patch drive-by comment: Given that nsCompressInputStreamWrapper is actually doing Decompress instead of Compress, let's name it nsDecompressInputStreamWrapper.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 53

•

14 years ago

(In reply to comment #51) > Brian: What about TLS requires us to use zlib deflate, as opposed to > snappy, or something else? I don't see why TLS matters if we're just > compressing the channel data in the cache (i.e. not on the wire). My point is that deflate/inflate is "free" and well-tested because we use it in other parts of the product already. We could definitely use snappy or something else, if it is demonstrably better than the results we would get with deflate.

Geoff Brown [:gbrown]

Assignee

Comment 54

•

14 years ago

(In reply to comment #52) > re: comment 50: The % change (ms) fields seem a bit off? I expected cases > that are faster to have a negative value. Also it seems the yahoo.co.jp > case is incorrect for time % (6 ms out of almost 4000 isn't 3%). I defined "% change" = ("no comp" - "comp") / "no comp" * 100% -- it seems intuitive to me. :) There is a typo in the yahoo.co.jp table entry: The original data shows a compressed time of 3866.67 ms. (3990.47-3866.67)/3990.47 = 3.10%.

Geoff Brown [:gbrown]

Assignee

Comment 55

•

14 years ago

(In reply to comment #52) > So theoretically only *decompressing* has to increase page load time, right? > Compressing happens on a separate thread as async I/O, and is (I assume) not > in the path between loading bytes from the network and delivering them to > the channel listener's OnDataAvailable. That said, I doubt that we > deprioritize the cache I/O thread below the parsing/rendering threads, and > so the CPU needed to compress might show up as a perf hit for page load just > by siphoning away CPU slices from the rest of the browser (particularly on a > single-CPU device like a mobile phone). Decompressing CPU cost is similar > no matter what level of compression, right? It's hard to say definitively, but that's exactly what I think is happening. On some runs with significant compression (deflate 9), there seems to be no effect on page load time; on others - seemingly at random - there is an increase. As you say, compression is not in the path to OnDataAvailable, so it seems likely that the increase in page load comes about when compression (or perhaps decompression - but that seems less likely) steals CPU from the rest of the browser. > let's name it nsDecompressInputStreamWrapper. Sure.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 56

•

14 years ago

At build time, we can create separate compression contexts for each type of file: JS, HTML, CSS that are loaded with common substrings used in each type of file. Then, when we compress/decompress entries, we can use these pre-built contexts. I am not sure how much of an improvement this would buy us, however.

(dormant account)

Comment 57

•

14 years ago

(In reply to comment #55) > (In reply to comment #52) > > So theoretically only *decompressing* has to increase page load time, right? > > Compressing happens on a separate thread as async I/O, and is (I assume) not > > in the path between loading bytes from the network and delivering them to > > the channel listener's OnDataAvailable. That said, I doubt that we > > deprioritize the cache I/O thread below the parsing/rendering threads, and > > so the CPU needed to compress might show up as a perf hit for page load just > > by siphoning away CPU slices from the rest of the browser (particularly on a > > single-CPU device like a mobile phone). Decompressing CPU cost is similar > > no matter what level of compression, right? > > It's hard to say definitively, but that's exactly what I think is happening. > On some runs with significant compression (deflate 9), there seems to be no > effect on page load time; on others - seemingly at random - there is an > increase. As you say, compression is not in the path to OnDataAvailable, so > it seems likely that the increase in page load comes about when compression > (or perhaps decompression - but that seems less likely) steals CPU from the > rest of the browser. > Seems like this would be settled by timing decompression (making sure to subtract any io time)

Michal Novotny [:michal]

Comment 60

•

14 years ago

Comment on attachment 536238 [details] [diff] [review] patch > + nsXPIDLCString val; > + nsresult rv = GetMetaDataElement("uncompressed-len", getter_Copies(val)); const char *val; val = mCacheEntry->GetMetaDataElement("uncompressed-len"); > - * nsCacheInputStream - a wrapper for nsIInputstream keeps the cache entry > + * nsCacheInputStream - a wrapper for nsIInputStream keeps the cache entry I don't see any change... > + if (mReadBuffer == NULL) { > + mReadBuffer = (unsigned char*)nsMemory::Alloc(count); > + } else { > + mReadBuffer = (unsigned char*)nsMemory::Realloc(mReadBuffer, count); > + } Realloc(NULL, count) should do the same as Alloc(count), so the if-statement isn't needed. Why is "count" chosen for the buffer size? The compressed data should be smaller, so it makes sense to allocate smaller buffer. On the other hand, there should be some minimum amount that is allocated. E.g. reading 1 byte of uncompressed data will definitely need more than 1 byte of compressed data. > + // Once allocated, this buffer is referenced by the zlib stream and > + // cannot be grown. We use 2x(initial write request) to approximate > + // an appropriate size for the stream buffer. > + mWriteBuffer = (unsigned char*)nsMemory::Alloc(count*2); Why count*2? The same as above, there should be some minimum size. Maybe this size should be fixed? And see below the problem with count=0. > + while (mZstream.avail_in > 0 && zerr == Z_OK) { > + zerr = deflate(&mZstream, Z_NO_FLUSH); > + > + while (mZstream.avail_out == 0 && rv == NS_OK) { > + rv = WriteAvailable(); > + zerr = deflate(&mZstream, Z_NO_FLUSH); > + } > + } This double loop is not completely clear to me. E.g. if WriteAvailable() fails then there are another 2 calls to deflate(). And if somebody calls the first write() with count=0 (which of course doesn't make much sense), then any other non-zero write will stuck in the inner loop. > + nsresult nsCacheEntryDescriptor:: > + nsCompressOutputStreamWrapper::WriteAvailable() Please rename the method to something like WriteBuffer or FlushBuffer. > - while (dataSize) { > - PRUint32 count = PR_MIN(dataSize, sizeof(chunk)); > - if (NS_FAILED(stream->Read(chunk, count, &n)) || n == 0) > - break; > - dataSize -= n; > + while(NS_SUCCEEDED(stream->Read(chunk, sizeof(chunk), &n)) &&. > + n > 0) { I don't like the fact that nsICacheEntryInfo::dataSize returns compressed size. Why not to return the uncompressed size? Also please note that the compressed entries don't support appending data.

Attachment #536238 - Flags: review?(michal.novotny) → review-

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 61

•

14 years ago

(In reply to comment #60) > I don't like the fact that nsICacheEntryInfo::dataSize returns compressed > size. Why not to return the uncompressed size? It would be better to have two new methods, uncompressedDataSize and compressedDataSize, removing dataSize. Then there would be no confusion about what the return value is supposed to represent.

Michal Novotny [:michal]

Comment 62

•

14 years ago

(In reply to comment #61) > It would be better to have two new methods, uncompressedDataSize and > compressedDataSize, removing dataSize. Then there would be no confusion > about what the return value is supposed to represent. Maybe, but I don't think it is really necessary. There is no access to the raw compressed data. So if all we can read from the entry is N bytes, then dataSize should IMO return N bytes.

Geoff Brown [:gbrown]

Assignee

Comment 63

•

14 years ago

(In reply to comment #60) Thanks much Michal! > > - * nsCacheInputStream - a wrapper for nsIInputstream keeps the cache entry > > + * nsCacheInputStream - a wrapper for nsIInputStream keeps the cache entry > > I don't see any change... Corrects lower-case "s" to upper-case, in "Stream". > > + if (mReadBuffer == NULL) { > > + mReadBuffer = (unsigned char*)nsMemory::Alloc(count); > > + } else { > > + mReadBuffer = (unsigned char*)nsMemory::Realloc(mReadBuffer, count); > > + } > > Why is "count" chosen for the buffer size? The compressed data should be > smaller, so it makes sense to allocate smaller buffer. On the other hand, > there should be some minimum amount that is allocated. E.g. reading 1 byte > of uncompressed data will definitely need more than 1 byte of compressed > data. A simple fixed size buffer would be sufficient, but I expect we can save a few cycles by using a buffer size that minimizes the number of times through the compress (or decompress) loop. "count" is not perfect, but is a rough approximation of what is required (perhaps off by 2x or 3x). I agree that a minimum is required, and have added a 1K minimum for both the read and write buffers. > > + while (mZstream.avail_in > 0 && zerr == Z_OK) { > > + zerr = deflate(&mZstream, Z_NO_FLUSH); > > + > > + while (mZstream.avail_out == 0 && rv == NS_OK) { > > + rv = WriteAvailable(); > > + zerr = deflate(&mZstream, Z_NO_FLUSH); > > + } > > + } > > This double loop is not completely clear to me. I re-wrote it to more closely resemble the example provided with zlib; it's simpler and I think the error handling is sensible now. > I don't like the fact that nsICacheEntryInfo::dataSize returns compressed > size. Why not to return the uncompressed size? nsICacheEntryInfo::dataSize returns nsCacheEntry::DataSize() which I interpret as the amount of data stored in the nsCacheEntry. From the perspective of about:cache, providing the compressed size gives a consistent view of the entry's resource usage. DataSize() is also used by LRU-SP cache eviction, which should be based on resource usage (compressed size / size on disk). Clients like nsHttpChannel usually use Content-Length where the "uncompressed size" is required. If the actual uncompressed data size is required somewhere, there is also the new meta-data element "uncompressed-len" available. There is potential for future confusion here, especially since compression will only apply to some pages, but using uncompressedDataSize/compressedDataSize might be an over-reaction. I have not changed this, but am open to more discussion... > Also please note that the compressed entries don't support appending data. I'm not sure what you mean here. Can you explain?

Geoff Brown [:gbrown]

Assignee

Comment 64

•

14 years ago

Attached patch patch updated based on review (obsolete) — Details — Splinter Review

Updated based on review (comment 60). Also changed wrapper name and allowed for disabling compression via pref, as suggested earlier (jduell, comment 52).

Attachment #536238 - Attachment is obsolete: true

Attachment #539423 - Flags: review?(michal.novotny)

Geoff Brown [:gbrown]

Assignee

Comment 65

•

14 years ago

(In reply to comment #56) > At build time, we can create separate compression contexts for each type of > file: JS, HTML, CSS that are loaded with common substrings used in each type > of file. Then, when we compress/decompress entries, we can use these > pre-built contexts. I am not sure how much of an improvement this would buy > us, however. This is an interesting idea but I fear that testing the relative efficiency of various dictionaries could take on a life of its own! Perhaps we should spin this off as a separate bug?

Michal Novotny [:michal]

Comment 66

•

14 years ago

Comment on attachment 539423 [details] [diff] [review] patch updated based on review > + if (mReadBufferLen < count) { > + // Once allocated, this buffer is referenced by the zlib stream and > + // cannot be grown. We use 2x(initial write request) to approximate > + // a stream buffer size proportional to request buffers. wrong comment > + mReadBufferLen = PR_MIN(count, kMinDecompressReadBufLen); There should be MAX, not MIN. And please use NS_MAX instead of PR_MAX. > + mReadBuffer = (unsigned char*)nsMemory::Realloc(mReadBuffer, > + mReadBufferLen); This leaks the old buffer when realloc fails. Btw. it isn't fatal error when realloc fails and we have an old buffer. I.e. do something like this: newBufLen = NS_MAX(count, kMinDecompressReadBufLen); newBuf = (unsigned char*)nsMemory::Realloc(mReadBuffer, newBufLen); if (newBuf) { mReadBufferLen = newBufLen; mReadBuffer = newBuf; } else if (!mReadBuffer) { return NS_ERROR_OUT_OF_MEMORY; } > + if (mWriteBuffer == NULL) { > + // Once allocated, this buffer is referenced by the zlib stream and > + // cannot be grown. We use 2x(initial write request) to approximate > + // a stream buffer size proportional to request buffers. > + mWriteBufferLen = PR_MIN(count*2, kMinCompressWriteBufLen); > + mWriteBuffer = (unsigned char*)nsMemory::Alloc(mWriteBufferLen); > + if (!mWriteBuffer) { NS_MAX again. And please don't mix (mWriteBuffer == NULL) and (!mWriteBuffer). Use one style in the whole file. I prefer the latter. > + mUncompressedCount += *result; > + return rv; > +} Change it to return NS_OK. End of the function is reached only in case of success. > nsICacheEntryInfo::dataSize returns nsCacheEntry::DataSize() which I > interpret as the amount of data stored in the nsCacheEntry. From the > perspective of about:cache, providing the compressed size gives a consistent > view of the entry's resource usage. DataSize() is also used by LRU-SP cache > eviction, which should be based on resource usage (compressed size / size on > disk). I don't agree. Of course, eviction algorithm should use nsCacheEntry::DataSize() which returns compressed size. But value returned by nsICacheEntryInfo::dataSize should exactly match the amount that can be read from the cache entry.

Attachment #539423 - Flags: review?(michal.novotny) → review-

Michal Novotny [:michal]

Comment 67

•

14 years ago

Attached file test (obsolete) — Details

(In reply to comment #63) > > Also please note that the compressed entries don't support appending data. > > I'm not sure what you mean here. Can you explain? See attached test. It fails when the data is compressed. We do this in case of partial entries.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 68

•

14 years ago

(In reply to comment #66) > I don't agree. Of course, eviction algorithm should use > nsCacheEntry::DataSize() which returns compressed size. But value returned > by nsICacheEntryInfo::dataSize should exactly match the amount that can be > read from the cache entry. I find this naming highly confusing. It would be much clearer if the names indicated specifically whether the size is the compressed or the uncompressed size, especially in this case where there is an inconsistency.

Geoff Brown [:gbrown]

Assignee

Comment 69

•

14 years ago

Attached patch updated patch addressing comments 66-68 (obsolete) — Details — Splinter Review

Thanks for the append test case -- that was very useful! With this version of the patch, a compressed entry may contain multiple compression streams, one after another. When reading a compressed entry, end-of-compressed-stream is handled by resetting the compression stream; if additional data is still present, it will be inflated and appended to the request buffer. While looking at the append test case, I came around to agreeing with Michal's concerns about .dataSize: It is natural to expect that .dataSize will be in agreement with the data returned by the stream (ie, uncompressed size). Allowing nsCacheEntry::DataSize() to continue to return the actual / stored (compressed) size of the entry simplifies compatibility. To allow clients of nsICacheEntryDescriptor to access the compressed size, I have added nsICacheEntryDescriptor::storedDataSize; this is required, for instance, when trying to append to a compressed entry, as illustrated by the test case. I believe all other review comments have also been addressed.

Attachment #539423 - Attachment is obsolete: true

Attachment #543304 - Flags: review?(michal.novotny)

Geoff Brown [:gbrown]

Assignee

Comment 70

•

14 years ago

Attached file Updated test for appending to a compressed entry (uses storageDataSize) (obsolete) — Details

Should we add this test to the netwerk/test/unit xpcshell manifest, or is it too much of a special case?

Attachment #540711 - Attachment is obsolete: true

Michal Novotny [:michal]

Comment 71

•

14 years ago

Comment on attachment 543304 [details] [diff] [review] updated patch addressing comments 66-68 Looks good.

Attachment #543304 - Flags: review?(michal.novotny) → review+

Michal Novotny [:michal]

Comment 72

•

14 years ago

(In reply to comment #70) > Created attachment 543308 [details] > Updated test for appending to a compressed entry (uses storageDataSize) > > Should we add this test to the netwerk/test/unit xpcshell manifest, or is it > too much of a special case? Yes, I think we should have a test for this case too.

Geoff Brown [:gbrown]

Assignee

Comment 73

•

14 years ago

Attached patch patch to add test_compressappend.js to xpcshell network unit tests (obsolete) — Details — Splinter Review

Attachment #546886 - Flags: review?(michal.novotny)

Michal Novotny [:michal]

Comment 74

•

14 years ago

Comment on attachment 546886 [details] [diff] [review] patch to add test_compressappend.js to xpcshell network unit tests Please add some short description of what this test does at the beginning of the file.

Attachment #546886 - Flags: review?(michal.novotny) → review+

Geoff Brown [:gbrown]

Assignee

Comment 75

•

14 years ago

Attached patch patch to add test_compressappend.js to xpcshell network unit tests (added comment) r=michal.novotny (obsolete) — Details — Splinter Review

Just added comment to test file (comment 74). r=michal.novotny

Attachment #543308 - Attachment is obsolete: true

Attachment #546886 - Attachment is obsolete: true

Attachment #547416 - Flags: review+

Geoff Brown [:gbrown]

Assignee

Updated

•

14 years ago

Keywords: checkin-needed

Brad Lassey [:blassey] (use needinfo?)

Comment 76

•

14 years ago

pushed to mozilla-inbound: http://hg.mozilla.org/integration/mozilla-inbound/rev/79a36ed03e14 http://hg.mozilla.org/integration/mozilla-inbound/rev/d17ac9b0254e

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 77

•

14 years ago

Were we really expecting this to get checked in? I thought we were going to do more performance testing first.

Geoff Brown [:gbrown]

Assignee

Comment 78

•

14 years ago

Consider jduell's comment 52: > But given how many factors are at play, and how hard it is to really > approximate a working cache, the best benchmarking approach for this bug > is probably to land it on m-c (with a pref that allows it to be disabled, > ideally), and collect some telemetry data to measure how it affects > nighly users.

Jason Duell

Reporter

Comment 79

•

14 years ago

Taras: Do we have any story yet on implementing A-B testing on nightly (in this case using different compression levels)? So we've got a way to turn this off (browser.cache.compression_level=0), which is good. So we can leave this on the tree at least :) We don't have any telemetry data, which is not so good. And we don't have any automated (or even standard non-automated) way to measure cache perf. I've outlined a plan for how to determine some sensible levels for compression in bug 671971 Geoff: I see you've picked compression level 1 for mobile and 5 for firefox. Are these based on any data? Oh--and glancing over the patch, I don't see any obvious way that we're using the compressed size of objects when determining how much stuff we're caching. I may be missing something. But we want to make sure we measure the the cache total size using the compressed size, else we're not actually getting any extra storage :)

Geoff Brown [:gbrown]

Assignee

Comment 80

•

14 years ago

(In reply to comment #79) > Geoff: I see you've picked compression level 1 for mobile and 5 for firefox. > Are these based on any data? For mobile, see comments 40 through 45: Testing on a Galaxy S and a Nexus 1 showed that compression level 9 sometimes degraded page load time; compression level 1 did not degrade page load time. I believe compression level 5 also did not degrade page load time. However, I wanted to err on the side of page-load-time-caution, so chose 1. For desktop Firefox, I do not have data. Compression level 5 was chosen somewhat arbitrarily. I expect that if phones can handle compression, we won't have trouble on desktop. > Oh--and glancing over the patch, I don't see any obvious way that we're > using the compressed size of objects when determining how much stuff we're > caching. I may be missing something. But we want to make sure we measure > the the cache total size using the compressed size, else we're not actually > getting any extra storage :) I believe the cache device total size is based on nsCacheEntry::Size(), which should return the compressed size -- the amount of storage used. I'll check on that.... Apologies if I jumped the gun on the checkin request. I thought the discussion had played out and we wanted to get this in to see if wider (ie, nightly) use turned up any problems.

Geoff Brown [:gbrown]

Assignee

Comment 81

•

14 years ago

(In reply to comment #80) > I believe the cache device total size is based on nsCacheEntry::Size(), > which should return the compressed size -- the amount of storage used. I'll > check on that.... I verified this by logging out the TotalSize from both the memory cache and the disk cache after storing a compressible file -- TotalSize reflects the compressed size.

Marco Bonardo [:mak] (Away July 4-13)

Comment 82

•

14 years ago

This had been pushed to inbound, but I've backed out it to figure out if it's the reason for the permaorange in osx opt xpcshell TEST-UNEXPECTED-FAIL | /Users/cltbld/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_bug650955.js | test failed (with xpcshell return code: 0) TEST-UNEXPECTED-FAIL | /Users/cltbld/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_bug650955.js | 012345 == Initial value - See following stack: TEST-UNEXPECTED-FAIL | /Users/cltbld/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/head_channels.js | Error in closure function: 2147500036 - See following stack: it started just after this push, see for example http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla-Inbound/1311284229.1311285619.5488.gz I'll report back if it is confirmed.

Keywords: checkin-needed

Marco Bonardo [:mak] (Away July 4-13)

Comment 83

•

14 years ago

it took less than expected, after the backout the orange disappeared, so this was the culprit, somehow.

Marco Bonardo [:mak] (Away July 4-13)

Comment 84

•

14 years ago

notice that test already has a random orange filed, so may even be this patch just touches something related to the existing orange making it worse on osx opt.

(no longer active)

Comment 85

•

14 years ago

This has been backed out, and the backout has fixed the test failures.

Geoff Brown [:gbrown]

Assignee

Comment 86

•

14 years ago

test_bug650955.js verifies the effect of browser.cache.disk.max_entry_size by writing plain text of length just greater than the max_entry_size to the cache. With compression enabled, I expect that the stored data size is not exceeding the max_entry_size when the test expects it to. We might work around the failure by changing the test to use a non-compressible Content-Type, but I want to debug this first to verify my understanding of the problem.

Geoff Brown [:gbrown]

Assignee

Comment 87

•

14 years ago

Actually, max_entry_size is compared against the Content-Length first and then later against the nsCacheEntry::size -- if either Content-Length or size exceeds the max_entry_size, the entry is doomed. The comparison against Content-Length allows the entry to be doomed before the entry is fully downloaded. With cache compression enabled, the behavior of the test does not change (for me, testing on Linux): Content-Length causes the entry to be doomed if the uncompressed size exceeds max_entry_size. Therefore, I no longer advocate changing the test. I do not understand the cause of the failure reported in Comment 82. I have started a conversation with bjarne about max_entry_size behavior in the context of cache compression, just to check that the behavior is intended and desired -- he is thinking about it.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 88

•

14 years ago

Is there any way we could get this test working so that we can check this in in the next few days before the Aurora 10 merge?

No longer blocks: 645848

Geoff Brown [:gbrown]

Assignee

Comment 89

•

14 years ago

That conversation (comment 87) never progressed -- no news on that front. I will re-test and see where we are at now.

Bjarne (:bjarne)

Comment 90

•

14 years ago

Hmm... I apparently was a key-element in said conversation... lets take up that thread when you know current status. :} (I would like to run the cache-microbenchmarks with this patch when it's ready, btw. Have you tried that?)

Geoff Brown [:gbrown]

Assignee

Comment 91

•

14 years ago

Attached patch patch updated for bitrot; r=michal.novotny (obsolete) — Details — Splinter Review

Attachment #543304 - Attachment is obsolete: true

Attachment #571840 - Flags: review+

Geoff Brown [:gbrown]

Assignee

Comment 92

•

14 years ago

Attached patch main patch -- updated for bitrot; r=michal.novotny (obsolete) — Details — Splinter Review

Attachment #571840 - Attachment is obsolete: true

Attachment #571878 - Flags: review+

Geoff Brown [:gbrown]

Assignee

Comment 93

•

14 years ago

Attached patch patch to add test_compressappend.js to xpcshell network unit tests (updated for bitrot); r=michal.novotny — Details — Splinter Review

Attachment #547416 - Attachment is obsolete: true

Attachment #571879 - Flags: review+

Geoff Brown [:gbrown]

Assignee

Comment 94

•

14 years ago

All tests pass on my computer, but try runs sometimes fail with a crash during some xpcshell tests: https://tbpl.mozilla.org/?tree=Try&rev=d0ce918d4d23 I am testing a fix now.

Geoff Brown [:gbrown]

Assignee

Comment 95

•

14 years ago

Attached patch Incremental change to guard against crash when cache service not yet initialized (obsolete) — Details — Splinter Review

With this change, all tests pass on try server: https://tbpl.mozilla.org/?tree=Try&rev=63955ac2e352

Attachment #572004 - Flags: review?(michal.novotny)

u408661

Comment 96

•

14 years ago

(In reply to Bjarne (:bjarne) from comment #90) > (I would like to run the cache-microbenchmarks with this patch when it's > ready, btw. Have you tried that?) I'd also like to see if this has any effect (advantageous or detrimental) on mobile, since the disk cache is now turned on for mobile in m-c.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 97

•

14 years ago

Comment on attachment 572004 [details] [diff] [review] Incremental change to guard against crash when cache service not yet initialized I suggest that Michal do the code review, and then Nick and Bjarne each do their performance tests and r+/r- when the results come back. Then we can check it in when we have all three reviews. Do you guys think we will have answers before the merge on Tuesday?

Attachment #572004 - Flags: review?(hurley)

Attachment #572004 - Flags: review?(bjarne)

Geoff Brown [:gbrown]

Assignee

Comment 98

•

14 years ago

(In reply to Bjarne (:bjarne) from comment #90) > Hmm... I apparently was a key-element in said conversation... lets take up > that thread when you know current status. :} Bjarne and I discussed in email and agree that test_bug650955.js is fine as it is and we need no special changes for max_entry_size with compression. > (I would like to run the cache-microbenchmarks with this patch when it's > ready, btw. Have you tried that?) I have not run the microbenchmarks with compression. If Nick and Bjarne can do some performance tests, that would be great!

u408661

Comment 99

•

14 years ago

After talking more with bsmith, we think it might be better to land this AFTER the next aurora merge, so we have a clear understanding of disk cache on mobile without compression before starting to compress anything. This should give Bjarne and I a bit more time to run some perf tests, too, instead of being under pressure for having it done by Monday :)

Bjarne (:bjarne)

Comment 100

•

14 years ago

Attached image Unpatched, misses (obsolete) — Details

Bjarne (:bjarne)

Comment 101

•

14 years ago

Attached image Patched, misses (obsolete) — Details

Result from the microbenchmark showing time seen from JS when cache do NOT contain the resource loaded ("miss time", or essentially time to store the entry into cache.). This plot is with the patch for this bug applied, the previous attachment is the same plot without patch applied.

Bjarne (:bjarne)

Comment 102

•

14 years ago

Attached image Unpatched, misses — Details

Sorry - this is the real plot

Attachment #572322 - Attachment is obsolete: true

Bjarne (:bjarne)

Comment 103

•

14 years ago

Attached image Patched, misses — Details

Proper plot for patched code... Result from the microbenchmark showing time seen from JS when cache do NOT contain the resource loaded ("miss time", or essentially time to store the entry into cache.). This plot is with the patch for this bug applied, the previous attachment is the same plot without patch applied.

Attachment #572325 - Attachment is obsolete: true

Bjarne (:bjarne)

Comment 104

•

14 years ago

Attached image Un (obsolete) — Details

Bjarne (:bjarne)

Comment 105

•

14 years ago

Attached image Unpatched, hits (obsolete) — Details

Attachment #572329 - Attachment is obsolete: true

Bjarne (:bjarne)

Comment 106

•

14 years ago

Attached image Patched, hits — Details

Result from the microbenchmark showing time seen from JS when cache DO contain the resource loaded ("hit time", or essentially time to read the entry from cache.). This plot is with the patch for this bug applied, the previous attachment is the same plot without patch applied.

Bjarne (:bjarne)

Comment 107

•

14 years ago

Attached image Unpatched, hits — Details

Replaces unpatched hits plot (bad ordering of columns)

Attachment #572330 - Attachment is obsolete: true

Bjarne (:bjarne)

Comment 108

•

14 years ago

Comment on attachment 572004 [details] [diff] [review] Incremental change to guard against crash when cache service not yet initialized Review of attachment 572004 [details] [diff] [review]: ----------------------------------------------------------------- (Apologies for the messed-up plots - hopefully they are all replaced by proper ones now...) Miss-results results indicate a performance-hit, but keep in mind that these are loads from the internal httpd.js and when loading from a slower network, the relative loss will be smaller. At this point of time and with the current results I'd say this is ok. Hit-results also indicate a generally small performance-hit, but I r- this because of the result for mem-cache full of small entries (sixth column-group). IMO this should be understood properly before we proceed.

Attachment #572004 - Flags: review?(bjarne) → review-

Michal Novotny [:michal]

Comment 109

•

14 years ago

Comment on attachment 572004 [details] [diff] [review] Incremental change to guard against crash when cache service not yet initialized nsCacheService::CacheCompressionLevel() is called from nsCacheEntryDescriptor and you can get nsCacheEntryDescriptor only from properly initialized cache service. So what is this good for?

Attachment #572004 - Flags: review?(michal.novotny) → review-

Geoff Brown [:gbrown]

Assignee

Comment 110

•

14 years ago

(In reply to Michal Novotny (:michal) from comment #109) > nsCacheService::CacheCompressionLevel() is called from > nsCacheEntryDescriptor and you can get nsCacheEntryDescriptor only from > properly initialized cache service. So what is this good for? I based the fix on the crash stacks reported from the try runs, such as https://tbpl.mozilla.org/php/getParsedLog.php?id=7200485&tree=Try. In these, CacheCompressionLevel is called from nsCompressOutputStreamWrapper::InitZstream (in turn called from nsCompressOutputStreamWrapper::Write -- so it looks like this is the first write to the stream). But I see your point: I cannot think of how the wrapper can be obtained without a nsCacheEntryDescriptor, which should in turn guarantee availability of nsCacheService. I'll look into it...

Michal Novotny [:michal]

Comment 111

•

14 years ago

(In reply to Geoff Brown [:gbrown] from comment #110) > I based the fix on the crash stacks reported from the try runs, such as > https://tbpl.mozilla.org/php/getParsedLog.php?id=7200485&tree=Try. In these, > CacheCompressionLevel is called from > nsCompressOutputStreamWrapper::InitZstream (in turn called from > nsCompressOutputStreamWrapper::Write -- so it looks like this is the first > write to the stream). But I see your point: I cannot think of how the > wrapper can be obtained without a nsCacheEntryDescriptor, which should in > turn guarantee availability of nsCacheService. I'll look into it... It seems that the descriptor is used after the shutdown (not before initialization). This is weird because we clear all entries and wait for all pending events on the cache IO thread during cache shutdown.

u408661

Comment 112

•

14 years ago

Comment on attachment 572004 [details] [diff] [review] Incremental change to guard against crash when cache service not yet initialized Review of attachment 572004 [details] [diff] [review]: ----------------------------------------------------------------- From the perspective of running the specially-crafted tp4m on mobile, this looks fine, and results in a perf. improvement, so I'm going to r+ this from my perspective. We still should see Bjarne's and Michal's comments addressed, though. If the patch changes significantly to address those, I'd like to re-do my tests just in case.

Attachment #572004 - Flags: review?(hurley) → review+

Geoff Brown [:gbrown]

Assignee

Comment 113

•

14 years ago

(In reply to Michal Novotny (:michal) from comment #111) > It seems that the descriptor is used after the shutdown (not before > initialization). This is weird because we clear all entries and wait for all > pending events on the cache IO thread during cache shutdown. I ran experiments on the try server to better understand this issue. The crash happens after nsCacheService::Shutdown is initiated. Shutdown() deletes mObserver (the pref observer) before waiting on the cache IO thread. A guard against mObserver == null is sufficient to avoid the crash (gService is still valid). Alternately, I could check for mInitialized, since it is cleared in Shutdown before deleting mObserver. Or, we could delay the deletion of mObserver until after waiting on the cache IO thread...but I don't think that offers a clear advantage. Overall, I prefer keeping the original check: if (!gService || !gService->mObserver) or perhaps: if (!gService->mObserver)

Geoff Brown [:gbrown]

Assignee

Comment 114

•

14 years ago

(In reply to Bjarne (:bjarne) from comment #108) > Miss-results results indicate a performance-hit, but keep in mind that these > are loads from the internal httpd.js and when loading from a slower network, > the relative loss will be smaller. At this point of time and with the > current results I'd say this is ok. > > Hit-results also indicate a generally small performance-hit, but I r- this > because of the result for mem-cache full of small entries (sixth > column-group). IMO this should be understood properly before we proceed. I have run the microbenchmarks on a MacBook Pro now and I see similar results. I agree that the small performance hits are not a cause for concern: We expect some time will be lost to compression, and expect/hope to make that up with a better hit ratio (presumably why Nick's results indicate an overall improvement). I have not determined the cause of the big performance loss for mem-cache full of small entries, but am investigating. I note that the content of the timing?nn files is very regular and uses a small alphabet, resulting in an extreme compression ratio: a 128K entry is compressed to 217 bytes! Perhaps the microbenchmark could be modified to use data that isn't so easily compressed? (I don't think it will make much of a difference, but more realistic/representative data might improve our confidence in the relevance of the results.)

Bjarne (:bjarne)

Comment 115

•

14 years ago

(In reply to Geoff Brown [:gbrown] from comment #114) > I note that the content of the timing?nn files is very regular and uses a > small alphabet, resulting in an extreme compression ratio: a 128K entry is > compressed to 217 bytes! Perhaps the microbenchmark could be modified to use > data that isn't so easily compressed? (I don't think it will make much of a > difference, but more realistic/representative data might improve our > confidence in the relevance of the results.) Yeah - that pattern is just an "x" repeated... not very suitable for compression... :) I'd be ok to change it - probably applying some randomness will do the trick. Would you like to give it a shot? Just change the function repeatToGovenSize() in "cache_timing_utils.js"

Geoff Brown [:gbrown]

Assignee

Comment 116

•

14 years ago

Comment on attachment 572004 [details] [diff] [review] Incremental change to guard against crash when cache service not yet initialized Michal - please reconsider in light of comment 113.

Attachment #572004 - Flags: review- → review?(michal.novotny)

Geoff Brown [:gbrown]

Assignee

Comment 117

•

14 years ago

(In reply to Bjarne (:bjarne) from comment #108) > Hit-results also indicate a generally small performance-hit, but I r- this > because of the result for mem-cache full of small entries (sixth > column-group). IMO this should be understood properly before we proceed. I have investigated this issue now. In summary: the microbenchmark "hit" results are sometimes measuring memory cache misses because of LRU-SP cache evictions. With mem-cache full of small entries, insertion of a compressed 128K entry almost always results in a cache eviction and subsequent miss. Details: The microbenchmark fills the memory cache by creating numerous "dummy" entries in the cache. For the problematic 128-byte case, about 73000 128 byte "dummy" entries are created. Next, 100 "timing" entries are loaded and their load times recorded as "misses". As these timing entries are created in the already-full cache, evictions must occur. In an LRU cache, like the disk cache, only dummy entries will be evicted. But in the LRU-SP memory cache, it is possible for the timing entries to be evicted. In the 3rd stage of the microbenchmark, the same 100 timing entries are loaded and their load times recorded as "hits". If only dummy entries were previously evicted, this will accurately reflect hit times. If timing entries were evicted, the microbenchmark will report cache miss times as hits. Factors affecting which entries will be evicted include: - the sizes of the dummy entries and the sizes of the timing entries; larger entries have a better chance of eviction - the entry's "in use" status -- in use entries are not evicted - timing: in particular, the duration between setting an entry's "last fetched" time and the start of the eviction procedure - (fetch count is not an issue, since all entries have the same count -- 1) In my tests *without* the compression patches, timing entries, < 8K, were rarely evicted. Timing entries >= 8K were sometimes evicted, but still typically less than 30 of the 100 timing entries were evicted. Larger timing entries avoided eviction most typically because the entry was marked in-use. There was substantial variation between test runs in the number of evictions of timing entries, apparently due to minor timing differences. In my tests *with* the compression patches, timing entries < 128K were rarely evicted. As noted previously, the compression ratios for the timing cache entries are extreme: the 32K timing entries are compressed to just 73 bytes. It is not surprising then that these entries are not evicted. The (noted, problematic) 128K timing entries are compressed to 217 bytes, making them eligible for LRU-SP eviction in preference to the 128 byte dummy entries. Unlike the larger entries observed in the uncompressed case, these 217 byte entries are almost never in use and therefore almost always evicted. As a result, the overall result for "hits" on the compressed mem-cache with 128 byte dummies and 128K timing entries is actually a measurement of cache misses. I tried introducing randomness into the timing entry data to produce a more realistic compression ratio. With a compression ratio of about 60%, the 128K entries were not evicted...but the 32K entries started to be evicted! My conclusions: - the memory cache is behaving correctly - cache compression is behaving correctly - these results do not indicate a performance problem with cache compression - the microbenchmark has a weakness when measuring hits on the memory cache; I don't currently see a way to address this effectively

Mikko Rantalainen

Comment 118

•

14 years ago

(In reply to Geoff Brown [:gbrown] from comment #117) > My conclusions: > [...] > - the microbenchmark has a weakness when measuring hits on the memory > cache; I don't currently see a way to address this effectively Perhaps the microbenchmark should somehow force tested entities to be "in-use" if the intent is to test performance of only the memory cache? It sounds to me that the current microbenchmark implementation requires an LRU memory cache to work correctly.

Bjarne (:bjarne)

Comment 119

•

14 years ago

(In reply to Geoff Brown [:gbrown] from comment #117) > - the microbenchmark has a weakness when measuring hits on the memory > cache; I don't currently see a way to address this effectively Nice work, Geoff! You are absolutely right: The benchmark does not expect timing-entries to be evicted. If that happens, results are unpredictable. (In reply to Mikko Rantalainen from comment #118) > Perhaps the microbenchmark should somehow force tested entities to be > "in-use" This is a nice idea! I'll try to implement it... (unless someone has another idea for how to "pin" selected entries in the cache..?)

Chris Lord [:cwiiis]

Comment 120

•

14 years ago

(In reply to Bjarne (:bjarne) from comment #119) > (In reply to Mikko Rantalainen from comment #118) > > Perhaps the microbenchmark should somehow force tested entities to be > > "in-use" > > This is a nice idea! I'll try to implement it... (unless someone has > another idea for how to "pin" selected entries in the cache..?) Just a note, being able to "pin" entries would be a feature we'd really like on mobile to be able to pin all the entries related to the current page. Mentioning in case this makes any difference to how this gets resolved.

Michal Novotny [:michal]

Comment 121

•

14 years ago

Comment on attachment 572004 [details] [diff] [review] Incremental change to guard against crash when cache service not yet initialized I prefer to move NS_RELEASE(mObserver) in nsCacheService::ShutDown() after the call to SyncWithCacheIOThread().

Attachment #572004 - Flags: review?(michal.novotny)

Geoff Brown [:gbrown]

Assignee

Comment 122

•

14 years ago

Attached patch main patch -- updated for bitrot; r=michal.novotny — Details — Splinter Review

Attachment #571878 - Attachment is obsolete: true

Attachment #576375 - Flags: review+

Geoff Brown [:gbrown]

Assignee

Comment 123

•

14 years ago

Attached patch incremental change to guard against crash when cache service is shutting down — Details — Splinter Review

Updated based on comment #121. This also passes try: https://tbpl.mozilla.org/?tree=Try&rev=139f9d16ff68

Attachment #572004 - Attachment is obsolete: true

Attachment #576511 - Flags: review?(michal.novotny)

Michal Novotny [:michal]

Updated

•

14 years ago

Attachment #576511 - Flags: review?(michal.novotny) → review+

Jason Duell

Reporter

Comment 124

•

14 years ago

Is this ready to land, or there more work to do?

Geoff Brown [:gbrown]

Assignee

Comment 125

•

14 years ago

The only unresolved issue is that we do not have a positive result from the cache microbenchmark. Bjarne has recently updated the microbenchmark to overcome the issue identified in comment 117, but it's not clear if that works yet. I do not object to landing, particularly in light of Nick Hurley's results...or we can wait for new microbenchmark results.

Bjarne (:bjarne)

Comment 126

•

14 years ago

(In reply to Geoff Brown [:gbrown] from comment #125) > particularly in light of Nick Hurley's > results... Are these results and code to produce them available somewhere? (I assume we're talking about comment #112) I can probably use something like this also - see bug #602611, comment #29. I have run the microbenchmarks with the current patch on a dual-core Linux system and results looks fine. It would of-course be nice to have results from other platforms also, but from my perspective I see no reason not to land this.

u408661

Comment 127

•

14 years ago

(In reply to Bjarne (:bjarne) from comment #126) > Are these results and code to produce them available somewhere? (I assume > we're talking about comment #112) I can probably use something like this > also - see bug #602611, comment #29. I would have to dig up the results, but reproducing the test (and presumably the results, unless something went wrong when I ran it!) is easy. Just apply the "makefile target for 'make talos-remote'" patch from bug 688604 to a tree that builds a working xul fennec (if you want to run on mobile). I have a slightly modified version of the patch I will attach here that makes things a bit faster for running it a lot (uses a local tarball of the talos source tree that also has the modifications to the manifest for cache locality). If you want to do it on a local machine, just use the instructions for standalone talos (on the wiki), and make sure your talos.config that you use uses a slightly modified Tp configuration (set tpcycles to 1 in the command line and point it at a manifest that goes through the same set of pages a few times so you get actual cache locality instead of a bunch of crap). I will attach a talos.config and a tp4 manifest that has those properties, as well.

u408661

Comment 128

•

14 years ago

Attached patch modified 'make talos-remote' patch — Details — Splinter Review

So far as I know, this only works on a xul fennec, so you have to do some archaeology to get a checkout you can make a working build with.

u408661

Comment 129

•

14 years ago

Attached file modified talos.config — Details

This is good for running tp4 on localhost with a modified manifest for cache locality (some paths in the config will still need fixing up based on your local config)

u408661

Comment 130

•

14 years ago

Attached file modified tp4 manifest — Details

The modified manifest to use with the modified talos.config, provides actual cache locality in the test! It loads the same set of URLs, but loads them in groups of 5, 3 times per group before moving on to the next group.

u408661

Comment 131

•

14 years ago

The tarball with the modified talos config and manifest for mobile is kind of big, so instead of trying to attach it and possibly choking bugzilla, it's at http://people.mozilla.org/~hurley/talos_clean_modified.tar

Geoff Brown [:gbrown]

Assignee

Comment 132

•

14 years ago

So this is ready for check-in now! There are 3 patches for check-in: https://bugzilla.mozilla.org/attachment.cgi?id=576375 https://bugzilla.mozilla.org/attachment.cgi?id=571879 https://bugzilla.mozilla.org/attachment.cgi?id=576511

Keywords: checkin-needed

Whiteboard: checkin instructions: comment 132

Dão Gottwald [:dao]

Comment 133

•

14 years ago

http://hg.mozilla.org/integration/mozilla-inbound/rev/dfc239e7dcf4 http://hg.mozilla.org/integration/mozilla-inbound/rev/ec4f2192e533 http://hg.mozilla.org/integration/mozilla-inbound/rev/0881bdd95e58

Keywords: checkin-needed

Whiteboard: checkin instructions: comment 132

Target Milestone: --- → mozilla11

:Ms2ger (he/him; ⌚ UTC+1/+2)

Comment 134

•

14 years ago

https://hg.mozilla.org/mozilla-central/rev/dfc239e7dcf4 https://hg.mozilla.org/mozilla-central/rev/ec4f2192e533 https://hg.mozilla.org/mozilla-central/rev/0881bdd95e58

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Mardeg

Comment 135

•

14 years ago

Is SVG included as a compressible type here, or is there a reason gzip compression of them in the cache wouldn't work?

Jo Hermans

Comment 136

•

14 years ago

(In reply to Mardeg from comment #135) > Is SVG included as a compressible type here, or is there a reason gzip > compression of them in the cache wouldn't work? image/svg+xml isn't included in the list at <http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpChannel.cpp#3197>

Geoff Brown [:gbrown]

Assignee

Comment 137

•

14 years ago

Compression is used if: - the Content-Encoding is not specified and - the storage policy is not STORE_ON_DISK_AS_FILE and - the Content-Type is one of: TEXT_HTML TEXT_PLAIN TEXT_CSS TEXT_JAVASCRIPT TEXT_ECMASCRIPT TEXT_XML APPLICATION_JAVASCRIPT APPLICATION_ECMASCRIPT APPLICATION_XJAVASCRIPT APPLICATION_XHTML_XML

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 138

•

14 years ago

(In reply to Mardeg from comment #135) > Is SVG included as a compressible type here, or is there a reason gzip > compression of them in the cache wouldn't work? There is no reason. Any XML-based type should compress well; that is why we included TEXT_XML. Please file a follow-up bug or bugs for any additional compressible mime-types.

Mardeg

Comment 139

•

14 years ago

(In reply to Brian Smith (:bsmith) from comment #138) > There is no reason. Any XML-based type should compress well; that is why we > included TEXT_XML. Please file a follow-up bug or bugs for any additional > compressible mime-types. Ok, filed bug 711849 for SVG

Blocks: 711849

Scoobidiver (away)

Updated

•

14 years ago

Depends on: 712277

Michal Novotny [:michal]

Updated

•

14 years ago

Depends on: 713480

dagyju-199

Updated

•

13 years ago

Blocks: 715191

Josh Aas

Updated

•

13 years ago

Depends on: 715198

Benjamin Smedberg

Updated

•

13 years ago

Depends on: 712032

Michal Novotny [:michal]

Updated

•

13 years ago

Depends on: 723100

Geoff Brown [:gbrown]

Assignee

Comment 140

•

13 years ago

Some web metrics pertinent to topics discussed in this bug: https://developers.google.com/speed/articles/web-metrics

YF (Yang)

Updated

•

10 years ago

Blocks: 940747

wip - this works for me but isn't quite ready for review 14 years ago Geoff Brown [:gbrown] 9.60 KB, patch		Details \| Diff \| Splinter Review
Graphs of pageload times for google.com and intrinsyc.com with/without cache compression 14 years ago Geoff Brown [:gbrown] 46.00 KB, image/png		Details
Raw data used for graphs 14 years ago Geoff Brown [:gbrown] 1.21 KB, text/plain		Details
wip - corrects CheckCache problem; also does not compress STORE_ON_DISK_AS_FILE entries 14 years ago Geoff Brown [:gbrown] 13.04 KB, patch		Details \| Diff \| Splinter Review
Graphs of pageload times for google.com and intrinsyc.com with/without cache compression (updated) 14 years ago Geoff Brown [:gbrown] 57.15 KB, image/png		Details
Graphs of zippity page load times for deflate(5) and deflate(1) 14 years ago Geoff Brown [:gbrown] 52.02 KB, image/png		Details
patch 14 years ago Geoff Brown [:gbrown] 23.34 KB, patch		Details \| Diff \| Splinter Review
patch 14 years ago Geoff Brown [:gbrown] 30.13 KB, patch	michal : review-	Details \| Diff \| Splinter Review
Graphs of pageload times for 3 simple text files with/without cache compression (deflate 1) 14 years ago Geoff Brown [:gbrown] 94.85 KB, image/png		Details
Graphs of pageload times for 3 popular sites with/without cache compression (deflate 1) 14 years ago Geoff Brown [:gbrown] 92.17 KB, image/png		Details
patch updated based on review 14 years ago Geoff Brown [:gbrown] 31.53 KB, patch	michal : review-	Details \| Diff \| Splinter Review
test 14 years ago Michal Novotny [:michal] 2.46 KB, application/javascript		Details
updated patch addressing comments 66-68 14 years ago Geoff Brown [:gbrown] 34.48 KB, patch	michal : review+	Details \| Diff \| Splinter Review
Updated test for appending to a compressed entry (uses storageDataSize) 14 years ago Geoff Brown [:gbrown] 2.49 KB, application/javascript		Details
patch to add test_compressappend.js to xpcshell network unit tests 14 years ago Geoff Brown [:gbrown] 3.61 KB, patch	michal : review+	Details \| Diff \| Splinter Review
patch to add test_compressappend.js to xpcshell network unit tests (added comment) r=michal.novotny 14 years ago Geoff Brown [:gbrown] 3.75 KB, patch	gbrown : review+	Details \| Diff \| Splinter Review
patch updated for bitrot; r=michal.novotny 14 years ago Geoff Brown [:gbrown] 34.49 KB, patch	gbrown : review+	Details \| Diff \| Splinter Review
main patch -- updated for bitrot; r=michal.novotny 14 years ago Geoff Brown [:gbrown] 34.63 KB, patch	gbrown : review+	Details \| Diff \| Splinter Review
patch to add test_compressappend.js to xpcshell network unit tests (updated for bitrot); r=michal.novotny 14 years ago Geoff Brown [:gbrown] 3.78 KB, patch	gbrown : review+	Details \| Diff \| Splinter Review
Incremental change to guard against crash when cache service not yet initialized 14 years ago Geoff Brown [:gbrown] 916 bytes, patch	u408661 : review+ bjarne : review-	Details \| Diff \| Splinter Review
Unpatched, misses 14 years ago Bjarne (:bjarne) 25.45 KB, image/png		Details
Patched, misses 14 years ago Bjarne (:bjarne) 25.45 KB, image/png		Details
Unpatched, misses 14 years ago Bjarne (:bjarne) 25.56 KB, image/png		Details
Patched, misses 14 years ago Bjarne (:bjarne) 25.53 KB, image/png		Details
Un 14 years ago Bjarne (:bjarne) 21.72 KB, image/png		Details
Unpatched, hits 14 years ago Bjarne (:bjarne) 21.72 KB, image/jpeg		Details
Patched, hits 14 years ago Bjarne (:bjarne) 22.76 KB, image/png		Details
Unpatched, hits 14 years ago Bjarne (:bjarne) 22.38 KB, image/png		Details
main patch -- updated for bitrot; r=michal.novotny 14 years ago Geoff Brown [:gbrown] 34.65 KB, patch	gbrown : review+	Details \| Diff \| Splinter Review
incremental change to guard against crash when cache service is shutting down 14 years ago Geoff Brown [:gbrown] 1.32 KB, patch	michal : review+	Details \| Diff \| Splinter Review
modified 'make talos-remote' patch 14 years ago u408661 1.42 KB, patch		Details \| Diff \| Splinter Review
modified talos.config 14 years ago u408661 4.03 KB, text/plain		Details
modified tp4 manifest 14 years ago u408661 22.98 KB, text/plain		Details