Closed Bug 628998 Opened 14 years ago Closed 9 years ago

Firefox refuses to obey cache-control headers and refuses to cache static content deliberatly

Categories

(Core :: Networking: Cache, defect)

defect
Not set
major

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: py.adriano, Unassigned)

Details

Attachments

(2 files)

User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.224 Safari/534.10 Build Identifier: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101206 Ubuntu/10.04 (lucid) Firefox/3.6.13 I'm a web developer, and I'm experiencing a cache issue with Firefox at any platform (Windows, Mac and Linux), while other browsers tested (IE6 - WinXP, Safari - Mac and Chrome - Mac and Linux) are caching perfectly. I verified the issue after explicitly setting the Cache-Control header to max-age=BIG_NUMBER_HERE and verified that the Expires header was pointing to a very far date in the future. Then, I accessed the webpage and using firebug, I verified that the header was indeed right, and I got a 200 for the given content. If I navigate the website (clicking links, I'm not doing any sort of refresh or reload of the page), I can verify once again that the same request is made, and a 200 response is retrieved instead of just fetching the content from cache. I verified the request headers, and they don't mention max-age=0. I checked my about:config, and here is what I've got: http://i56.tinypic.com/ephtmf.png Here is a sample of HTTP header that is facing this sort of problem: http://i51.tinypic.com/25yu0ev.png and http://i55.tinypic.com/ndsyvm.png Both samples I got while navigating the website for the second time, using links inside the page. Same methodology applied at the other browsers mentioned and they managed to cache perfectly. If I go to my about:cache, this is what I see: http://i53.tinypic.com/2czeutw.png which is the exact match of the second content's header above. Even though, it doesn't make use of this cache because I can see the request through firebug, and I can see the request arriving at the webserver. The same doesn't occur with the other webbrowsers mentioned. Reproducible: Always Steps to Reproduce: 1. Serve a webpage with the headers I demonstrated 2. Access the page 3. Navigate to that page again, and verify that it isn't using the content from cache Actual Results: Firefox isn't respecting the cache-control headers, and it ends up making unecessary requests to the server, decreasing user experience by slowly loading the content that is already in cache. Firefox, instead, sends a non-conditional get request and getting a 200 everytime. Expected Results: Firefox should grab the content from its cache and pop it on the screen without making any request to the server. In the worse acceptable case, should send a conditional get request, accepting a 304 as response and then using the cache. I tested firefox with and without the BetterCache extension, and I could verify that BetterCache partially fixed the caching issue, by enforcing firefox to cache a lot of contents that wasn't cached before. Even with this extension, firefox isn't caching as much content as the other browsers are caching without any extension. I even tested this behavior in a firefox without firebug, to make sure it isn't its fault, and I verified the same amount of requests in my webserver.
Adriano, is there a public server that shows the problem that you could point me to? If not, can you please create an HTTP log following the instructions at https://developer.mozilla.org/en/HTTP_Logging while doing as few page loads as possible while still reproducing the problem? Then please attach that log to this bug using the "Add an attachment" link and mention which URI you expected to be cached that wasn't.
Oh, one more thing. The screenshot at http://i51.tinypic.com/25yu0ev.png shows us making a conditional request; the server is responding with a 200, not a 304... But again, an HTTP log would likely give us all the information (including the request sequence, and all the relevant headers) to figure out what's going on here.
I thought that a conditional request must send a Last-Modified header with the Last-Modified value the browser has from its cached content. Otherwise, how is the webserver supposed to know if the content was modified or not and return a 304? Perhaps I'm just missing something here. What is it in that request header that makes it conditional?
> I thought that a conditional request must send a Last-Modified header There are two kinds of conditional requests in HTTP for our purposes (there's a third one, but it can't result in a 304). If all a client has to work with is a date, then it can send an If-Modified-Since header. The server should then send a 304 if the resource has not changed since that date. But if a client has an ETag (which if you look at those headers it does) for the resource then it can send an If-None-Match header with that ETag in it. The server should then send a 304 if the ETag it has matches one of the ETags in the If-None-Match header. If the ETags don't match, then the resources are different and the server must respond with a 200 and send the new data.
Thanks for clarifying this! I didn't know about these other headers. I just setup a server where you'll be able to reproduce this issue: http://adriano.burble.net You don't really need to login to figure this out, but you can create an account if you want to. Please, let me know if I can help you further with this testing.
OK, so how do I reproduce the bug? I tried loading the page, clicking the "log in" link, clicking the "about" link, etc, and things look correct here (some things are cached; some are not due to Vary headers the site sends, that sort of thing). What file are you expecting to be cached that isn't and what should I be loading, in what order, to reproduce the problem?
What headers are screwing with the cache? Chrome is caching everything, but the html. Same behavior with the other browsers I tested. Everything that comes with a Cache-Control header of max-age=big_number should be cached.
> What headers are screwing with the cache? The only one I saw in my testing was Vary (for the stylesheets). In particular, the cookies for the site kept changing as I loaded different pages, and the stylesheet responses have "Vary: cookie" in them. But again, I'm not sure which urls you think should be cached but are not, so it's hard to answer your question... Would be easier if you would answer mine. As for "max-age=big_number", the stylesheets served by that site come with a 1-day max-age (and those Vary headers). None of the other responses had a max-age at all, though some of the HTML had that string in Set-Cookie headers, like so: Set-Cookie: sessionid=1abdffa27116d9b30dfb9a70cca53538; expires=Wed, 09-Feb-2011 20:52:10 GMT; Max-Age=1209600; Path=/
It is supposed to cache all css, js and images. Don't you think that even with the 1-day max-age it should cache for at least one day?
Yes, except the server is sending a Vary header with that CSS, with "Cookie" listed as one of the values. And then it's setting different cookies on differenet pages. So since the Cookie header we send is different, we can't use the cache entry, because the server told us that the CSS it returns depends on the cookies (by putting "Cookie" in the Vary header). _Does_ your server return different CSS depending on the cookies? If not, why is it setting that Vary header?
No, it doesn't. Try this new address instead: http://adriano.burble.net:8080/ I think that perhaps apache is messing up with the headers. While reproducing the issue, I wasn't running through apache. What I did was to set a max-age for all css, js and image files to make sure they're all cached. So, once you get to that address, if you see any of those not being cached, then this is the issue I'm talking about. Then, if you test with chrome, you'll verify that they're all cached as expected.
I managed to remove the Vary headers from static files, and now firefox is caching more aggressively, but still not caching everything. I just updated http://adriano.burble.net:8080 to help you verify this. Anyways, shouldn't the Cache-Control have a higher priority over the Vary header? I may be missing something here again, but if the server is saying that this content should be caches for some time, can't firefox presume that this includes even the varying content? I know it isn't sane to send both, but perhaps it is more logical to have the Cache-Control to take precedence over the Vary header. Besides, you may try to navigate further in the website using the user boris and passwd boris.
> Anyways, shouldn't the Cache-Control have a higher priority over the Vary No. Please see the HTTP spec. > if the server is saying that this content should be caches for some time, > can't firefox presume that this includes even the varying content? No. The time thing says that for that amount of time if you make the same request the response will be the same. The Vary says what makes a request not the same. If a header listed in Vary changed, that's equivalent to requesting a different URI altogether from the point of view of caching. Maybe I'm not making myself clear. What I want from you is a list of step-by-step instruction (load uri X, click link Y, click link Z, whatever) and for each one a list of which files you think should be cached but aren't. Then I can investigate what the issue is. Telling me "oh, some stuff is not cached that should be, but I won't tell you which" is not particularly useful....
Ok, gotcha. I was raising these questions because I saw this working on other browsers, I was trying to understand if this could be the issue. Perhaps, I didn't make myself clear enough about what should be cached or not, I'm sorry about this. If you look at my answer back on 2011-01-26 13:05:40 PST, I said that everything that comes with a Cache-Control set with a max-age should be cached. Later, I said on 2011-01-26 15:16:08 PST I complemented, saying that all css, js and images should be cached (these are all with the Cache-Control header I mentioned). To reproduce, open your firebug or a sniffer of preference, then go to http://adriano.burble.net:8080 and see all files that Firefox is requesting from the server. From those files, it should be caching all that fit into those conditions I mentioned before. In order to verify if Firefox is using from cache, instead of making another request, then navigate to the same url not using reload, refresh or anything that would usually send back a Cache-Control: max-age=0 to the server and watch from your sniffer if Firefox is making the same requests or not. It shouldn't grab the css, js and img files that fits in those conditions. While following the same steps with the before mentioned browsers, I verify that they're not making new requests (not even conditional requests) to those that fits the condition. Even after removing the Vary header completely, I keep seeing some files not being cached like: jquery-ui-1.8.2.custom.min.js Response Headers Date Thu, 27 Jan 2011 17:54:41 GMT Server WSGIServer/0.1 Python/2.6.5 Content-Language en-us Expires Sat, 26 Feb 2011 17:54:41 GMT Last-Modified Sun, 28 Nov 2010 17:54:41 GMT Content-Type text/javascript Request Headers Host localhost User-Agent Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; pt-BR; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 Accept */* Accept-Language pt-br,pt;q=0.8,en-us;q=0.5,en;q=0.3 Accept-Encoding gzip,deflate Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive 115 Connection keep-alive Referer http://localhost/library/your_libraries/?base_context=library Cookie sessionid=6b1a93abea77a78234c02813f1f974ca; csrftoken=3408c2fac25e8e413b089e30f104d006; djdt=hide Please, let me know if you want me to provide more informations and thanks for helping with this issue.
Thanks, that's the sort of information I was looking for. If I load that url, then focus the url bar and hit "enter", I see a single HTTP request for jquery-ui-1.8.2.custom.min.js during the whole process. On the second pageload, I see it loaded from cache.
Ok, but do you see it using from cache if it is called by the html that is loaded when you access http://adriano.burble.net:8080 ? Moreover, can you confirm that you see all files that fits in that description being cached? I just reproduced here once again, and I saw about half of the files being used from cache, and half being requested everytime I navigate to the page mentioned.
> Ok, but do you see it using from cache if it is called by the html that is > loaded when you access http://adriano.burble.net:8080 ? Yes, that's exactly when I see it loading from cache. > Moreover, can you confirm that you see all files that fits in that description > being cached? Yes. I see 2 HTTP requests for the toplevel page itself, which is served with no useful cache headers and a Date of "right now", and 2 HTTP requests for favicon.ico. I see only one HTTP request for every single other resource. Note that I'm testing this with the current Firefox 4 beta, though. Are you testing with Firefox 3.6?
No, I'm not using Firefox 4 beta. I'm using the one I mention in this issue description. It is the first comment, in the User Agent section. Do you have a firefox 3.6 for testing?
Of course. ;) I did just try that; I see the same behavior as in comment 17.
That's weird. I'm trying the same page using firefox in different machines (a Mac, a Linux and a Windows), and also got two other people to test it and we consistently got the same behavior. Could it be some sort of configuration at your side that is different than ours? Is there anything else that firefox considers while deciding on whether to use data from cache or not?
Adriano, did you try this with a new profile? It may be that the cache in the profile you are using has stale data-entries (yes, there are a number of issues with this...) from the old versions of your test.
Adriano, I created a brand-new profile to test with..... > Is there anything else that firefox considers while deciding on whether to use > data from cache or not? Other than the headers and the state of the cache, no? But the headers can be affected by add-ons, etc, and the state of the cache might depend on previous browsing behavior, of course.
I asked another friend (who doesn't work at the same place I do, and haven't accessed this site previously) and also created a new profile. With both, we managed to get everything cached as expected. What could be causing this then? The media directory changes name everytime the server restarts, so why is old cache (pointing to a different location) is affecting the way new content is fetched/cached? Is there something I can do with my headers to overcome this issue?
> What could be causing this then? I don't know. What's different between the two profiles?
(In reply to comment #23) > I asked another friend (who doesn't work at the same place I do, and haven't > accessed this site previously) and also created a new profile. With both, we > managed to get everything cached as expected. Meaning that it works now..? :) > What could be causing this then? The media directory changes name everytime the > server restarts, so why is old cache (pointing to a different location) is > affecting the way new content is fetched/cached? Does the URL used by the browser change also (because this is what the browser uses to generate the cache-key) ? > Is there something I can do with my headers to overcome this issue? Not quite sure which issue you mean... However, it seems to me like you have managed to tune various headers (with substantial help from bz) to do what you want, no? If somebody have stale/bad entries in their caches, they will experience suboptimal performance for some time, until the cache is cleared or those entries are evicted. I think this is inevitable and not a big issue.
Hm... That's weird, because I tried the following: 1 - My firefox in a Mac, with my cache, profile, etc. and it didn't cache. Then, I was clearing the cache, and it still wasn't caching anything. 2 - My firefox in a Linux, same test as before, tried to clear my cache several times and it didn't work 3 - A clean firefox in a Win XP machine, I didn't clear the cache, but have never accessed the site from there before. It didn't work either. In the windows machine there is no extension installed. In the Mac and Linux, there is Firebug, YSlow and Live HTTP headers. The url always change. The media location in the filesystem is always the same, it is only that url that keeps changing. Eg. /site_media/static_1101251233/ changes to something like /site_media/static_1101261435 I agree that I can mess up with Firefox, and that doesn't mean that firefox is got an issue. But what keeps me thinking is why other browsers were doing fine with the cache while firefox wasn't. Besides, what would be your recomendation to avoid the stale cache? Should I always set a max-stale header, or something else?
Uhh... are you saying that this doesn't work anyway? Were any of the tests from comment #26 using the same profile which apparently worked in comment #23?
No, I was talking about what I tried before and that wasn't successful at all.
could you help me understand how my firefox profile could get into a state that would stop it caching correctly? creating a new profile seemed to resolve this issue for me but i'd like to know how this happened. It won't be acceptable for my users to keep having to create new profiles, i'd like to understand if this was a one-off issue or if this is something that might reoccur.
It might be a good start to answer bzs question in comment #24. If clearing the caches doesn't work for you, something else must be different,
In the old profile I had: * Aptana Debugger * ColorZilla * Email This! * Firebug * FireLogger * Flash Video Downloader * Header Spy * HttpFox * Image Zoom * Live HTTP Headers * Measure it * Modify Headers * SenSEO * SQLite Manager * Web Developer * Xmarks * YSlow And the new profile, I didn't have any extension installed. It was brand new. Two other colleagues experienced the same behavior, but they probably didn't have all those extensions. Is that what you wanted to know, or am I missing something that could help you solve this issue?
I've been aggressively working on my site's cache and came across this issue. Here is the time line of events, status and intended behavior... 1.) HTTP = 200: with completely empty browser cache load page. (Good) 2.) No HTTP Request: Make second or numerous requests before the 15 seconds are up, browser should ONLY load from the cache unless it's force-reloaded or the cache is intentionally cleared by the user. (Good) 3.) HTTP 304: The header('Cache-Control: max-age=15'); has expired after 15 seconds, the browser should check if there is a newer version of the file on the server, makes an HTTP request and receives an HTTP 304 response since the file has obviously not been modified. (Good) 4.) No HTTP Request: Cache-Control header rules goes back in to effect for 15 seconds, no number of requests until this time passes will be made via HTTP, only the copy in the local browser cache will be loaded. (Good) 5.) HTTP 304: The header('Cache-Control: max-age=15'); has expired after 15 seconds for a second time. We've now established the desired pattern in the *CURRENT* time frame. (Good) 6.) Set your computer's clock ahead by one day. 7.) HTTP 304: 24 hours is long past when the header('Cache-Control: max-age=15'); has expired, this is the correct/desired action. (Good). 8.) HTTP 304: The header('Cache-Control: max-age=15'); should kick back in to effect though is incorrectly ignored, Firefox incorrectly makes an HTTP request. The server on it's part correctly responds with HTTP 304 though it should not have received the request to begin with. (Bad) 9.) HTTP 304: less than or greater than Cache-Control header. (Bad) 10.) HTTP 304: less than or greater than Cache-Control header. (Bad) 11.) HTTP 304: less than or greater than Cache-Control header. (Bad) etc ------------------------------ This is how I tested this out. My goal was to ensure that since I modify scripts and style sheets frequently that visitors would receive a fresh copy (in the live environment the cache-control is set to once every few minutes) though that would usually result in an HTTP 304 though most of the time (if visiting five or more pages in example during a single session) the browser should ONLY request the file from the browser's cache. I also tested this in IE8 and Opera 11 both of which correctly did NOT make the requests on EVERY normal load after setting the clock forward (and seeking a fresh copy when the cache-control expired after 15 seconds as is in my example). I tried to test Chrome and Safari however they don't have go buttons. I suppose I could have made a anchor in the HTML. I actually was testing this on a JavaScript file. A casual check shows that I am not modifying cache rules via .htaccess files at any level for the extensions I tested (.js and .php). I also tested this on my computer locally (localhost) and live on my site. I tested this both in Firefox 3.6.15 and Firefox 4 Beta 12. I am *not* sure if I leave the clock alone and simply let a day pass if the issue will still occur. I tested a different copy of Firefox, closed the browser out completely, set the clock ahead a day, loaded up Firefox and saw the same result. I had seen people suggesting that the browser might be clasping on to previous headers for some reason. Let me know if there is anything else I can do to help out. I'll be happy to setup a temporary test case on my live server if need be. Oh CRITICAL to testing was Fiddler2 as I was checking the headers that way. Hope this helps!
> 6.) Set your computer's clock ahead by one day. Once you do that, your computer and the server no longer agree on the time, and "has time X passed?" calculations suddenly depend on whose time you're measuring from. It looks like RFC 2616 has an algorithm for trying to handle this situation and we just don't implement it. That's a bug, but it's not THIS bug unless Adriano is also purposefully screwing up his clock. I filed bug 640445 to cover this issue. > I am *not* sure if I leave the clock alone and simply let a day pass if the > issue will still occur. It shouldn't.
No, I wasn't tweaking my clock. That wasn't necessary to reproduce the issue. John, do you think you can reproduce that without screwing the clock?
Here is the temporary file I was using yesterday for testing (this will be deleted once you guys have had enough time to test so I'll just leave it up for a few days or so)... (with www) jabcreations.com/scripts/test.js I loaded it this morning (maybe 12 hours later) and the first request was to my cache, then a HTTP 304 though it wasn't a constant 304 per request...it was pulling from my cache locally as I would hope.
Requires PHP-enabled www server.
I have this (or very similar) problem on Firefox 5, on Linux. Did a bunch of testing, it seems to boil down to `Vary' header being mishandled by HTTP cache. Testcase attached in the comment above (sorry, no experience with bugzilla :P) The testcase makes five HTTP requests to the same URI, differing only in `X-Nnn' header. The www server returns different content depending on the X-Nnn header. This is indicated by the `Vary' header. It's best to observe the HTTP communication via Firebug or similar. Actual result: response: X-Nnn: text/foo response: X-Nnn: text/bar response: X-Nnn: text/bar response: X-Nnn: text/bar response: X-Nnn: text/bar First two requests are performed OK, but subsequent ones are served from cache with invalid content -- disobeying the Vary header. Expected result: first open (or Ctrl+F5) of the test page: response: X-Nnn: text/foo response: X-Nnn: text/bar response: X-Nnn: text/frob response: X-Nnn: text/knob response: X-Nnn: text/q-p subsequest open of the test page -- same output, but served from HTTP cache: response: X-Nnn: text/foo response: X-Nnn: text/bar response: X-Nnn: text/frob response: X-Nnn: text/knob response: X-Nnn: text/q-p
dexen, that seems unrelated to this bug. Could you please file a separate bug on that issue?
Could you try this with a recent nightly? I can not reproduce this with my local build, nor with a downloaded FF5. Hence, I guess I'm doing something wrong... (Your script loads the uri "/foobar" which I changed to make the script load itself - is this correct?)
Status: UNCONFIRMED → RESOLVED
Closed: 9 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: