Closed Bug 327790 Opened 18 years ago Closed 16 years ago

buggy behavior with multiple cache-control headers (was: no way to stop a page from being cached with bfcache)

Categories

(Core :: DOM: Navigation, defect)

1.8 Branch
x86
Windows 2000
defect
Not set
major

Tracking

()

RESOLVED INVALID

People

(Reporter: WOLF_THUNDER, Unassigned)

References

Details

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.1) Gecko/20060111 Firefox/1.5.0.1
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.1) Gecko/20060111 Firefox/1.5.0.1

I realize this is similar to reports filed with Bug # 314422 and 272857, but those have yet to be fully addressed and have been "written off"...

I have php headers writing the following:
header("Expires: Mon, 26 Jul 1997 05:00:00 GMT"); // Date in the past
header("Last-Modified: " . gmdate( "D, j M Y H:i:s", time() ) . " GMT" ); 
header("cache-control: no-store, no-cache, must-revalidate"); 
header("cache-control: no-store");
header("cache-control: no-cache");
header("Pragma: no-cache"); 

I also have META headers as such:
<META HTTP-EQUIV="Expires" CONTENT="Fri, Jun 12 1981 08:20:00 GMT"> 
<META HTTP-EQUIV="Pragma" CONTENT="no-cache"> 
<META HTTP-EQUIV="cache-control" CONTENT="no-cache"> 
<META HTTP-EQUIV="cache-control" CONTENT="no-store"> 

AND I am using onunload to fire an event, AND have onpageshow="if (event.persisted) document.getElementById('resubmit_it').submit()"

The page continually shows a cached version if I hit the back button, every time.  IE always reloads with just the very first 2 php headers only (Expires and Last Modified).

I have tried every suggestion, including start_session(), modifying the ETag to be "time()", etc...  and FF 1.5.01 will always show a cached page on a back button hit....  It is ignoring the event.persisted, ignoring the fact an onunload was called, and ignoring the headers.

This here (http://developer.mozilla.org/en/docs/Using_Firefox_1.5_caching) is complete rubbish, as none of it works...

As the other person posted about medical records, if a site owner does not want a paged cached, it should not be this hard to get it to not be cached, and it should at least work when the steps to supposedly prevent caching are followed....


Reproducible: Always

Steps to Reproduce:
1. build a page with php headers and meta tags indicating no cache, no store, a date in history, etc
2. use the onunload tag
3. use the onpageshow with the event.persistant tag 
Actual Results:  
dynamic php page (with timestamp) always comes out of cache on a browser back button click....  no matter what...  timestamp does not change...

Expected Results:  
a new page would be loaded with a new, and current time stamp.

theme... brushed

my browser cache option in about.config is set to 3 (automatically), yet it is acting like never....
http://developer.mozilla.org/en/docs/Using_Firefox_1.5_caching is a bit confusing in that it doesn't differentiate between things that prevent both http caching and bfcaching (such as cache-control: no-store) and things that only prevent bfcaching (such as having an onunload handler).  A simple way to determine whether your page was bfcached is to include this in the HTML:

<script>document.write(Math.random())</script>

That said, you're doing at least one thing that should prevent all caching (using cache-control: no-store) and I don't know why that wouldn't work for your site.

You're doing things a little strangely (sending multiple cache-control headers and using all-lowercase for the header name), and you might try seeing if doing those things less strangely helps (despite the HTTP spec saying they should be equivalent).

Can you set up a test site and post the URL here?  I'd rather not have to teach myself PHP just to test whether I can reproduce the bug you're describing.
Assignee: nobody → darin
Component: Security → Networking: Cache
Product: Firefox → Core
QA Contact: firefox → networking.cache
Version: unspecified → 1.8 Branch
jesse: bfcache is not part of necko
Assignee: darin → adamlock
Component: Networking: Cache → Embedding: Docshell
QA Contact: networking.cache → adamlock
I believe that Bug 215405 is just the opposite problem. The browser should cache and it's not doing it.
-> bryner
Assignee: adamlock → bryner
(In reply to comment #1)
I agree the headers are a bit strange.  I keep trying new things to find something that worked, but nothing did and I posted the latest attempt here.  I tried with upper case and lower case.

I am using a php timestamp (versus the javascript math random) that is created on the fly when the page is called, and it shows a new time with IE every time with the php headers, but not with FF.  The javascript may or may not work, but if it does, it just means it is re-executing the cached script when the page is returned to, but not for sure calling the page off the server again, as it should.  So the php echo of the current time is a sure fire way to show if the page is called from the server each time, as that is the only way the time will change on the page.

If whoever is ultimately in charge of this needs me to set up a test page with php headers, I will be happy to.  Please advise.
The JavaScript code in comment 1 would tell you whether it's being cached in bfcache or the http cache, which would in turn help determine what's going wrong.
(In reply to comment #0)
> header("cache-control: no-store, no-cache, must-revalidate"); 
> header("cache-control: no-store");
> header("cache-control: no-cache");

> <META HTTP-EQUIV="cache-control" CONTENT="no-cache"> 
> <META HTTP-EQUIV="cache-control" CONTENT="no-store"> 

WOLF_THUNDER(bug opener):

(Q1)Do you know HTTP 1.1 spec for multiple Cache-Control: headers?
    Last one is effective? Or first one? Or merged?
    (Sorry but I don't know)

(Q2)Question to clarifying.
    header("cache-control: xxx") requests are correctly issued before all echo
    requests?
    (I believe yes, since if this is not done, these headers are displayed as
    (text string in output HTML.) 

(Q3)What HTTP headers("Cache-Control:") are REALLY sent from server to client?

(Q4)Firefox correctly uses <meta http-equiv> as complement in case of the HTTP
    header is not sent by server, as RFC defines.
    But IE possibly superceeds the really sent HTTP header by <meta http-equiv>.
    Do you know which "Cache-Control:" IE uses when multiple Cache-Control:
    headers and multiple <meta http-equiv="Cache-Control> are exist?
  
Since problem has relation to HTTP and caching, HTTP protocol log is required to know real HTTP flow.
See http://www.mozilla.org/projects/netlib/http/http-debugging.html ang get HTTP protocol log, and attach the log tothis bug(mime-type=text/plain).
  (To see HTTP header flow only, LiveHTTPHeaders is also usable.)
  (see http://livehttpheaders.mozdev.org/index.html )
about:cache(both memory cache and disk cache) of rlated URI(URI of the page) is also required.
And attach HTML generated by your PHP script thru link of "Create a New Attachment". ("what <meta>s are REALLY in HTML" is very important.)

Get HTTP protocol log for at least next four cases.
 (1) Shift+Reload (request with no-cache)
 (2) Reload       (request with If-Modified-Since usually)
 (3) Normal access by link click or bookmark (From cache, if not expired)
 (4) Back button
And see about:cache of related URI after each step.
Including sctipt to know "whether it's being cached in bfcache or the http cache" in HTML, which is mentioned by Jesse Ruderman in Comment #6, is recommended, in addition to your "php echo of the current time".
(eho'ed  time is for whether HTTP GET is issued or not, and script is for whether normal load process is invoked or not. "Loaded, but HTTP GET is not issued=read from cache" case exists. In this case, script is executed.)
(In reply to comment #7)
> (Q1)Do you know HTTP 1.1 spec for multiple Cache-Control: headers?
>     Last one is effective? Or first one? Or merged?

I was not sure either, I just reordered them in my php and now it is working on FF, but it worked before the reorder on IE. (and still does on IE)  I still have the meta tags as well.

Here is the new header order:
// HTTP/1.1 
header("Cache-Control: no-store, no-cache, must-revalidate"); 
// Date in the past
header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");
// HTTP/1.0 
header("Pragma: no-cache"); 
// always modified 
header("Last-Modified: " . gmdate( "D, j M Y H:i:s", time() ) . " GMT" ); 


> (Q2)Question to clarifying.
>     header("cache-control: xxx") requests are correctly issued before all echo
>     requests?

Yes, or the page fails.  php dies.

> (Q3)What HTTP headers("Cache-Control:") are REALLY sent from server to client?
Good question, but what ever is on the php page is supposed to overide apache for that page.

> (Q4)Firefox correctly uses <meta http-equiv> as complement in case of the HTTP
>     header is not sent by server, as RFC defines.
>     But IE possibly superceeds the really sent HTTP header by <meta
> http-equiv>.
>     Do you know which "Cache-Control:" IE uses when multiple Cache-Control:
>     headers and multiple <meta http-equiv="Cache-Control> are exist?

I do not.
This is a very real, repeatable, serious bug. Unless the headers are sent in exactly the order listed above (cache-control, expires, pragma), Mozilla caches the page. Actually it caches certain things about the page and forgets others. Please work on this!!!
I have also run into this bug and its causing us a significant problem as we depend on the back button but cannot rely on caching as some content is dynamically loaded at page load.

I tried using two test pages (see below) with the headers in the order described above, and with FF 2.0.0.9 was unable to prevent caching. IMHO this is a serious  issue, and I'm very concerned because we are supposed to deploy to our client in a matter of weeks and this will significantly affect the end user's experience.

Here are my two test pages:

Test Page 1:
<html xmlns="http://www.w3.org/1999/xhtml">
   <head>
      <meta http-equiv="Cache-control" content="no-store, no-cache, must-revalidate" />
      <meta http-equiv="Expires" content="Mon, 01 Jan 1990 00:00:00 GMT" />
      <meta http-equiv="Pragma" content="no-cache" />
      <meta http-equiv="Last-Modified: " content="Tue, 13 Nov 2007 01:03:33 GMT" />
      <title>Untitled Document 3</title>
   </head>

   <body>
      <img src="http://www.google.ca/images/firefox/title.gif" />
      <a href="test2.html">LINK</a>
</body>
</html>


Test Page 2:
<html xmlns="http://www.w3.org/1999/xhtml">
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
      <title>Untitled Document</title>
      <meta http-equiv="expires" content="Mon, 01 Jan 1990 00:00:00 GMT" />
      <meta http-equiv="Cache-control" content="no-cache" />
      <meta http-equiv="Cache-control" content="no-store" />
      <meta http-equiv="Cache-control" content="must-revalidate" />
      <meta http-equiv="pragma" content="no-cache" />
   </head>

   <body>
      <a href="javascript:history.back();">GO BACK!!!</a>
   </body>
</html>

Clicking on the link from page 1 to page 2 and then hitting the back button does no interaction with the server.

If I have done something incorrectly, I would appreciate a correction.

Thanks very much,

Mark

Cache-control has to be sent as an http header, not a <meta http-equiv>.  See bug 202896.

If you just want to turn off bfcache and allow normal caching of the markup, a better solution is to add an onunload handler.  And even better is to hook into the pageshow and pagehide events.  See http://developer.mozilla.org/en/docs/Using_Firefox_1.5_caching.
Ugh, too bad about it FF ignoring the no-cache, no-store in meta tags.

Thanks very much for information, much appreciated.
Reassigning my bugs, since I'm not actually working on them.
Assignee: bryner → nobody
QA Contact: adamlock → docshell
Summary: no way to stop a page from being cached with bfcache → buggy behavior with multiple cache-control headers (was: no way to stop a page from being cached with bfcache)
So isn't this just a duplicate of bug 202896?
Depends on: 202896
Oh, that's just the META thing.

Can someone please attach an HTTP log showing what the server is actually sending in this case?

Better yet, can someone put up a testcase that shows the problem?
The most recent dup has a PHP testcase that shows the problem.
That doesn't help me any, since I don't have access to a web server that would run php.
I've uploaded that sample code to http://www.kelahn.com/cachetest.php

I can't confirm that it works at the moment, though.  Can someone else confirm it works?
hi will can you please cut&paste the code of the cachetest.php file? thanks!
<?php
	 session_start();
	 header('Pragma: no-cache');
	 header("Expires: Sat, 26 Jul 1997 05:00:00 GMT");                  // Date in the past
	 header('Last-Modified: '.gmdate('D, d M Y H:i:s') . ' GMT');
	 header('Cache-Control: no-store, no-cache, must-revalidate');     // HTTP/1.1
	 header('Cache-Control: pre-check=0, post-check=0, max-age=0');    // HTTP/1.1
	 if (!isset($_SESSION['test'])) {
	 echo "<a href='cachetest.php'>First Run</a>";
	 $_SESSION['test'] = TRUE;
	 } else {
	 echo "Second Run";
	 }
	 ?>


This was copied from bug 468046, which is a duplicate of this bug that Jesse Ruderman was talking about in comment #17.  The only change I made was to fix a comment that ran to another line.  (The 'Date in the past' line.)

https://bugzilla.mozilla.org/show_bug.cgi?id=468046
I created this new test case:
----------------------------------
<?php
    session_start();
    header('Pragma: no-cache');
    header("Expires: Sat, 26 Jul 1997 05:00:00 GMT");
    header('Last-Modified: '.gmdate('D, d M Y H:i:s') . ' GMT');
    header('Cache-Control: no-store, no-cache, must-revalidate');
    header('Cache-Control: pre-check=0, post-check=0, max-age=0');
    if (!isset($_SESSION['test'])) $_SESSION['test'] = 0;
    echo "<a href='index.php'>" . $_SESSION['test']++ . "</a>";
----------------------------------

instead of reloading click on the link a few times then click the firefox back button, the page always get cached.

but if you use this code
---------------------------------
<?php
    session_start();
    header("Cache-Control: no-store, no-cache, must-revalidate"); 
    header("Expires: Sat, 26 Jul 1997 05:00:00 GMT");
    header("Pragma: no-cache"); 
    header('Last-Modified: '.gmdate('D, d M Y H:i:s') . ' GMT');

    if (!isset($_SESSION['test'])) $_SESSION['test'] = 0;
    echo "<a href='index.php'>" . $_SESSION['test']++ . "</a>";
----------------------------------
it seems to work fine
Again, I'd love URIs here, not code that I have no useful way to run.  I'm happy to debug what we're doing, given a pair of URIs that run the PHP code from comment 22, say.
(In reply to comment #19)
> I've uploaded that sample code to http://www.kelahn.com/cachetest.php

Following is HTTP header by your page(At step [1] in attached log. When Cache/Cookie is cleared), which is obtained by extension of LiveHTTPHeaders.
William Crawford, how can Firefox treat it as no-cache or no-store request?

> HTTP/1.x 200 OK
> Date: Sun, 07 Dec 2008 07:41:23 GMT
> Server: Apache
> Set-Cookie: PHPSESSID=6b1c7b4f53aedb9a1bea00e4b0371320; path=/
> Expires: Sat, 26 Jul 1997 05:00:00 GMT
> Cache-Control: pre-check=0, post-check=0, max-age=0
> Pragma: no-cache
> Last-Modified: Sun, 07 Dec 2008 07:41:23 GMT
> Content-Type: text/html
> Connection: close

PHP manual says as follows.
> http://jp.php.net/manual/en/function.header.php
> header — Send a raw HTTP header
>  Description
>  void header ( string $string [, bool $replace [, int $http_response_code ]] )
>(snip)
> Parameters
>(snip)
> replace
>  The optional replace parameter indicates whether the header should replace
>  a previous similar header, or add a second header of the same type.
>  By default it will replace, but if you pass in FALSE as the second argument
>  you can force multiple headers of the same type.

To William Crawford:
I don't know whether PHP option(php.ini setting) which will alter default of "replace" parameter of header(), but I think above explains your case.

By the way, I was also convinced that multiple headers are generated by PHP by default. I wasn't aware of "defalut==replace" until you presented both of good case & bad case.
Correction. Sorry for spam. I wanted to say ;
> I wasn't aware of "defalut==replace" 
> until Fabrizio Balliano presented both of good case & bad case in Comment #22.
I uploaded the 2 samples above:
http://crealabs.it/bfcache-working.php (the second snippet)
http://crealabs.it/bfcache-notworking.php (the first snippet, with only 1 cache-control header instruction, I removed the second one to avoid the "php replace" behavior)

actually both of them seem to work now (working on sunday morning doesn't help)
(In reply to comment #26)
> I uploaded the 2 samples above:

To Fabrizio Balliano:
Thanks for your quick action. 
Can you create "3 or more Cache-Control: headers" cases with replace==FALSE?
> Case-3: Multiple Cache-Control: headers - A, no-cache etc. on first header)
> Case-4: Multiple Cache-Control: headers - B, no-cache etc. on middle header)
> Case-5: Multiple Cache-Control: headers - C, no-cache etc. on last header)
(In order to check whether "buggy behavior with multiple cache-control headers" of Firefox in bug summary really exits or not.)
FYI.
For existence of "Pragma: no-cache" in response header by your server/page.

HTTP 1.1 says as follows.
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.32
> 14.32 Pragma
>(snip)
> HTTP/1.1 caches SHOULD treat "Pragma: no-cache" as if the client had sent
> "Cache-Control: no-cache". No new Pragma directives will be defined in HTTP.
>     Note: because the meaning of "Pragma: no-cache as a response
>     header field is not actually specified, it does not provide a
>     reliable replacement for "Cache-Control: no-cache" in a response

It looks that Firefox doesn't treat "Pragma:no-cache in response header" as "replacement of Cache-Control:no-cache", and it's never RFC violation. 
However, IE possibly treats "Pragma:no-cache in response header" as "replacement for Cache-Control:no-cache in response header".
I have copied the code samples from comment #22 into http://william.is-a-geek.com/~william/cachetest2.php and http://william.is-a-geek.com/~william/cachetest3.php respectively.

I am unable to make the first one show the bad behavior, though.  I've tried it with my hosting company (the URLs above) and on my local machine with the same results.  I post this here so anyone else can try them and report if it breaks for them.
OK.  It looks to me like this bug is invalid, due to people not understanding PHP's header() behavior and therefore not sending the headers they meant to send.

"Pragma: no-cache" is equivalent to "Cache-control: no-cache" as far as we're concerned.  We disable bfcache for no-store responses or SSL no-cache responses.

If you look in the HTTP 1.1 specification (RFC 2616), here's what it has to say about caching and history mechanisms (section 13.12):

   History mechanisms and caches are different. In particular history
   mechanisms SHOULD NOT try to show a semantically transparent view of
   the current state of a resource. Rather, a history mechanism is meant
   to show exactly what the user saw at the time when the resource was
   retrieved.

And further in section 14.9.2 (when talking about no-store):

   History buffers MAY store such responses as part of their normal operation.

Now in practice, it turns out that following the above SHOULD NOT and MAY causes some issues, so we give sites a way to opt out of history 
(or more precisely opt out of form state restoration and bfache) by not doing the MAY and doing the "SHOULD NOT" in the case of no-store and SSL no-cache.

But the testcases in this bug are all non-SSL no-cache, and we're just following the SHOULD recommendation of the RFC to show exactly what the user last saw.
Status: UNCONFIRMED → RESOLVED
Closed: 16 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.