The default bug view has changed. See this FAQ.

cache expiration problems with blog sites? (14 years of Heurostic Expiration instead of considering as "expired", if "Expires: -1" is returned)

RESOLVED FIXED in mozilla1.9alpha1

Status

()

Core
Networking: Cache
RESOLVED FIXED
11 years ago
10 years ago

People

(Reporter: chris hofmann, Assigned: Darin Fisher)

Tracking

({fixed1.8.1.1})

Trunk
mozilla1.9alpha1
x86
Windows XP
fixed1.8.1.1
Points:
---
Bug Flags:
blocking1.8.1 -
blocking1.8.1.1 +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [Fx 2.0.0.1] uiHitList)

Attachments

(1 attachment)

(Reporter)

Description

11 years ago
reported to webmaster...

I like http://crawfordslist.blogspot.com/  It has not been seen on my Firefox since Sunday, Sept 10.  I can get the daily blog with no problem on Safari.  It there a problem with your browser?

Rousculp 

-----------------------

my wife sees this a lot on her blog as well...  she will publish an update, then come bug me becuase the new update can't be viewed in firefox.  if we clear the cache firefox goes out and grabs the latest content and everything is fine.  I've seen this problem off and on in the code base since pre-necko days.  I'm wondering what the best way to investigate is?
Phenomenon of Bug 277813(and Bug 328605)? 

Read Bug 271652 which is listed in Bug 328605.
(Read other bugs listed in it for more example)
And check status of related files in cache(Expires: thru about:cache).  
And Get HTTP header data. 
 See Bug 221036 Comment #7 for getting data by NSPR logging.
 See Bug 221036 Comment #6 for getting data by LiveHTTPHeaders.

Comment 2

11 years ago
Some additional information. I downloaded and installed "Live HTTP Headers" from http://livehttpheaders.mozdev.org/ and waited until I experienced this problem (which I have seen before).

I went to Ann Althouse's Blog at http://www.althouse.blogspot.com/ and noticed that it was a stale copy. The request headers were as follows.
REQUEST: Get / HTTP/1.1
Host: www.althouse.blogspot.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7
Accept: text/xml, application/xml, application/xhtml+xml, text/html;q=0.9, text/plain;q=0.8, image/png, */*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7
Keep-Alive: 300
Connection: keep-alive

Response headers:
RESPONSE: HTTP/1.1 200 OK
Server: Apache
Vary: Accept-Encoding
test: %{HOSTNAME}e
Last-Modified: Tue, 26 Sep 2006 03:32:52 GMT
ETag: W/"17d502a-2a4df-4518ae02"
Accept-Ranges: none
Content-Type: text/html
Content-Encoding: gzip
Content-Length: 47108
Date: Tue, 26 Sep 2006 03:26:46 GMT
Cache-Control: private, xgzip-ok=""
Pragma: no-cache
Expires: -1

On the "General" tab of the Page Info (with Live Http Headers installed), it shows Expires: Tuesday, May 05, 2020 11:04:54 PM. That appears to me to be the problem. I have no idea why this page is getting stored in the cache with an Expires date 14 years in the future when the returned headers are as specified.

(In reply to comment #2)
> Expires: -1
Who generated this header? Apache? Weblog applicaion? Or your script?
Depends on: 328605
(In reply to comment #2)
> Last-Modified: Tue, 26 Sep 2006 03:32:52 GMT
> Date: Tue, 26 Sep 2006 03:26:46 GMT
Another question.
Why future time-stamp is returned as Last-Modified: ?
Time-stamp of "Date:" is start of script execution, and time-stamp of Last-modified: is end of script execution? (6 minutes to execute script...)  
(Addition to comment #4)
Proxy server is used?
"Content-Encoding: gzip" for html is rare when usual server, I think, but is popular when proxy server. So clock mis-match between original server and proxy server can produce such HTTP headers.

Comment 6

11 years ago
(In reply to comment #3)
> (In reply to comment #2)
> > Expires: -1
> Who generated this header? Apache? Weblog applicaion? Or your script?
> 

Professor Althouse tried various ways of getting her site to work well with Firefox and that was one of them. Per the RFC: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.21

"HTTP/1.1 clients and caches MUST treat other invalid date formats, especially including the value "0", as in the past (i.e., "already expired")."

A -1 is clearly not a valid date format, so it should be treated as already expired. It is not. Sometimes it results in "Not specified" but much of the time it results in a date sometime in the year 2020.

Also, Cache-Control: no-cache and Pragma: no-cache were tried by her as well, and neither of them worked. She also tried a valid in-the-past date. Same deal.

Comment 7

11 years ago
"Why future time-stamp is returned as Last-Modified: ?
Time-stamp of "Date:" is start of script execution, and time-stamp of
Last-modified: is end of script execution? (6 minutes to execute script...)"

And 

""Content-Encoding: gzip" for html is rare when usual server, I think, but is
popular when proxy server. So clock mis-match between original server and proxy
server can produce such HTTP headers."

Regarding these questions-- the answer is because that is the way Blogspot (which is owned by Google) is doing things. 

It would be nice if Blogspot would not be doing unusual things, but it also would be nice if Firefox was not doing even more unusual things in response (including, by my read, not quite following the RFC on what to do when there is an invalidly formatted Expires header).

Comment 8

11 years ago
(In reply to comment #6)
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.21
> 
> "HTTP/1.1 clients and caches MUST treat other invalid date formats, especially
> including the value "0", as in the past (i.e., "already expired")."

Per this from the RFC, I believe that lines 545-552 are in error in nsHttpResponseHead.cpp:

535 nsresult
536 nsHttpResponseHead::GetExpiresValue(PRUint32 *result)
537 {
538     const char *val = PeekHeader(nsHttp::Expires);
539     if (!val)
540         return NS_ERROR_NOT_AVAILABLE;
541 
542     PRTime time;
543     PRStatus st = PR_ParseTimeString(val, PR_TRUE, &time);
544     if (st != PR_SUCCESS) {
545         // parsing failed... maybe this is an "Expires: 0"
546         nsCAutoString buf(val);
547         buf.StripWhitespace();
548         if (buf.Length() == 1 && buf[0] == '0') {
549             *result = 0;
550             return NS_OK;
551         }
552         return NS_ERROR_NOT_AVAILABLE;
553     }
554 
555     if (LL_CMP(time, <, LL_Zero()))
556         *result = 0;
557     else
558         *result = PRTimeToSeconds(time); 
559     return NS_OK;
560 }

I believe the correct code should be:

nsresult
nsHttpResponseHead::GetExpiresValue(PRUint32 *result)
{
     const char *val = PeekHeader(nsHttp::Expires);
     if (!val)
         return NS_ERROR_NOT_AVAILABLE;
 
     PRTime time;
     PRStatus st = PR_ParseTimeString(val, PR_TRUE, &time);
     if (st != PR_SUCCESS) {
         // parsing failed but header exists. Treat as already expired...

         *result = 0;
         return NS_OK;
     }
 
     if (LL_CMP(time, <, LL_Zero()))
         *result = 0;
     else
         *result = PRTimeToSeconds(time); 
     return NS_OK;
 }

I apologize in advance for not knowing how to go about creating a formal patch to submit as a proposed solution. If someone wants to email me and teach me how, I would be glad to learn.

Gerry
(In reply to comment #6)
> Also, Cache-Control: no-cache and Pragma: no-cache were tried by her as well,
> and neither of them worked. She also tried a valid in-the-past date.

Server returns both "Cache-Control: private" and "Pragma: no-cache".
And, I couldn't find "Cache-Control: no-cache" in your HTTP header log.  
> RESPONSE: HTTP/1.1 200 OK
> Cache-Control: private, xgzip-ok=""
> Pragma: no-cache
Gerry Daly, do you know specific description about such situation in protocol definition of HTTP?

HTTP 1.1 says "Pragma: no-cache should be treated as if Cache-Control: no-cache is specified", but I think it is only when no Cache-Control: header case because "Pragma" is defined by HTTP 1.1 for backward compatibility purpose only.
Even if "Pragma: no-cache" is always to be treated as "Cache-Control: no-cache", I don't know what should be done when both "Cache-Control: private" and "Cache-Control: no-cache" are returned. 
And, the server says "I'm HTTP 1.1"...
Summary: cache expiration problems with blog sites? → cache expiration problems with blog sites? (14 years of Heurostic Expiration instead of considering as "expired", if "Expires: -1" is returned)
(Assignee)

Comment 10

11 years ago
The patch in comment #8 looks good to me.
(Assignee)

Comment 11

11 years ago
Created attachment 240408 [details] [diff] [review]
v1 patch

Patch based on comment #8.  Thanks!
Assignee: nobody → darin
Status: NEW → ASSIGNED
Attachment #240408 - Flags: review?(cbiesinger)
(Assignee)

Updated

11 years ago
Target Milestone: --- → mozilla1.9alpha
No longer depends on: 328605

Comment 12

11 years ago
(In reply to comment #9)
> Server returns both "Cache-Control: private" and "Pragma: no-cache".
> And, I couldn't find "Cache-Control: no-cache" in your HTTP header log.  

I am sorry for any ambiguity. Let me try to clarify.

First, she has been battling this issue for a few months, along with the help of some of her readers (like me). We told her some things to try, including those. Some of them she tried in the past, and did not work. Some of them she tried in the example I presented. I understand that the headers do not show all of the things I said she has tried, because that was just one example.

Second, she does not have direct control over the response headers. She has been attempting to get around this problem using META HTTP-EQUIV tags. The patch I suggested above will work with the fact that the RFC says that any invalid date in the Expires header should be considered to be already expired, but further testing by me indicates that a META HTTP-EQUIV="Expires" CONTENT="0" still does not do the trick; the Expires shows up in the Page Info (and in about:cache) as being in the year 2020 whenever Blogspot has returned a Date header that is earlier than the Last-Modified header (thanks, WADA, for pointing me in the right direction).

In other words, the proposed patch I came up with fixes a bug, just not the one that was reported here. :-/
(In reply to comment #12)
> the Expires shows up in the Page Info
> (and in about:cache) as being in the year 2020 whenever Blogspot has returned a
> Date header that is earlier than the Last-Modified header (thanks, WADA, for
> pointing me in the right direction).

That was fixed in bug 323708, right?

Comment 14

11 years ago
(In reply to comment #13)
> 
> That was fixed in bug 323708, right?
> 

That does look to my eyes like it would do the trick. Excellent!
Comment on attachment 240408 [details] [diff] [review]
v1 patch

this will lead to additional requests if servers use a nonstandard date format... I guess that's ok
Attachment #240408 - Flags: review?(cbiesinger) → review+
(Reporter)

Comment 16

11 years ago
if we think this is really low risk it might have a pretty positive impact on folks that do blogging and a lot of other places where folks are seeing stale content and getting frustrated.
Flags: blocking1.8.1?

Comment 17

11 years ago
It's too late to get this into FF2 - but if we can get the patch in the trunk we'd love to consider for 2.0.0.1.   
Flags: blocking1.8.1?
Flags: blocking1.8.1.1?
Flags: blocking1.8.1-
Whiteboard: [Fx 2.0.0.1]
(Assignee)

Comment 18

11 years ago
fixed-on-trunk
Status: ASSIGNED → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED
Flags: blocking1.8.1.1? → blocking1.8.1.1+
As this is blocking1.8.1.1+, please either request approval1.8.1.1 on the current patch or, if needed, attach a branch version of the patch and request approval1.8.1.1 on it.

Updated

11 years ago
Whiteboard: [Fx 2.0.0.1] → [Fx 2.0.0.1][checkin needed (1.8 branch)]
Whiteboard: [Fx 2.0.0.1][checkin needed (1.8 branch)] → [Fx 2.0.0.1]
Comment on attachment 240408 [details] [diff] [review]
v1 patch

From a Bonsai inspection, it looks like the same patch would work fine on the branch.
Attachment #240408 - Flags: approval1.8.1.1?
Comment on attachment 240408 [details] [diff] [review]
v1 patch

Darin confirmed we don't need a separate patch for the branch.

approved for 1.8, a=dveditz
Attachment #240408 - Flags: approval1.8.1.1? → approval1.8.1.1+
Whiteboard: [Fx 2.0.0.1] → [Fx 2.0.0.1] [checkin needed (1.8 branch)]
Fixed on 1.8 branch

Checking in nsHttpResponseHead.cpp;
/cvsroot/mozilla/netwerk/protocol/http/src/nsHttpResponseHead.cpp,v  <--  nsHttpResponseHead.cpp
new revision: 1.42.2.3; previous revision: 1.42.2.2
Keywords: fixed1.8.1.1
Whiteboard: [Fx 2.0.0.1] [checkin needed (1.8 branch)] → [Fx 2.0.0.1]
Whiteboard: [Fx 2.0.0.1] → [Fx 2.0.0.1] uiHitList
You need to log in before you can comment on or make changes to this bug.