Last Comment Bug 327765 - Mozilla's cache does not invalidate on non-GET requests
: Mozilla's cache does not invalidate on non-GET requests
Status: VERIFIED FIXED
: verified1.9.1
Product: Core
Classification: Components
Component: Networking: Cache (show other bugs)
: Trunk
: All All
: P1 major with 14 votes (vote)
: mozilla1.9.1b2
Assigned To: Bjarne (:bjarne)
:
Mentors:
http://www.mnot.net/javascript/xmlhtt...
Depends on:
Blocks: 200948 443098
  Show dependency treegraph
 
Reported: 2006-02-18 15:02 PST by Mark Nottingham
Modified: 2010-02-25 03:45 PST (History)
19 users (show)
benjamin: blocking1.9.1+
bugzillamozillaorg_serge_20140323: in‑testsuite?
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
HTTP Header log when Shift+Reload of your test page (36.39 KB, text/plain)
2006-02-24 16:02 PST, WADA
no flags Details
HTTP header log/about:cache list (21.33 KB, text/plain)
2006-03-17 16:25 PST, WADA
no flags Details
Test case (337 bytes, text/plain)
2008-03-07 23:57 PST, Hubert Roksor
no flags Details
Patch to nsHttpChannel.cpp / ncHttpChannel.h (4.42 KB, patch)
2008-09-04 01:44 PDT, Bjarne (:bjarne)
no flags Details | Diff | Splinter Review
Better patch to nsHttpChannel.cpp and nsHttpChannel.h (56.58 KB, patch)
2008-09-09 02:43 PDT, Bjarne (:bjarne)
no flags Details | Diff | Splinter Review
Improved patch (56.84 KB, patch)
2008-09-12 00:56 PDT, Bjarne (:bjarne)
no flags Details | Diff | Splinter Review
Above patch with -w to ignore whitespaces when computing the diff (12.85 KB, patch)
2008-09-12 00:58 PDT, Bjarne (:bjarne)
no flags Details | Diff | Splinter Review
Yet another patch, now without annoying whitespaces (8.99 KB, patch)
2008-09-18 03:12 PDT, Bjarne (:bjarne)
cbiesinger: review-
Details | Diff | Splinter Review
Unit-test (4.52 KB, text/plain)
2008-09-29 03:16 PDT, Bjarne (:bjarne)
cbiesinger: review-
Details
Patch honoring comments from reviewer (6.69 KB, patch)
2008-10-01 01:56 PDT, Bjarne (:bjarne)
no flags Details | Diff | Splinter Review
Previous patch was missing changes to nsHttpChannel.h (8.37 KB, patch)
2008-10-01 02:54 PDT, Bjarne (:bjarne)
no flags Details | Diff | Splinter Review
... and hopefully final version, taking into account in-flight changes to nsHttpChannel.cpp as well (9.01 KB, patch)
2008-10-01 03:01 PDT, Bjarne (:bjarne)
cbiesinger: review+
Details | Diff | Splinter Review
Improved unit-test, honoring comments from reviewer (5.15 KB, text/javascript)
2008-10-01 08:13 PDT, Bjarne (:bjarne)
cbiesinger: review+
Details
...added space after the if [Checkin: Comment 62] (9.01 KB, patch)
2008-10-01 08:18 PDT, Bjarne (:bjarne)
cbiesinger: review+
Details | Diff | Splinter Review
Further improved unit-test, honoring latest comments from reviewer (5.41 KB, text/javascript)
2008-10-02 01:34 PDT, Bjarne (:bjarne)
no flags Details
Improved testcase honoring latest comments from reviewer (5.75 KB, application/x-javascript)
2008-10-10 04:30 PDT, Bjarne (:bjarne)
cbiesinger: review+
Details

Description Mark Nottingham 2006-02-18 15:02:35 PST
User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/417.9 (KHTML, like Gecko) Safari/417.8
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.0.1) Gecko/20060111 Firefox/1.5.0.1

RFC2616 section 13.10 states;
   Some HTTP methods MUST cause a cache to invalidate an entity. This is
   either the entity referred to by the Request-URI, or by the Location
   or Content-Location headers (if present). These methods are:

      - PUT
      - DELETE
      - POST

   In order to prevent denial of service attacks, an invalidation based
   on the URI in a Location or Content-Location header MUST only be
   performed if the host part is the same as in the Request-URI.

   A cache that passes through requests for methods it does not
   understand SHOULD invalidate any entities referred to by the
   Request-URI.

Firefox's built-in browser cache doesn't appear to be conformant to these requirements.

Reproducible: Always

Steps to Reproduce:
Go to http://www.mnot.net/javascript/xmlhttprequest/cache.html; the code on that page will test any browser pointed at it.

Or, make a GET request to a resource, POST (or PUT, or DELETE, or use an unknown method) to it, and then GET it again; the second GET should not be taken from cache.

Actual Results:  
POST, PUT, DELETE and unknown methods do not invalidate the appropriate cache entries, as specified by RFC2616 (i.e., those referred to by the Request-URI, Content-Location or Location headers).

Expected Results:  
The cache entries should be invalidated.
Comment 1 WADA 2006-02-24 16:02:36 PST
Created attachment 213126 [details]
HTTP Header log when Shift+Reload of your test page

This is HTTP Header log by Seamonkey-2006012909-trunk(Win2K)/LiveHTTPHeaders when Shift+Reload for your test page.
  Cache option : "When the page is out of date"
                 browser.cache.check_doc_frequency = 3
( See http://livehttpheaders.mozdev.org/index.html for LiveHTTPHeaders )

There is a funny flow.
Even though Seamonkey issues HTTP GET with Cache-Control: no-cache, 304 Not Modified was returned by your server.
Is it correct behaviour?
(I don't know client's correct action when this case.)

--------------------------------------------------------
http://www.mnot.net/javascript/xmlhttprequest/check_validation
GET /javascript/xmlhttprequest/check_validation HTTP/1.1
Host: www.mnot.net
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9a1) Gecko/20060129 SeaMonkey/1.5a
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.7,ja;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: Shift_JIS,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
x-reqnum: 2
Pragma: no-cache
Cache-Control: no-cache

HTTP/1.x 304 Not Modified
Date: Fri, 24 Feb 2006 23:40:00 GMT
Server: Apache/1.3.29
Keep-Alive: timeout=5, max=96
Connection: Keep-Alive
Content-Type: text/plain
--------------------------------------------------------

GET HTTP protocol log, if required.
The method at http://www.mozilla.org/projects/netlib/http/http-debugging.html
is better for problem analysis, although LiveHTTPHeader log is easier to view HTTP flow.
Comment 2 Mark Nottingham 2006-03-16 18:18:12 PST
(In reply to comment #1)
> HTTP Header log when Shift+Reload of your test page
> Even though Seamonkey issues HTTP GET with Cache-Control: no-cache, 304 Not
> Modified was returned by your server.

As the instructions say, don't shift-reload ;) 

Cache-Control request headers are targetted at intervening caches, not the server. So, CC: no-cache won't have any affect on the server's behaviour (or at least it shouldn't).

There *is* something weird happening here, because the server is sending a 304 Not Modified even though the client didn't ask for it (e.g., with a If-Not-Modified request header). That's because I've hard-wired that particular resource to always return a 304, for purposes of testing validation. Because you're shift-reloading, the browser isn't following the expected flow for testing.

At any rate, that isn't part of the test that's relevant to this bug; see the testInvalidation function in the source.

Cheers,
Comment 3 WADA 2006-03-16 22:50:58 PST
(In reply to comment #2)
> As the instructions say, don't shift-reload ;) 

> That's because I've hard-wired that particular resource to always return a 304, for purposes of testing validation. 
> Because you're shift-reloading, the browser isn't following the expected flow for testing.

> At any rate, that isn't part of the test that's relevant to this bug; see the
> testInvalidation function in the source.

How can we can know your server side script's logic?
How can we know whether your server side script's or server's behavaiour is correct or not, intentional or simply programing error, if such inappropriate behaviour is involved?
How can we know whether test result we get is same test result as yours?

If your test case is targeted to specific case only, clearly describe test procedure to reproduce your problem, e.g.
 (0) set browser.cache.check_doc_frequency = N
 (1) Clear cache
 (2) Delete session cookie
 (4) Start LiveHTTPHeaders
 (3) Click link to save data in cache
 (4) Check HTTP headers, about:cache and so on. 
 (5) Reload or click link again
 (6) Check HTTP headers, about:cache and so on. 
And attach data of your test results(HTTP Header log, about:cache entries, and so on), then clearly describe what is invalid result, where the invalid result exists, how we can check whether the result is correct or not.

Please note that, when browser.cache.check_doc_frequency=1(Every time I view the page), I couldn't see any funny HTTP flow or funny behaviour of Seamonkey when normal Load by link click.
However, I saw response of no Cache-Control:max-age nor no Expires: for HTTP GET, and I could see entries of long expiration date(Heuristic Expiration) in cache. 
This may affet test result when browser.cache.check_doc_frequency=3 (When the page is out of date).
If browser.cache.check_doc_frequency=3, HTTP GET with If-Modified-Since will not be requested until 3/23 05:34.
Can this have relation to your test result?

> Disk cache device
> Key: http://www.mnot.net/javascript/xmlhttprequest/check_request_cache_control
>     Data size: 4 bytes
>   Fetch count: 6
> Last modified: 2006-03-17 15:01:22 (Note: This is NOT Last-Modified: header) 
>       Expires: 2006-03-22 05:33:26 (<== Heuristic Expirarion, probably) 
> Key: http://www.mnot.net/javascript/xmlhttprequest/check_fresh
>     Data size: 4 bytes
>   Fetch count: 12
> Last modified: 2006-03-17 15:01:25 (Note: This is NOT Last-Modified: header)
>       Expires: 2006-03-22 05:33:28 (<== Heuristic Expirarion, probably)
Comment 4 Mark Nottingham 2006-03-17 09:31:08 PST
You're focusing on aspects of that page that are outside the scope of this bug.

To repeat;

RFC2616 -- HTTP/1.1 -- says that caches should be invalidated under certain circumstances in section 13.10. 

Mozilla is not conforming to these requirements, and is therefore not conformant to HTTP/1.1.

In particular,

   Some HTTP methods MUST cause a cache to invalidate an entity. This is
   either the entity referred to by the Request-URI, or by the Location
   or Content-Location headers (if present). These methods are:

      - PUT
      - DELETE
      - POST

   In order to prevent denial of service attacks, an invalidation based
   on the URI in a Location or Content-Location header MUST only be
   performed if the host part is the same as in the Request-URI.

   A cache that passes through requests for methods it does not
   understand SHOULD invalidate any entities referred to by the
   Request-URI.

The requirement for requests to the same URI can be tested by creating a URI that accepts both POST and GET; e.g.,

#!/usr/local/bin/python
from os import environ
print "Content-Type: text/html" 
print "Cache-Control: max-age=60"
print
print "<h1>That was a %s</h1>" % environ.get("REQUEST_METHOD")
print "<p><a href='invalidate.cgi'>GET</a></p>"
print "<p><form action='invalidate.cgi' method='POST'><input type='submit'/></fo
   
To test, navigate to the URI (sending a GET), and then press the "submit query" button, which will send a POST. At this point, the cache will contain these entries;

Key: id=441aed73&uri=http://www.mnot.net/test/invalidate.cgi
Data size: 146 bytes
Fetch count: 1
Last modified: 2006-03-17 09:19:19
Expires: 2006-03-17 09:20:14

Key: http://www.mnot.net/test/invalidate.cgi
Data size: 145 bytes
Fetch count: 1
Last modified: 2006-03-17 09:19:07
Expires: 2006-03-17 09:20:02

Subsequent GETs to the same URI within the freshness window (here, 60 seconds) will be satisfied from cache (i.e., there will be no request back to the origin server), in clear violation of RFC2616.

I'm not sure what the cache entry starting id=... is for; based on the RFC, Mozilla MAY cache that and return it for subsequent GETs to the same URI, but it doesn't. 

Similar tests can be constructed for the requirements around the Location and Content-Location headers. Mozilla will fail them as well.
Comment 5 WADA 2006-03-17 16:25:45 PST
Created attachment 215456 [details]
HTTP header log/about:cache list
Comment 6 WADA 2006-03-17 17:05:01 PST
(In reply to comment #4)
Your test case said next when attached HTTP log is obtained.
> Response Cache-Control max-age POST freshness: fail (POST not cached)
  (this was first "fail" message from your javascript in my test)

Does aboves correspond to Cache-Control:no-cache of second POST in consecutive two POSTs to check_cache_method.s at "Date: Fri, 17 Mar 2006 23:59:36 GMT", followed by PUT/PUT/DELETE/DELETE/FOO in my HTTP log?

One question.
I could see response header of Cache-Control:max-age=60 for GET & 2 POSTs to check_cache_invalidate.s, but I couldn't find Last-Modified: header.
Is this intentional?

> Key: id=441aed73&uri=http://www.mnot.net/test/invalidate.cgi

I couldn't see any entry of Key start with id=...
I think you'd better to start test of your test case after cache clear.

Anyway, clearly describe which flow at which step is RFC violation, and what header/what parameter value is RFC violation, please.
Comment 7 WADA 2006-03-17 17:11:59 PST
Correction. Sorry for spam.
"two POSTs to check_cache_invalidate.s", instead of "check_cache_method.s"
Comment 8 Mark Nottingham 2006-03-25 12:42:10 PST
(In reply to comment #6)
> (In reply to comment #4)
> Your test case said next when attached HTTP log is obtained.
> > Response Cache-Control max-age POST freshness: fail (POST not cached)
>   (this was first "fail" message from your javascript in my test)
> 
> Does aboves correspond to Cache-Control:no-cache of second POST in consecutive
> two POSTs to check_cache_method.s at "Date: Fri, 17 Mar 2006 23:59:36 GMT",
> followed by PUT/PUT/DELETE/DELETE/FOO in my HTTP log?

No, that's a separate test.

> I could see response header of Cache-Control:max-age=60 for GET & 2 POSTs to
> check_cache_invalidate.s, but I couldn't find Last-Modified: header.
> Is this intentional?

Last-Modified is irrelevant to cache freshness in this scenario.
 
> > Key: id=441aed73&uri=http://www.mnot.net/test/invalidate.cgi
> 
> I couldn't see any entry of Key start with id=...
> I think you'd better to start test of your test case after cache clear.

I did.

> Anyway, clearly describe which flow at which step is RFC violation, and what
> header/what parameter value is RFC violation, please.

I have clearly explained it, repeatedly. It's not a header, it's a behavioural issue.
Comment 9 Mark Nottingham 2006-04-09 22:15:56 PDT
BTW, a demonstration of the kind of problem this causes;

Imagine you go to a Weblog, say http://blog.example.org/, and read an entry, http://blog.example.org/2006/04/09/entry. Your'e interested in adding to the discussion of the entry by submitting a comment, so you fill out a comment form which POSTs to http://blog.example.org/comments. 

After the server processes the comment, it 3xx redirects you back to the entry, so you can see your comment. According to HTTP, the redirect from a POST should invalidate any cached copy of the redirection target; in this case, the blog entry page (http://blog.example.org/2006/04/09/entry), so that the user will see a fresh copy.

Mozilla will show the cached copy, causing the user to think their comment wasn't received. They'll proceed to resubmit it, causing it to show up twice (or more).

I see this double-posting very often, and it only happens when Mozilla is the UA, because other browsers behave correctly.
Comment 10 Gervase Markham [:gerv] 2006-05-17 05:23:02 PDT
The reporter of this bug mentioned the problem in his talk today at XTech 2006.

Darin: do you have any comment?

Gerv
Comment 11 Darin Fisher 2007-06-11 00:31:40 PDT
-> reassign to default owner
Comment 13 Brendan Eich [:brendan] 2007-10-09 18:14:06 PDT
Duplicate bug?

/be
Comment 14 Hubert Roksor 2007-10-23 11:19:08 PDT
I have sifted through the 535 bugs that a quicksearch on "cache" returned (https://bugzilla.mozilla.org/buglist.cgi?quicksearch=cache). Few of them mentionned Firefox reusing stale content and none of them pertained to this bug or mentionned invalidation by non-GET requests.
Comment 15 Ivan Enderlin 2007-12-07 15:21:39 PST
Hey,
it's a very embarrassing bug. HTTP is the main protocol of Firefox, please, keep attention to its implementation :).
Comment 16 Ivan Enderlin 2007-12-07 15:45:11 PST
Oops, sorry, I didn't mean it was embarassing for Mozilla. The word I was looking for was "inconvevient", because that bug is very inconvenient for web developpers. HTTP is the core of the web and it's so sad it's not yet correctly implemented on all browsers. 

I hope you'll manage to fix it up soon, thanks.
Comment 17 Hubert Roksor 2008-01-30 01:58:59 PST
It's nearly been two years since that bug was first reported, is there any news on its status? If not, is there anything one can do to help ease up the process? (short of learning C and patching Firefox, that is)

In case a new developer finds this bug but doesn't want to read the whole page, please let me sum it up:
 - if Firefox makes a non-GET, non-HEAD request, invalidate cache entries matching to the Request-URI, (there should be just one if I'm not mistaken)
 - if the response carries a Content-Location or Location header and they share the same host as the Request-URI, invalidate those as well

The real-world consequences are that if web developers do not disable Firefox cache on any page that could possibly have any dynamic information (basically any page that allows any kind of feedback, rating, comment) they run the risk of having their users see their contribution "vanish" when Firefox serves them the stale copy of the page (the older copy obviously doesn't have the newly-added contribution.) The users then usually proceed to either immediately resubmit the contribution or complain about its disappearance.

Currently, the only workaround available to web developers is to disable caching entirely.

Thanks for reading.
Comment 18 Hubert Roksor 2008-03-07 23:57:32 PST
Created attachment 308112 [details]
Test case

I was re-reading this bug report the other day and I noticed some misunderstanding in the early comments from two years ago so I thought a smaller test-case could help visualizing the kind of nuisance that bug generates.

The name of the file is "bugzilla_327765.php", it is a PHP script that displays the current time and a submit button, and sets the page to expire one hour after the first access. When activated, the button POSTs to the page. In response, that page redirects to itself with a 302. We expect the content of the page to be updated whenever the button is activated. However, Mozilla Firefox does not revalidate the page and keeps displaying the stale copy instead.

It should be noted that this pattern (redirect after a successful POST) is very common on the web. In fact, it is used on this very page. The only reason we do not see a swarm of complaints about that bug is because most webmasters picked up the arguably bad habit of systematically disabling HTTP caching.
Comment 19 Bjarne (:bjarne) 2008-09-04 01:44:24 PDT
Created attachment 336812 [details] [diff] [review]
Patch to nsHttpChannel.cpp / ncHttpChannel.h

Lots of whitespaces (??) when doing the diff with -pU8 so I added -w to make it more reasonable...
Comment 20 Bjarne (:bjarne) 2008-09-04 02:10:48 PDT
Note that the patch below fixes the testcase added by Hubert, but does not pass the tests by Mark Nottingham.

The reason for this is a bit unclear at the moment, but one clue may be that nsXMLHttpRequest::Send() adds the LOAD_BYPASS_CACHE flag to all POST-requests, which causes nsHttpChannel::OpenCacheEntry() to return before the patch above is activated. It doesn't explain why PUT and DELETE fails to invalidate the cache, though, so there might be other factors involved as well.
Comment 21 Boris Zbarsky [:bz] 2008-09-04 07:17:12 PDT
Drive-by review comments:

1)  I'd like to see the actual patch being landed (without -w) in addition to the
    -w patch.
2)  It sounds like we need to either do this earlier in OpenCacheEntry or
    something, no?
3)  The nsICacheSession arg should just be nsICacheSession*, not a reference to
    the nsCOMPtr.
4)  This needs unit tests.
Comment 22 Bjarne (:bjarne) 2008-09-09 02:43:48 PDT
Created attachment 337627 [details] [diff] [review]
Better patch to nsHttpChannel.cpp and nsHttpChannel.h

With this patch, FF passes all the tests in the last section (Cache Invalidation) on the testpage by Mark Nottingham, as well as the simpler testcase for this bug provided by Hubert Roksor.

Note, however, that due to a subtle flaw in the the cache-invalidation-tests on the testpage, the patch also make FF pass the last test (Unknown Method Invalidation) even though the patch does not contain code to do so. The author of the testpage (and reporter of this bug) has been notified about this.
Comment 23 Boris Zbarsky [:bz] 2008-09-11 07:48:30 PDT
So...  There are a lot of unrelated whitespace changes in this patch.  Can you take them out, for ease of review?

Is there a reason to not just add the invalidate to the end of ProcessNormal() instead of just calling it after every ProcessNormal() call?  Does ProcesNormal() take early returns or something?  Do you still want to invalidate if ProcessNormal() returned error?

The Invalidate... method should be called MaybeInvalidate..., right?

dcamp or Honza need to review the application cache parts of Invalidate...; I'm not convinced those are right.  

If you need to do this on unrecognized HTTP methods, shouldn't you rather just skip doing it on GET/HEAD instead of testing for POST/etc?

Refactoring the cache-entry-getting code does seem to me like a good idea...

You probably want biesi to review the details, not me: he might actually be familiar with what the HTTP spec calls for here.
Comment 24 Honza Bambas (:mayhemer) 2008-09-11 09:12:33 PDT
We should not expire/invalidate entries in the application cache. Spec doesn't say anything about such behavior and it is not a classic HTTP cache as described at the HTTP RFC sec. 13.10. Application cache entries are fetched during cache update process while a manifest bound to a top-level document is traversed and all resources present in this manifest are by a relatively complicated algorithm fetched to the application cache bound to the manifest and also to the normal HTTP cache.

So, fix to this bug should not touch application cache entries but should always invalidate an entry in the HTTP cache.

Dave, can you confirm it?
Comment 25 Christian :Biesinger (don't email me, ping me on IRC) 2008-09-11 18:40:32 PDT
some quick comments-

comments should be indented like the code, not start in the first column

you should Doom() the cache entry instead of just setting its expiration time to zero
Comment 26 Mark Nottingham 2008-09-11 18:57:32 PDT
(In reply to comment #25)

> you should Doom() the cache entry instead of just setting its expiration time
> to zero

Not necessarily (unless the semantics of Doom are other than what one would think). 'invalidate' in HTTP doesn't mean that the cache entry needs to be cleared -- just that you have to check with the origin server before reusing it.

From 13.10:

In this section, the phrase "invalidate an entity" means that the cache will either remove all instances of that entity from its storage, or will mark these as "invalid" and in need of a mandatory revalidation before they can be returned in response to a subsequent request.
Comment 27 Christian :Biesinger (don't email me, ping me on IRC) 2008-09-11 19:32:07 PDT
That's what I get for not reading the spec before commenting... good point.
Comment 28 Bjarne (:bjarne) 2008-09-12 00:56:41 PDT
Created attachment 338275 [details] [diff] [review]
Improved patch

Improved patch with renamed method name and different indents
Comment 29 Bjarne (:bjarne) 2008-09-12 00:58:12 PDT
Created attachment 338276 [details] [diff] [review]
Above patch with -w to ignore whitespaces when computing the diff

For simpler review, as requested.
Comment 30 Bjarne (:bjarne) 2008-09-12 01:27:10 PDT
(In reply to comment #23)
> So...  There are a lot of unrelated whitespace changes in this patch.  Can you
> take them out, for ease of review?

Please see attachments above.

> Is there a reason to not just add the invalidate to the end of ProcessNormal()
> instead of just calling it after every ProcessNormal() call?  Does
> ProcesNormal() take early returns or something?  Do you still want to
> invalidate if ProcessNormal() returned error?

My understanding is that if we request a mutating operation (PUT/POST/DELETE) on a resource and the server replies with a status-code indicating that our request was fulfilled, we MUST invalidate our cache-entry since we can assume the resource was changed on the server. This is independent of how we process content of the response and the success or failure of this processing. Thus, I chose to invoke invalidation in selected cases of the switch in ProcessResponse().

(Of-course, there may be additional responses not implemented yet where invalidation should happen, and potentially responses where invalidation should not happen, but I think it makes fundamental sense to trigger invalidation based on the response-code from the server and not on FFs success or failure of processing the response-content.)

> The Invalidate... method should be called MaybeInvalidate..., right?

Ok...  see patch. :)

> dcamp or Honza need to review the application cache parts of Invalidate...; I'm
> not convinced those are right.  

See reply to Honzas comment below.

> If you need to do this on unrecognized HTTP methods, shouldn't you rather just
> skip doing it on GET/HEAD instead of testing for POST/etc?

There are 17 methods defined in nsHttpAtomList.h, but only 8 defined in Rfc2616 section 5.1.1. Do you think it would make sense to stick to the 8 defined in Rfc2616 and treat all others as unrecognized in this context?

> Refactoring the cache-entry-getting code does seem to me like a good idea...

Yes.
Comment 31 Bjarne (:bjarne) 2008-09-12 01:48:03 PDT
(In reply to comment #24)
> We should not expire/invalidate entries in the application cache. Spec doesn't
> say anything about such behavior and it is not a classic HTTP cache as
> described at the HTTP RFC sec. 13.10. Application cache entries are fetched
> during cache update process while a manifest bound to a top-level document is
> traversed and all resources present in this manifest are by a relatively
> complicated algorithm fetched to the application cache bound to the manifest
> and also to the normal HTTP cache.
> 
> So, fix to this bug should not touch application cache entries but should
> always invalidate an entry in the HTTP cache.
> 
> Dave, can you confirm it?

I must humbly admit that I am not familiar with the application-cache in Firefox... The basic issue it to make sure that if a mutating operation on a resource is successful, the client will not retrieve a cached copy of this resource on the next request. If this situation is irrelevant when using an application-cache, or if it's handled in some other way, we're in the clear. Otherwise, a careful explanation of why the application-cache should be allowed to circumvent the Http-spec at this point would be nice.

I've left code to invalidate the application-cache for now, but if there is consensus for disabling it I'll promptly change it accordingly.
Comment 32 Honza Bambas (:mayhemer) 2008-09-12 05:06:53 PDT
(In reply to comment #31)
> I've left code to invalidate the application-cache for now, but if there is
> consensus for disabling it I'll promptly change it accordingly.

The offline web application cache is not the HTTP cache even it is actually used as an HTTP cache when fetching resource from an application cache during normal browsing. Application caches have its own update cycle and algorithm very different from HTTP update and expire mechanism (as described above). In offline application spec from WHATWG is nothing said about invalidation/expiration of entries in an application cache when PUT/DELETE/POST or an unknown method is used on that resource.

And humanly, when I am online and PUT a new resource version on the server, then go offline and run my offline application (in a car on the road, for instance) I expect that resource to be still working and fetched from the offline cache. However, FF offers stall cache data when offline, but it is AFAIK FF specific.

Only when a resource from a manifest is to be fetched during an update process and (where there is already a previous - older - version of a cache bound to this manifest) then that older cache is used as an HTTP cache. Spec says exactly:

"Fetch the resource. If this is an upgrade attempt, then use cache [=the older version here] as _an HTTP cache_, and honor HTTP caching semantics (such as expiration, ETags, and so forth) with respect to that cache. User agents may also have other caches in place that are also honored."

WHATWG spec is vague in many details and this might be another one. There is not any other place in the spec that says "use the cache as a true HTTP cache". IMHO, this says we should expire the entry according to RFC 2616 13.10 but make this expiration has affect ONLY for next cache update process, what makes this fix much more complicated.

So, I would rather remove the code that expires the offline application cache entry from your patch and submit a follow-up bug that should be confirmed with people from WHATWG spec mailing list.
Comment 33 Honza Bambas (:mayhemer) 2008-09-12 05:11:15 PDT
And one more think: when we expire an entry in an offline application cache then this entry will not be accessible (will actually be deleted from the cache) until next update of the manifest (which might never occur) or until the whole cache is manually deleted by the user and populated again. This is IMO undesirable.
Comment 34 Boris Zbarsky [:bz] 2008-09-12 05:37:54 PDT
> Above patch with -w to ignore whitespaces when computing the diff

I guess I wasn't clear enough.  Please do not make gratuitous whitespace changes to these files, period.  Only change the things that need changing for your patch.

> Do you think it would make sense to stick to the 8 defined in
> Rfc2616 and treat all others as unrecognized in this context?

I have no idea, to be honest.  See the part about biesi being the one you want to review this, not me.
Comment 35 Mark Nottingham 2008-09-12 06:57:48 PDT
(In reply to comment #32)

> The offline web application cache is not the HTTP cache even it is actually
> used as an HTTP cache when fetching resource from an application cache during
> normal browsing. Application caches have its own update cycle and algorithm
> very different from HTTP update and expire mechanism (as described above). In
> offline application spec from WHATWG is nothing said about
> invalidation/expiration of entries in an application cache when PUT/DELETE/POST
> or an unknown method is used on that resource.

Now I'm curious -- is there a spec for how the application cache is supposed to operate? I wasn't aware that WHATWG/HTML5 was specifying a new kind of cache as well...
Comment 36 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-09-12 07:12:34 PDT
(In reply to comment #35)
> Now I'm curious -- is there a spec for how the application cache is supposed to
> operate?

Yes, which is why people keep referring to the spec above. :)

> I wasn't aware that WHATWG/HTML5 was specifying a new kind of cache as
> well...

http://www.w3.org/TR/html5/offline.html#appcache
http://www.whatwg.org/specs/web-apps/current-work/#appcache
Comment 37 Bjarne (:bjarne) 2008-09-18 03:12:18 PDT
Created attachment 339222 [details] [diff] [review]
Yet another patch, now without annoying whitespaces

This patch fixes the cache-invalidation problem as reported in this bug. Entries in the application-cache are NOT invalidated (see comments #24, #32 and #33) and unknown methods are handled by limiting the known methods to those defined in Rfc2616 section 5.1.1 (see comment #30).

Unit tests will be attached later.
Comment 38 Bjarne (:bjarne) 2008-09-29 03:16:33 PDT
Created attachment 340899 [details]
Unit-test

Unit-test for this fix. See comments in script for further explanation.

Leaving this for now, awaiting review...
Comment 39 Christian :Biesinger (don't email me, ping me on IRC) 2008-09-30 12:43:49 PDT
Comment on attachment 339222 [details] [diff] [review]
Yet another patch, now without annoying whitespaces

         else
             rv = ProcessNormal();
+            MaybeInvalidateCacheEntryForSubsequentGet();


I would recommend curly braces around the body of the else :-)

         else {
             LOG(("ProcessRedirection failed [rv=%x]\n", rv));
             rv = ProcessNormal();
+            MaybeInvalidateCacheEntryForSubsequentGet();
         }

In this case-block, you are already calling MaybeInvalidateCacheEntryForSubsequentGet. You don't have to call it here again.

Is there a particular reason why you're not calling this in 304/401 branches?

+    // See Rfc2616 section 5.1.1. These are considered valid

Rfc2616 -> RFC 2616

+    if(
+       mRequestHead.Method() == nsHttp::Options ||

Style here is:
  if  (mRequestHead.Method() == nsHttp::Options ||

+       mRequestHead.Method() == nsHttp::Trace||

missing a space before the ||

+       mRequestHead.Method() == nsHttp::Connect
+       ) return;

and the ) should be on the same line as the nsHttp::Connect comparison

+    nsresult rv;
+    nsCAutoString tmpCacheKey;

can you move the rv declaration to right before the GetCacheSession line?

+    rv = gHttpHandler->GetCacheSession(storagePolicy,
+                                      getter_AddRefs(session));

incorrect indentation on the second line

+    rv = session->OpenCacheEntry(tmpCacheKey, nsICache::ACCESS_READ/*accessRequested*/, PR_FALSE,
+                                    getter_AddRefs(tmpCacheEntry));

Incorrect indentation on the second line. Also, the /*accessRequested*/ comment doesn't seem very useful, can you remove it?


+    // If we found the the entry or we got WAIT_FOR_VALIDATION, set
+    // its expiration-time = 0
+    // TODO can we let the cache-entry live in the latter case?

Isn't tmpCacheEntry NULL in the latter case?

+       tmpCacheEntry->SetExpirationTime(PRUint32(0));

You shouldn't need the cast

+    if(mLoadFlags & INHIBIT_PERSISTENT_CACHING)

Should have a space before the (

Also, please limit your lines to 80 characters.
Comment 40 Christian :Biesinger (don't email me, ping me on IRC) 2008-09-30 13:01:54 PDT
Comment on attachment 340899 [details]
Unit-test

    var nsCacheService = Components.classes["@mozilla.org/network/cache-service;1"];
    var service = nsCacheService.getService(Components.interfaces.nsICacheService);

hm, I don't like storing the class object in a variable named nsCacheService... I'd just do something like:

return Components.classes["@mozilla.org/network/cache-service;1"].
    getService(Components.interfaces.nsICacheService);

(align getService with classes, this textfield doesn't let me do that)

    var tmp = ios.newChannel("http://localhost:4444"+suffix, "", null);

spaces around the + operator

I think better names than tmp and chan would be chan and httpChan

Can you remove the dump lines instead of just commenting them out

        suffix="/put";

spaces around the = operator
(also on the other lines)

    return days[d.getDay()] + " " +
            months[d.getMonth()] + " " +
            d.getDate() + " 23:59:59 " +
            (d.getFullYear()+1);

I'd prefer if you used the first format from RFC 2616 3.3.1:
      Sun, 06 Nov 1994 08:49:37 GMT  ; RFC 822, updated by RFC 1123

Shouldn't you also change the POST method to PUT, etc, not just the suffix?
Comment 41 Bjarne (:bjarne) 2008-10-01 01:53:38 PDT
(In reply to comment #39)
> (From update of attachment 339222 [details] [diff] [review])
>          else
>              rv = ProcessNormal();
> +            MaybeInvalidateCacheEntryForSubsequentGet();
> 
> 
> I would recommend curly braces around the body of the else :-)

Very good point...  :-}
However, 206 should never be the response of a mutating method anyway so I remove the call altogether.

>          else {
>              LOG(("ProcessRedirection failed [rv=%x]\n", rv));
>              rv = ProcessNormal();
> +            MaybeInvalidateCacheEntryForSubsequentGet();
>          }
> 
> In this case-block, you are already calling
> MaybeInvalidateCacheEntryForSubsequentGet. You don't have to call it here
> again.

Good point (again)  :-} Removed second call.

> Is there a particular reason why you're not calling this in 304/401 branches?

My interpretation of status code 304 is that it may only come in response to a conditional GET, in which case invalidation is not pertinent. Thus, no call. As for 401/407, my understanding of the code is that ProcessResponse() will be called again after the new request with credentials has been sent, and invalidation may happen during that call.

(Please correct any of these views if they're wrong, as well as point out any responses which should cause invalidation.)

> +    // See Rfc2616 section 5.1.1. These are considered valid
> 
> Rfc2616 -> RFC 2616

Ok. Fixed.

> +    if(
> +       mRequestHead.Method() == nsHttp::Options ||
> 
> Style here is:
>   if  (mRequestHead.Method() == nsHttp::Options ||
> 
> +       mRequestHead.Method() == nsHttp::Trace||
> 
> missing a space before the ||
> 
> +       mRequestHead.Method() == nsHttp::Connect
> +       ) return;
> 
> and the ) should be on the same line as the nsHttp::Connect comparison

Very well...  fixed.

> +    nsresult rv;
> +    nsCAutoString tmpCacheKey;
> 
> can you move the rv declaration to right before the GetCacheSession line?

Done.

> +    rv = gHttpHandler->GetCacheSession(storagePolicy,
> +                                      getter_AddRefs(session));
> 
> incorrect indentation on the second line

Fixed.

> +    rv = session->OpenCacheEntry(tmpCacheKey,
> nsICache::ACCESS_READ/*accessRequested*/, PR_FALSE,
> +                                    getter_AddRefs(tmpCacheEntry));
> 
> Incorrect indentation on the second line. Also, the /*accessRequested*/ comment
> doesn't seem very useful, can you remove it?

Ok. Fixed.

> +    // If we found the the entry or we got WAIT_FOR_VALIDATION, set
> +    // its expiration-time = 0
> +    // TODO can we let the cache-entry live in the latter case?
> 
> Isn't tmpCacheEntry NULL in the latter case?

Yes, it may be NULL. Case and TODO removed.

> +       tmpCacheEntry->SetExpirationTime(PRUint32(0));
> 
> You shouldn't need the cast

Ok. Removed.

> +    if(mLoadFlags & INHIBIT_PERSISTENT_CACHING)
> 
> Should have a space before the (

Fixed.

> Also, please limit your lines to 80 characters.

I think that should be better now.
Comment 42 Bjarne (:bjarne) 2008-10-01 01:56:04 PDT
Created attachment 341238 [details] [diff] [review]
Patch honoring comments from reviewer
Comment 43 Bjarne (:bjarne) 2008-10-01 02:54:14 PDT
Created attachment 341244 [details] [diff] [review]
Previous patch was missing changes to nsHttpChannel.h
Comment 44 Bjarne (:bjarne) 2008-10-01 03:01:18 PDT
Created attachment 341246 [details] [diff] [review]
... and hopefully final version, taking into account in-flight changes to nsHttpChannel.cpp as well
Comment 45 Christian :Biesinger (don't email me, ping me on IRC) 2008-10-01 06:57:39 PDT
Comment on attachment 341246 [details] [diff] [review]
... and hopefully final version, taking into account in-flight changes to nsHttpChannel.cpp as well

OK. About 401/407, since the user can cancel the dialog, there may not be a second request with a 200 status. But in that case the request shouldn't have a side-effect so no need to invalidate the GET entry.

+    if(mRequestHead.Method() == nsHttp::Options ||

missing space before the (

With that change, looks good.
Comment 46 Christian :Biesinger (don't email me, ping me on IRC) 2008-10-01 07:02:00 PDT
Are you going to attach a new version of the unit test as well?

Thanks a lot for fixing this, btw!
Comment 47 Bjarne (:bjarne) 2008-10-01 08:11:39 PDT
(In reply to comment #40)
> (From update of attachment 340899 [details])
>     var nsCacheService =
> Components.classes["@mozilla.org/network/cache-service;1"];
>     var service =
> nsCacheService.getService(Components.interfaces.nsICacheService);
> 
> hm, I don't like storing the class object in a variable named nsCacheService...
> I'd just do something like:
> 
> return Components.classes["@mozilla.org/network/cache-service;1"].
>     getService(Components.interfaces.nsICacheService);
> 
> (align getService with classes, this textfield doesn't let me do that)

Ok. Fixed. (I borrowed it from TestAsyncCache.js, btw.  ;) )

>     var tmp = ios.newChannel("http://localhost:4444"+suffix, "", null);
> 
> spaces around the + operator

Ok. Fixed.

> I think better names than tmp and chan would be chan and httpChan

Ok. Fixed.

> Can you remove the dump lines instead of just commenting them out

Done.

> 
>         suffix="/put";
> 
> spaces around the = operator
> (also on the other lines)

Hopefully ok now.

>     return days[d.getDay()] + " " +
>             months[d.getMonth()] + " " +
>             d.getDate() + " 23:59:59 " +
>             (d.getFullYear()+1);
> 
> I'd prefer if you used the first format from RFC 2616 3.3.1:
>       Sun, 06 Nov 1994 08:49:37 GMT  ; RFC 822, updated by RFC 1123
> 

Shouldn't make a big difference, but ok. :) Fixed.

> Shouldn't you also change the POST method to PUT, etc, not just the suffix?

Ahh yes...  :-}  Bad mistake! Fixed.

Note the special handling of PUT in check2(). It turned out that if I didn't pass any payload with the PUT-request, something weird happens to the caching and I was stuck with a cache-entry with no content for the URL. I did not dig deeper into that one, but rather made the unit-test work properly by passing some payload.
Comment 48 Bjarne (:bjarne) 2008-10-01 08:13:16 PDT
Created attachment 341291 [details]
Improved unit-test, honoring comments from reviewer
Comment 49 Bjarne (:bjarne) 2008-10-01 08:18:36 PDT
Created attachment 341292 [details] [diff] [review]
...added space after the if
[Checkin: Comment 62]

Here we go... one more space. :)

I'd be happy for directions for how to proceed after the r+ ...
Comment 50 Christian :Biesinger (don't email me, ping me on IRC) 2008-10-01 11:08:26 PDT
Comment on attachment 341291 [details]
Improved unit-test, honoring comments from reviewer

    if (suffix == "/post")        method = "POST";

What you could have done was make suffix be just the "post" part and add the slash separately, then you wouldn't need these ifs :-)

but this is fine too.

        buffer.setData("uploaded data", 128);

Wow, this doesn't crash?

I think what you want is:
        var buffer =
            Components.classes["@mozilla.org/io/string-input-stream;1"].
                       createInstance(Components.interfaces.nsISupportsCString);
        buffer.data = "uploaded data";
        channel.setUploadStream(postData,"",postData.length);

missing spaces after the commas

you wouldn't need the MIMEInputStream if you passed a content type here
Comment 51 Christian :Biesinger (don't email me, ping me on IRC) 2008-10-01 11:12:10 PDT
(In reply to comment #49)
> I'd be happy for directions for how to proceed after the r+ ...

Unfortunately the timing isn't great, because until yesterday the patch could have been checked in right after review. Now though, the tree is frozen for beta1. I don't know what the rules are going to be after that, the patch may need approval.
Comment 52 Christian :Biesinger (don't email me, ping me on IRC) 2008-10-01 11:13:23 PDT
The review comment got formatted badly, there was supposed to be an empty line between the buffer.data assignment and the setUploadStream line, because setUploadStream wasn't part of the suggested code.
Comment 53 Brendan Eich [:brendan] 2008-10-01 11:23:30 PDT
We want this for Firefox 3.1 / Mozilla 1.9.1, I think. It could land right after the tree opens for beta2 checkins. Just add the checkin-needed keyword provided the patch is up-to-date.

/be
Comment 54 Bjarne (:bjarne) 2008-10-01 12:33:23 PDT
(In reply to comment #50)
> (From update of attachment 341291 [details])
>     if (suffix == "/post")        method = "POST";
> 
> What you could have done was make suffix be just the "post" part and add the
> slash separately, then you wouldn't need these ifs :-)
>
> but this is fine too.

I was considering this as the test evolved but my head was swimming badly enough with the mentioned PUT-issue so I decided to keep this part of the logic simple while figuring out what was happening to my PUT-requests. :)

I'm not proud of the code in the unit-test. It's a real patchwork, and I thank God for google and 'grep -i' with lots of existing testcases...  :) But I think it does what it is supposed to do, and it's fairly straightforward and readable.

>         buffer.setData("uploaded data", 128);
> 
> Wow, this doesn't crash?

Not yet, no...

> I think what you want is:
>         var buffer =
>             Components.classes["@mozilla.org/io/string-input-stream;1"].
>                       
> createInstance(Components.interfaces.nsISupportsCString);
>         buffer.data = "uploaded data";

A lot simpler, I agree. But see also comment above on 'grep -i'. :)

>         channel.setUploadStream(postData,"",postData.length);
> 
> missing spaces after the commas
> 
> you wouldn't need the MIMEInputStream if you passed a content type here

Ok.

Would you prefer me to fix the above issues and submit a new unit-test, or should I just keep these things in mind for future test-cases?
Comment 55 Christian :Biesinger (don't email me, ping me on IRC) 2008-10-02 00:36:50 PDT
You have to fix the setData thing at least. That has a really high chance of crashing. It tries to copy 128 bytes from a 14 byte string.

Where did you copy it from?
Comment 56 Bjarne (:bjarne) 2008-10-02 01:34:13 PDT
Created attachment 341412 [details]
Further improved unit-test, honoring latest comments from reviewer

Referring to comment #50

> you wouldn't need the MIMEInputStream if you passed a content type here

I actually tried this but it didn't work. See the "if (false)" block in check2(), and change it to "if (true)". This makes the test fail (for me at least) and it was part of the mentioned PUT-issue I was battling yesterday. I'd be very interested in understanding why this simpler version does not work.
Comment 57 Bjarne (:bjarne) 2008-10-02 01:35:49 PDT
(In reply to comment #55)

> Where did you copy it from?

Hehe...  that particular line is my own. See why I prefer copying...?  ;)
Comment 58 Bjarne (:bjarne) 2008-10-03 02:20:18 PDT
Btw, see comment at https://bugzilla.mozilla.org/show_bug.cgi?id=200948#c5 by the reporter of current bug.

Would someone with sufficient credentials verify statement and duplicate?
Comment 59 Christian :Biesinger (don't email me, ping me on IRC) 2008-10-03 04:45:11 PDT
Comment on attachment 341412 [details]
Further improved unit-test, honoring latest comments from reviewer

(I wish bugzilla let me view application/x-javascript attachments directly)
Comment 60 Christian :Biesinger (don't email me, ping me on IRC) 2008-10-03 04:50:20 PDT
Comment on attachment 341412 [details]
Further improved unit-test, honoring latest comments from reviewer

so, where does the length attribute come from? (for buffer.length and putData.length). I don't see it on the interfaces...

           Components.classes["@mozilla.org/io/string-input-stream;1"].
                       createInstance(Components.interfaces.nsIStringInputStream);

this is kind of confusing, because the data attribute is on nsISupportsCString, not on nsIStringInputStream... classinfo makes this work, but it might be better to just use createInstance() without specifying an interface.
Comment 61 Bjarne (:bjarne) 2008-10-10 04:30:13 PDT
Created attachment 342574 [details]
Improved testcase honoring latest comments from reviewer

This should be more aligned with defined interfaces. :)

Also added some thoughts about why the MIMEInputStream-approach behaves differently.
Comment 62 Serge Gautherie (:sgautherie) 2008-10-10 09:15:28 PDT
Comment on attachment 341292 [details] [diff] [review]
...added space after the if
[Checkin: Comment 62]

http://hg.mozilla.org/mozilla-central/rev/9dfd46cb4a99
Comment 63 Christian :Biesinger (don't email me, ping me on IRC) 2008-10-22 07:58:57 PDT
Comment on attachment 342574 [details]
Improved testcase honoring latest comments from reviewer

// Change below (false) to (true) and the test fails...

please indent your comments as well

// My guess is that when doing a PUT-request with text/plain content, the

I'd rather you didn't put guesses into comments. Can you file a bug instead to investigate this issue?

Also, you don't declare putData, please do.
Comment 64 Gordon P. Hemsley [:GPHemsley] 2009-02-21 00:52:01 PST
Fixed before 1.9.1 branching:
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/9dfd46cb4a99

Verified fixed in recent Shiretoko build:
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.1b3pre) Gecko/20090220 Shiretoko/3.1b3pre

(Also confirmed not fixed in Firefox 3.0.)

This still needs to be VERIFIED in a trunk build; I don't have one available at the moment.
Comment 65 Gordon P. Hemsley [:GPHemsley] 2009-02-23 21:09:22 PST
Since trunk was still pre-1.9.1, and it's verified in the 1.9.1 branch, then obviously this should be marked VERIFIED in trunk (at the time it was fixed).
Comment 66 Henrik Skupin (:whimboo) 2009-02-23 21:15:22 PST
I checked the given testcase and a lot of tests still fail. Do we have other bugs which cover the remaining parts?
Comment 67 Bjarne (:bjarne) 2009-02-24 01:17:23 PST
(In reply to comment #66)
> I checked the given testcase and a lot of tests still fail. Do we have other
> bugs which cover the remaining parts?

Assuming you mean the testpage from Mark Nottingham, please see bug #462243.
Comment 68 Henrik Skupin (:whimboo) 2009-02-25 19:12:40 PST
(In reply to comment #67)
> Assuming you mean the testpage from Mark Nottingham, please see bug #462243.

Yeah, I meant the testpage. Thanks for the link.

Note You need to log in before you can comment on or make changes to this bug.