Closed Bug 313733 Opened 19 years ago Closed 17 years ago

Cache/download problems when downloaded file changes server side

Categories

(Firefox :: General, defect)

1.0 Branch
x86
Windows XP
defect
Not set
major

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: marco, Unassigned)

Details

(Whiteboard: CLOSEME 07/14)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; it-IT; rv:1.7.12) Gecko/20050919 Firefox/1.0.7
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; it-IT; rv:1.7.12) Gecko/20050919 Firefox/1.0.7

CASE A. IF cache settings set to a large size value:

  1. I download foo.pdf from ABC server to my client;
  2. foo.pdf downloads correctly and I can see it;
  3. foo.pdf changes server-side (even in size): I delete the old foo.pdf on ABC server and I put a new completely different PDF file on that server, naming it "foo.pdf";
  4. when I download (the new) foo.pdf, Firefox displays the old one!

CASE B. IF cache settings set to 1KB:

  Firefox works well, as expected.


Reproducible: Always




For server side applications that use dinamic generated PDF files, this can be seen as a "major feature broken".

Regards.
Can you point us to an example?  If a server is dynamically creating content, it should be sending HTTP headers stating not to cache the content.  On the other hand, if the headers say that the content can be cached, than Firefox is behaving correctly.  So we need to know what the server is doing to evaluate this bug report.
Component: Download Manager → General
QA Contact: download.manager → general
(In reply to comment #1)

> If a server is dynamically creating content,
> it should be sending HTTP headers stating not to cache the content.

Very often you do not own the server or the **HTML page**, but you have just to update one single file on a single folder inside that server...
More: "foo.pdf" is pointed to as a Web link, nothing more.

> On the other hand, if the headers say that the content can be cached, 
> than Firefox is behaving correctly.

I don't think so, really: if the file changes in size, Firefox has to understand that the file IS NOT the same.

Another example: 

1. Go to "ABCServer/SomeDir/foo.pdf" typing it directly into the addrerss bar
2. See foo.pdf

3. Change foo.pdf server side, as you want
4. See the "new" foo.pdf. Oh my... ;-)

In this case the HTTP headers are sent by the WebServer directly, not by the script or the HTML page.


When "foo.pdf" is sent via HTTP, it has caching information associated with it, either explicitly or implicitly (if the server omits the info).  Based on that info, as well as the user's settings, Firefox may do one of a few things the next time "foo.pdf" is requested.

1) It can ask the server to send the file again, ignoring the cache (or if it is no longer cached)
2) It can ask the server if "foo.pdf" has changed since it last requested it.  It then either uses the cached copy or gets an updated copy from the server.
3) It can load the file from the cache and not ask the server anything.

>I don't think so, really: if the file changes in size, Firefox has to
understand that the file IS NOT the same.<

Firefox may not see the change in file size.  By the time it has, it already decided to ignore the cached copy based on settings and the item's cache info, and asked the server for info on the item.

You can type in 'about:cache' as a URL to see what's in the cache, as well as the header info stored.
(In reply to comment #4)

> Firefox may do one of a few things the next time "foo.pdf" is requested.

> 1) It can ask the server to send the file again, ignoring the cache (or if it
> is no longer cached)
> 2) It can ask the server if "foo.pdf" has changed since it last requested it. 
> It then either uses the cached copy (A) or gets an updated copy from the 
> server (B).
> 3) It can load the file from the cache and not ask the server anything.


3): who needs that?

As far as I know, IE always asks the server and, in the case, gets an updated copy from it (See 2-B above).
IE behaves always correctly and brings a user the right copy of the PDF or HTML page, whatever you set in its prefs. It only does mistakes with .js files, as we noticed.

"Sic rebus stantibus", I'll set 1 KB of cache memory in Firefox prefs.

Regards.
See http://www.mozilla.org/quality/networking/docs/netprefs.html for information on the browser.cache.check_doc_frequency preference.  It controls how often the server is consulted about potentially out-of-date items.  Again, there is interaction between this setting and what the server claims about an item.

>>3) It can load the file from the cache and not ask the server anything.
>3) who needs that?

Internet Explorer has the exact same functionality.  There are probably some  differences in implementation, and you may have changed your settings (see Internet Options, General tab, and then Settings... in the Temporary Internet file section).  People with slow connections or who pay for usage find this useful.
(In reply to comment #6)

> >3) It can load the file from the cache and not ask the server anything.
> >3) who needs that?
> 
> People with slow connections or who pay for usage find this useful.

A few days ago I visited a Physics site - in this case, a normal HTML page with frames (not mine, with no images, we say 5-10 KB in size). 
The same day the webmaster changed that page. 
The following day I visited the page again and I saw no changes. 

Emptying the cache, I realized that I missed something (important to me).
But if people nerver empty the cache and let Firefox in its standard config... what happens???

--> People with slow connections or who pay for usage DO NOT find this useful.

This is the ONLY thing for which I, my customers and other programmers or engineers (as me) prefer IE's implementation. Really.
Sometimes we use it for "It has not the bug", we say.

Please, think about.

Of course, apart from that, Firefox is a (better) new world of browsing.
Whole World thaks you for your hard work.

Ciao e grazie ancora.
I agree with ing.emmebì
We have the same problems
We just discovered this problem too.  It's quite dangerous.  I have set up a snapshot of our Subversion project on our web server.  Using "post_commit", I cause this snapshot to be updated whenever someone commits a file.  Note that I use an auto-generated dir listing (via apache) for this, which shows date last modified.

A user found this problem using Firefox and also reported that IE works correctly.  What is worse than the fact that the old file contents are cached is that the dir listing page shows the "last modified" time correctly!  In other words, the user things the file is the new version based on the time listed, but when clicking the file, he will get the old contents with no indication that it is stale info (file does not match mod time listed).  Misleading at best...

Another strange thing is that the "about:cache" shows the last mod time as correct too (i.e. the later time) even though the cached file is not the file matching that later time.

I implore someone to take this seriously, since it is definitely an issue.  I would expect the browser to at least check the server to see if a new version is available (not much network traffic is required for that, right?) and invalidate the cached copy if so.

I suspect that the user is using default behavior for both IE and Firefox and still gets this behavior.  Is this not a bug?

Thanks...!  Joe
> I would expect the browser to at least check the server to see if a new version
> is available (not much network traffic is required for that, right?) and
> invalidate the cached copy if so

That's not how the HTTP spec works though.  If an item is cacheable, and is fresh enough, the browser does not need to check the server.  Whether it does so anyway is a user-configurable option.  If you want to guarantee that only the latest version of the file is seen, you need to have the server mark the pages to not be cached or to always be considered expired.  That way every browser will always check if it has the latest version.
I understand what you are saying.  Do you happen to know, off hand, how to cause Apache to specify "never cache" on automatically generated dir listings (the ones where no HTML exists)?

Also, one thing that makes me think something is amiss is the fact that the dir list itself gets updated with the new date of modification, whereas clicking the file link gives the old content.  This is especially misleading.  Could it be that Firefox is always refreshing the dir list attributes but not the file data?  It probably should be one way or the other but not both.

If this is working as speced in HTTP, maybe it's a bug in Apache (?).  Again, if anyone knows how to ensure Apache says, "don't cache," let me know.

Thanks, Joe
Reporter, do you still see this problem with the latest Firefox 2? If not, can you please close this bug as WORKSFORME. Thanks!
Whiteboard: CLOSEME 07/14
Version: unspecified → 1.0 Branch
(In reply to comment #12)
> Reporter, do you still see this problem with the latest Firefox 2? If not, can
> you please close this bug as WORKSFORME. Thanks!

I cannot see the problem with 2.x, so we should close.  Thanks!
Hmm, how do I close this?  This bugzilla interface seems to have no option for changing status.
Status: UNCONFIRMED → RESOLVED
Closed: 17 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.