Closed Bug 209242 Opened 21 years ago Closed 21 years ago

[FIX]back button not using cached pages (when page headers say not to cache)

Categories

(Core :: Networking: Cache, defect, P1)

x86
All
defect

Tracking

()

RESOLVED FIXED
mozilla1.5beta

People

(Reporter: ehume, Assigned: bzbarsky)

References

()

Details

(Keywords: perf, regression)

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4b) Gecko/20030525 Mozilla Firebird/0.6
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4b) Gecko/20030526 Mozilla Firebird/0.6

On all FB builds starting 2003-05-26, pages that should be loading from cache
are loading from the web.

Reproducible: Always

Steps to Reproduce:
1. Open a web page
2. Navigate to a new page
3. Go back to first page.

Actual Results:  
Page loads from web rather than from cache.

Expected Results:  
Load page from cache

Using about:config, tried all settings from 0 to 3. Could not get page to load
from cache. 2003-05-25 works as expected.
For a blow-by-blow, here is the MZ thread:
http://forums.mozillazine.org/viewtopic.php?p=97331#97331
Ed, if you type about:cache into the URL bar, does it report "normal" looking
numbers, or are the numbers wrong?  If the numbers look like they're completely
wrong, this may be bug 193454.
I went back to the June 3 build (used profile from when I had that build in use)
and got normal looking numbers. Page still wasn't being stored in cache however. 
When I look at the numbers in about:config, they look fine: the check cache
frequency is 3, which is default.

I have discovered that some pages behave differently. For example,
http://www.techcentralstation.com/1051/ will not reload from cache after
2003-05-25. Yet http://story.news.yahoo.com/news?tmpl=index&cid=716 does not
show any loading from web activity on the status bar. This page, at any rate,
seems to load from cache. 
Ed, I meant about:cache, not about:config.
The cache numbers look OK: disk cache is 10000 k and memory cache is 18432 k
The server specified in the URL doesn't seem to generate any Cache-Control or
Expire headers, as such I would expect the browser to go back to the origin server

I guess it would be interesting to sit down with Ethereal and see the traffic
going back/forth
Granted that the server has left some things off. But the fact remains that
broswers up to 2003-05-25 behave differently from browsers after 2003-05-25, and
that difference slows the browser down.
Ed, I checked this with several SeaMonkey nightlies and I can reproduce this
problem there, too!

Now just to summarize: we have pages, which still load from cache and others,
which don't (comment #4) and the page given in the bug report seems to not care
about cache control or expire headers (comment #7) - so perhaps the logic
changed on how to handle pages _without_ cache control info / expire headers?

To be honest: in this case I would expect Moz.Firebird / SeaMonkey to re-fetch
the page if no expire information is given, since we can't decide if the already
fetched data is still up-2-date! 

Now with the given info from comment #7 the builds from 2003-05-25 and prior
seems to behave faulty and starting with 2003-05-26 the builds seem to handle
such pages with missing caching instructions correctly. Hence this bug seems to
be invalid if we don't find a page _with_ given cache control info which gets
still loaded from web regardless of the cache instructions... :) 

=> Added status "Invalid?" and tweaked summary a bit
Summary: cache not working → cache not working on given page
Whiteboard: Invalid?
if there's a change in Seamonkey as well, then this isn't a Firebird issue.

Seems that when moving back through several pages, we're now revalidating things
rather than just using the cached version always. That may or may not be a bug,
I'm not sure, but it does seem to be a change. I can't see any checkins on
25th/26th May which look like an intentional change to this...
Assignee: blaker → gordon
Status: UNCONFIRMED → NEW
Component: General → Networking: Cache
Ever confirmed: true
OS: Windows XP → All
Product: Phoenix → Browser
QA Contact: asa → cacheqa
Version: unspecified → Trunk
Re-#9: I don't think invalid is appropriate. Here's why:

If I was coming brand new to a page, I'd want the browser to check on the
currency of the page stored in cache, and load a new copy from the net if there
was a doubt. I remember that the View and Mail pages of Jerry pournelle's old
site (before he switched web hosts) did not reliably update on earlier builds of
Px, for example.

OTOH, when I am moving back and forth along my history trail, I don't want the
browser updating the pages. I want the browser to show me the page I was looking
at, not any changes to the page. If I want changes in a history string, I can
hit the refresh button.

Thus, I think the browser should behave differently when it is going Back or
Forward as opposed to coming new to a page via a link, via typed input on the
URI bar or via a bookmark.
The Mozillazine forums are showing the same behavior as the TCS website: with a
build up through 2003-05-25 one backstracks from cache. With a build later than
2003-05-25, to backtrack the previous page must be reloaded from network. This
can take a long time with a website as labored as MZ.
Re comment #12:  This is a major annoyance for those of us who use dialup
connections.  Going through the MZ forums is now excruciatingly slow because the
msg list stupidly and unnecessarily refreshes when I press the back button.

There is a user preference that should determine (on a scale of zero to three)
when to check for new versions of cached pages, but that setting no longer seems
to have any effect.

Re comment #9:  NO!!  The older behavior was correct and the newer one is wrong.



Having just discussed this on IRC, this is a valid bug. These pages shouldn't be
displayed from the cache in general (because they have headers which tell the
browser not to cache them), but the back button should always use the cache, as
it did previously.
Summary: cache not working on given page → back button not using cached pages (when page headers say not to cache)
Whiteboard: Invalid?
just for the record, the builds mentioned in the report are 2003052508 (caches
ok) and 2003052608 (reloads on hitting back)
Keywords: regression
I have done some detective-work and I think I have found the code that caused
this bug. 

The only change I have made to the code is in
mozilla\content\base\src\nsImageLoadingContent.cpp.
Line 374-
/*
* Non-interface methods
*/
nsresult
nsImageLoadingContent::ImageURIChanged(const nsACString& aNewURI)
{
if (!mLoadingEnabled) {
return NS_OK;
}
///////I have added this from an earlier version 1.11/////////////
if (aNewURI.IsEmpty()) {
// Do not take down the already loaded image... (for compat with
// the old code that loaded images from frames)
// XXXbz is this what we really want?
return NS_OK;
}
////////////// END /////////////////////////////////////////////////////

nsresult rv; // XXXbz Should failures in this method fire onerror?
----------

This seems to solve the problem with pages not loading from cache but maby
someting else is broken? Hope this helps to get rid of this very annoying bug.

Here is where and how I found it maybe this is more readable.
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=MozillaTinderboxAll&branch=HEAD&branchtype=match&dir=mozilla&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=05%2F24%2F2003&maxdate=05%2F26%2F2003&cvsroot=%2Fcvsroot

http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&root=/cvsroot&subdir=mozilla/content/base/src&command=DIFF_FRAMESET&root=/cvsroot&file=nsImageLoadingContent.cpp&rev1=1.11&rev2=1.12
Attached patch patchSplinter Review
Taking.  For those, who are interested, the markup that caused the problem is:

      <input type="image" src="http://images.paypal.com/images/x-click-but04.gif"
       name="submit" alt="PayPal &#8212; Donate">

What happened was that the <input> got created (and defaults to type="text"),
then the type attribute got set.  Without this patch, that causes an attempt to
load the current src, and since src is not set it causes us to load "", and with
the change mentioned in comment 16 (many thanks for tracking that down, Daniel!)
that leads to the input trying to load the page URI itself.  But that's got a
no-cache header, so we have to hit the network (which is odd; I'd think that
child loads of history loads would get the same flags....) and the rest is history.

This patch brings nsHTMLInputElement more in line with what nsHTMLImageElement
does nowadays.
Assignee: gordon → bzbarsky
Keywords: perf
Priority: -- → P1
Summary: back button not using cached pages (when page headers say not to cache) → [FIX]back button not using cached pages (when page headers say not to cache)
Target Milestone: --- → mozilla1.5beta
Comment on attachment 127992 [details] [diff] [review]
patch

We really need to fix this issue of how stuff is inserted in the document..
Attachment #127992 - Flags: superreview?(jst)
Attachment #127992 - Flags: review?(caillon)
Comment on attachment 127992 [details] [diff] [review]
patch

> NS_IMETHODIMP
> nsHTMLInputElement::SetDocument(nsIDocument* aDocument, PRBool aDeep,
>                                 PRBool aCompileEventHandlers)
> {
>+  PRBool documentChanging = (aDocument != mDocument);
>+
>   // SetDocument() sets the form and that takes care of form's WillRemove
>   // so we just have to take care of the case where we're removing from the
>   // document and we don't have a form
>-  if (!aDocument && !mForm && mType == NS_FORM_INPUT_RADIO) {
>+  if (documentChanging && !mForm && mType == NS_FORM_INPUT_RADIO) {

Before we'd get here if aDocument was null.  Don't you want |documentChanging|
to be something like:

  |PRBool documentChanging = aDocument && aDocument != mDocument|

(Also note the lack of parentheses, as they are unneeded).  r=caillon
otherwise.
Attachment #127992 - Flags: review?(caillon) → review+
Pardon my point of view on this, but why not simply back out the code that
caused the problem in the first place? Then the original problem it was supposed
to solve can be re-thought and a different solution tried.
Ed, "the code that caused the problem in the first place" is correct.  It just
exposed a bug in another set of code (basically, the input code was relying on a
bug in the image loader; biesi fixed the image loader bug, so the input code was
now causing undesired effects).  The attached patch fixes the input code to not
be broken.

caillon,

> Don't you want |documentChanging|
> to be something like:
>
>  |PRBool documentChanging = aDocument && aDocument != mDocument|

No, I do not.  I want to execute that first set of code (removal from radio
group) if I am being removed from the document or moved to a different document;
this means I want to execute it when aDocument != mDocument, regardless of
whether aDocument is null or not.

The insertion into a radio group should only happen if my document changed _and_
the new document is not null, on the other hand.
Thank you bzbarsky for the clear explanation.
Comment on attachment 127992 [details] [diff] [review]
patch

sr=jst
Attachment #127992 - Flags: superreview?(jst) → superreview+
Fix checked in.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
I've been struggling with this behavior and wanted to make sure I understand
that Mozilla/Firefox will not re-check a page when using the back button even
when that page issued the appropriate expires/no-cache headers?  Isn't that the
point of these headers?  To keep pages from caching, either by back button, or link?

In my view I would expect the browser to make a new request when going back to
an expired page.  In a web application environment, once a user logs out of the
application, I don't want them to go back in and see expired pages.  I also
don't want to have to worry about changing cache seetings on the client.  Even
though I shudder when I say this, IE performs as I expect, re-checking the page
regardless of the user's cache settings.
First, comment 26 is not related to this bug at all.

Second,

> In my view I would expect the browser to make a new request when going back to
> an expired page.

The HTTP specification very clearly says that your expectation is wrong.  It
clearly says that caching and session history are completely separate and that
content can and should be accessible from session history even if it's expired
from the HTTP cache.

Now we make an exception for secure sites -- if an https:// page requests no
caching, then we will not make it available from session history either.  This
is to handle the exact case you describe for cases that clearly are caring about
security.
Thanks for reply.  I was just about to check the HTTP spec (I know, should've
done that first), and I appreciate the clarification.  I apologize if this
wasn't related to this bug, but it may be helpful for people having the same
issue as me who stumble across this bug report as I did.
(In reply to comment #28)
> Thanks for reply.  I was just about to check the HTTP spec (I know, should've
> done that first), and I appreciate the clarification.  I apologize if this
> wasn't related to this bug, but it may be helpful for people having the same
> issue as me who stumble across this bug report as I did.

you want cache-control: no-store
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: