Closed Bug 112564 Opened 23 years ago Closed 23 years ago

Cache-Control: no-cache should not affect back/forward buttons

Categories

(Core :: Networking: HTTP, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla0.9.8

People

(Reporter: seminar_, Assigned: darin.moz)

References

()

Details

(Keywords: dataloss, perf, topperf)

Attachments

(1 file, 1 obsolete file)

<meta HTTP-EQUIV="Pragma" CONTENT="no-cache"> allows Back

HERE IS WHAT darin WROTE IN THE NEWS GROUP:

Re: <meta HTTP-EQUIV="Pragma"  CONTENT="no-cache">
Date:  Wed, 28 Nov 2001 17:40:11 -0800
From: Darin Fisher <darin@netscape.com>
Newsgroups: netscape.public.mozilla.netlib

we had a long debate on this and the fact of the matter is that
we must 
honor 'cache-control: no-cache' on back/forward.  this is
unfortunate 
because the spec would allow us to not do this, but
unfortunately many 
important web servers depend on this behavior, and would block
mozilla 
from accessing their web servers if we didn't implement
back/forward in 
this manner.
Why do they depend on this and why would they block Mozilla?

I suggest a WONTFIX for this. According to RFC 2616, "Pragma: no-cache" and 
"Cache-Control: no-cache" SHOULD NOT affect the back/forward buttons. Currently 
"Pragma/Cache-Control: no-cache" will affect back/forward when sent as headers, 
but not when written as <meta> tags. This bug is about making <meta> tags work 
the same way as headers; I suggest we do the opposite - not letting no-cache 
affect the back/forward buttons at all. Why? For two reasons:

1) It's what the spec says. Mozilla is all about following the standards, right?

2) I find this behaviour *extremely* annoying, both as an end-user (why
doesn't the stupid back button work when I'm offline?!?) and as a
webmaster (I store some data in the session cookie when this page is
loaded, therefore we will get into trouble later if this page is
retrieved from cache, but I can't send a header telling the browser not
to cache it because then the back button won't work!!!).
So.. we want to make no-cache do the same thing as no-store?
Huh? Changing component to HTTP.
Assignee: gordon → darin
Component: Networking: Cache → Networking: HTTP
on netscape 4x and IE5 you have the same exact behavior as mozilla w.r.t.
back/forward validating 'no-cache' content with the server before displaying.

as i said this is a standard requirement of web browsers.  the authors of the
http spec clearly wish it were otherwise (since they added it as a SHOULD
recommendation).  but, we cannot change the standard simply by dictating that it
should be so.  as i've said existing web sites depend on this, and will block
mozilla if it does not comply with this recognized standard.  some online banks
believe that this is necessary to provide a level of security for their users. 
while one could argue this, we'd rather not have every online bank claim that
mozilla is an insecure browser.

so, for the immediate future it just makes sense to follow the standard layed
out by existing browsers.  long term i hope that we can provide a better
solution, since mozilla once did ignore 'no-cache' on back/forward.
Oooh, the revenge of bug 101832
Going against the suggestion in the spec is a bad idea. 

Back/forwards performance is being damaged for a significant percentage of page s in order 
that a tiny percentage of (somehow "important") sites let Mozilla in.
I think damaging 
Mozilla's usability on a large number of sites for the sake of a few is just brain dead. I don't see 
any reason to treat banks as special. What tiny percentage of page views are done at banks? 

If 
the module owners aren't willing to follow the specs suggestion wholesale for some political 
reasons then there must at least be some concession to general useability. Something 
like:

a) Having a pref to switch between "silly bank friendly" and "user friendly" modes for 
history, perhaps in "silly bank friendly" mode by default. I don't like this idea, as most users 
won't know it's there and will still think Mozilla is slow when they hit their back button.
b) 
Have the "silly bank friendly" mode enabled for SSL and "user friendly" for normal sites. This is a 
horrendously crufty solution but ultimately will give people a better browsing experience most 
of the time and make the silly banks happy.
c) Something else.

If Mozilla is to compete it 
needs to offer the user a better experience. You don't do that by copying errant behaviour. You do 
that by improving upon it.
Opera do, which is why I use it for 99% of my browsing (including at 
least one of my two banks, the other using some odd java applet that only seems to load randomly in IE 
in any case).
For banks and other sites that doesn't want the page to be stored at all, we
have Cache-Control: no-store. If some sites depend on wrong behavior, those
sites should be changed.

Mozilla is all about following the standards, not breaking the specs, always
doing the right thing. Let's not change that. (Nobody mention favicons. Or
innerHTML.)
ic your point, and i understand your concerns... i share your concerns, but i
also want to make sure that people don't frown at mozilla when it appears to
break existing web browser behavior.  people won't blame the website... they'll
blame the browser since in this case it'll look very much like a problem with
the browser.

consider this common website design:

1) user logs in
2) user checks there bank account
3) user clicks on logout button, which takes them to a new page saying "goodbye"
4) user presses back button

on IE5&6, Netscape4.7, Netscape6.2:
5) user is asked to login again

on mozilla before bug 101832 was fixed:
5) user is shown their bank account again... user grumbles something about the
fact that they thought they had logged out... user tries IE and then frowns at
mozilla for "getting it wrong."  user stops using mozilla when visiting secure
websites.  you get the point.

and online banks will block browsers if they behave this way... maybe not right
away, and maybe not all at once... but they will and they have.

in the future, we hope to provide two modes for mozilla... a kiosk mode and a
home user mode.  the home user mode would of course have no need to make
back/forward honor 'no-cache' headers.  but in kiosk mode it would be critical.
there are a number of features that could be enabled in a home user browser but
that should be disabled in kiosk applications.
<sigh> I still think we should do the Right Thing(tm).

You said in the newsgroup that using no-store instead of no-cache...

  »would solve this problem, but it then prevents the user from saving the
  page or viewing the source of the page without refetching the page from
  the server.  if the page is the result of a credit card transaction the
  user would be out of luck.«

Actually the spec says that we MAY keep a copy of the page for saving and
view-sourcing - just as long as we don't cache it. I don't see a problem with
that, as long as we delete it immediately when the user leaves the page.
Darin:
Please define "Common". I assume you are using it as a synonym for rare, because
a tiny percentage of sites would "rely" on this behaviour. I frankly don't see
what the problem is with us being locked out. It's been happening on many sites
during Mozilla's development. The others just get evang, why are we bending over
for these few sites?

FWIW both my banks 'solve' this 'problem' by creating a new window when you log
in and destroying it when you log out. 

Lets face it, if this isn't fixed the way it should be for the user by Mozilla 1
we're gonna stuck with this for a long while.

And if Mozilla can't currently print/view source the current page without
refetching if it is sent with no-store that is a seperate bug in Mozilla.
Clearly the current page should be in memory in a way that can be accessed for
such things.

Mozilla needs all the performance wins it can get. Reducing back/forwards
performance for a large percentage of sites for a small percentage of sites is
just bad maths.
please understand that we are talking about a very low percentage of the web
sites out there... we are only talking about websites that send a
'cache-control: no-cache' (or equivalent) header with their content.  most
websites don't set this header for the toplevel document.
See also:

bug 65947 "form password is remembered when hitting back button" (fixed as a 
form manager bug rather than as a cache bug).
bug 60877 "sign-out from hotmail and then clicking on back button lands up 
inside the user's mail account" (invalid)
bug 86264 "Content sent with 'Cache-Control: no-store' should _only_ be reused 
when browsing via history." (wontfix?)
bug 90288 "not honoring 'Pragma: no-cache' from HTTP-EQUIV". (fixed?)
bug 104162 META http-equiv for Pragma, Cache-Control with content no-cache, no-
store should remove Session History. (open)

I can't find a "fixed" bug about making no-cache break the back button.  Has no-
cache always broken Mozilla's back button, or am I just searching incorrectly?
The "Fixed" bug you speak of is bug 101832
Darin:
Is it only cache-control: no cache (and equiv) or other caching directives.
For example http://www.cnn.com/ sets:
Cache-Control: private,max-age=60
If I read click on an article, take longer than 60 seconds to read it, then
click back is a reload forced?
no-cache sites:
http://my.netscape.com/
http://my.aol.com/
http://slashdot.org/ (when logged in).
I'm sure I could go on......
paul: an expired page will not be reloaded.  back/forward behavior is equivalent
to the normal browsing behavior when "validate never" is selected in the cache
preferences panel.  this preference is overidden if the server sends one or more of:

Cache-control: no-cache
Cache-control: no-store
Cache-control: must-revalidate
Pragma: no-cache
Expires: <some date in the past>

(the 'must-revalidate' case is actually a bug, see bug 94121.)

jessy: see bug 101832.
*** This bug has been confirmed by popular vote. ***
Status: UNCONFIRMED → NEW
Ever confirmed: true
See also bug 105395, "Moving back in history doesn't remember scroll position 
sometimes", wontfixed.

Slashdot uses no-cache, probably to prevent remote (ISP) caches from 
interfering with user preferences regarding the display of comments.  I really 
doubt the site is trying to make the going "back" slower.  "Fixing" bug 101832 
caused a major perf regression at all Slash-based sites, and probably other 
sites as well.

As other users pointed out on bug 101832:
- Many banks tell the user to close the window on logout, or even close the 
window for the user.
- Banks are a small minority of sites.  They should find another solution 
rather than asking us to reduce speed and usability for all users (except those 
who habitually open a new window for each link) at a large number of sites.
- The http standard says browsers "should" ignore no-cache for session history 
purposes.  Violating an rfc "should" statement requires a good reason.
- Mozilla could honor no-cache only on https sites.  This would be kludgy but 
would satisfy most users.
i can see making this a https only behavior... perhaps that is the best compromise. 
Setting all/all, major, changing summary from "<meta HTTP-EQUIV="Pragma" 
CONTENT="no-cache"> allows Back" to "Cache-Control: no-cache should not affect 
back/forward buttons".

A different bug should be filed (if there isn't one already) about Cache-
Control: no-store not allowing the user to save the page or view the source.
Severity: normal → major
OS: Windows NT → All
Hardware: PC → All
Summary: <meta HTTP-EQUIV="Pragma" CONTENT="no-cache"> allows Back → Cache-Control: no-cache should not affect back/forward buttons
Adding dataloss, perf and topperf keywords. Nominating for 0.9.8 because I think
a bug this important should be fixed in good time before 1.0.

(BTW, will good old bug 40867 or one of the bugs depending on it fix the problem
with no-store?)
Oops... I accidentally added the 0.9.9 keyword instead of 0.9.8. Also forgot to
mention that I added the nsCatFood keyword. I am /really/ sorry about the spam. :-(
I just want to apply my statements in bug 101832 to this bug, because this
behavior (not to mention the standards violation) annoys me to no end.  Not only
is the speed hit painful, but when I go back in a slashdot-like discussion, the
reloaded thread may look very different due to new posts and changed scores, and
I have to reorient.  I am not happy with the new window/new tab workarounds
because they are too awkward and easy to forget.
I think I can accept using Cache-Control: no-store (and only that) to prevent
saving into history.  However, ideally I would prefer that

1. There be a preference to put all pages into the history buffer, regardless of
cache control directives.  From RFC 2616: "History mechanisms and caches are
different" and "History buffers MAY store [no-store] responses".
2. If a page is not stored in history due to no-store, mozilla should not
automatically reload when I go to the page in the history.  It should instead
pop up a dialog telling me it was not stored, with an option to reload or abort
(rather like the repost dialog).

Both of these make history more reliable to me as a user.
So this is why /. has slowed down... (I wasn't aware that this particular
behaviour had changed when I commented on this thread in the newsgroup).

darin: We need to back the original patch out. I don't recall a long debate on
this particular issue, and the closest I remember to a discussion on this was
that back/forward would stay as-is, precisly because of all the dynamic sites
which send no-cache headers. If a bank's database system actually allows users
to press the logout button, press back, and then reuse the existing session to
do trasactions, then it is the bank which is broken.

Have there actually been complaints on the scenario you gave in comment #7? If a
bank's database system actually allows users to press the logout button, press
back, and then reuse the existing session to do trasactions, then it is the bank
which is broken.

no-cache sites are not a minority; no-store is, and I have no problem in doing
this for no-store. I wouldn't object too strenuously to making this https only,
though. The current situation is however completely broken, and a major bug from
both a performance and usability point of view.

The fact that this means thet scroll position isn't restored is _extremely_
annoying, although I realise that that is, strictly speaking, a separate bug.
I can understand Netscape's worry about being dismissed as broken when it
doesn't work like IE to most out there. I can also understand (and support) the
Mozilla team's desire to make the browser work the way it is SUPPOSED to work.

Can't this be solved via prefs without risking forks and patch wars? Then
Netscape can distribute a prefs.js file that has this broken behavior set on by
default and mozilla and other vendors can set it in their distro the way they want.

It doesn't have to have a corresponding UI element and I'd suggest it doesn't
anyway. Too many settings can overwhelm most people...
This must not be done via a pref. Partly because we don't need Yet Another Pref,
and partly because banks don't want any way of this happenning, even via a
hidden pref.
Blocks: 115520
We need to get this bug targeted at 0.9.8 and fixed soon.  Darin, can you do it?
 If not, I'm sure there are others who will take it.  I like Paul McGarry's idea
(at least, I saw him propose it first in
http://bugzilla.mozilla.org/show_bug.cgi?id=101832#c19) of distinguishing https
from http, and breaking Back only for https responses with no-cache.  It sounds
like you and bbaetz find that a good compromise, too.  If so, let's go!

/be
No longer blocks: 115520
Blocks: 115520
Oh, and do we need evangelism bugs filed against bogo-banks?  Cc'ing bclary.

/be
No longer blocks: 115520
Blocks: 115520
i'll take this for 0.9.8.  we shouldn't hit too much resistance if we choose to
modify the behavior for HTTP only.

but, that being said, there are two separate issues to consider.  one is
document-caching, and the other is layout-state-history.  here's my thoughts on
both of these...


document-caching:

neither NS4x nor IE5/6x store no-cache documents in their cache.  this means
that most internet users and web site designers are accustomed to back/forward
hitting the net for pages marked with no-cache.  this is the "standard" despite
the fact that the spec allows an user-agent to store such responses in its cache
for the purposes of back/forward.  breaking the "standard" in an effort to
follow RFC2616 may significantly reduce effective page views for some sites, and
could legitmately be termed an afront on their business model (think ad revenue)
-- i don't like this at all, and though i may be reaching here, i somehow
wouldn't be surprised if something like this came up one day.  does mozilla want
to deal with this?  does the mozilla community think that this is a no-issue? 
(ever wonder why /. bothers to send a no-cache header?)

now for HTTPS connections, if we stop pinging the server on back/forward for
no-cache pages, then several large online banks will block mozilla.  of course,
we could evangelise them to send no-store instead, but beware that this means
that mozilla will have no means by which to File->SaveAs such content to disk
w/o hitting the server (eg. without transfering funds from account A to account
B *again*).  are banks going to be happy if users cannot save a HTML copy of
their transaction receipts (for example) to their own harddrive?  maybe banks
don't care... but as a user, wouldn't you care?  i would.

you might ask: how does NS4x handle this problem?  Well, NS4x ignores
Cache-control: no-store, since it is a HTTP/1.1 specific header.  Ok, well..
what about IE5/6x then.. it speaks HTTP/1.1.. so, what about it?  Well, it seems
that IE knows how to generate HTML from its content model.  Try saving a
no-cache document from IE... you'll notice that it isn't exactly what you put on
the server ;-)  perhaps mozilla needs to know how to do this as well.


layout-state-history:

this includes things like scroll position and form values.  NS4x and IE5/6x do
not "remember" form values when you return to a no-cache page via the
back/forward buttons.  in mozilla, i believe it was convenient to disable saving
all layout state in order to achieve the same effect.  however, in mozilla this
has the side-effect of not restoring scroll position on back/forward to a
no-cache page.  it seems to me that mozilla should make a distinction here
between saving scroll position and saving form values.  or, perhaps for HTTP
connections, we could simply save all layout state, and not save any layout
state for HTTPS connections (w/ a bug filed to get the more preferred behavior
of only discarding form values).  (also: please let me know if there is an api
for saving only portions of layout state.  afaik, there is only a boolean
all-or-nothing flag currently.)


so what is the solution??

for mozilla0.9.8, i'd suggest the following:
1) only disable saving layout state if page is served up with a no-store header
or if page is served up with a no-cache header and the URL is https://
2) keep caching behavior as is.
3) file bug(s) to make mozilla smart enough to generate HTML from its content model.
4) file bug(s) to make docshell only discard form values (and not other layout
state) on no-cache pages.

long term solution (once 3 & 4 from above are resolved):
1) only disable saving _all_ layout state if page is served up with a no-store
header.
2) disable saving form values if page is served up with a no-cache header and
the URL is https://
3) figure out if we "can" disable validating no-cache content on back/forward.

long term we might also want to provide a kiosk mozilla configuration that banks
can "trust" in multi-user environments (such as in a library), and provide a
separate "home" version that is optimized for the single-user.  such a thing,
however, is probably just pie-in-the-sky :P
Priority: -- → P1
Target Milestone: --- → mozilla0.9.8
A comment on solution #3.  We should make it clear that this is _not_ a request 
to generate the "original" html based on the content model when we file the bug.
Once #3 is done, save as could try to use cache and use the content model if 
that fails.  Darin, is that the basic idea?  This proposed course of action 
looks promising....
bbaetz:

the banks in question didn't depend on no-cache to prevent users from logging
out and then pressing back and then issuing new transactions on an old session.
 that is not the problem here.

banks blocked mozilla/netscape6 because: if a user pressed logout, and left
their computer, another user could come up to their computer and press back and
then see the previous users bank account statement.

it's reasonable to think that a user familar with IE or NS4x would not think to
protect themselves by closing the browser window.  afterall, they would most
likely recall that pressing back results in a login prompt.  ah hah!  not so in
 mozilla/netscape6 (prior to the patch for bug 101832)!  now my snooping room-
mate knows that i <fill in the blank> ;-)
bz: yes, that is exactly what i am proposing :-)
I believe that the form stuff is will be solved once it all moves into content,
rather than layout. Most, if not all, of this work has been done by jkeiser.
This may now be a non-issue, in fact. Besides, the form stuff should depend
solely on the autocomplete="off" attribute - the other stuff was added
piecemeal, as I recall.

jkeiser: whats the current state of this?

Does IE really hit the server with an unconditional request for the new document
when going back/forward, on a no-cache page?

/. has an image url which involves the current time, so the presense/absense of
no-cache shouldn't be an issue, unless we end up caching the entire document for
speed, but thats separate to this discussion. Or unless you have js enabled, I
guess - then it uses the server time. I guess we could only do this for the top
level document, but that sounds like too much of a hack.

Also, I presume that your first sentence meant to refer to https, not http
And the snooping roommate could just install a hacked mozilla version. Or open
the mail before the roommate, and read the bank statements the old fashioned way.

But yes, point taken for https.
Oh, and do we want to throw cache-control: private into the mix? (Yes, this
isn't what its meant for, I know. But neither is anything else)
> 3) file bug(s) to make mozilla smart enough to generate HTML from its content
> model.

No! If I want to save a page, I want to save the page with all the original 
HTML. I don't just want something that looks the same. If that was the case, I 
would take a screen shot.

We should ALWAYS store the page in memory so we can save it and view source on 
it.
Would it be a good idea to limit the discussion to the rolling back of the
previous change in bug 101832 (for plain http at least) and file other,
seperate, bugs on:

a)Mozilla's apparent inability to print/view-source/save the current page if
no-store is used. The _current_ page should _always_ be available for these
function, if it isn't then we have a bug.

b)Other potential back/forwards speed-ups (Opera appears to somehow store fully
rendered pages (an entire 'canvas'?), if you select some text, go back, select
some text and go forwards and back even the selections are remembered and the
whole thing is lightening fast too.)

c)Other potential items to make the new (or old, but not current ;)
functionality clearer to the user. (such as a warning dialogue if you attempt to
resubmit a form from a page in history).

d)A nice, "Clear private data"/"Logout" function which returns the browser to
it's homepage and kills all cookies/cache files/history for kiosks, public
terminals, paranoid roomies.
Paul McGarry: Yeah, I think it would be a good idea. I filed bug 115610 on the
no-store thing.
ok, i thought about this problem a little bit more, and it turns out that
because of our current back/forward impl, it is trivial to enable caching of
no-store content (in the memory cache only).  no-store content would only be
retrievable when using the LOAD_FROM_CACHE load flags, which correspond to
File->SaveAs and View->PageSource.  AFAIK there is no problem with printing
no-store documents, because printing uses the content model directly.

jonas: ic that you filed bug 115610 for no-store issues, and i think that there
is no need for a separate bug.  i'd prefer to use this bug for _the_ 0.9.8 patch.

bbaetz: i tested IE5.5 under win2k just now, and it does indeed refetch the
toplevel document from slashdot when logged in.  however, unlike mozilla it
remembers layout state when going back/forward.  also, i don't believe that
autocomplete=off is a complete solution since NS4x didn't honor autocomplete=off.
NS4x triggers off of "Pragma: no-cache" and i think so should we.  it's also
good to hear that there may be hope for distinguishing saving form values from
saving scroll position in the short term :)  oh, and my second sentence of
comment #27 does indeed refer to HTTP and _not_ HTTPS.  my point being that
making changes which effect HTTPS are more likely to upset websites than making
changes to HTTP
since with HTTP there are far fewer security concerns.
darin: Ah, so you meant modify the bevhaour back to what it was for https, so
that we only differ from 4.x in http ;)

ns4 didn't recall form data for no-cache; forms would be required to use both
autocomplete=off and pragma: no-cache. They need both anyway so that password
manager doesn't trigger - see my comments in those bugs about this issue (which
is separate)
*** Bug 115610 has been marked as a duplicate of this bug. ***
The form history is still being saved/restored in frames, in exactly the same
way (there is a bug to move this into content).  Even when we move it to
content, though, I don't expect *how* we store the data to change.  Fixing this
to honor no-cache for save/restore should be a fairly simple job--just make the
save state stuff avoid saving if no-cache is true.
> LOAD_FROM_CACHE load flags

As a matter of fact, this is not a valid flag on nsIWebNavigation, which is what 
view source has to use... Would using LOAD_FLAGS_NONE on nsIWebNavigation lead 
to LOAD_FROM_CACHE being set on the channel?
bz:

actually, view source is badly broken right now... it needs to not only load
from cache, but also it needs to know the cacheKey stored in session history. 
without the cacheKey it is impossible to view source the result of POST
transaction.  i don't think view source should be using nsIWebNavigation... why
doesn't it talk directly to necko?

and fwiw: nsIWebNavigation::LOAD_FLAGS_NONE does not correspond to
nsIRequest::LOAD_FROM_CACHE.
Attached patch v1.0 patch (obsolete) — Splinter Review
ok, here's first draft patch for this bug.  it does the following:

- we'll now put no-store pages in the memory cache that can only be retrieved
with the LOAD_FROM_CACHE load flag.

- introduces nsIHttpChannel::IsNoStoreResponse() and nsIHttpChannel::
IsNoCacheResponse().  these methods simplify the docshell implementation.

- layout state discarded only if (no-store || (no-cache && ssl))

- fixes bug 94121 'must-revalidate means ignore VALIDATE_NEVER'

- fixes part of bug 103944 'no-cache != no-cache=foo'

- nsHttpResponseHead no longer stores strings using nsXPIDLCString to reduce
footprint ever so slightly.
Blocks: 103944
Blocks: 94121
Comment on attachment 62008 [details] [diff] [review]
v1.0 patch

>Index: docshell/base/nsDocShell.cpp
>===================================================================

>     if (cacheToken) {
>         // Check if the page has expired from cache 
>-        nsCOMPtr<nsICacheEntryDescriptor> cacheEntryDesc(do_QueryInterface(cacheToken));
>-        if (cacheEntryDesc) {        
>+        nsCOMPtr<nsICacheEntryInfo> cacheEntryInfo(do_QueryInterface(cacheToken));
>+        if (cacheEntryInfo) {        
>             PRUint32 expTime;         
>-            cacheEntryDesc->GetExpirationTime(&expTime);         
>+            cacheEntryInfo->GetExpirationTime(&expTime);         
>             PRUint32 now = PRTimeToSeconds(PR_Now());                  
>             if (expTime <=  now)            
>                 expired = PR_TRUE;         

Why this change?

>Index: netwerk/protocol/http/public/nsIHttpChannel.idl
>===================================================================

IS hard coding this in the idl, and letting docshell manage this, a good idea?
Is it possible for http to handle
the login internally?

>Index: netwerk/protocol/http/src/Makefile.in
>===================================================================
>RCS file: /cvsroot/mozilla/netwerk/protocol/http/src/Makefile.in,v
>retrieving revision 1.54
>diff -u -r1.54 Makefile.in
>--- netwerk/protocol/http/src/Makefile.in	16 Dec 2001 17:01:42 -0000	1.54
>+++ netwerk/protocol/http/src/Makefile.in	17 Dec 2001 21:39:14 -0000
>@@ -36,6 +36,7 @@
> 		  intl \
> 		  exthandler \
> 		  caps \
>+          timer \
> 		  $(NULL)

You need to use tabs here, for the Makefile, since ISTR some versions of make
complaining about mixing
tabs and spaces. ICBW.

>Index: netwerk/protocol/http/src/nsHttpChannel.cpp
>===================================================================

>@@ -905,7 +905,9 @@
> end:
>     mCachedContentIsValid = !doValidation;
> 
>-    if (doValidation) {
>+    // add validation headers unless the cached response is marked no-store...
>+    // this'll force no-store content to be refetched each time from the server.
>+    if (doValidation && !mCachedResponseHead->NoStore()) {
>         const char *val;
>         // Add If-Modified-Since header if a Last-Modified was given
>         val = mCachedResponseHead->PeekHeader(nsHttp::Last_Modified);

I thought that we didn't want to do this for no-store because then people could
go to about:cache and get the data?


>+PRBool
>+nsHttpResponseHead::MustValidateIfExpired()
>+{
>+    // according to RFC2616, section 14.9.4:
>+    //
>+    //  When the must-revalidate directive is present in a response received by a   
>+    //  cache, that cache MUST NOT use the entry after it becomes stale to respond to 
>+    //  a subsequent request without first revalidating it with the origin server.
>+    //
>+    const char *val = PeekHeader(nsHttp::Cache_Control);
>+    return val && !PL_strcasestr(val, "must-revalidate");
>+}
>+

This won't affect saving, and so on, will it? Will it affect session history?

>
>+void
>+nsHttpResponseHead::ParseCacheControl(const char *val)
>+{
>+    if (!val) {
>+        // clear no-cache flag
>+        mCacheControlNoCache = PR_FALSE;
>+        return;
>+    }
>+    else if (!*val)
>+        return;
>+
>+    const char *s = val;
>+
>+    // search header value for occurance(s) of "no-cache" but ignore
>+    // occurance(s) of "no-cache=blah"
>+    while (s = PL_strcasestr(s, "no-cache")) {
>+        s += (sizeof("no-cache") - 1);
>+        if (*s != '=')
>+            mCacheControlNoCache = PR_TRUE;
>+    }
>+
>+    // search header value for occurance of "no-store" 
>+    if (PL_strcasestr(val, "no-store"))
>+        mCacheControlNoStore = PR_TRUE;
>+}

This isn't a total fix for that other bug (Cache-Control: no-cache-please will
still be matched), but
thats OK.

Fix these, and if you've tested well (including all of arun's test pages),
r=bbaetz
Attachment #62008 - Flags: review+
Status: NEW → ASSIGNED
bbaetz: thx for looking at the patch...

1- change from nsICacheEntryDescriptor to nsICacheEntryInfo because docshell
doesn't need to know that it is a descriptor.

2- i don't understand your question about nsIHttpChannel... http doesn't know
anything about session history, so i need to provide docshell with the necessary
information to let it decide what our layout-state-remembering policy should be.

3- whoops.. that wasn't meant for this patch... a side effect of updating _only_
http/src after pav's timer landing.

4- most users don't know anything of about:cache and moreover there is no UI for
it.  it is essentially only a development feature.  i think we can live with
allowing users to access no-store content saved in the memory cache via 
about:cache.  think of it as a type of session history ;-)

5- MustValidateIfExpired will not effect saving, view-source, but it will effect
back/forward.  so, if a site serves up a page with must-revalidate, and the user
presses the back button, then if the page is stale, we'll first validate the
page with the server before displaying it.  this is consistent with honoring
no-cache on back/forward.

6- right, i said that this patch provides "part" of the fix for bug 103944.
2) I'm just worried that we're giving docshell too much information, and its
making assumptions which are only going to be valid for http. What if HTTP/1.2
introduces another token? Will we need to update all the callers of these? But
yes, its probably not that much of a concern - http-specific code is everywhere.

4) Banks Are Paranoid - thats why we have this bug. Wasn't someone concerned
about charts which you could access if you knew the exact image url? Or was that
something else?

5) "this is consistent with honoring no-cache on back/forward." Yes... But we're
not honoring no-cache on back/forward any more - thats teh point of this bug.
I'm fine with behaving differently for must-revalidate, though
bbaetz:

2) i'm not sure we can do any better

4) i don't recall a discussion about charts, and i'm not convinced that it's
wrong to store no-store content in the memory cache for the lifetime of the
browser session.  the spec explicitly does not object to this at least ;-)

5) see my comments above... how can we disable honoring no-cache on back/forward?
fwiw: IE6 and NS4x honor no-cache on back/forward, but opera does not.  so, i
suppose we wouldn't be alone in dropping support for no-cache on back/forward!
4) Maybe Arun remembers, or maybe it was jsut a hypothetical someone mentioned.

5) Err. We _are_ disabling no-cache on back/forwards. Thats the point of this
bug, and is what your patch does. What am I missing?
OK, so I spoke to darin on IRC, and looked at the patch again.

This is fine. Its not ideal, but we'll see if people still notice a difference.
In comment #50, Bradley Baetz wrote:
> OK, so I spoke to darin on IRC, and looked at the patch again.

Could one of you please clarify for the rest of us?  From Darin's
description of the patch (sorry, I don't have time to understand the patch
itself), I did not understand how it affects the central issue of this bug.

Later, Darin said: "this is consistent with honoring no-cache on
back/forward."  I thought everyone agreed on the opposite, at least for
HTTP.  I hope I'm just confused.

> This is fine. Its not ideal, but we'll see if people still notice a
> difference.

Please, none of this condescending "see if the users notice".  Users 
patently want a fast history.  If the patch only "fixes" certain cases, the
issue will recur as soon some site uses different cache control headers,
causing back/forward become slow again.
andrew:

please read comment #27

please consider what i've said there before blindly thinking that i'm trying to
pull a fast one on you ;-)

fwiw: the growing concensus from those i've discussed this bug with (offline)
seems to be that YES we should not worry about the points i mentioned in comment
#27, and for that reason i am preparing a patch that would indeed disable
honoring no-cache on back/forward for HTTP only (not HTTPS) as was originally
discussed (see comment #17).  my point was that we should carefully consider
this break from the norm.
Darin, the last messages from you and Bradley seemed to suggest that you
were nearing closure on this issue.  Sorry if I was too impatient.

I did read comment #27, but it only increased my uncertainty, because you
said at the end that your short term plan was to "keep caching behavior as
is". 

Regardless, thanks for the update on your plans, and I'll try not to 
distract you from your current patch.
bbaetz: i've started noticing makefiles not using tabs for those lines. afaik 
no port has gone red and these non tabs are scattered around the tree.  I think 
the magic is in the \ line continuation character. I think it causes those 
lines which we see as having tab critical whitespaces to collapse into a single 
line which is then used as a simple variable. From looking at the .m*k files, 
the variable is probably just iterated over.

cls: am I right?
so after disabling honoring no-cache on back/forward, i went to slashdot.org
(with a login cookie set) and noticed that on back/forward mozilla still hits
the server.  turns out this is due to the javascript on the page that
document.write's <img> tags (i recall bbaetz already mentioned this).  the best
part is that this corresponds to none other than the ad banner at the top of the
page.  what's even more fun is the fact that the resulting image from slashdot
is always the same.  (it probably changed before when we were refetching the
toplevel document each time.)  why can't slashdot just return a 304?  oh well...
timeless: yup you are correct, but it doesn't matter for this patch since the
timer module doesn't exist anymore.
Yay! Seeing the same advert again is progress. I can remember at least a couple of occasions where 
I've tried to back up to click on a particular advert and found myself looking at a different, 
uninteresting advert instead.

If we can make Mozilla pull out fully rendered documents from 
history rather than re-building and re-executing javascript we're onto a real speed win I'd have 
thought...

timeless: old make (v7 Unix, IIRC) required a leading tab for all command lines
following a dependency, where the dependency and its commands form a rule.  I
bet this tab requirement is still with us.  dbaron and I try to use tabs in Unix
Makefiles (Makefile.in sources, actually) only, never in .cpp or .h files. 
There is no dispute about how many spaces a tab expands to on Unix (or if there
is, it doesn't matter when editing makefiles, because only one tab is ever used
to indent command lines in rules).

There is no magic about tabs in lines continued by backslashes, unless those
lines form a single command-line in a rule, in which case (as for commands that
fit one per line in rules), you must have a leading tab.  Or so it was for years
on all make(1) programs I had to deal with on Irix and other Unixes.

/be
Attached patch v1.1 patchSplinter Review
this patch disables honoring no-cache on back/forward for HTTP connections.
HTTPS connections will still honor no-cache on back/forward.
Attachment #62008 - Attachment is obsolete: true
*** Bug 116830 has been marked as a duplicate of this bug. ***
*** Bug 116570 has been marked as a duplicate of this bug. ***
Comment on attachment 62334 [details] [diff] [review]
v1.1 patch

fix up that macro I mentioned, other than that, r=dougt.
Attachment #62334 - Flags: review+
Comment on attachment 62334 [details] [diff] [review]
v1.1 patch

Tricky doValidation logic in the no-store case, but I'm not able to make it
clearer.  sr=brendan@mozilla.org.

/be
Attachment #62334 - Flags: superreview+
fixed-on-trunk :-)

ok, so this patch does the following:

1) don't save layout state if (no-store or (no-cache and ssl))

2) put no-store content in cache to be retrieved only for file->saveAs and
view->source.

3) honor 'Cache-control: must-revalidate'

4) honor 'Cache-control: no-cache=Some-Header'

5) back/forward no longer requires server hit for pages marked no-cache unless
SSL.  pages marked no-store will be refetched from server on back/forward.

this solution eliminates the need for mozilla to know how to generate HTML for
File->SaveAs since the cache will nearly always have a copy.

so, i'm going to mark this bug fixed :-)
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Perhaps a doc writup on how our Cache now works would be useful (to
developers/banks) and interesting (to powerusers)?

I'll grab tomarrows build and make sure nothing bad is happening, but let me
make sure I understand how we're supposed to work, so I'm not verifying the
wrong things.

Normal page: back/forward/save/print/source don't touch the network, ever. Even
if the page is "expired". If it's been purged from the cache, back/forward would
automatically refetch, but if the page is still open in a webshell,
save/print/source will ALWAYS have a local copy available. All layout state
remembered.

page with no-cache, non-ssl: Same as above? Would no-cache with mozilla for http
only affect intervening proxies?

page with no-cache, SSL: Only cached for save/print/source use. No layout state
saved. Back/forward refetch (refetch, or compare?).

no-store, http or https: Same as above? Again, only affecting proxies?

How does this interact with the annoying POSTDATA warning? Are POST results
supposed to be cached for session history for normal and http no-cache?

Thanks.
> Perhaps a doc writup on how our Cache now works would be useful (to
> developers/banks) and interesting (to powerusers)?

one of these days... yes ;-)

> Normal page: ... but if the page is still open in a webshell,

no... just because a page is open in a browser window that does not guarantee
that the page is going to be in the cache.  it most likely will be in the cache
if you recently loaded that page in that browser window, however.

> page with no-cache, non-ssl: Same as above? Would no-cache with mozilla for 
> http only affect intervening proxies?

same as above for back/forward/save/view-source/print.
not sure what you mean by "only affect intervening proxies"

> page with no-cache, SSL: Only cached for save/print/source use. No layout 
> state saved. Back/forward refetch (refetch, or compare?).

compare end-to-end, if given a Last-Modified response header.

> no-store, http or https: Same as above? Again, only affecting proxies?

always refetch end-to-end no-store response unless save/view-source/print.

> How does this interact with the annoying POSTDATA warning? Are POST results
> supposed to be cached for session history for normal and http no-cache?

caching of post data follows the same rules.  the only difference is that there
is a secondary part to the cache key to identify the particular POST instance
(you can make several POST requests to the same URL).  session history is
supposed to remember the secondary key, and use it when going back/forward.  it
is also supposed to be used for save/view-source/print, but there are definitely
bugs in the current implementation.
> Perhaps a doc writup on how our Cache now works would be useful (to
> developers/banks) and interesting (to powerusers)?

From a developer's point of view, this would be extremely valuable.  I can
assure you that developers waste a lot of time being befuddled by the
caching behavior of browsers, and frequently misuse HTTP headers and URL's
in an attempt to achieve their aims.

Can I give this a shot?  You may freely ignore it if it's sufficiently
off-base or subject to future change; but I think it is at least a decent
template for a more complete and accurate document.  (Please tell me if
there is a better place to send this than to this bug log.)

    Mozilla stores locally (either in memory or on the filesystem) a variety
    information about resources it loads.  [ Does mozilla treat all
    protocols the same way for this purpose?  Ie, are file:// URL's cached?
    Presumably, "internal" protocols are not cached. ]  Here, "resource"
    refers to any sort of data, but documents in HTML or XML format are
    important special cases, which I will refer to as "HTML pages" for
    short.

    The raw "source" of a resource is the most obvious example of such
    information.  Another is the rendered form of a resource, ie a data
    structure generated by processing the source into something more
    suitable for displaying.  For an HTML page, this is [ what?  I don't
    know anything about mozilla/gecko's internal data structures. ]; for an
    image, it is the pixel array [ guessing ].  [ Of course, in general
    there may be many levels of "cooked" forms of resources; I don't know
    what exists in mozilla, or what is relevant to this discussion. ]  For
    HTML pages, there is also the "layout state", containing information
    such as the scoll position and any form entries.

    This document attempts to describe fully what of this information is
    used when.  The format will be to consider every scenario in which
    information about a resource is wanted, and for each, explain which
    locally stored information may be used, and how.

    Note that HTML pages (and maybe other types) often include other
    resources, such as images, script files, stylesheets, and frames.  These
    will be referred to as child resources.  In most cases, what is said
    about the main resource applies also to child resources, so children
    will only be mentioned when they are handled differently.  [ I'm
    assuming here that all types of child resources are dealt with in the
    same way, though I suspect it's not true.  I'm not sure that this whole
    matter is relevant--see the reload question. ]

    Case 1:  Displaying the current resource of a window or tab.  Mozilla
    always retains the fully-rendered form of all current resources, so no
    information is ever re-loaded, re-rendered, or lost.

    We may as well note here that the fully-rendered form is never stored
    for any other purpose, so in all other scenarios, the resource must be
    re-rendered from its raw form.  [ Could this change in the future?
    People suggest that Opera caches a parsed or rendered version for
    performance. ]

    Case 2:  Resizing a currently displayed resource.  [ I mention this
    mostly because of Netscape 4's awful behavior in this case.  Does
    mozilla re-render from source, or do something smarter? ]

    Case 3:  Saving, viewing source, or printing of a currently displayed
    resource.  The raw sources of current resources are always available in
    the cache for this purpose.  [ Is this guarenteed?  Ie, is it possible
    to load a page, then do lots of browsing in another window, and find the
    source of the first page has fallen out of the cache when I try to view
    source? ]

    Case 4:  Using the back and forward buttons to go to a resource.  [ How
    is the Tasks|Tools|History dialog related, if at all?  I hesitate to use
    the word "history" in this section, for fear of confusion. ]  The raw
    source is unconditionally retrieved from cache (if available) unless
    either 1) the resource was served with a Cache-Control: no-store header,
    or 2) the resource was served over HTTPS with a Cache-Control: no-cache
    (or Pragma: no-cache) header; in these cases, the source is revalidated
    (if possible) or reloaded.  (The headers may be true HTTP headers, or
    <meta http-equiv=...> elements in an HTML document.)  The layout state
    is retrieved from cache (keyed off the back/forward entry) [ Is this
    true?  If I have viewed a page in different windows, or twice in the
    same window, is the layout state preserved independently for each? ] if
    the raw source was taken from cache (either directly, or as the result
    of revalidation).  [ Are there any cases where the source is reloaded,
    but the layout state is used?  Is the layout state used for anything
    besides back/forward? ]  Note that a back/forward entry contains a URL
    and a POST instance (if relevant), and that cache access is keyed off
    both values.  [ Hmm... that is really not right:  It should simply key
    off of the back/forward entry, because a single non-POST URL could have
    different contents at different times, but this is currently lost. ] [
    Does mozilla know to throw away cached POST's when the browser exits? ]

    Case 5:  Loading a resource by typing a URL, following a link, or via
    scripting (eg, Javascript window.location) [ Does scripting add any
    special cases to this discussion? ].  If a cached copy of the source is
    available, mozilla implements the caching behavior described in the HTTP
    specification, in that it always reloads and revalidates in cases where
    the spec requires it.  [ How close is mozilla to the spec? ] [ Perhaps
    the rules should be sketched here; the spec is fairly dense. ]  It may
    also revalidate (if possible) or reload when the spec doesn't require
    it, as controlled by the Advanced|Cache preferences.  [ I don't
    understand these preferences, especially "Automatically". ]  Otherwise,
    the cached source is used.

    Case 6:  Loading a resource via reload button or Javascript
    window.reload.  [ Is there a difference between reload and shift-reload?
    I thought that historically, shift-reload would reload images, but that
    plain reload would only reload the main page. ]

    Case 7:  Loading a resource via shift-reload.  The resource is reloaded
    (not revalidated).
instead of using this bug, let's create another bug for this.  the aim being to
post it to mozilla.org eventually.
*** Bug 111793 has been marked as a duplicate of this bug. ***
Everybody was against it when this braindead behavior was first introduced, and now ten years (and dozens of bug reports about this same issue) later, this _still_ isn't fixed.

Does Darin even still "work" for Mozilla. Can't we just yank this silliness out? Indeed in these early bugs about the issue (this one, and bug 101832) every single contributor (apart from Darin Fisher) was against this asinine behavior.

If any banks still want to block us over this, well, then JUST LET THEM.

> and online banks will block browsers if they behave this way... maybe not right away, and maybe not all at once... but they will and they have.

Hey, if they won't block us all at once, that's great news! That gives us, the customers, ample opportunity to send a clear signal by moving to a bank that is still customer friendly.

Question: it's now almost 10 years since this misfeature has been existing. However, thanks to "Work offline" it doesn't even meet its goal. Indeed, if you tick "Work offline" just before going back, then you still see your bank account. And, have many banks blocked us over this during these past ten years?
> Does Darin even still "work" for Mozilla.

No.  He works on Chrome.
I also don't understand the point of your anti-Darin rant, since what he changed in this bug was to make no-cache NOT affect back/forward in many cases when it previously used to affect it.
(In reply to comment 72)
> No.  He works on Chrome.

Great!

(In reply to comment 73)
> I also don't understand the point of your anti-Darin rant,

Well, it looks as if he was one of the main defenders of the current behavior (cf early comments on this bug). Worse: judging from bug 101832, it seems that he introduced the problem in the first place!

> since what he changed in this bug was to make no-cache NOT affect back/forward in many cases when it previously used to affect it.

Good, if that was indeed the case, but why not in _all_ cases? It's not as if this was a change touching hundreds of source files and subtle branches?
Still broken in 7.0.1 . We're nearing the 10th anniversary on this...
12th anniversary has passed.

Can we create a fund for fixing this bug? I'll put in $50 to start. I don't even care if I have to apply a patch and compile my own Firefox and maintain a fork from now on because upstream won't accept it. I just want it fixed. Or do the equivalent for Chrome and I'll use that. Seriously, if Chrome didn't have the same bug I would just delete Firefox immediately and be done with it.

To earn the reward, supply a patch to implement all of RFC2616 section 13.13 including the SHOULDs, using the MAY in section 14.9.2 paragraph 2 that allows the history mechanism to ignore the no-store directive, without breaking compliance with any other caching rules. The motivating sentence from the RFC is: "History mechanisms and caches are different." There are lots of kludgy workarounds, all of which ignore the root cause of this bug: failure to distinguish history from cache.

The test will be simple: back and forward buttons must show the page exactly as it was shown before, without generating any network traffic. Even if the server is https and sends every known back-button-thwarting header. If that works, and I can't find any other behavior that has changed as a side effect, you win.
Actually, I had posted a patch for this to bug #261312 (one of the many aliases for this issue...)

https://bug261312.bugzilla.mozilla.org/attachment.cgi?id=542004

not sure whether it is still good (i.e. applies to current versions) though, probably not...
So this is why back/forward is so slow.

It seems so many sites now days are been anti cache, e.g. the bbc for whatever odd reason are forcing reloading of static content.

Please make it so back/forward loads 100% of page from cache.  Either via pref or default.
If worried about security a suggestion is to make it only work if the cache is flushed upob browse close/restart so it cannot load cache from previous sessions.
its almost 14 yrs now. is there any configurations to load from cache.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: