Closed Bug 569142 Opened 14 years ago Closed 13 years ago

Going "back" to a page not stored in session history should show an info (error) page rather than trying to reload the page

Categories

(Firefox :: General, defect)

x86
macOS
defect
Not set
major

Tracking

()

RESOLVED WONTFIX
Tracking Status
blocking2.0 --- -

People

(Reporter: me, Unassigned)

References

(Depends on 2 open bugs, Blocks 1 open bug, )

Details

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKit/531.22.7 (KHTML, like Gecko) Version/4.0.5 Safari/531.22.7
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3

When the user is expecting to see what they saw before, Firefox shows something new.  FIrefox also fails to provide any indication to the user that a reload occurred (unless the page is slow to load, or the reload fails), much less any indication of why it would perform a reload.

This is particularly bad when it invokes bug 160144, which brings up a modal dialog when reloading a POST request.  This bug can be invoked when going back to pages that are correctly not stored in session history, or by pages that are not stored due to another bug (e.g. bug 567365).

Reproducible: Always

Steps to Reproduce:
1. Note the time stamp at http://lll.lu/~aknaff/bug-reports/mozillaNoStore/no-store.cgi
2. Follow the "continue" link
3. Hit back, and see a different timestamp
Actual Results:  
The page is reloaded, showing updated content rather than the expected historical content

Expected Results:  
Either the historical content or an infopage explaining why the historical content is not shown

This is a spinoff from bug 160144, comments on which reflect user frustration from a combination of misbehaviors, of which this is one (see bug 160144 comment 161 and 162).  The demonstration URL is from bug 160144 comment 148 (more cases are available through http://lll.lu/~aknaff/bug-reports/mozillaNoStore ).  This bug can be invoked by 567365 (at least), and may be responsible for bug 496232 (back-to-redirect = failure to go back).
Status: UNCONFIRMED → NEW
blocking2.0: --- → ?
Ever confirmed: true
Summary: Going "back" to a page not stored in session history automatically reloads the page → Going "back" to a page not stored in session history should show an info (error) page rather than trying to reload the page
Neil posted a PoC patch in bug 160144 comment 171.
That patch only defers the postdata warning until you hit reload. This bug seems to be about caching pages in session history more aggressively.
Not after I changed the summary to disambiguate ;)
I intended this bug to be about reloading inappropriately, not about failing to cache in session history; failing to cache should be filed separately, e.g. bug 56735.  The patch is definitely relevant.
This infopage should inform the user that going one further back in the history should take them to the form that was POSTed, which is the thing to do if they want to resubmit with different form entries.

re bug 322984, this infopage should also have a button to GET the URL rather than resend the POST.

re bug 493438, this infopage should say why the page wasn't cached.

If this infopage includes all of the options provided by the UI that resolves bug 160144, bug 322984, and any other bugs about the reload-POST response, should the re-POST button on this page bypass that UI?

I thought I'd seen a bug about revealing and editing POST data prior to sending (or re-sending) a POST request, but now I can't find it; when this is available it could be embedded in this infopage as well.

This should block bug 160144, possibly also bug 28586.
Guys, the basic problem is that FF does not store how the page looked like on my monitor when it was displayed. Instead, it tries to re-render it again, from the source (HTML). Since re-rendering means to re-load it and the source was not allowed to be "cached", there's no cached copy anywhere around, so it's downloaded again. This is the base of all troubles.

I guess the only bugs that I have found pinpointing this basic deficiency are mine (bug 200208) & Dan B's (bug 288462). All the other people are comtemplating about what should take place in case the page cannot be displayed. I personally don't think this case should be allowed at all; if the history entry is there, the content should also be there - so there should be no barriers to redisplay it again. And then there would be nothing to contemplate about - the case when "the page cannot be displayed" should be completely eliminated.
Re-rendering from html does not imply re-retrieving the html from the server.

There should be a difference between the "history" (back/forward only) and "cache" (also used for new requests), and there should be more kept in history than cached - maybe the cache should be a subset of the history, - but it is not possible to guarantee that nothing will ever be dumped from the history.  It should be possible to correct some cases where pages are dumped incorrectly (e.g. bug 567365), but there will always be the possibility of dumping historic pages when the buffer is full.  The browsers behavior when navigating the history to a dumped page must be well defined, and currently is problematic.

Personally, I'm not sold on the idea that everything should always be stored.  I don't see cases where data should self-destruct, but I don't work with sensitive information (medical records, trade secrets, troop positions...).  I agree that the headers to prevent history retention are overused, but I think it would be a mistake for the protocol to not have that feature (or, almost equivalently, for a browser to ignore that feature).

Adding dependencies:
Depends on 200208:  Clearly distinguish between history & cache
Blocks 288462: [meta] Don't reload without explicit request
Blocks: 288462
Depends on: 200208
blocking2.0: ? → -
this bug is still present in current release 4.0.1; try adding a review on https://addons.mozilla.org/ for example
Wontfix. In the vast, vast majority of cases, grabbing the URL again from the server is the next best thing to having it cached.

In the specific case of session history entries that require resubmitting POST data, we are going to use an error page (see recent comments in bug 160144).
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
For the vast majority of sheeple, just using Internet Explorer is the next best thing to using a proper browser such as Firefox. And with this kind of developer's attitude, we can't even blame them for thinking that.

Why is fixing this almost ten year old bug (for which patches have been submitted!) being opposed so fiercely? Whose ego is being protected here? What's the problem exactly?

Every single argument against fixing it has been refuted:
- banks want this behavior ==> yet nobody could name one single bank who still insists on this
- it's technically hard to fix this ==> yet a patch has been submitted, but it is being ignored
- other browsers are doing it too ==> no, not in exactly the same way, and some actually aren't. And shouldn't the RFCs be the benchmark, rather than other browsers?
- it's to protect the user's privacy ==> yet, there is a workaround which does allow to skip back to the pages protected by this "feature", and any hypothetical pranksters walking around cybercafes clicking back on recently abandoned workstations would know about that workaround...
- the webmaster (or "somebody") _wanted_ the page to not be kept in the history ==> no, in some cases Firefox "second guesses" such a wish into the simultaneous occurrence of 2 unrelated settings that were meant for entirely different purposes and often that are often set by 2 different people, both ignoring that this coincidence between 2 unrelated settings would be interpreted in such a strange way.
- ...

====> no user, no webmaster, no web application developer and no bank wants this misfeature which is not being fixed in the official tree despite patches having been posted to the dozens of bug reports about this issue. So why exactly do we have to settle for "the next best thing"?
I didn't say you have to settle for "the next best thing". You're welcome to file bugs on improving our caching behavior so these situations happen less.
(In reply to comment #11)
> I didn't say you have to settle for "the next best thing". You're welcome to
> file bugs on improving our caching behavior so these situations happen less.

Jesse, the truth is, nobody wants a history of URLs visited. You don't want that either - even if you have not reached that conclusion yourself so far. What everybody wants, is a history of page content visited. And that's close to impossible to be reconstructed from the cache of source files loaded from the webservers. This bug is a simple artefact of the lack of proper history implementation.

So, please implement a proper page history (instead of the current URL history), and voila, this bug will go away automagically, without any further work.

See my 8-year-old bug, 200208.
No longer blocks: 160144
Depends on: 160144
Re comment 9: Please share your data about "the vast, vast majority of cases".  I don't doubt that a reload is often good enough, but I'm not going to believe that the problem is trivial without some data.

The case that I'm thinking of is where the user is trying to find a page they remember viewing recently, but cannot find when going back & forth in any of their open tabs, because the page they remember was dumped from the history and is different when reloaded.  With an error page, it's at least clear that the remembered page may have been lost that way.  Reloading without informing the user is useful in very few cases of the user wondering where the hell that page was.

I have the impression that the history mechanism is not totally conflated with caching, but automatically reloading when browsing history does violate my expectation that the history contains stale pages as viewed in the past.  I agree somewhat with comment 12 that there should be clear distinction between history and cache, but my impression is that what needs to change is mostly in the user interface, where it needs to be clear to the user when a page is fresh (within caching rules) and when it is historical.

Showing an error page (this bug) is one way to clearly distinguish the history, not the only way, and not necessarily the best way. Automatically reloading a page when the user requests the historical page may not cause confusion frequently, but at best it fails to convey to the user that the browser recognizes any distinction between the past and the present.
> I didn't say you have to settle for "the next best thing".

Well, in that case, what is "Wontfix" supposed to mean?

> You're welcome to file bugs on improving our caching behavior so these situations happen less.

There are already a huge number of bugs about the issue, some over ten years old.

Let me divide them into several categories:

1. The "basic" problem: session history broken for POST requests
----------------------------------------------------------------
Bug 1718 - [webshell]Unable to reproduce page containing search info
Bug 55055 - Go Back broken for form post results
Bug 56346 - Need to cache *all* pages for session history purposes 
Bug 200208 - back/forward buttons complain "has expired from cache" instead of showing pages visited as shown previously 
Bug 294775 - form re-posted when going back with history.back() 

2. Bug reports analyzing the real root cause 
--------------------------------------------
(it's not about POST, but about certain cache-control headers which should have no bearing on session history according to RFC 2616)

Bug 261312 - Cache-control: no-store should not affect session history navigation (when memory cache is large enough) 
Bug 567365 - Cache-Control no-cache on https page disables history


3. UI change recommendation
---------------------------
(IMHO, while a good idea, the main effort should really be focused on preventing the problem in the first place, rather than spin-doctoring it after it has happened)

Bug 54492 - warning of repost of form data needs rewording 
Bug 160144 - Replace POSTDATA dialog with better UI (post form resubmit warning)
Bug 243534 - Page is expired do you want to re-post message should show info about the original and target page. 
Bug 493438 - POSTDATA warning should say why the page wasn't cached 

4. More technical bug reports (quoting the relevant RFCs... or otherwise analyzing the situation)
--------------------------------------------------------------------------------------------------------

Bug 68705 - Implement new cache design 
Bug 288462 - [Meta] Mozilla sometimes re-retrieves pages instead of re-using already-retrieved copy; violates RFC 2616 (Back, Send Page, etc.); 

5. Hints for implementing a fix 
----------------------------------
Bug 72519 - support for cacheTokens must be implemented 
Bug 75679 - nsIRequest:: new load flags for cache 

6. Other situations than Back where the same bug strikes (some possibly already resolved)
-----------------------------------------------------------------------------------------
There used to be other user actions than "Back" or "Forward" which revealed the bug: View source, Print, Send Page. Numerous bug reports exist about these as well, but they seem (fortunately...) to have been fixed. Could the same fix be applied to Back/Forward? As the "Work offline" workaround shows, the pages do seem to be stored in the session history (as they should), but it is the handler for back/forward which (for some reason) chooses not to use that cached copy.

Bug 6119 - View page source tries to reload page 
Bug 17889 - Changing character set reloads the page from web. 
Bug 40867 - Need means to reuse/reload current page without refetching from server 
Bug 55583 - view-source should show original page source (use cached source) 
Bug 64100 - view-source doesn't work for pages generated via forms with method=POST 
Bug 68412 - W3C CUAP: Keep track of completed HTTP POSTs 
Bug 84106 - [FIX]Not correctly retrieving post data when saving a page or frame generated from a form POST 
Bug 85128 - Printing don't work for PHP output 
Bug 86261 - Browser's "File -> Send Page" should use original cached source, not refetch. 
Bug 86835 - [FIX]Can't view source of dll cgi 
Bug 115174 - Save Page As Web Page, HTML Only will attempt to resubmit cgi to save instead of using cached or displayed content 
Bug 118487 - Save image download the image again 
Bug 120809 - Save as function refetches data or images that are in the cache 
Bug 136633 - View Source gets wrong source when same URL is open in two windows simultaneously with different content
Bug 159387 - [AltSS]alternate stylesheet should be in session history (keep for reload/back/forward) 
Bug 166786 - Can't save / view source POST query (e.g. ebay auction)
Bug 177329 - nsIWebBrowserPersist can't persist things whose cache entries have expired
Bug 182712 - never expire cache data for open document 
Bug 222989 - Link to View Source Window brings up incorrect page (page without POST DATA) 
Bug 214783 - going back/forward with groupmarks makes hits on servers when everything should be in the cache 
Bug 210065 - view frame source -> save : doesn't save what is viewed 
Bug 227714 - Extremely slow saving of .JPG images from cache to disk.
Bug 235142 - Unable to save a page that is probably expired 
Bug 251231 - View source does not show content of currently displayed page if this page was changed and loaded in another instance of browser 
Bug 384222 - When web designer writes <title> with 7bit ascii only before <meta ... charset=us-ascii>, HTTP GET is issued again due to character set change 
Bug 462378 - Video/Audio playback should download the resource to the cache 
Bug 487883 - Image not properly cached, in disk but have 0 bytes (page info or properties) 
Bug 499255 - save from <video> does not use cache 


So, how much more bug reports do we need to file until this issue will be taken seriously?
And here seems to be the _real_ reason why this hasn't been fixed yet. It's not technical at all, but purely politics:

Bug 112564:

 Re: <meta HTTP-EQUIV="Pragma"  CONTENT="no-cache">
 Date:  Wed, 28 Nov 2001 17:40:11 -0800
 From: Darin Fisher <darin@netscape.com>
 Newsgroups: netscape.public.mozilla.netlib

 we had a long debate on this and the fact of the matter is that
 we must 
 honor 'cache-control: no-cache' on back/forward.  this is
 unfortunate 
 because the spec would allow us to not do this, but
 unfortunately many 
 important web servers depend on this behavior, and would block
 mozilla 
 from accessing their web servers if we didn't implement
 back/forward in 
 this manner.

 ...
 long term i hope that we can provide a better
 solution, since mozilla once did ignore 'no-cache' on back/forward.


Mr Fisher, is almost ten years "long term" enough for you? If you think so, then please go over your list of banks, and start by striking off those that croaked in the financial crisis of 2008/2009. And if any are still left after this exercise, please contact them, and strike off any who changed their opinion since then. And if there are still some left after this second step, please publicly post the remaining list, so that we can boycott them until they meet the fate that they deserve.

I think by now, Mozilla has enough "pull in the real world" that we _can_  really fight this nonsense.
(In reply to comment #14)
> > I didn't say you have to settle for "the next best thing".
> 
> Well, in that case, what is "Wontfix" supposed to mean?

It means the specific change suggested by this bug's summary -- showing an error page on Back whenever we don't have something stored, regardless of GET-vs-POST and the reason it's not stored -- is not what we want to do.

It doesn't mean that we're rejecting fixes that would reduce the frequency of "page in session history isn't cached/pinned/available" scenarios.

In Bugzilla we have to stick to one-issue-per-bug-report and one-change-per-bug-report to maintain our sanity.  Otherwise it becomes impossible to remember what we have fixed and what we haven't fixed.

> > You're welcome to file bugs on improving our caching behavior so these situations happen less.
> 
> There are already a huge number of bugs about the issue, some over ten years
> old.

Thanks for gathering and organizing this list.  You can add bug 441226 to it ;)

Do you think we need some new metabugs or does bug 288462 suffice?

> So, how much more bug reports do we need to file until this issue will be
> taken seriously?

Which issue, exactly? ;)

* A bunch of cache addressing and pinning bugs, which mainly affect "View Source" and "Save" rather than session history.
** You can see the networking team's priorities at https://wiki.mozilla.org/Networking/TeamPriorities.

* No-cache session history policy.
** I'm trying to get bug 567365 fixed. The next step is holding a security review.

* No-store session history policy.
** It may be sensible to reopen bug 261312, as you suggest in comment 15. Or maybe you should go ahead and organize your boycott of banks that use no-store. FWIW, I feel like JavaScript and AJAX have partially mooted this issue.

* Improving UX for the remaining cases where cached pages are missing from session history.
** I'd say we're taking bug 160144 seriously. It's hard -- hard enough that we've had to file separate bugs on some of the suggested solutions!
Voters on this bug may want to vote for Bug 666076 - Inform user when performing automatic reload on back/forward
> Do you think we need some new metabugs or does bug 288462 suffice?

The list on 288462 is quite extensive and good already, thank you. But what we need now is some action! Maybe we could start with some acknowledgment by somebody (who has the power to fix this) that:
- this unasked reloading is indeed a problem
- somebody is moving to fix it. It can't be that hard.

If there is a technical reason why a fix can't be implemented easily, please briefly explain why. This would help to calm people down, and even might encourage some to contribute solutions. There are already patches floating around to fix the bug, but so far they have neither been adopted, nor has there been any relevant explanation posted why they couldn't get used. Yes, one of the devs has posted an URL supposedly pointing to an explanation, but it turned out that it was just a page on the raising of pet rabbits (or a similar subject). Not sure what point the dev was trying to make there, but in any case that was not very productive.

If there is a non-technical reason why a fix can't be implemented, please also explain why, and don't hesitate to name and shame the "guilty" parties. Indeed, such politics should have no place in an open-source software.

> It may be sensible to reopen bug 261312,

How can I do that? I thought only admins, or maybe the original poster can re-open bugs.

> Or maybe you should go ahead and organize your boycott of banks that use no-store.

Now, what is this supposed to mean?

> FWIW, I feel like JavaScript and AJAX have partially mooted this issue.

With these at least you can use NoScript if they become too annoying :-)

> Improving UX for the remaining cases where cached pages are missing from session history.

There are cases where pages are present in the session history, but still not shown without a reload. This can be demonstrated by switching on "Work Offline" before clicking back. Suddenly pages which were gone are not really gone. To me, it looks like the main bug is in the decision whether to use a cached page or not rather than in the cache eviction policy.
If this bug were fixed, having a button to go back to the nearest GET would almost satisfy bug 493437, and it would avoid confusing the user by skipping unexpectedly.
You need to log in before you can comment on or make changes to this bug.