never expire cache data for open document

NEW
Unassigned

Status

()

--
major
16 years ago
7 years ago

People

(Reporter: jmd, Unassigned)

Tracking

(Depends on: 1 bug)

Trunk
x86
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

16 years ago
Leave a document open that was returned from a POST request.

Go about your daily surfing.

Try to view source on that page later. Recieve 'expired' warning.

As long as the page is open, mozilla should still be able to do any normal
operation with it. Print, edit, view source/info.
(Reporter)

Comment 1

16 years ago
Forgot to mention. When you rePOST, the new cache entry isn't associated with
the original page. Meaning, view source, get warning, click ok, close source,
view source again right away, get another warning.

Comment 2

16 years ago
Do you have a testcase?

Comment 3

16 years ago
Changing component.  I'm not sure if this should be Session History or Docshell,
since it only refers to a currently displayed page.  It's also very likely a dup.
Assignee: gordon → radha
Component: Networking: Cache → History: Session
QA Contact: tever → kasumi
Actually, this is in fact cache.  We just don't have any way to "pin" entries in
the cache.  Nor should we, in my opinion.  The original source is NOT actually
needed once the page has been rendered, except in the extreme edge case of
someone using "view source".  People who do that should realize that the source
won't always be available anymore.
Assignee: radha → darin
Component: History: Session → Networking: Cache
QA Contact: kasumi → core.networking.cache
(In reply to comment #4)
> Actually, this is in fact cache.  We just don't have any way to "pin" entries in
> the cache. 

please see
http://lxr.mozilla.org/seamonkey/source/netwerk/base/public/nsICachingChannel.idl#58
That doesn't work in general (specifically, for pages without POST data), last I
checked.

Note that SetCacheToken is not implemented, so even if it worked it wouldn't be
usable as things stand.

Comment 7

14 years ago
(In reply to comment #4)
> Actually, this is in fact cache.  We just don't have any way to "pin" entries in
> the cache.  Nor should we, in my opinion.  The original source is NOT actually
> needed once the page has been rendered, except in the extreme edge case of
> someone using "view source".  People who do that should realize that the source
> won't always be available anymore.

Why?  This is implementable, and as far as I can see, IS implemented in Internet
Explorer.  Please explain to me the DISadvantage of implementing this.  Note
that I'm not talking about coding difficulty of implementation - assuming this
could be implemented perfectly and instantly, what problem(s) would it cause, as
it would indeed fix the 'view source' problem (and view source is actually used
by a decent number of webpage designers who DO want the guarantee that the
source used to render any open webpage will always be locally cached, including me).

CCing self.
> what problem(s) would it cause

It'd cause the problem that when we need space in the cache for new content it's
not there because we can't evict the old page (note that if there is space
already, then there are no issues with view-source either, since the cache entry
will just stick about).

In other words, it makes the cache less efficient.
since bug 262350 now covers the (necko part of the) backend, I'll morph this
into a docshell bug.

I do think that fixing this would be desirable, btw... (and bz, you sounded like
you did too on irc recently?)
Assignee: darin → adamlock
Component: Networking: Cache → Embedding: Docshell
Depends on: 262350
QA Contact: core.networking.cache → adamlock
I think there are cases it could be useful, yes, as a general API thing.  I'd
have to see some data on how it does in the real world to see whether we want
session history to hold on to cache tokens...
maybe it should be docshell which holds onto the token, rather than shistory
(i.e. only store for the document being displayed currently)
That doesn't make much sense to me, frankly...  I use back and then view source
quite as much as I use view-source.

Comment 13

14 years ago
Strange - I actually don't.  However, could you just require that all pages in
the current back history, as well as all opened pages, be cached too?  If that
sounds like a lot to cache, bear in mind that these are relatively small HTML
source files and a developer (that may need these pages) is likely to have an
enormous (50MB for me) amount of cache to play with.
(In reply to comment #12)
> That doesn't make much sense to me, frankly...  I use back and then view source
> quite as much as I use view-source.

people don't want to persist no-store content in shistory...
(Reporter)

Comment 15

14 years ago
As the original reporter here, I thought I'd weigh in on the recent debate. I am
personally only concerned with garanteeing lossless access to all data currently
displayed. This is what I filed this bug for, and this is all it ever described,
as far as I know.

A document listed in the back/forward buttons is no different than any other
history entry in my mind. Just a quicker way to navigate to them. As I surf with
disk cache disabled, pinning every page in session history to memory seems
awfully wasteful. If however the DOM representation of one of these pages is
still in cache (rare for me; presumably common under default settings), then the
source should be as well. The fundamental problem here is that we're mutilating
the original source simply for a performance optimization, but then throwing
away the original document. If anything, we should be doing the opposite, as
having the original source rather than DOM is what really reduces network
access, the original reason for a browser's cache. (And with CPU performance
outpacing Internet connection speed even more today, I say it's even more
important). But whenever the DOM representation (sorry, I forget the technical
term for it) is available, the original document needs to be available as well,
for save and view source. Always. Always.

As far as the issue of too much memory being used here, that should only be the
case when a user has a large number of windows/tabs open. Using a lot of memory
(even exceeding the defined limits) to keep all of that cached is absolutely
expected behavior, in a web browser or any other application, and assuming
closing some of those docshells frees up the memory, I don't see what the
problem is. This shouldn't even be considered cache... it's the working dataset!
> If however the DOM representation of one of these pages is still in cache

We don't cache DOMs, ever.

> having the original source rather than DOM is what really reduces network
> access

Absolutely.  What does that have to do with this bug?  We only throw away the
source when we actually remove the entry from the cache.

Note that we do not cache regular HTTP documents anywhere but the disk cache,
last I checked, so turning off disk cache simply disables caching of HTTP
documents altogether.

> But whenever the DOM representation is available

This can be available in many cases when the user can't even try to view source
(eg XMLHttpRequest and company).

> This shouldn't even be considered cache... it's the working dataset!

That's the point. The original source is NOT part of the working dataset.  Once
it's been parsed, it's no longer needed to display that particular page, unless
the page is reloaded.  If the page is reloaded, we reget the source from cache.
 If you've disabled cache, that won't help you, of course.

Note that biesi's proposed changes won't help you either, since you've just
disabled cache altogether.  So it won't matter whether we've expired the data
there or not, since the data is simply not there.
>Note that we do not cache regular HTTP documents anywhere but the disk cache,
>last I checked

do we use the mem cache if we have no disk cache? (e.g. TestProtocols) (hm...
this is starting to get somewhat unrelated to this bug)
(Reporter)

Comment 18

14 years ago
> We don't cache DOMs, ever.

I thought this was the cause of all of the view-source issues, with misplaced
quotes and so on.

> Note that we do not cache regular HTTP documents anywhere but the disk cache,

Arg. So what the heck have I been caching with my 20M of memory cache for the
last few years? Not only am I forced to disable disk cache because of the
performance hit on my slow PC, but after having so many problems with Mozilla's
cache years ago, I completely lost faith in it, and needed to ensure that during
web development, I could at least resort to a quit and restart to get the
current version of a web page.

> The original source is NOT part of the working dataset.

Then we fundamentally disagree. If my URL bar says:
"https://bugzilla.mozilla.org/show_bug.cgi?id=182712", then I feel the document
located at "https://bugzilla.mozilla.org/show_bug.cgi?id=182712" is part of my
working dataset. I should have full access to it, for as long as it's open. This
means being able to save it.
> then I feel the document located at
> "https://bugzilla.mozilla.org/show_bug.cgi?id=182712"

You say "the document", but part of the problem (and the only way to ever see
this issue without disabling cache altogether) is that there may be multiple
different documents located at that URI.

Comment 20

14 years ago
Um, right Boris, but that's the whole point, isn't it?  When you 'view source'
on an open page, what you almost certainly want (and if not you can manually
re-request the document) is the document that was served to your client, at the
time you visited that URI, with the parameters you passed to the server, etc. 
The only real way to ensure that is to grab a cache of the page at the time of
receipt, and use that repeatedly.  Hence all currently open pages *NEED* to be
guarenteed of being in the cache - re-requesting the page isn't good enough.
Assignee: adamlock → nobody
QA Contact: adamlock → docshell
Depends on: 136633
You need to log in before you can comment on or make changes to this bug.