Closed Bug 563297 Opened 14 years ago Closed 8 years ago

Exclude static content (images, ...) form refreshing

Categories

(Core :: Networking: Cache, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: karl156, Unassigned)

References

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3

I don't really know if it is a bug or just "room for optimization", so I will file this as an enhancement.

This should be a very common situation. A dynamic html page with many static images, for example a forum.
When users hit refresh they usually only want the html content of the forum to be refreshed (because they are checking if there are some new answers). But Firefox also tries to refresh all images/css/js every time resulting in many "304 Not Modified" responses. This causes unnecessary load on the server and slows the refresh down.

My proposal: Don't revalidate content with very long expiration dates with the normal Refreh button. ("Expires:" or "Cache-Control: max-age" header)

I think this should be covered by the HTTP/1.1 spec:
To mark a response as "never expires," an origin server sends an Expires date approximately one year from the time the response is sent. HTTP/1.1 servers SHOULD NOT send Expires dates more than one year in the future. 
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

You can see this on many sites including big ones like YouTube (when hitting refresh):
GET /yt/cssbin/www-core-vfl162531.css HTTP/1.1
Host: s.ytimg.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3
[...]
If-Modified-Since: Sat, 01 May 2010 00:22:23 GMT
If-None-Match: "3716124314"
Cache-Control: max-age=0

HTTP/1.0 304 Not Modified
Date: Mon, 03 May 2010 09:01:00 GMT
Expires: Sun, 26 Dec 2032 06:20:53 GMT
Last-Modified: Sat, 01 May 2010 00:22:23 GMT
Etag: "3716124314"
[...]

As you can see the year 2032 is far in the future. It is completely useless to revalidate this file because YouTube is doing their own revision control ("vfl162531").
You can see this behavior on many other sites (even addons.mozilla.org).

Risk: I don't think this would break other websites because no one sets an expiration date far in the future without knowing what he is doing.

Reproducible: Always
> When users hit refresh they usually only want the html content of the forum to
> be refreshed

Unfortunately, no.  They want whatever is there to be refreshed.  In many cases the images are in fact an integral part of the content and do need to be refreshed for the new content to make sense...
That statement was for my forum example.

I don't want to propose that all images should be excluded from refresh. Just the ones with very long expiration time. Normal images should be unaffected by this change.
But the whole point of refresh is to ignore expiration times and request revalidation of everything....
This could still be achieved by Ctrl+F5. Normal expiration times sent by servers is always 1-2 days. No one sets an expiration time of several _years_ without exactly knowing that this image will never change.

The guys who set such long expiration times always give their content a version in the filename. If they want to change their content they just change the id of the file.
Another example are the social networking sites where all uploaded images get unique ids.

Otherwise it is completely useless to set such long expiration times.
The change could be extremely useful on sites with tons of little images.

I can't see an example where a serveradmin sets such long expiration times on images that will later change. Did I miss something?
Ctrl+F5 reloads unconditionally; it doesn't just revalidate.

> Otherwise it is completely useless to set such long expiration times.

Not at all.  It works great for loads that are not explicit requests to see whatever the current state is right now.

> Did I miss something?

Yes, all the cases when expiration times are not set at all but are computed from the Date and Last-Modified headers.  That can easily lead to long expiration times in situations where the image will in fact change, if it just hasn't changed in a while.
> Not at all.  It works great for loads that are not explicit requests to see
> whatever the current state is right now.
This could be achieved with expiration times of a few months too.

Objects with estimated expiration times really would be a problem. Maybe that should be excluded from this.

If you think this is not the right way to solve this problem, do you see another solution for a serveradmin to indicate that the browser should never revalidate an object?
That really doesn't seem like something a server admin can reliably indicate, honestly.  Seems like in the end it should be under user control.  Users can, in fact, control that.
Last try (I'm really sorry for arguing so much):
You're completely right for normal files. But I did not see such long expiration times on normal files yet.
This only makes sense on versioned objects. And on versioned objects a serveradmin really can reliably indicate that a specific version of an object will not change anymore (because that is the definition of a version). As you can see below, all major websites do it.
I don't think that this would take control from the user. The user would benefit from shorter reload times. Another example is eBay. The user repeatedly hits refresh because he wants to see the last bid of an auction. I am quite sure that he does not have in mind to get a new eBay site logo. (And of course, even if he has that in mind, then the websitedeveloper would do this with a new version (=filename) of the logo which the client does not have in its cache.)

When I think more about it, this can't take control from the user. If it is an evil website, the serveradmin can easily always send 304 responses to the client, having the same result (even on outdated content).
In this example only Ctrl+F5 would help, which still would help with my proposed change too.

Summary of what I think that should be changed:
IF the user hits reload (mLoadFlags & VALIDATE_ALWAYS)
AND it is not the toplevel document (mLoadFlags & LOAD_INITIAL_DOCUMENT_URI) AND the cached object has an expiration time of more than 5 years
AND the cached object has an explicit expiration time (not an estimated one)
THEN use the cached object without revalidation.

Some examples from the biggest websites worldwide:
Google (#1 on Alexa):
http://t1.gstatic.com/images?q=tbn:_9vdV0weKydEtM:http://techblick.de/wp-content/uploads/2010/01/mozilla.jpg
Facebook (#2 on Alexa):
http://static.ak.fbcdn.net/rsrc.php/z9Q0Q/hash/8yhim1ep.ico
Youtube (#3 on Alexa):
http://s.ytimg.com/yt/cssbin/www-core-vfl162531.css
Yahoo (#4 on Alexa):
http://l.yimg.com/a/i/ww/met/yahoo_logo_de_061509.png
and finally even Mozilla: :)
https://addons.mozilla.org/css/amo2009/style.min.css?64663
Status: UNCONFIRMED → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Depends on: 1267474
OS: Windows XP → All
Hardware: x86 → All
You need to log in before you can comment on or make changes to this bug.