The default bug view has changed. See this FAQ.
Last Comment Bug 1267474 - Cache-Control: immutable
: Cache-Control: immutable
Status: RESOLVED FIXED
[necko-active], DevAdvocacy-Facebook ...
: dev-doc-complete, DevAdvocacy
Product: Core
Classification: Components
Component: Networking: Cache (show other bugs)
: Trunk
: Unspecified Unspecified
-- normal (vote)
: mozilla49
Assigned To: Patrick McManus [:mcmanus]
:
: Patrick McManus [:mcmanus]
Mentors:
Depends on:
Blocks: 563297
  Show dependency treegraph
 
Reported: 2016-04-25 16:48 PDT by Ben Maurer
Modified: 2017-03-17 21:01 PDT (History)
33 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
?
fixed


Attachments
only revalidate strongly framed responses 1/3 (14.74 KB, patch)
2016-05-05 14:21 PDT, Patrick McManus [:mcmanus]
honzab.moz: review+
Details | Diff | Splinter Review
cache-control: immutable 2/3 (9.82 KB, patch)
2016-05-05 14:22 PDT, Patrick McManus [:mcmanus]
no flags Details | Diff | Splinter Review
cache-control: immutable tests 3/3 (7.82 KB, patch)
2016-05-05 14:22 PDT, Patrick McManus [:mcmanus]
honzab.moz: review+
Details | Diff | Splinter Review
with-immutable.png (156.43 KB, image/png)
2016-05-06 13:03 PDT, Patrick McManus [:mcmanus]
no flags Details
without-immutable.png (162.53 KB, image/png)
2016-05-06 13:03 PDT, Patrick McManus [:mcmanus]
no flags Details
cache-control: immutable 2/3 (10.83 KB, patch)
2016-05-06 13:27 PDT, Patrick McManus [:mcmanus]
honzab.moz: review+
Details | Diff | Splinter Review

Description User image Ben Maurer 2016-04-25 16:48:04 PDT
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.86 Safari/537.36

Steps to reproduce:

1) Go go facebook.com
2) open developer tools and look at networking requests
3) Reload the page (eg by pressing command + R)


Actual results:

You'll see a bunch of requests for JS/CSS/Images with If-Modified-Since headers that result in a 304. This wastes bandwidth and causes additional latency


Expected results:

Revalidation requests should not be sent for subresources. Chrome is adopting this behavior because it has been found to be aperformance problem. See the following for reference on the chrome change:

https://bugs.chromium.org/p/chromium/issues/detail?id=505048
https://crbug.com/600636
https://docs.google.com/document/d/1vwx8WiUASKyC2I-j2smNhaJaQQhcWREh7PC3HiIAQCo/edit#
Comment 1 User image Ben Kelly [not reviewing due to deadline][:bkelly] 2016-04-25 16:53:06 PDT
How does this change impact sites that don't use hashed resources with long cache times?

For example, github.io pages that get a 4H cache time by default?  My naive take is that you can somewhat easily get incoherent cache loads on these sites and ctrl-r revalidating subresources fixes the problem for users.  I don't think most users even know what a hard refresh is.
Comment 2 User image Ben Maurer 2016-04-25 17:11:36 PDT
(In reply to Ben Kelly [:bkelly] from comment #1)
> How does this change impact sites that don't use hashed resources with long
> cache times?
> 
> For example, github.io pages that get a 4H cache time by default?  My naive
> take is that you can somewhat easily get incoherent cache loads on these
> sites and ctrl-r revalidating subresources fixes the problem for users.  I
> don't think most users even know what a hard refresh is.



One design alternative we had proposed in doing this was that subresources would not be revalidated only if they had an expiration time more than 1 week in the future. The idea being that it might be reasonable for something like github.io to have a 4H expiration time and that a refresh might solve an incoherent cache. However if the resource had a cache time of over 1 week a site author would realize that not all of their users would figure out to use refresh and would be forced to switch URLs.

This would make for the best of both worlds -- diligent sites that correctly use the cache save bandwidth (they will all have long expiration times). Sites like github get a "classic" refresh button.

That said the chrome team preferred the simplicity of the solution they proposed.
Comment 3 User image David Baron :dbaron: ⌚️UTC-7 2016-04-25 17:22:21 PDT
Seems like there are two different use cases for reload that are conflicting a bit here:
 (1) get newer versions of resources whose cache validation times are still in the future, but that have (possibly) changed
 (2) fix problems resulting from flaky network connections causing some resources to fail to load

It seems like this proposed change will break many cases of (1), which is actually a thing users reload pages for.  (e.g., users reload to get "new data", and the new data may well be in a subresource rather than the toplevel page -- would be interesting to see if we could figure out how often this is the case, although figuring that out seems hard).

Another idea that would help with the (2) cases (although not the additional site load they cause) would be that if a reload results in a network error, but there was a valid cached copy, continue using the cached copy rather than having the behavior that results from a failed load.  (It might be tricky to define exactly what a network error is for this case, though.  It's also tricky for things that are partly loaded and then result in a network error, although I suppose we could continue to use the rest of cached copy as long as the incrementally-received data matches the beginning of the cached copy.)
Comment 4 User image Ben Kelly [not reviewing due to deadline][:bkelly] 2016-04-25 17:26:39 PDT
I would also note that large sophisticated sites can set their own policy with service workers.  We expose Request.cache starting in FF48.  The service worker could see that Request.cache is 'no-cache' and override the resulting fetch() with a 'default' cache policy.

I'm not saying service workers are a replacement for a sane default policy, but the primitives are at least exposed now.
Comment 5 User image Ben Maurer 2016-04-25 17:36:36 PDT
(In reply to David Baron [:dbaron] ⌚️UTC-7 from comment #3)
> Seems like there are two different use cases for reload that are conflicting
> a bit here:
>  (1) get newer versions of resources whose cache validation times are still
> in the future, but that have (possibly) changed
>  (2) fix problems resulting from flaky network connections causing some
> resources to fail to load
> 
> It seems like this proposed change will break many cases of (1), which is
> actually a thing users reload pages for.  (e.g., users reload to get "new
> data", and the new data may well be in a subresource rather than the
> toplevel page -- would be interesting to see if we could figure out how
> often this is the case, although figuring that out seems hard).

My guess is that most sites have learned not to do this because many of their users haven't learned to use reload to update the data. As chrome deploys this new policy my guess is that sites will change their ways.

If we really want to support this use case, I think the threshold solution of 1 week preserves the existing use case.


> Another idea that would help with the (2) cases (although not the additional
> site load they cause) would be that if a reload results in a network error,
> but there was a valid cached copy, continue using the cached copy rather
> than having the behavior that results from a failed load.  (It might be
> tricky to define exactly what a network error is for this case, though. 
> It's also tricky for things that are partly loaded and then result in a
> network error, although I suppose we could continue to use the rest of
> cached copy as long as the incrementally-received data matches the beginning
> of the cached copy.)

The issue here is that if the user is on a poor quality connection issuing the revalidation requests is likely to make the overall experience bad still. How long would we allow revalidations to be outstanding: 10 seconds? That's already a terrible experience.

(In reply to Ben Kelly [:bkelly] from comment #4)
> I would also note that large sophisticated sites can set their own policy
> with service workers.  We expose Request.cache starting in FF48.  The
> service worker could see that Request.cache is 'no-cache' and override the
> resulting fetch() with a 'default' cache policy.
> 
> I'm not saying service workers are a replacement for a sane default policy,
> but the primitives are at least exposed now.

Fair enough. But long caching is such a common pattern that I think it's really important we get the defaults right. Facebook's data suggests that 10-20% of all requests to our cdn are spurious revalidations. We suspect that other sites are seeing similar rates. Even a 5% rate of revalidations is enough to be a big perf difference
Comment 6 User image Ben Kelly [not reviewing due to deadline][:bkelly] 2016-04-25 17:55:27 PDT
Could we make this a meta tag?  Then the document could easily opt in to no revalidation for sub resources without hurting smaller, less sophisticated sites.
Comment 7 User image Karl Dubost :karlcow 2016-04-25 18:29:01 PDT
That's an interesting data source. 

(In reply to Ben Maurer from comment #5)
> Facebook's data suggests that
> 10-20% of all requests to our cdn are spurious revalidations. We suspect
> that other sites are seeing similar rates. Even a 5% rate of revalidations
> is enough to be a big perf difference

Have you been able to measure the perf difference?
How did you measure the differences?
I'm asking because that would be useful if Firefox was to change something. 
Also as a comparison point to evaluate on other sites in different country markets. 

About:

> My guess is that most sites have learned not to do this because many of their users haven't learned to use reload to update the data. As chrome deploys this new policy my guess is that sites will change their ways.

This part makes me a bit nervous. From experience, sites don't necessary updates. There are tons of legacy issues and sites do not behave the same depending on countries/servers. I agree with big active properties will probably update (with a lot of swearing at first from Web devs ;) ) but less sure about the rest of the Web. So we really have to think about this carefully. 

More data would be super useful.
Comment 8 User image Karl Dubost :karlcow 2016-04-25 18:38:30 PDT
What was proposed by Ben in an email.


> What we are proposing is that:
> 
> If (resource is expired) {
>  Send revalidation
> } If (request is main resource && resource is not expired && user initiated refresh) {
>  Send revalidation
> } else {
>  Use response from cache
> }
Comment 9 User image Ben Maurer 2016-04-25 18:39:58 PDT
(In reply to Ben Kelly [:bkelly] from comment #6)
> Could we make this a meta tag?  Then the document could easily opt in to no
> revalidation for sub resources without hurting smaller, less sophisticated
> sites.

We'd be happy with this solution. That said, i think the best outcome here is if we can respect long cache times by default so that people see the impact of this across the web instantly.

My personal preference for a cautious solution is "only revalidate resources with a lifetime > 1 week".

(In reply to Karl Dubost :karlcow from comment #7)
> That's an interesting data source. 
> 
> (In reply to Ben Maurer from comment #5)
> > Facebook's data suggests that
> > 10-20% of all requests to our cdn are spurious revalidations. We suspect
> > that other sites are seeing similar rates. Even a 5% rate of revalidations
> > is enough to be a big perf difference
> 
> Have you been able to measure the perf difference?
> How did you measure the differences?
> I'm asking because that would be useful if Firefox was to change something. 
> Also as a comparison point to evaluate on other sites in different country
> markets. 

~ 2 years ago there was a chrome bug that caused them to send even more revalidations than they do today. Due to the nature of the bug we were able to work around it on our side. We did an A/B test. There was a substantial increase in overall perf on fb for people who were in the workaround. In fact the site was so much faster we saw a number of our usage/engagement numbers increase. We suspect that if browsers fix the remaining sources of revalidations we'll see another such bump.

the 10-20% number comes from our cdn logs -- about 10-20% (depending on browser) of requests to our CDN are revalidations. This means that if we address this issue that those requests will be served by the user's cache and not cause latency.
Comment 10 User image Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) 2016-04-25 19:14:55 PDT
If we do this, we'd certainly need to restrict this to things whose freshness lifetimes were explicitly set in the response headers, not things which have heuristic freshness lifetimes.

In other words, we'd need to change how necko handles some of the nsIRequest flags (or add new ones), not merely stop propagating to subresources the flags we propagate right now....
Comment 11 User image Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) 2016-04-25 19:17:35 PDT
Ben, I have a specific question about the Facebook setup.  How often are the resources with 1 year freshness actually updated?  Is the answer just "never, when we need to change one of those we give it a different URL"?
Comment 12 User image Ben Maurer 2016-04-25 19:19:22 PDT
(In reply to Boris Zbarsky [:bz] from comment #11)
> Ben, I have a specific question about the Facebook setup.  How often are the
> resources with 1 year freshness actually updated?  Is the answer just
> "never, when we need to change one of those we give it a different URL"?

We never change them. If we want a new resource we change the URL.
Comment 13 User image Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) 2016-04-25 19:27:58 PDT
Also, do you know what Chrome is doing for URLs with heuristic freshness?  I can't tell from their issue tracker...
Comment 14 User image Patrick McManus [:mcmanus] 2016-04-26 07:05:47 PDT
This is an interesting thread, thanks.

up to 20% 304's is a good data point. For resources that never change, that's a significant waste.

however, I'm against adding more crazy heuristics to this space or changing historical behavior that other origins legitimately rely on.

So what we need is an explicit opt in that we can implement and then standardize and get the same behavior across clients. Ben suggests Meta - that's a good start, but this is a property of the subresource not of the HTML. This is what cache-control extensions were made for https://tools.ietf.org/html/rfc7234#section-5.2.3

Cache-Control: max-age=31536000, never-revalidate

I would implement that so that Firefox would never send a conditional request for that cached resource.. We would always use the cached copy as long as it was fresh and we weren't concerned about the integrity of the cached copy[*].

What do you think of that proposal? I'd be happy to work with Chrome on it.

[*] the web in general is full of broken crap including broken HTTP responses that have things like bogus content-lengths. We need to be lax in rendering and caching this stuff or the web doesn't work (we've tried enforcing the rules to our own detriment in the past). But of course this looks just like the truncated load from a real network error - so in those cases we would probably ignore never-revalidate when reload was pressed.
Comment 15 User image Ben Kelly [not reviewing due to deadline][:bkelly] 2016-04-26 07:25:15 PDT
The never-revalidate cache-control seems like a great idea to me.  Let the server be explicit about its intent for the resource.
Comment 16 User image Ben Maurer 2016-04-26 07:58:01 PDT
(In reply to Boris Zbarsky [:bz] from comment #13)
> Also, do you know what Chrome is doing for URLs with heuristic freshness?  I
> can't tell from their issue tracker...

I asked the chrome team to provide some input here.

(In reply to Patrick McManus [:mcmanus] from comment #14)
> This is an interesting thread, thanks.
> 
> up to 20% 304's is a good data point. For resources that never change,
> that's a significant waste.
> 
> however, I'm against adding more crazy heuristics to this space or changing
> historical behavior that other origins legitimately rely on.
> 
> So what we need is an explicit opt in that we can implement and then
> standardize and get the same behavior across clients. Ben suggests Meta -
> that's a good start, but this is a property of the subresource not of the
> HTML. This is what cache-control extensions were made for
> https://tools.ietf.org/html/rfc7234#section-5.2.3
> 
> Cache-Control: max-age=31536000, never-revalidate

I think that's a great idea!

https://www.ietf.org/mail-archive/web/httpbisa/current/msg25463.html

The response was somewhat lukewarm on this thread. I think people are understandably unexcited about an extension that essentially boils down to "no I *really* mean my expiration date". One question I got asked privately was "what is the use case for max-age=1 year without a never-revalidate header". I don't think there is one. Changing a resource with a 1 year max age is a terrible user experience, so bad that I truly doubt that people do it. 

Would you guys like to help champion this solution? We're happy with the header, a meta tag, or a change in browser behavior.
Comment 17 User image Ben Kelly [not reviewing due to deadline][:bkelly] 2016-04-26 08:22:49 PDT
(In reply to Ben Maurer from comment #16)
> The response was somewhat lukewarm on this thread. I think people are
> understandably unexcited about an extension that essentially boils down to
> "no I *really* mean my expiration date". One question I got asked privately
> was "what is the use case for max-age=1 year without a never-revalidate
> header". I don't think there is one. Changing a resource with a 1 year max
> age is a terrible user experience, so bad that I truly doubt that people do
> it. 

Yea, I can understand your frustration.

These questions seem weird to me, though.  There are plenty of servers out there that change resources under the same URL within the max-age.  Its trivial to reproduce this on github.io and probably other sites.

To me what the "never-revalidate" means is really "I'm using a unique URL for this resource and I will give you a new URL if it ever changes".  That's somewhat orthogonal to max-age setting.

Saying that there is no use case for a long max-age without unique URLs does not mean servers won't do that.  I don't know why we would want a magic inference from the browser here when the server could just be explicit.

Martin, do you have any opinions here?
Comment 18 User image Patrick McManus [:mcmanus] 2016-04-26 08:31:52 PDT
Thanks for unearthing the httpbis thread. A few thoughts based on that:

* I would change the name from never-revalidate to immutable. Mark makes the point that never-revalidate, as proposed, is really about browser behavior and therefore could live somewhere else.. but we actually don't want this to be about proscribing behavior, we're trying to communicate a property of the resource - which is that it never changes. Its most applicable to caching policy, so Cache-Control extension is a decent place to put it.

* As far as enthusiasm - nothing like running code and experience to focus the mind. That's why there is an extension mechanism. Just have to be open to modifying the approach based on evidence as it comes in.

* http in general doesn't try and insist you use the most recent variant of a resource.. there is nothing conceptually wrong with a resource that changes every day with a max-age of a year and we shouldn't try and undermine that.. but a new property like immutable that indicates this is not the case is certainly new information that will influence cache management policy.

Cache-Control: max-age=36500000, immutable

I think I'd work on that. We'd want to get an agreement with Chrome (if possible) and then get some deployment data potentially leading to a I-D if we're all happy with it.

This would obviously also significantly improve page load time.
Comment 19 User image Ben Maurer 2016-04-26 08:40:08 PDT
We'd be happy to put in an immutable attribute. Sounds like the next step here is to get a patch that implements this so that we could test that out.

I'll talk with the chrome team about this idea. They're committed to solving this problem in some fashion.
Comment 20 User image Ben Kelly [not reviewing due to deadline][:bkelly] 2016-04-26 08:43:09 PDT
Ah, sounds like Patrick has things in hand here.  Sorry for roping you in Martin.
Comment 21 User image Jason Duell [:jduell] (needinfo me) 2016-04-26 10:57:27 PDT
I'm late to the thread here, but I agree with Patrick about adding "immutable".  (I'm torn--it would be really nice to have a solution that just rolls out w/o lots of server changes, like the "don't revalidate if expires in more than a week" heuristic.  But my gut says we'd break stuff, and in hard to debug ways.)
Comment 22 User image Martin Thomson [:mt:] 2016-04-26 17:57:56 PDT
No worries Ben.  I'm across the issue.  I've offered Ben M my assistance in getting specs written when that happens.

I think that we should look at shipping this, but we might want to ignore the immutability thing under certain circumstances.  We could require stronger integrity mechanisms rather than etag as a signal that the origin is serious: e.g., https://tools.ietf.org/html/draft-thomson-http-mice
Comment 23 User image :Ehsan Akhgari (busy) 2016-04-26 23:20:07 PDT
(In reply to Jason Duell [:jduell] (needinfo? me) from comment #21)
> I'm late to the thread here, but I agree with Patrick about adding
> "immutable".  (I'm torn--it would be really nice to have a solution that
> just rolls out w/o lots of server changes, like the "don't revalidate if
> expires in more than a week" heuristic.  But my gut says we'd break stuff,
> and in hard to debug ways.)

How about having an additional Cache-Control extension like "immutable-subresources" or some such which would mean that subresources for the response are all immutable?  And we can make it work with a <meta http-equiv> tag served in the main page.  With that, if the server wants to have only some subresources immutable it can be configured to use the immutable extension for those responses, otherwise you could just serve a <meta http-equiv="Cache-Control" content="max-age=36500000, immutable-subresources"> in their HTML markup, and that would satisfy the use case in comment 0 without any server configuration changes.
Comment 24 User image Martin Thomson [:mt:] 2016-04-26 23:30:12 PDT
(In reply to :Ehsan Akhgari (busy, don't ask for review please) from comment #23)
> How about having an additional Cache-Control extension like
> "immutable-subresources" [...]

I don't have data to back this (Ben M might), but my intuition is that there are almost no pages that have all immutable subresources, so this wouldn't help much.  I don't think that the server changes are a barrier worth optimizing around.  In this case, we want the server folks to have to spend at least some effort to get into this state; after all, it would be bad if this were applied accidentally.

If people are concerned about breaking things permanently, we can build out a system that revalidates with a server occasionally until the server shows that they really mean it.  I believe that the same approach was proposed for HSTS: cap the time that the pin is respected for until the site demonstrates availability over time.  That will delay benefits, but allow for accidents to be rolled back relatively quickly in case of bustage.

That's a lot more work though, and I tend to think that Patrick's suggestion is probably enough.

BTW, the need for paranoia is justified, but it's not all that bad.  In case of bustage, servers can put the resource up at a new URI.  Changing a bunch of links might suck, but it's still technically possible to avoid a permanent break.
Comment 25 User image Ben Maurer 2016-04-26 23:38:49 PDT
Saying all resources are immutable is a bit extreme. But I'd be ok with a meta tag that says please respect our expiration date for subresources regardless of if the user is refreshing
Comment 26 User image :Ehsan Akhgari (busy) 2016-04-26 23:48:59 PDT
Err, sorry, to clarify, what I meant was for "immutable-subresources" to mean subresources with a max-age matching what's in the Cache-Control header (perhaps with validity >= the max-age in the Cache-Control header) to be immutable.  The subresources served with lower max-age (or perhaps with no max-age at all) will be considered as mutable.

IOW, if the main page is served with |Cache-Control: max-age=123456, immutable-subresources|, then responses for the subresources served with |Cache-Control: max-age=123456| and |Cache-Control: max-age=123457| etc will be considered immutable.

Does that make sense?
Comment 27 User image Martin Thomson [:mt:] 2016-04-26 23:54:09 PDT
Seems a bit indirect for my liking.  A more explicit method would be to tie this to SRI, something we should do without the directive, I think.  But I'm not sure that folks want to add SRI headers to things like that.
Comment 28 User image Ben Maurer 2016-04-27 00:05:22 PDT
Sri is not great here because we often don't know the hash of something (eg a photo that's on our cdn that we might dynamically resize)
Comment 29 User image Patrick McManus [:mcmanus] 2016-04-27 07:00:46 PDT
> of bustage, servers can put the resource up at a new URI.

ack - or more simply ctrl shift reload still reloads things unconditionally. As Ben points out that's what you have to do to fix corruption right now anyhow when the etag is right but the message body isn't (e.g. truncation) which is the common failure more anyhow. I wouldn't expect a accidental immutable to be more common than that.

I'm pretty confident that immutable is a property of the resource not of the page referencing it. This creates a consistent story for things other than HTML as well as third party combinations.
Comment 30 User image Patrick McManus [:mcmanus] 2016-05-04 19:42:31 PDT
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5d5ef5660ace
Comment 31 User image Patrick McManus [:mcmanus] 2016-05-05 14:19:03 PDT
https://treeherder.mozilla.org/#/jobs?repo=try&revision=55767065fede
Comment 32 User image Patrick McManus [:mcmanus] 2016-05-05 14:21:54 PDT
Created attachment 8749412 [details] [diff] [review]
only revalidate strongly framed responses 1/3
Comment 33 User image Patrick McManus [:mcmanus] 2016-05-05 14:22:00 PDT
Created attachment 8749413 [details] [diff] [review]
cache-control: immutable 2/3
Comment 34 User image Patrick McManus [:mcmanus] 2016-05-05 14:22:06 PDT
Created attachment 8749415 [details] [diff] [review]
cache-control: immutable  tests 3/3
Comment 35 User image Patrick McManus [:mcmanus] 2016-05-06 13:03:24 PDT
Created attachment 8749792 [details]
with-immutable.png
Comment 36 User image Patrick McManus [:mcmanus] 2016-05-06 13:03:44 PDT
Created attachment 8749793 [details]
without-immutable.png
Comment 37 User image Patrick McManus [:mcmanus] 2016-05-06 13:04:32 PDT
I'm separately going to work on some documentation here..
Comment 38 User image Patrick McManus [:mcmanus] 2016-05-06 13:27:43 PDT
Created attachment 8749805 [details] [diff] [review]
cache-control: immutable 2/3
Comment 39 User image Patrick McManus [:mcmanus] 2016-05-06 13:28:44 PDT
Comment on attachment 8749412 [details] [diff] [review]
only revalidate strongly framed responses 1/3

Review of attachment 8749412 [details] [diff] [review]:
-----------------------------------------------------------------

honza, the notion is that you revalidate for one of two reasons

1] to get a newer representation of the resource
or 2] to fix corruption

This part of the patch is aimed at the fact that revalidation via If-Modified-Since or If-None-Match doesn't really help with #2.. it just ensures you have the right etag/Date cached locally.. given that the common corruption is truncation, you just get back a 304 confirming your meta data is fine without confirming the data at all.

So if we don't have a good indication that corruption has not occurred (good indications include Content-Length headers that match, or chunked encodings that terminate correctly, or end-stream bits found on h2/spdy... the patch calls this strongly framed) then skip revalidation and just transfer the whole thing.

This is a valid thing to do independent of immutable, but we only want to honor immutable in the case of strongly framed https responses which is why its in this patch series
Comment 40 User image Honza Bambas (:mayhemer) 2016-05-08 04:36:36 PDT
(In reply to Patrick McManus [:mcmanus] from comment #39)
> Comment on attachment 8749412 [details] [diff] [review]
> only revalidate strongly framed responses 1/3
> 
> Review of attachment 8749412 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> honza, the notion is that you revalidate for one of two reasons
> 
> 1] to get a newer representation of the resource
> or 2] to fix corruption
> 
> This part of the patch is aimed at the fact that revalidation via
> If-Modified-Since or If-None-Match doesn't really help with #2.. it just
> ensures you have the right etag/Date cached locally.. given that the common
> corruption is truncation, you just get back a 304 confirming your meta data
> is fine without confirming the data at all.
> 
> So if we don't have a good indication that corruption has not occurred (good
> indications include Content-Length headers that match, or chunked encodings
> that terminate correctly, or end-stream bits found on h2/spdy... the patch
> calls this strongly framed) then skip revalidation and just transfer the
> whole thing.
> 
> This is a valid thing to do independent of immutable, but we only want to
> honor immutable in the case of strongly framed https responses which is why
> its in this patch series

I am not sure we need this.  When a _cached_ response is incomplete (content-length doesn't match the size of the cached data) it's always also resumable.  Then we always do a range request for which immutable must be ignored - haven't took a look if these patches already contain that bypass.

If we end up with an incomplete _network_ response that is not resumable (the set of conditions like must have an etag, content-length and accept byte ranges + some others have not all been met) we doom the cached entry immediately in OnStopRequest.  Hence such a response can never be reused from the cache.

In other words, I think resumability of a response implies strong framing already.  But I need to double check on that first.
Comment 41 User image Patrick McManus [:mcmanus] 2016-05-08 23:41:32 PDT
it sounds like you're saying we wouldn't cache at all a response delimited via connection: close.. I don't *think* that's true - but I guess we should double check. And if we don't, shouldn't we? (I'm just talking about normal transaction hit reuse here - not resuming)
Comment 42 User image Honza Bambas (:mayhemer) 2016-05-11 02:34:39 PDT
Comment on attachment 8749412 [details] [diff] [review]
only revalidate strongly framed responses 1/3

Review of attachment 8749412 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +3562,5 @@
> +        nsXPIDLCString buf;
> +        rv = entry->GetMetaDataElement("strongly-framed", getter_Copies(buf));
> +        // describe this in terms of explicitly weakly framed so as to be backwards
> +        // compatible with old cache contents which dont have strongly-framed makers
> +        bool weaklyFramed = NS_SUCCEEDED(rv) && buf.EqualsLiteral("0");

should this be backwards?  I'd more tend to treat content w/o this meta as weakly framed.  Should be NS_FAILED(rv) || buf.Equals('0')

::: netwerk/protocol/http/nsHttpChannel.h
@@ +569,5 @@
> +
> +    // if the http transaction was performed (i.e. not cached) and
> +    // the result in OnStopRequest was known to be correctly delimited
> +    // by chunking, content-length, or h2 end-stream framing
> +    bool mStronglyFramed;

could this be added to the bit fields we keep here?

::: netwerk/protocol/http/nsHttpTransaction.h
@@ +282,5 @@
>      // after the main thread modifies it. To deal with raciness, only unsetting
>      // bitfields should be allowed: 'lost races' will thus err on the
>      // conservative side, e.g. by going ahead with a 2nd DNS refresh.
>      Atomic<uint32_t>                mCapsToClear;
> +    Atomic<bool>                    mResponseIsComplete;

I think you are OK with just ReleaseAcquire ordering here.
Comment 43 User image Honza Bambas (:mayhemer) 2016-05-11 03:01:22 PDT
Comment on attachment 8749415 [details] [diff] [review]
cache-control: immutable  tests 3/3

Review of attachment 8749415 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/xpcshell/moz-http2/moz-http2.js
@@ +715,5 @@
>      Serializer.prototype._transform = newTransform;
>    }
>  
> +  // for use with test_immutable.js
> +  else if (u.pathname === "/immutable-test1") {

some more descriptive names would be good here.
Comment 45 User image Carsten Book [:Tomcat] 2016-05-11 07:11:43 PDT
backed out for bustage like https://treeherder.mozilla.org/logviewer.html#?job_id=27658277&repo=mozilla-inbound
Comment 47 User image Patrick McManus [:mcmanus] 2016-05-11 07:22:00 PDT
https://treeherder.mozilla.org/#/jobs?repo=try&revision=53e68f10f34f
Comment 48 User image Patrick McManus [:mcmanus] 2016-05-11 07:28:34 PDT
build fail was because I reordered a member variable on honza's review comment.. my smoketest compiler didn't enforce that.
Comment 51 User image Kenji Baheux 2016-05-11 18:37:10 PDT
Sorry for the late reply.

On the Chrome side, we still want to continue exploring changing the reload behavior.

Namely, scope down to 2 user facing reload behaviors (keep current options as is for developers).

 - reload: validate the main resource, proceed with a regular load for everything else (i.e. respect max-age, don't proactively validate seemingly fine resources).

 - hard reload. Sidenote: it's currently missing on Chrome for Android; Our UX team wants to avoid obscure gestures (e.g. long press the reload button, pull harder to hard reload...) and instead just make it work.


We are thinking about metrics to determine how much breakage/frustration this leads to and have several options to consider in response. I'll be happy to share what we learn.

Our interest in Immutable is currently low but might change depending on what we all learn.
The main reason for the lack of interest being that the impact on the Loading user experience will be inherently smaller.
Comment 52 User image Patrick McManus [:mcmanus] 2016-05-12 01:34:07 PDT
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bcd1d21d4e04
Comment 53 User image Patrick McManus [:mcmanus] 2016-05-12 01:34:58 PDT
test failed in comment 50 because it depends on node and xpcshell.ini did not have the skip-if !hasNode annotation.
Comment 56 User image Florian Scholz [:fscholz] (MDN) 2016-06-22 10:33:45 PDT
Docs:
https://developer.mozilla.org/en-US/Firefox/Releases/49#HTTP
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control
(We are revamping HTTP docs, expect more soon)
Comment 57 User image ziyunfei 2016-07-03 21:51:42 PDT
Can someone please tell me why immutable caches don't do validations? 

https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpChannel.cpp#3740
Comment 58 User image Karl Dubost :karlcow 2016-07-03 21:58:14 PDT
ziyunfei, see Comment #18
Comment 59 User image ziyunfei 2016-07-03 22:50:12 PDT
(In reply to Karl Dubost :karlcow from comment #58)
> ziyunfei, see Comment #18

What I want to ask is why a immutable cache doesn't send If-Modified-Since/If-None-Match header after its Max-Age expires? 

e.g. Cache-Control: max-age=3600, immutable

Within an hour, no http requests sent. After an hour, should be a 304 response, not 200 I think.
Comment 60 User image ziyunfei 2016-07-04 21:53:01 PDT
(In reply to ziyunfei from comment #59)
> (In reply to Karl Dubost :karlcow from comment #58)
> > ziyunfei, see Comment #18
> 
> What I want to ask is why a immutable cache doesn't send
> If-Modified-Since/If-None-Match header after its Max-Age expires? 
> 
> e.g. Cache-Control: max-age=3600, immutable
> 
> Within an hour, no http requests sent. After an hour, should be a 304
> response, not 200 I think.

mcmanus, is this intended behavior, I think immutable resources do need revalidation when they expire.
Comment 61 User image Patrick McManus [:mcmanus] 2016-07-05 04:59:41 PDT
(In reply to ziyunfei from comment #60)

> mcmanus, is this intended behavior, I think immutable resources do need
> revalidation when they expire.

By being marked immutable the server has declared that the resources do not change, therefore we don't use conditional revalidation - (what does if-modified mean for something that is never modified?) its assumed the reload is being done for coherency/corruption purpose so a full reload is done and 304 tends to be destructive to coherency checks. (i.e. the etag/l-m is correct but the stored response body is not).

do you have a use case in mind? I would expect the standard immutable use would have a very large max-age.
Comment 62 User image ziyunfei 2016-07-05 21:19:17 PDT
(In reply to Patrick McManus [:mcmanus] from comment #61)
> By being marked immutable the server has declared that the resources do not change,
By being marked immutable the server has declared that the resources do not change *before the end of the declared max-age period*.

> I would expect the standard immutable use would have a very large max-age.
Well, I would also expect that, 1 year even 10 years (long-lived than user's computer), because these are the use cases Cache-Control: immutable are invented for, if this is the case, 200 or 304, whatever, never expire anyway; unfortunately this is not mandatory, people may need short max-age like 1 day or 1 week, maybe w3c's site would use this kind short max-age (https://github.com/w3c/tr-design/issues/101), maybe other sites would also.

So I don't think "immutable" should "never-revalidate".
Comment 63 User image Patrick McManus [:mcmanus] 2016-07-06 04:05:26 PDT
(In reply to ziyunfei from comment #62)
> (In reply to Patrick McManus [:mcmanus] from comment #61)
> > By being marked immutable the server has declared that the resources do not change,
> By being marked immutable the server has declared that the resources do not
> change *before the end of the declared max-age period*.

no - that is not correct. They declare they do not change - they are immutable.
Comment 64 User image Martin Thomson [:mt:] 2016-07-10 19:33:52 PDT
I think that the confusion here is that the max-age is still being used.  But the point is that max-age is only being used to evict something from cache.  At that point, there is still a promise that the content is the same, it's just no longer useful to retain that information.
Comment 65 User image Ideas-Creator 2016-11-10 06:10:37 PST
Hi, I have two questions about this topic.

1) Is it possibile to use "immutable" and also considerating the resource's expiration time?
There will be a grat max-age but, when it decade, can we discard the sub-resource?

2) how many sites are using this token beyond Facebook?
I notice that some Facebook's affiliate sites (like newsroom.fb.com or insights.fb.com) aren't using this token for static objects like their icon.
Comment 66 User image ziyunfei 2016-12-19 19:22:14 PST
FYI: "Chrome no longer forces validation on reload, so immutable shouldn't be necessary." from https://bugs.chromium.org/p/chromium/issues/detail?id=611416#c12
Comment 67 User image Karl Dubost :karlcow 2016-12-19 20:54:45 PST
Comment 9 is interesting too on the Chromium bug.

> (With Firefox and `Cache-Control: immutable` on the 
> BBC Web site) we're now seeing ~31% fewer 304's served 
> from our CDN edge on a subset of our static assets 
> (which have immutable enabled).
https://bugs.chromium.org/p/chromium/issues/detail?id=611416#c9

That's a huge win.
Comment 68 User image randy 2017-01-04 14:00:29 PST
The fact that (during a reload) Firefox requests many resources it was previously told to cache is a problem specific to Firefox.  The problem should be solved in Firefox, and shouldn't require the whole Internet to adopt a new cache-control header when the headers they are already return are sufficient.

Chrome solved this problem at the browser, without requiring every website to change.  When reloading, only resources that were cached according to a heuristic (e.g. the last-modified time) should be conditionally fetched.  Resources which a server *explicitly* indicated are still valid should be loaded from the cache.

Even if websites adopted this workaround, there are still broken scenarios.  For resources which periodically change, "immutable" wouldn't apply, and Firefox will still ignore the expiration indicated by the site during a reload.
Comment 69 User image ziyunfei 2017-01-27 17:10:39 PST
FYI: Safari just implemented Cache-control: immutable, also want to implement Chrome's new reload behavior https://bugs.webkit.org/show_bug.cgi?id=167497
Comment 70 User image m.kurz 2017-03-12 15:27:35 PDT
It seems this isn't working for me using Firefox 52 (64 bit) on Ubuntu 16.04. We added "immutable" to the Cache-Control header to static assets (using unique urls) in our app but when hitting F5 it doesn't hit the cache but the server instead which returns a 304.

1) I open Firefox devtools

2) I open a page from our web application on localhost (http://127.0.0.1)

3) I hit CTRL-F5 to completely re-initialize the cache. That's the response headers for a PNG file on the page we get:
--------------------
Accept-Ranges: bytes
Cache-Control: public, max-age=31536000, immutable
Content-Length: 1061
Content-Type: image/png
Date: Sun, 12 Mar 2017 20:31:08 GMT
Etag: "8fe2e7043abc31ed723e8f2f5addcf79"
Last-Modified: Sat, 02 Jul 2016 00:09:04 GMT
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-Permitted-Cross-Domain-Policies: master-only
X-XSS-Protection: 1; mode=block
content-security-policy: default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com; font-src 'self' https://fonts.gstatic.com
--------------------

4) A couple of seconds later I hit ONLY the F5 key. We now hit the server which returns a 304 instead just hitting the cache.

Why is that?
Could it be that
* Some response headers each other?
* or I have to wait some time before the mechanism takes effect and a F5 request hits the cache?
* or is it because I was testing against http://127.0.0.1?
* or could it be that we overwrite window.onunload?
* or the immutable flag isn't enabled for all in FF52 on Linux?
* or devtools lie to me?

Any help would be appreciated, thank you!
Comment 71 User image m.kurz 2017-03-12 15:30:27 PDT
Sorry I meant
> * Some response headers INTERFERE each other?
and 
> * or could it be BECAUSE we DO overwrite window.onunload?
Comment 72 User image J. Ryan Stinnett [:jryans] (use ni?) (on PTO until Mar. 28) 2017-03-12 16:42:10 PDT
(In reply to m.kurz from comment #70)
> It seems this isn't working for me using Firefox 52 (64 bit) on Ubuntu
> 16.04. We added "immutable" to the Cache-Control header to static assets
> (using unique urls) in our app but when hitting F5 it doesn't hit the cache
> but the server instead which returns a 304.
> 
> 1) I open Firefox devtools

Do you have the cache disabled in DevTools?  You'll probably want to ensure that DevTools -> gear icon -> Advanced -> Disable HTTP cache is unchecked for this test.
Comment 73 User image m.kurz 2017-03-12 16:54:02 PDT
I found out why it didn't work:

It looks like the immutable flag is only taken into account over https. http -> asks the server, https -> hits the cache.

Why does it (only) work via https but not via http?
Is that on purpose?
Comment 74 User image Ben Kelly [not reviewing due to deadline][:bkelly] 2017-03-12 17:00:34 PDT
(In reply to m.kurz from comment #73)
> Why does it (only) work via https but not via http?
> Is that on purpose?

See the end of comment 39.  Sounds like this is intentional because https reduces the chance that corrupt data is being stored in the cache.
Comment 75 User image m.kurz 2017-03-12 17:08:57 PDT
Got it, thank you very much for your help!
Comment 76 User image ziyunfei 2017-03-17 21:01:05 PDT
Add a reload policy where only expired subresources are revalidated
​https://bugs.webkit.org/show_bug.cgi?id=169756

Note You need to log in before you can comment on or make changes to this bug.