Closed Bug 1267474 Opened 8 years ago Closed 8 years ago

Cache-Control: immutable

Categories

(Core :: Networking: Cache, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla49
Tracking Status
platform-rel --- ?
firefox49 --- fixed

People

(Reporter: ben.maurer, Assigned: mcmanus)

References

Details

(Keywords: dev-doc-complete, DevAdvocacy, Whiteboard: [necko-active], DevAdvocacy-Facebook [platform-rel-Facebook])

Attachments

(5 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.86 Safari/537.36

Steps to reproduce:

1) Go go facebook.com
2) open developer tools and look at networking requests
3) Reload the page (eg by pressing command + R)


Actual results:

You'll see a bunch of requests for JS/CSS/Images with If-Modified-Since headers that result in a 304. This wastes bandwidth and causes additional latency


Expected results:

Revalidation requests should not be sent for subresources. Chrome is adopting this behavior because it has been found to be aperformance problem. See the following for reference on the chrome change:

https://bugs.chromium.org/p/chromium/issues/detail?id=505048
https://crbug.com/600636
https://docs.google.com/document/d/1vwx8WiUASKyC2I-j2smNhaJaQQhcWREh7PC3HiIAQCo/edit#
How does this change impact sites that don't use hashed resources with long cache times?

For example, github.io pages that get a 4H cache time by default?  My naive take is that you can somewhat easily get incoherent cache loads on these sites and ctrl-r revalidating subresources fixes the problem for users.  I don't think most users even know what a hard refresh is.
Status: UNCONFIRMED → NEW
Component: General → Networking: Cache
Ever confirmed: true
(In reply to Ben Kelly [:bkelly] from comment #1)
> How does this change impact sites that don't use hashed resources with long
> cache times?
> 
> For example, github.io pages that get a 4H cache time by default?  My naive
> take is that you can somewhat easily get incoherent cache loads on these
> sites and ctrl-r revalidating subresources fixes the problem for users.  I
> don't think most users even know what a hard refresh is.



One design alternative we had proposed in doing this was that subresources would not be revalidated only if they had an expiration time more than 1 week in the future. The idea being that it might be reasonable for something like github.io to have a 4H expiration time and that a refresh might solve an incoherent cache. However if the resource had a cache time of over 1 week a site author would realize that not all of their users would figure out to use refresh and would be forced to switch URLs.

This would make for the best of both worlds -- diligent sites that correctly use the cache save bandwidth (they will all have long expiration times). Sites like github get a "classic" refresh button.

That said the chrome team preferred the simplicity of the solution they proposed.
Seems like there are two different use cases for reload that are conflicting a bit here:
 (1) get newer versions of resources whose cache validation times are still in the future, but that have (possibly) changed
 (2) fix problems resulting from flaky network connections causing some resources to fail to load

It seems like this proposed change will break many cases of (1), which is actually a thing users reload pages for.  (e.g., users reload to get "new data", and the new data may well be in a subresource rather than the toplevel page -- would be interesting to see if we could figure out how often this is the case, although figuring that out seems hard).

Another idea that would help with the (2) cases (although not the additional site load they cause) would be that if a reload results in a network error, but there was a valid cached copy, continue using the cached copy rather than having the behavior that results from a failed load.  (It might be tricky to define exactly what a network error is for this case, though.  It's also tricky for things that are partly loaded and then result in a network error, although I suppose we could continue to use the rest of cached copy as long as the incrementally-received data matches the beginning of the cached copy.)
I would also note that large sophisticated sites can set their own policy with service workers.  We expose Request.cache starting in FF48.  The service worker could see that Request.cache is 'no-cache' and override the resulting fetch() with a 'default' cache policy.

I'm not saying service workers are a replacement for a sane default policy, but the primitives are at least exposed now.
(In reply to David Baron [:dbaron] ⌚️UTC-7 from comment #3)
> Seems like there are two different use cases for reload that are conflicting
> a bit here:
>  (1) get newer versions of resources whose cache validation times are still
> in the future, but that have (possibly) changed
>  (2) fix problems resulting from flaky network connections causing some
> resources to fail to load
> 
> It seems like this proposed change will break many cases of (1), which is
> actually a thing users reload pages for.  (e.g., users reload to get "new
> data", and the new data may well be in a subresource rather than the
> toplevel page -- would be interesting to see if we could figure out how
> often this is the case, although figuring that out seems hard).

My guess is that most sites have learned not to do this because many of their users haven't learned to use reload to update the data. As chrome deploys this new policy my guess is that sites will change their ways.

If we really want to support this use case, I think the threshold solution of 1 week preserves the existing use case.


> Another idea that would help with the (2) cases (although not the additional
> site load they cause) would be that if a reload results in a network error,
> but there was a valid cached copy, continue using the cached copy rather
> than having the behavior that results from a failed load.  (It might be
> tricky to define exactly what a network error is for this case, though. 
> It's also tricky for things that are partly loaded and then result in a
> network error, although I suppose we could continue to use the rest of
> cached copy as long as the incrementally-received data matches the beginning
> of the cached copy.)

The issue here is that if the user is on a poor quality connection issuing the revalidation requests is likely to make the overall experience bad still. How long would we allow revalidations to be outstanding: 10 seconds? That's already a terrible experience.

(In reply to Ben Kelly [:bkelly] from comment #4)
> I would also note that large sophisticated sites can set their own policy
> with service workers.  We expose Request.cache starting in FF48.  The
> service worker could see that Request.cache is 'no-cache' and override the
> resulting fetch() with a 'default' cache policy.
> 
> I'm not saying service workers are a replacement for a sane default policy,
> but the primitives are at least exposed now.

Fair enough. But long caching is such a common pattern that I think it's really important we get the defaults right. Facebook's data suggests that 10-20% of all requests to our cdn are spurious revalidations. We suspect that other sites are seeing similar rates. Even a 5% rate of revalidations is enough to be a big perf difference
Could we make this a meta tag?  Then the document could easily opt in to no revalidation for sub resources without hurting smaller, less sophisticated sites.
That's an interesting data source. 

(In reply to Ben Maurer from comment #5)
> Facebook's data suggests that
> 10-20% of all requests to our cdn are spurious revalidations. We suspect
> that other sites are seeing similar rates. Even a 5% rate of revalidations
> is enough to be a big perf difference

Have you been able to measure the perf difference?
How did you measure the differences?
I'm asking because that would be useful if Firefox was to change something. 
Also as a comparison point to evaluate on other sites in different country markets. 

About:

> My guess is that most sites have learned not to do this because many of their users haven't learned to use reload to update the data. As chrome deploys this new policy my guess is that sites will change their ways.

This part makes me a bit nervous. From experience, sites don't necessary updates. There are tons of legacy issues and sites do not behave the same depending on countries/servers. I agree with big active properties will probably update (with a lot of swearing at first from Web devs ;) ) but less sure about the rest of the Web. So we really have to think about this carefully. 

More data would be super useful.
What was proposed by Ben in an email.


> What we are proposing is that:
> 
> If (resource is expired) {
>  Send revalidation
> } If (request is main resource && resource is not expired && user initiated refresh) {
>  Send revalidation
> } else {
>  Use response from cache
> }
(In reply to Ben Kelly [:bkelly] from comment #6)
> Could we make this a meta tag?  Then the document could easily opt in to no
> revalidation for sub resources without hurting smaller, less sophisticated
> sites.

We'd be happy with this solution. That said, i think the best outcome here is if we can respect long cache times by default so that people see the impact of this across the web instantly.

My personal preference for a cautious solution is "only revalidate resources with a lifetime > 1 week".

(In reply to Karl Dubost :karlcow from comment #7)
> That's an interesting data source. 
> 
> (In reply to Ben Maurer from comment #5)
> > Facebook's data suggests that
> > 10-20% of all requests to our cdn are spurious revalidations. We suspect
> > that other sites are seeing similar rates. Even a 5% rate of revalidations
> > is enough to be a big perf difference
> 
> Have you been able to measure the perf difference?
> How did you measure the differences?
> I'm asking because that would be useful if Firefox was to change something. 
> Also as a comparison point to evaluate on other sites in different country
> markets. 

~ 2 years ago there was a chrome bug that caused them to send even more revalidations than they do today. Due to the nature of the bug we were able to work around it on our side. We did an A/B test. There was a substantial increase in overall perf on fb for people who were in the workaround. In fact the site was so much faster we saw a number of our usage/engagement numbers increase. We suspect that if browsers fix the remaining sources of revalidations we'll see another such bump.

the 10-20% number comes from our cdn logs -- about 10-20% (depending on browser) of requests to our CDN are revalidations. This means that if we address this issue that those requests will be served by the user's cache and not cause latency.
If we do this, we'd certainly need to restrict this to things whose freshness lifetimes were explicitly set in the response headers, not things which have heuristic freshness lifetimes.

In other words, we'd need to change how necko handles some of the nsIRequest flags (or add new ones), not merely stop propagating to subresources the flags we propagate right now....
Ben, I have a specific question about the Facebook setup.  How often are the resources with 1 year freshness actually updated?  Is the answer just "never, when we need to change one of those we give it a different URL"?
Flags: needinfo?(ben.maurer)
(In reply to Boris Zbarsky [:bz] from comment #11)
> Ben, I have a specific question about the Facebook setup.  How often are the
> resources with 1 year freshness actually updated?  Is the answer just
> "never, when we need to change one of those we give it a different URL"?

We never change them. If we want a new resource we change the URL.
Flags: needinfo?(ben.maurer)
Also, do you know what Chrome is doing for URLs with heuristic freshness?  I can't tell from their issue tracker...
Flags: needinfo?(ben.maurer)
Summary: Do not send revalidation requests for subresources → Do not send revalidation requests for subresources on user initiated refresh
This is an interesting thread, thanks.

up to 20% 304's is a good data point. For resources that never change, that's a significant waste.

however, I'm against adding more crazy heuristics to this space or changing historical behavior that other origins legitimately rely on.

So what we need is an explicit opt in that we can implement and then standardize and get the same behavior across clients. Ben suggests Meta - that's a good start, but this is a property of the subresource not of the HTML. This is what cache-control extensions were made for https://tools.ietf.org/html/rfc7234#section-5.2.3

Cache-Control: max-age=31536000, never-revalidate

I would implement that so that Firefox would never send a conditional request for that cached resource.. We would always use the cached copy as long as it was fresh and we weren't concerned about the integrity of the cached copy[*].

What do you think of that proposal? I'd be happy to work with Chrome on it.

[*] the web in general is full of broken crap including broken HTTP responses that have things like bogus content-lengths. We need to be lax in rendering and caching this stuff or the web doesn't work (we've tried enforcing the rules to our own detriment in the past). But of course this looks just like the truncated load from a real network error - so in those cases we would probably ignore never-revalidate when reload was pressed.
The never-revalidate cache-control seems like a great idea to me.  Let the server be explicit about its intent for the resource.
(In reply to Boris Zbarsky [:bz] from comment #13)
> Also, do you know what Chrome is doing for URLs with heuristic freshness?  I
> can't tell from their issue tracker...

I asked the chrome team to provide some input here.

(In reply to Patrick McManus [:mcmanus] from comment #14)
> This is an interesting thread, thanks.
> 
> up to 20% 304's is a good data point. For resources that never change,
> that's a significant waste.
> 
> however, I'm against adding more crazy heuristics to this space or changing
> historical behavior that other origins legitimately rely on.
> 
> So what we need is an explicit opt in that we can implement and then
> standardize and get the same behavior across clients. Ben suggests Meta -
> that's a good start, but this is a property of the subresource not of the
> HTML. This is what cache-control extensions were made for
> https://tools.ietf.org/html/rfc7234#section-5.2.3
> 
> Cache-Control: max-age=31536000, never-revalidate

I think that's a great idea!

https://www.ietf.org/mail-archive/web/httpbisa/current/msg25463.html

The response was somewhat lukewarm on this thread. I think people are understandably unexcited about an extension that essentially boils down to "no I *really* mean my expiration date". One question I got asked privately was "what is the use case for max-age=1 year without a never-revalidate header". I don't think there is one. Changing a resource with a 1 year max age is a terrible user experience, so bad that I truly doubt that people do it. 

Would you guys like to help champion this solution? We're happy with the header, a meta tag, or a change in browser behavior.
(In reply to Ben Maurer from comment #16)
> The response was somewhat lukewarm on this thread. I think people are
> understandably unexcited about an extension that essentially boils down to
> "no I *really* mean my expiration date". One question I got asked privately
> was "what is the use case for max-age=1 year without a never-revalidate
> header". I don't think there is one. Changing a resource with a 1 year max
> age is a terrible user experience, so bad that I truly doubt that people do
> it. 

Yea, I can understand your frustration.

These questions seem weird to me, though.  There are plenty of servers out there that change resources under the same URL within the max-age.  Its trivial to reproduce this on github.io and probably other sites.

To me what the "never-revalidate" means is really "I'm using a unique URL for this resource and I will give you a new URL if it ever changes".  That's somewhat orthogonal to max-age setting.

Saying that there is no use case for a long max-age without unique URLs does not mean servers won't do that.  I don't know why we would want a magic inference from the browser here when the server could just be explicit.

Martin, do you have any opinions here?
Flags: needinfo?(martin.thomson)
Thanks for unearthing the httpbis thread. A few thoughts based on that:

* I would change the name from never-revalidate to immutable. Mark makes the point that never-revalidate, as proposed, is really about browser behavior and therefore could live somewhere else.. but we actually don't want this to be about proscribing behavior, we're trying to communicate a property of the resource - which is that it never changes. Its most applicable to caching policy, so Cache-Control extension is a decent place to put it.

* As far as enthusiasm - nothing like running code and experience to focus the mind. That's why there is an extension mechanism. Just have to be open to modifying the approach based on evidence as it comes in.

* http in general doesn't try and insist you use the most recent variant of a resource.. there is nothing conceptually wrong with a resource that changes every day with a max-age of a year and we shouldn't try and undermine that.. but a new property like immutable that indicates this is not the case is certainly new information that will influence cache management policy.

Cache-Control: max-age=36500000, immutable

I think I'd work on that. We'd want to get an agreement with Chrome (if possible) and then get some deployment data potentially leading to a I-D if we're all happy with it.

This would obviously also significantly improve page load time.
We'd be happy to put in an immutable attribute. Sounds like the next step here is to get a patch that implements this so that we could test that out.

I'll talk with the chrome team about this idea. They're committed to solving this problem in some fashion.
Ah, sounds like Patrick has things in hand here.  Sorry for roping you in Martin.
Flags: needinfo?(martin.thomson)
I'm late to the thread here, but I agree with Patrick about adding "immutable".  (I'm torn--it would be really nice to have a solution that just rolls out w/o lots of server changes, like the "don't revalidate if expires in more than a week" heuristic.  But my gut says we'd break stuff, and in hard to debug ways.)
No worries Ben.  I'm across the issue.  I've offered Ben M my assistance in getting specs written when that happens.

I think that we should look at shipping this, but we might want to ignore the immutability thing under certain circumstances.  We could require stronger integrity mechanisms rather than etag as a signal that the origin is serious: e.g., https://tools.ietf.org/html/draft-thomson-http-mice
(In reply to Jason Duell [:jduell] (needinfo? me) from comment #21)
> I'm late to the thread here, but I agree with Patrick about adding
> "immutable".  (I'm torn--it would be really nice to have a solution that
> just rolls out w/o lots of server changes, like the "don't revalidate if
> expires in more than a week" heuristic.  But my gut says we'd break stuff,
> and in hard to debug ways.)

How about having an additional Cache-Control extension like "immutable-subresources" or some such which would mean that subresources for the response are all immutable?  And we can make it work with a <meta http-equiv> tag served in the main page.  With that, if the server wants to have only some subresources immutable it can be configured to use the immutable extension for those responses, otherwise you could just serve a <meta http-equiv="Cache-Control" content="max-age=36500000, immutable-subresources"> in their HTML markup, and that would satisfy the use case in comment 0 without any server configuration changes.
(In reply to :Ehsan Akhgari (busy, don't ask for review please) from comment #23)
> How about having an additional Cache-Control extension like
> "immutable-subresources" [...]

I don't have data to back this (Ben M might), but my intuition is that there are almost no pages that have all immutable subresources, so this wouldn't help much.  I don't think that the server changes are a barrier worth optimizing around.  In this case, we want the server folks to have to spend at least some effort to get into this state; after all, it would be bad if this were applied accidentally.

If people are concerned about breaking things permanently, we can build out a system that revalidates with a server occasionally until the server shows that they really mean it.  I believe that the same approach was proposed for HSTS: cap the time that the pin is respected for until the site demonstrates availability over time.  That will delay benefits, but allow for accidents to be rolled back relatively quickly in case of bustage.

That's a lot more work though, and I tend to think that Patrick's suggestion is probably enough.

BTW, the need for paranoia is justified, but it's not all that bad.  In case of bustage, servers can put the resource up at a new URI.  Changing a bunch of links might suck, but it's still technically possible to avoid a permanent break.
Saying all resources are immutable is a bit extreme. But I'd be ok with a meta tag that says please respect our expiration date for subresources regardless of if the user is refreshing
Err, sorry, to clarify, what I meant was for "immutable-subresources" to mean subresources with a max-age matching what's in the Cache-Control header (perhaps with validity >= the max-age in the Cache-Control header) to be immutable.  The subresources served with lower max-age (or perhaps with no max-age at all) will be considered as mutable.

IOW, if the main page is served with |Cache-Control: max-age=123456, immutable-subresources|, then responses for the subresources served with |Cache-Control: max-age=123456| and |Cache-Control: max-age=123457| etc will be considered immutable.

Does that make sense?
Seems a bit indirect for my liking.  A more explicit method would be to tie this to SRI, something we should do without the directive, I think.  But I'm not sure that folks want to add SRI headers to things like that.
Sri is not great here because we often don't know the hash of something (eg a photo that's on our cdn that we might dynamically resize)
> of bustage, servers can put the resource up at a new URI.

ack - or more simply ctrl shift reload still reloads things unconditionally. As Ben points out that's what you have to do to fix corruption right now anyhow when the etag is right but the message body isn't (e.g. truncation) which is the common failure more anyhow. I wouldn't expect a accidental immutable to be more common than that.

I'm pretty confident that immutable is a property of the resource not of the page referencing it. This creates a consistent story for things other than HTML as well as third party combinations.
Assignee: nobody → mcmanus
Status: NEW → ASSIGNED
Summary: Do not send revalidation requests for subresources on user initiated refresh → Cache-Control: immutable
Whiteboard: [necko-active]
Flags: needinfo?(ben.maurer)
Attached patch cache-control: immutable 2/3 (obsolete) — Splinter Review
Attached image with-immutable.png
Attached image without-immutable.png
I'm separately going to work on some documentation here..
Attachment #8749412 - Flags: review?(honzab.moz)
Attachment #8749413 - Flags: review?(honzab.moz)
Attachment #8749415 - Flags: review?(honzab.moz)
Attachment #8749413 - Attachment is obsolete: true
Attachment #8749413 - Flags: review?(honzab.moz)
Attachment #8749805 - Flags: review?(honzab.moz)
Comment on attachment 8749412 [details] [diff] [review]
only revalidate strongly framed responses 1/3

Review of attachment 8749412 [details] [diff] [review]:
-----------------------------------------------------------------

honza, the notion is that you revalidate for one of two reasons

1] to get a newer representation of the resource
or 2] to fix corruption

This part of the patch is aimed at the fact that revalidation via If-Modified-Since or If-None-Match doesn't really help with #2.. it just ensures you have the right etag/Date cached locally.. given that the common corruption is truncation, you just get back a 304 confirming your meta data is fine without confirming the data at all.

So if we don't have a good indication that corruption has not occurred (good indications include Content-Length headers that match, or chunked encodings that terminate correctly, or end-stream bits found on h2/spdy... the patch calls this strongly framed) then skip revalidation and just transfer the whole thing.

This is a valid thing to do independent of immutable, but we only want to honor immutable in the case of strongly framed https responses which is why its in this patch series
(In reply to Patrick McManus [:mcmanus] from comment #39)
> Comment on attachment 8749412 [details] [diff] [review]
> only revalidate strongly framed responses 1/3
> 
> Review of attachment 8749412 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> honza, the notion is that you revalidate for one of two reasons
> 
> 1] to get a newer representation of the resource
> or 2] to fix corruption
> 
> This part of the patch is aimed at the fact that revalidation via
> If-Modified-Since or If-None-Match doesn't really help with #2.. it just
> ensures you have the right etag/Date cached locally.. given that the common
> corruption is truncation, you just get back a 304 confirming your meta data
> is fine without confirming the data at all.
> 
> So if we don't have a good indication that corruption has not occurred (good
> indications include Content-Length headers that match, or chunked encodings
> that terminate correctly, or end-stream bits found on h2/spdy... the patch
> calls this strongly framed) then skip revalidation and just transfer the
> whole thing.
> 
> This is a valid thing to do independent of immutable, but we only want to
> honor immutable in the case of strongly framed https responses which is why
> its in this patch series

I am not sure we need this.  When a _cached_ response is incomplete (content-length doesn't match the size of the cached data) it's always also resumable.  Then we always do a range request for which immutable must be ignored - haven't took a look if these patches already contain that bypass.

If we end up with an incomplete _network_ response that is not resumable (the set of conditions like must have an etag, content-length and accept byte ranges + some others have not all been met) we doom the cached entry immediately in OnStopRequest.  Hence such a response can never be reused from the cache.

In other words, I think resumability of a response implies strong framing already.  But I need to double check on that first.
it sounds like you're saying we wouldn't cache at all a response delimited via connection: close.. I don't *think* that's true - but I guess we should double check. And if we don't, shouldn't we? (I'm just talking about normal transaction hit reuse here - not resuming)
Comment on attachment 8749412 [details] [diff] [review]
only revalidate strongly framed responses 1/3

Review of attachment 8749412 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +3562,5 @@
> +        nsXPIDLCString buf;
> +        rv = entry->GetMetaDataElement("strongly-framed", getter_Copies(buf));
> +        // describe this in terms of explicitly weakly framed so as to be backwards
> +        // compatible with old cache contents which dont have strongly-framed makers
> +        bool weaklyFramed = NS_SUCCEEDED(rv) && buf.EqualsLiteral("0");

should this be backwards?  I'd more tend to treat content w/o this meta as weakly framed.  Should be NS_FAILED(rv) || buf.Equals('0')

::: netwerk/protocol/http/nsHttpChannel.h
@@ +569,5 @@
> +
> +    // if the http transaction was performed (i.e. not cached) and
> +    // the result in OnStopRequest was known to be correctly delimited
> +    // by chunking, content-length, or h2 end-stream framing
> +    bool mStronglyFramed;

could this be added to the bit fields we keep here?

::: netwerk/protocol/http/nsHttpTransaction.h
@@ +282,5 @@
>      // after the main thread modifies it. To deal with raciness, only unsetting
>      // bitfields should be allowed: 'lost races' will thus err on the
>      // conservative side, e.g. by going ahead with a 2nd DNS refresh.
>      Atomic<uint32_t>                mCapsToClear;
> +    Atomic<bool>                    mResponseIsComplete;

I think you are OK with just ReleaseAcquire ordering here.
Attachment #8749412 - Flags: review?(honzab.moz) → review+
Attachment #8749805 - Flags: review?(honzab.moz) → review+
Comment on attachment 8749415 [details] [diff] [review]
cache-control: immutable  tests 3/3

Review of attachment 8749415 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/xpcshell/moz-http2/moz-http2.js
@@ +715,5 @@
>      Serializer.prototype._transform = newTransform;
>    }
>  
> +  // for use with test_immutable.js
> +  else if (u.pathname === "/immutable-test1") {

some more descriptive names would be good here.
Attachment #8749415 - Flags: review?(honzab.moz) → review+
build fail was because I reordered a member variable on honza's review comment.. my smoketest compiler didn't enforce that.
Flags: needinfo?(mcmanus)
Sorry for the late reply.

On the Chrome side, we still want to continue exploring changing the reload behavior.

Namely, scope down to 2 user facing reload behaviors (keep current options as is for developers).

 - reload: validate the main resource, proceed with a regular load for everything else (i.e. respect max-age, don't proactively validate seemingly fine resources).

 - hard reload. Sidenote: it's currently missing on Chrome for Android; Our UX team wants to avoid obscure gestures (e.g. long press the reload button, pull harder to hard reload...) and instead just make it work.


We are thinking about metrics to determine how much breakage/frustration this leads to and have several options to consider in response. I'll be happy to share what we learn.

Our interest in Immutable is currently low but might change depending on what we all learn.
The main reason for the lack of interest being that the impact on the Loading user experience will be inherently smaller.
test failed in comment 50 because it depends on node and xpcshell.ini did not have the skip-if !hasNode annotation.
Flags: needinfo?(mcmanus)
https://hg.mozilla.org/mozilla-central/rev/1c91c9a5fdba
https://hg.mozilla.org/mozilla-central/rev/06a3edfb00cd
https://hg.mozilla.org/mozilla-central/rev/546b3763d1b6
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Keywords: DevAdvocacy
Whiteboard: [necko-active] → [necko-active], DevAdvocacy-Facebook
platform-rel: --- → ?
Whiteboard: [necko-active], DevAdvocacy-Facebook → [necko-active], DevAdvocacy-Facebook [platform-rel-Facebook]
Can someone please tell me why immutable caches don't do validations? 

https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpChannel.cpp#3740
ziyunfei, see Comment #18
(In reply to Karl Dubost :karlcow from comment #58)
> ziyunfei, see Comment #18

What I want to ask is why a immutable cache doesn't send If-Modified-Since/If-None-Match header after its Max-Age expires? 

e.g. Cache-Control: max-age=3600, immutable

Within an hour, no http requests sent. After an hour, should be a 304 response, not 200 I think.
(In reply to ziyunfei from comment #59)
> (In reply to Karl Dubost :karlcow from comment #58)
> > ziyunfei, see Comment #18
> 
> What I want to ask is why a immutable cache doesn't send
> If-Modified-Since/If-None-Match header after its Max-Age expires? 
> 
> e.g. Cache-Control: max-age=3600, immutable
> 
> Within an hour, no http requests sent. After an hour, should be a 304
> response, not 200 I think.

mcmanus, is this intended behavior, I think immutable resources do need revalidation when they expire.
Flags: needinfo?(mcmanus)
(In reply to ziyunfei from comment #60)

> mcmanus, is this intended behavior, I think immutable resources do need
> revalidation when they expire.

By being marked immutable the server has declared that the resources do not change, therefore we don't use conditional revalidation - (what does if-modified mean for something that is never modified?) its assumed the reload is being done for coherency/corruption purpose so a full reload is done and 304 tends to be destructive to coherency checks. (i.e. the etag/l-m is correct but the stored response body is not).

do you have a use case in mind? I would expect the standard immutable use would have a very large max-age.
Flags: needinfo?(mcmanus)
(In reply to Patrick McManus [:mcmanus] from comment #61)
> By being marked immutable the server has declared that the resources do not change,
By being marked immutable the server has declared that the resources do not change *before the end of the declared max-age period*.

> I would expect the standard immutable use would have a very large max-age.
Well, I would also expect that, 1 year even 10 years (long-lived than user's computer), because these are the use cases Cache-Control: immutable are invented for, if this is the case, 200 or 304, whatever, never expire anyway; unfortunately this is not mandatory, people may need short max-age like 1 day or 1 week, maybe w3c's site would use this kind short max-age (https://github.com/w3c/tr-design/issues/101), maybe other sites would also.

So I don't think "immutable" should "never-revalidate".
(In reply to ziyunfei from comment #62)
> (In reply to Patrick McManus [:mcmanus] from comment #61)
> > By being marked immutable the server has declared that the resources do not change,
> By being marked immutable the server has declared that the resources do not
> change *before the end of the declared max-age period*.

no - that is not correct. They declare they do not change - they are immutable.
I think that the confusion here is that the max-age is still being used.  But the point is that max-age is only being used to evict something from cache.  At that point, there is still a promise that the content is the same, it's just no longer useful to retain that information.
Blocks: 563297
QA Whiteboard: https://bugs.chromium.org/p/chromium/issues/detail?id=654378
QA Whiteboard: https://bugs.chromium.org/p/chromium/issues/detail?id=654378
Hi, I have two questions about this topic.

1) Is it possibile to use "immutable" and also considerating the resource's expiration time?
There will be a grat max-age but, when it decade, can we discard the sub-resource?

2) how many sites are using this token beyond Facebook?
I notice that some Facebook's affiliate sites (like newsroom.fb.com or insights.fb.com) aren't using this token for static objects like their icon.
FYI: "Chrome no longer forces validation on reload, so immutable shouldn't be necessary." from https://bugs.chromium.org/p/chromium/issues/detail?id=611416#c12
Comment 9 is interesting too on the Chromium bug.

> (With Firefox and `Cache-Control: immutable` on the 
> BBC Web site) we're now seeing ~31% fewer 304's served 
> from our CDN edge on a subset of our static assets 
> (which have immutable enabled).
https://bugs.chromium.org/p/chromium/issues/detail?id=611416#c9

That's a huge win.
See Also: → 1324644
The fact that (during a reload) Firefox requests many resources it was previously told to cache is a problem specific to Firefox.  The problem should be solved in Firefox, and shouldn't require the whole Internet to adopt a new cache-control header when the headers they are already return are sufficient.

Chrome solved this problem at the browser, without requiring every website to change.  When reloading, only resources that were cached according to a heuristic (e.g. the last-modified time) should be conditionally fetched.  Resources which a server *explicitly* indicated are still valid should be loaded from the cache.

Even if websites adopted this workaround, there are still broken scenarios.  For resources which periodically change, "immutable" wouldn't apply, and Firefox will still ignore the expiration indicated by the site during a reload.
FYI: Safari just implemented Cache-control: immutable, also want to implement Chrome's new reload behavior https://bugs.webkit.org/show_bug.cgi?id=167497
It seems this isn't working for me using Firefox 52 (64 bit) on Ubuntu 16.04. We added "immutable" to the Cache-Control header to static assets (using unique urls) in our app but when hitting F5 it doesn't hit the cache but the server instead which returns a 304.

1) I open Firefox devtools

2) I open a page from our web application on localhost (http://127.0.0.1)

3) I hit CTRL-F5 to completely re-initialize the cache. That's the response headers for a PNG file on the page we get:
--------------------
Accept-Ranges: bytes
Cache-Control: public, max-age=31536000, immutable
Content-Length: 1061
Content-Type: image/png
Date: Sun, 12 Mar 2017 20:31:08 GMT
Etag: "8fe2e7043abc31ed723e8f2f5addcf79"
Last-Modified: Sat, 02 Jul 2016 00:09:04 GMT
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-Permitted-Cross-Domain-Policies: master-only
X-XSS-Protection: 1; mode=block
content-security-policy: default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com; font-src 'self' https://fonts.gstatic.com
--------------------

4) A couple of seconds later I hit ONLY the F5 key. We now hit the server which returns a 304 instead just hitting the cache.

Why is that?
Could it be that
* Some response headers each other?
* or I have to wait some time before the mechanism takes effect and a F5 request hits the cache?
* or is it because I was testing against http://127.0.0.1?
* or could it be that we overwrite window.onunload?
* or the immutable flag isn't enabled for all in FF52 on Linux?
* or devtools lie to me?

Any help would be appreciated, thank you!
Sorry I meant
> * Some response headers INTERFERE each other?
and 
> * or could it be BECAUSE we DO overwrite window.onunload?
(In reply to m.kurz from comment #70)
> It seems this isn't working for me using Firefox 52 (64 bit) on Ubuntu
> 16.04. We added "immutable" to the Cache-Control header to static assets
> (using unique urls) in our app but when hitting F5 it doesn't hit the cache
> but the server instead which returns a 304.
> 
> 1) I open Firefox devtools

Do you have the cache disabled in DevTools?  You'll probably want to ensure that DevTools -> gear icon -> Advanced -> Disable HTTP cache is unchecked for this test.
I found out why it didn't work:

It looks like the immutable flag is only taken into account over https. http -> asks the server, https -> hits the cache.

Why does it (only) work via https but not via http?
Is that on purpose?
(In reply to m.kurz from comment #73)
> Why does it (only) work via https but not via http?
> Is that on purpose?

See the end of comment 39.  Sounds like this is intentional because https reduces the chance that corrupt data is being stored in the cache.
Got it, thank you very much for your help!
Add a reload policy where only expired subresources are revalidated
​https://bugs.webkit.org/show_bug.cgi?id=169756
Regressions: 1812505
You need to log in before you can comment on or make changes to this bug.