Closed Bug 1083922 Opened 10 years ago Closed 10 years ago

Firefox 33 failing to load images, css, etc. from local cache based on cache-control max-age

Categories

(Core :: Networking: Cache, defect)

33 Branch
x86
All
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla36
Tracking Status
firefox33 --- wontfix
firefox34 + verified
firefox35 + verified
firefox36 + verified
firefox-esr31 --- unaffected
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed

People

(Reporter: adam.gresh, Assigned: michal)

References

()

Details

(Keywords: regression, testcase)

Attachments

(3 files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.101 Safari/537.36

Steps to reproduce:

Only tracked to server side, not yet able to reproduce on client.  


Actual results:

Monitoring for number of 200s vs. 304s shows number of 200s for same css has seen 5x increase correlated with adoption of FF 33 by users.  
Date | % FF33 Users | 304s | 200s
Tues. | .5% | 1851 | 283
Wed. | 4.6% | 2349 | 314
Thr. | 14.9% | 10067 | 299

This increase is only for user agents with Firefox/33.0, e.g. "Mozilla/5.0 (Windows NT 6.1; rv:33.0) Gecko/20100101 Firefox/33.0" in the load balancer logs.

Cache in our case is configured in web.xml and can be viewed at client using firebug...
HTTP/1.1 200 OK
Date: Thu, 16 Oct 2014 17:10:40 GMT
Server: Apache/2.2.23 (Amazon)
Last-Modified: Thu, 02 Oct 2014 00:10:46 GMT
Accept-Ranges: bytes
Content-Length: 1178
Cache-Control: max-age=2592000
Expires: Sat, 15 Nov 2014 17:10:40 GMT
Keep-Alive: timeout=15, max=96
Connection: Keep-Alive
Content-Type: image/png

	<filter>
		<filter-name>Cache</filter-name>
		<filter-class>com.m2.web.filters.CacheFilter</filter-class>
		<init-param>
			<param-name>cache</param-name>
			<param-value>public, max-age=2592000</param-value>
		</init-param>
	</filter>


Expected results:

css and image files should be served from local cache based on cache header rules
Could you attach HTTP log of a page with this issue, please.
https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Component: Untriaged → Networking: Cache
Flags: needinfo?(adam.gresh)
Product: Firefox → Core
Both logs were captured from firefox running on windows 7.
Jonathan Smith has attached the requested log files.

Steps to reproduce:  occurs when upgrading from FF32 to FF33 on windows machines.  A clean install off FF33 will not show the same behavior.

On a windows machine running FF32...

Visit any website that caches images and watch the network output.  You will see that on FF32 any given image will be retrieved from the server first with a 200 response and then will be served from the browser's cache for any subsequent requests.  Upgrade to FF33 and repeat.  Images will always be requested from the server.  Depending on server configuration first request will be sent image and a 200 response and subsequent requests will be sent a 304.  

There doesn't seem to be an upgrade path on Mac, only a new install so not able to reproduce on OSX.
Flags: needinfo?(adam.gresh)
Severity: normal → blocker
OS: Mac OS X → All
Priority: -- → P1
Please, don't play with the flags, this part is set by the devs. Ty.
Severity: blocker → normal
Priority: P1 → --
We're serving our static content from apache, so here is our cache config from httpd.conf...

#Expires Headers
ExpiresActive On
# expire images after a month in the client's cache
ExpiresByType image/gif A2592000
ExpiresByType image/png A2592000
ExpiresByType image/jpg A2592000
ExpiresByType image/jpeg A2592000
ExpiresByType image/x-icon A2592000
ExpiresByType image/vnd.microsoft.icon A2592000
ExpiresByType image/svg+xml A2592000
ExpiresByType text/css A2592000
ExpiresByType text/javascript A2592000
ExpiresByType application/javascript A2592000
ExpiresByType application/x-javascript A2592000
ExpiresByType application/json A2592000
Release 32 turned on new v2 cache...

https://bugzilla.mozilla.org/show_bug.cgi?id=913806

...could something in release 33 have had an impact? e.g....

https://bugzilla.mozilla.org/show_bug.cgi?id=1013587

...?
Could you provide a public testcase? like URL?
And if you can't, install the tool mozregression (see http://mozilla.github.io/mozregression/ for details) to find a potential regression range.
FF33 nightlies started in June 2014, so --good=2014-06-01 is a good start.
No need to bisec, jus stop when you get the pushlog in the console.
Flags: needinfo?(adam.gresh)
Here's more data to show this bug is happening.

Below I'm looking at my Apache logs for static images for today.   I'm grepping out a GET to our css file.  This file should be gotten once and then cached.  It should be cached locally and not getting to us after the first GET.

Below, I'm looking at the unique JSESSION ID's and counting them up, and only returning the count if the it happened more than 2 times.

For Chrome, this bug is not happening at all.

For Firefox/31, this bug is not happening at all.

For Firefox/32, this bug happened a couple times, but that could also be a user who's turned off caching.

For Firefox/33, this bug has happened 659 times.

[root@XXXXXXX httpd]# cat static-ssl-access_log.2014-10-17.txt |grep "Chrome" | grep "GET /ema/static/css/ema.css" | egrep -o "JSESSIONID=[^;]*;" | sort| uniq -c | awk -vthreshold=2 '$1 > threshold'  | wc -l
0

[root@XXXXXXX httpd]# cat static-ssl-access_log.2014-10-17.txt |grep "Firefox/31" | grep "GET /ema/static/css/ema.css" | egrep -o "JSESSIONID=[^;]*;" | sort| uniq -c | awk -vthreshold=2 '$1 > threshold'  | wc -l
0

[root@XXXXXXX httpd]# cat static-ssl-access_log.2014-10-17.txt |grep "Firefox/32" | grep "GET /ema/static/css/ema.css" | egrep -o "JSESSIONID=[^;]*;" | sort| uniq -c | awk -vthreshold=2 '$1 > threshold'  | wc -l
9

[root@XXXXXXX httpd]# cat static-ssl-access_log.2014-10-17.txt |grep "Firefox/33" | grep "GET /ema/static/css/ema.css" | egrep -o "JSESSIONID=[^;]*;" | sort| uniq -c | awk -vthreshold=2 '$1 > threshold'  | wc -l
659
Steps to reproduce with a live url.

Test url: https://pod02test.ema.md/ema/static/css/ema.css?build=4.0.0.3.5&rev=65558

On windows install fresh copy of 32.0.3 

Go to the test url and once loaded reload the same url several times.  After the first request the server will respond with 304 on subsequent requests.

Restart firefox to allow the upgrade to 33 to take place.

Again go to the test url.  Reload the file several times, each reload should get a 200 response code from the server.

Then clear the browser cache.

Again go to the test url.  Reload the file several times, after the first load server will respond with 304 response code.
We believe that the problem is cache corruption when upgrading from 32 to 33.
Flags: needinfo?(adam.gresh)
(In reply to Jonathan Smith from comment #12)
> Steps to reproduce with a live url.
> 
> Test url:
> https://pod02test.ema.md/ema/static/css/ema.css?build=4.0.0.3.5&rev=65558

This file is just plain text. Is it normal?
(In reply to Loic from comment #14)

The file is a css file from our application.  We chose this to eliminate any variability from javascript or anything else.  It very clearly shows the change in request behavior when firefox is upgraded from 32 to 33 and that if you then clear your cache firefox then returns to normal behavior.

Can I get anymore information for you from our side that may be helpful?

> (In reply to Jonathan Smith from comment #12)
> > Steps to reproduce with a live url.
> > 
> > Test url:
> > https://pod02test.ema.md/ema/static/css/ema.css?build=4.0.0.3.5&rev=65558
> 
> This file is just plain text. Is it normal?
The best would be to have a simple self-contained testcase, like HTML file with CSS sheets.
Hi Loic,

The use of a single text file takes all of the variability out of the problem and seems better suited to testing a caching issue.  By retrieving only a single file that has no other content it greatly reduces the complexity and number of possibilities for the cause and isolates it down to retrieving and caching a single file with no other dependencies.

Using this test case we were able to see that the client sends the initial request and gets a 200 for the single file from the server.  The server's response headers show that the file should be cached by the browser.  Entering the same address into the URL bar and hitting enter again will show that the client again goes to the server for the file and gets a 200.

Could you explain why you would like us to introduce more complexity into the use case so that we might better understand your needs?
(In reply to Loic from comment #18)

Video of the bug being reproduced.

http://youtu.be/4xTx3nq78Ao


> What does this URL show?
> https://pod02test.ema.md/ema/static/css/ema.css?build=4.0.0.3.5&rev=65558
The URL I needed is https://pod02test.ema.md/ema/Login.action :)
I'm gonna test on my side.
That testcase is laborious, the user needs to install each FF33 nightly and upgrade to the latest one (FF36 at the moment), then check the network activity. I'm too lazy to do that tonight.

If you want to do that, you can download each Nightly from the Mozilla's FTP (repo. mozilla-central).
FF33 Nighties started somewhere in mid-June: http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/06/
Cc'ing Alice, maybe my teammate has a better idea than me to test that easily and find a regression range
The easiest way to test this bug it to install r32.   Hit the css file listed above in this thread.  Let it cache.  Reload a couple times, then upgrade to r33.

Then try again, and you'll see you are getting 200's instead of cache hits (or 304's).  Every time you reload you'll get another 200 (full GET of the file).

Then if you clear your cache and try again, the first connection will give you a 200.  And everything thereafter will be a Cache or 304.

So the problem is for people who have upgraded from r32 to r33 and have not cleared their cache. Those users are not getting the benefit of any caching.
The issue is not to test the reproduction of the testcase, but to find the regression range by searching the 1st nightly where the issue is present. Each day Mozilla releases a Nightly with daily bugfixes.
(In reply to Loic from comment #24)

Will each nightly version upgrade to the next or will it automatically jump to the most recent?

> The issue is not to test the reproduction of the testcase, but to find the
> regression range by searching the 1st nightly where the issue is present.
> Each day Mozilla releases a Nightly with daily bugfixes.
I've started working on it will post back where i stop.
(In reply to Jonathan Smith from comment #26)
> I've started working on it will post back where i stop.

No luck with my approach went back to 31 but it always wants to update to 34 and it always has the issue... need to look at some other way of doing this.
[Tracking Requested - why for this release]: Regression, due to upgrade Firefox to version 33 from version 32


Regression window(m-i)
Good:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7231daefbb28
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0 ID:20140703232503
Bad:
https://hg.mozilla.org/integration/mozilla-inbound/rev/240a0b085725
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0 ID:20140703234337
Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=7231daefbb28&tochange=240a0b085725

Regressed by:
240a0b085725	Harsh Pathak — Bug 643041 - Merge nsIX509Cert2 and nsIX509Cert3 into nsIX509Cert, and merge nsIX509CertDB2 into nsIX509CertDB. r=keeler
Blocks: 643041
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(dkeeler)
Bug 643041 indirectly changed the representation of TransportSecurityInfo objects stored in the cache. When Firefox is upgraded, the new code can't read the old objects and should throw them away and re-fetch the associated resource. This should only happen once per resource, though. If the cache isn't storing new entries, that's definitely undesirable behavior.
Flags: needinfo?(dkeeler)
NI? on the Zack from bug 643041 to see if he can look into this regression, tracking up to Beta (34).
Flags: needinfo?(zackw)
I'm afraid I do not have time right now to do any Firefox work.

The observation that merging some C++ interfaces changes the on-disk representation of objects in the cache fills me with horror.  What on earth are we doing that would have that effect?
Flags: needinfo?(zackw)
Patrick - you're the module owner so over to you for triage and prioritizing.
Flags: needinfo?(mcmanus)
this is impt. honza is out of office for the rest of the week, but :michal also worked on the cache and in the team meeting today he agreed to take a look at this
Flags: needinfo?(mcmanus)
This is not an ideal solution because the first load just dooms the entry and a second load is needed to cache the resource. It seems to me that it's not easy to do this in a single load.
Assignee: nobody → michal.novotny
Attachment #8509135 - Flags: review?(mcmanus)
I did find a bug that feels related to this one, but goes in the opposite direction where requests to an API are being cached and not requested again for a long time. I discovered this while working on soundcloud and our acceptance tests failing in firefox. The too heavy caching starts with Firefox 33 and is still broken on 34 beta and 35 alpha. 32 and below are fine.

Steps to reproduce: 

* Go to https://soundcloud.com
* login - only then should the requests not be cached
* Open your network panel and filter XHR only to see best
* go to any user profile (just as an example, problem exists in general)
* First request has many requests to our API
* Consecutive request have almost no requests although server response indicates no caching (see my screenshots)

Screenshots: https://www.dropbox.com/sh/mh88v2csqxx437j/AADCH-E_I2eCOP0I1YpBn7hGa?dl=0

This is a very pressing issue. I imaging others are having it too. Is there anything on our side we can do?
(In reply to peter from comment #35)
> I did find a bug that feels related to this one, but goes in the opposite
> direction where requests to an API are being cached and not requested again
> for a long time.

Hi Peter - please file a new bug in this category for your issue. There would be no way to track the resolution of 2 different issues in one bug.
Comment on attachment 8509135 [details] [diff] [review]
patch v1 - doom the entry when security info cannot be parsed

Review of attachment 8509135 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm .. what versions is this needed on?
Attachment #8509135 - Flags: review?(mcmanus) → review+
Alright Patrick, I filed a new bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1087320
> lgtm .. what versions is this needed on?

Patch in bug 643041 landed on 33, so I guess we want to uplift this to aurora and beta.
Comment on attachment 8509135 [details] [diff] [review]
patch v1 - doom the entry when security info cannot be parsed

Approval Request Comment
[Feature/regressing bug #]: 643041
[User impact if declined]: reading from entries with security info in the old format fails, but the entries are not removed so we always need to fetch from the server
[Describe test coverage new/current, TBPL]: tested manually
[Risks and why]: probably none
[String/UUID change made/needed]: none
Attachment #8509135 - Flags: approval-mozilla-beta?
Attachment #8509135 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/5230ba4082a9
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
Comment on attachment 8509135 [details] [diff] [review]
patch v1 - doom the entry when security info cannot be parsed

Beta+
Aurora+
Attachment #8509135 - Flags: approval-mozilla-beta?
Attachment #8509135 - Flags: approval-mozilla-beta+
Attachment #8509135 - Flags: approval-mozilla-aurora?
Attachment #8509135 - Flags: approval-mozilla-aurora+
Flags: qe-verify+
I was able to reproduce the original issue on Win 7 x64, by updating 32.0.3 to 33.0.2 and reloading https://pod02test.ema.md/ema/static/css/ema.css?build=4.0.0.3.5&rev=65558. 200 was received upon every single reload of the page.

I've verified that 304 is received in the same scenario when updating 32.0.3 to:
- Firefox 34 beta 5 - BuildID=20141030172027
- Latest Firefox 35 Aurora - BuildID=20141031004003
- Latest Firefox 36 Nightly - BuildID=20141031030201

testing was done on Windows 7 x64 (I don't think testing is necessary on Linux and Mac as well).
I have the same bug...
Using this on my website dev sesssion to stop caching :
https://css-tricks.com/can-we-prevent-css-caching/
(In reply to vb078 from comment #47)
> I have the same bug...
> Using this on my website dev sesssion to stop caching :
> https://css-tricks.com/can-we-prevent-css-caching/

This bug is fixed, if you have a different issue, please, file a new bug with steps to reproduce.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: