Closed Bug 465529 Opened 16 years ago Closed 16 years ago

Netscaler flush for AMO

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

All
Other
task
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: wenzel, Assigned: oremj)

Details

Yesterday's push hasn't completely made it into the netscaler: Fashion your firefox's CSS file is out of date.

Please flush that for us, will you? If you can selectively flush, get AMO/css/*, otherwise all of AMO.

Thanks. It's being presented in NYC at the moment, so we should get this out ASAP.
Flushing but I'm surprised it didn't age itself out of the cache.  Max cache age is 3600 seconds unless you specified some crazy cache-age.

Not even sure how to test, assuming it's fixed.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Thanks a lot. To test:

https://addons.mozilla.org/en-US/firefox/fashionyourfirefox/

should look like

https://preview.addons.mozilla.org/en-US/firefox/fashionyourfirefox/

Which it doesn't :( The good news is, the NS was not at fault. Bad news: We don't know who is serving the old (pre-push) CSS file.
What makes it seem like the NS is that it looks right when you're logged in to AMO and have the AMOv3 cookie which bypasses the NS cache for a logged-in user.
Is there a specific file I can test against with curl?
Sorry, I need to reopen:

https://preview.addons.mozilla.org/css/style.min.css?19906

is served with the following headers:
https://addons.mozilla.org/css/style.min.css?19906



GET /css/style.min.css?19906 HTTP/1.1

Host: addons.mozilla.org

User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.4) Gecko/2008102920 Firefox/3.0.4

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

Accept-Language: en-us,en;q=0.7,de;q=0.3

Accept-Encoding: gzip,deflate

Accept-Charset: UTF-8,*

Keep-Alive: 300

Connection: keep-alive

Cookie: __utma=150903082.174226280.1213422549.1226005181.1226005650.18; __utmz=150903082.1224270483.16.8.utmccn=(organic)|utmcsr=google|utmctr=get+firefox+3|utmcmd=organic; __utma=164683759.1806091917.1213674996.1226986837.1227019833.328; __utmz=164683759.1226720602.320.77.utmccn=(referral)|utmcsr=bugzilla.mozilla.org|utmcct=/show_bug.cgi|utmcmd=referral; dloadday=128.237.229.48.1214256165114786; __utmc=164683759; AMOappName=firefox; __utmb=164683759

Pragma: no-cache

Cache-Control: no-cache



HTTP/1.x 200 OK

Date: Tue, 18 Nov 2008 03:36:26 GMT

Expires: Fri, 16 Nov 2018 03:37:20 GMT

Cache-Control: max-age=315360000

Content-Length: 11572

Connection: Keep-Alive

Via: NS-CACHE-6.0:   4

Etag: "c7da"

Server: Apache/2.2.3 (Red Hat)

Last-Modified: Tue, 04 Nov 2008 21:30:08 GMT

Accept-Ranges: bytes

ntCoent-Length: 51162

Content-Type: text/css

Content-Encoding: gzip



The last-modified header is not supposed to be November 4th. Yes it has a crazy-long max-age, but we need to get that file off the cache somehow :-/
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Oh, well, my comment is a little late. Specific test would be the Last-Modified header. If it's not November 4th, it worked.
I am not sure if you did anything additional, but the Last-Modified header is showing November 18th now, and it is now magically working the way it is supposed to.

Thanks! For the future, we should find out though what caused this, so we can avoid it going forward.
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
I think the ? parameter is causing unique entries to be cached.

root@nslb02# nscachemgr -a | grep addons.mozilla.org | grep style.min.css
0x000000018166571c5bd2  DEFAULT GET     //addons.mozilla.org:443/css/style.min.css?19906
0x00000003ea8c572ca27b  DEFAULT GET     //versioncheck.addons.mozilla.org:443/css/style.min.css?19906
0x00000007d8965736d540  DEFAULT GET     //addons.mozilla.org:443/css/style.min.css?17704
0x000000086a755709a4cd  DEFAULT GET     //addons.mozilla.org:443/css/style.min.css?18878
0x0000000ca86f573b48f9  DEFAULT GET     //addons.mozilla.org:443/css/style.min.css?19426
0x0000000e1f43572759df  DEFAULT GET     //addons.mozilla.org:443/css/style.min.css?19143
0x0000000e5856554df35c  DEFAULT GET     //addons.mozilla.org:443/css/style.min.css?19465

oremj, any idea how to make those non-unique?
Assignee: server-ops → oremj
Severity: blocker → normal
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Shouldn't they by unique?  Is addons still on the 15k?
It is on the 15k and I don't know.
Sorry, I am jumping in to this bug late.  What issues currently need to be a addressed to fix this bug?
Can you un-mark this as a private bug, please? Then we can discuss how to avoid this in the future.
Group: infra
(In reply to comment #11)
> Sorry, I am jumping in to this bug late.  What issues currently need to be a
> addressed to fix this bug?

What I think *seems* to have happened is:
- during push, the data is rsynced. The php files made it over yet, the CSS files didn't.
- a user performs a GET request for /css/style.min.css?19906 (due to the updated php files)
- the app node fetches the *OLD* CSS file from the file system, it is cached for eternity.
- rest of the update is performed
...
- => all users get an old CSS file disguised as the new revision.


To fix it, we need to make sure this does not happen again.
- One way would be to routinely perform a netscaler flush (for AMO anyway) after a push has happened completely. That'd also get rid of the old cached revisions that are not needed anymore.
- or we make a push atomic: That is, the rsync goes to a different directory, and only when all is in place on all app nodes, a switch is flipped (symlink switched, for example) to make all new files available at once.

Oremj: What do you think is the best (and least painful for IT) way to approach this?
(In reply to comment #13)
> To fix it, we need to make sure this does not happen again.
> - One way would be to routinely perform a netscaler flush (for AMO anyway)
> after a push has happened completely. That'd also get rid of the old cached
> revisions that are not needed anymore.

Wil made the point that this won't work b/c the users cache will still hang on
to the files because the expire time in 10 years.

Looks like atomic updates are the best solution?
Also, we need to push a new update soon, just to increment that CSS number, because all people who visited the site this morning are now stuck with the outdated file forever.
Since git is used to push out to the cluster the updates are pretty atomic.  My guesses at what happened are:

* The auto commit script ran half way through the svn up, so only a group of the files were synced out at first and then the rest 5 or so min later.
* The old css file was served for around one second, but still was cached by the netscaler/clients.
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
Verified FIXED; Fashion Your Firefox and AMO are both fine, now.
Status: RESOLVED → VERIFIED
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.