Closed
Bug 562171
Opened 14 years ago
Closed 14 years ago
Drumbeat.org web site experiencing intermittent outages
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: mthompson2000, Assigned: justdave)
Details
* The Drumbeat.org site is presently down. As of approx 5:10 pm ET * The site was also down earlier today from approx 1:10 pm ET to 1:20pm ET. * Checked with developers at Trellon -- they said it looks like a server issue on our side.
Reporter | ||
Comment 1•14 years ago
|
||
* It now appears to be back up as of 5:20pm ET * We appear to be experiencing temporary outages today. Let us know if there's any testing or maintenance or other potential reasons?
Severity: critical → normal
Reporter | ||
Updated•14 years ago
|
Summary: Drumbeat.org web site is down → Drumbeat.org web site experiencing intermittent outages
Reporter | ||
Comment 2•14 years ago
|
||
* Drumbeat.org is down again at 10:15am ET * Confirmed on http://downforeveryoneorjustme.com/drumbeat.org * Trellon developers say this isn't an issue on their side -- allege it's a server issue * This is obviously a big deal for us. Status report appreciated.
Severity: normal → critical
Updated•14 years ago
|
Assignee: server-ops → shyam
Comment 3•14 years ago
|
||
Seems like there's something overloading the box. I can't even login. It's happened at around the same time over the last 2 days and it recovers on it own. Do you guys have a cron job or something running around this time?
Comment 4•14 years ago
|
||
Also, it seems like pm-drumbeat01 is a VM. What kind of traffic have you been seeing on the site of late?
Comment 5•14 years ago
|
||
Shyam: stats are here: https://metrics.mozilla.com/awstats/bin/awstats.pl?config=drumbeat Traffic went through the roof on the 23rd of April. We're now getting up to 600,000 hits a day. That, not coincidentally, is the day we started linking to drumbeat.org from a Firefox start page snippet. Gerv
Comment 6•14 years ago
|
||
That level of exposure makes the site going down a blocker, IMO. Gerv
Severity: critical → blocker
Comment 7•14 years ago
|
||
No wonder. I'm seeing massive spikes on the CPU/Memory graphs on the VM. Does the site have searchable stuff? How scalable is that? It looks to me like someone is able to search for stuff and hammer the site/DB and bring the VM to its knees. Also, how scalable is the app? I don't think we can keep running this on a single VM for long if we are seeing legitimate traffic.
Comment 8•14 years ago
|
||
08:20:39 up 45 days, 17:15, 1 user, load average: 213.68, 218.22, 220.95 Mem: 2075392k total, 2025504k used, 49888k free, 1996k buffers Swap: 2097144k total, 1977336k used, 119808k free, 117276k cached I'm going to have to reboot to recover from this. The machine's used up all it's RAM and swap and is practically dead.
Comment 9•14 years ago
|
||
Rebooted the VM, site is back online. Looking to see what the problem could have been on IRC.
Comment 10•14 years ago
|
||
pingers is looking into this. They're planning to apply the 6.16 update to Drupal and hope that will fix the issue here. Since they're taking point on this, nothing more from IT's side at this point.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Reporter | ||
Updated•14 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 12•14 years ago
|
||
(In reply to comment #5) > Shyam: stats are here: > https://metrics.mozilla.com/awstats/bin/awstats.pl?config=drumbeat > > Traffic went through the roof on the 23rd of April. We're now getting up to > 600,000 hits a day. That, not coincidentally, is the day we started linking to > drumbeat.org from a Firefox start page snippet. Why would we do that without notification to IT? Who approved doing that? drumbeat's not on any infrastructure that can handle load like that. It's a single server, no redundancy. This has be pulled from the start page now.
Comment 13•14 years ago
|
||
MRZ: good point and my fault as I authorized. Clearly I didn't think through the limitations or downstream implications. In terms of solutions, my understanding is we can't easily pull from the start page -- takes days or weeks for google to react to our requests. My proposal is that we create a page w/ similar content on www.mozilla.org and then redirect there. We've put snippets at mozilla.org in the past and it's been fine. Does this solve the problem from IT's perspective? Matt and David, possible to get something up relatively quickly? FWIW, it should be easy to avoid mistakes like this once Paul starts next week. We'll have someone on our team to push stuff risks and limitations like this in front of my face before pulling the trigger. In any case, sorry for creating this firedrill for you guys.
Assignee | ||
Comment 14•14 years ago
|
||
(In reply to comment #13) > In terms of solutions, my understanding is we can't easily pull from the start > page -- takes days or weeks for google to react to our requests. In my understanding, we have direct control over these because they're on our server and Google syncs them relatively frequently. jslater or pascalc would know for sure.
Reporter | ||
Comment 15•14 years ago
|
||
David Boswell and I are working on creating that page for re-direct now. Should have an ETA shortly.
Comment 16•14 years ago
|
||
We have control on snippets creation as they are stored on our svn repo but little control on the syncing with Google servers. We can have snippets changed once a month only. I can change the snippet on our SVN repo in a matter of hours but somebody in MV will have to make a special request to Google to have it published since we already used our April update. Another solution would be to redirect the requested page to a temporary static html page, that would probably remove a lot of the load (no mysql, no php).
Reporter | ||
Comment 17•14 years ago
|
||
* We are creating a temporary page at mozilla.org/causes/subtitles now * ETA for having that page live is 1:30pm PT * Will need a re-direct from drumbeat.org/universal-subtitles to mozilla.org/causes/subtitles
Severity: critical → blocker
Status: REOPENED → NEW
Reporter | ||
Comment 18•14 years ago
|
||
* We have some temporary placeholder text up now at: http://www.mozilla.org/causes/subtitles.html "Universal Subtitles coming soon! please check back later." * Please enable the re-direct now. Will have proper HTML page up there soon. re-direct from: drumbeat.org/project/universal-subtitles to: mozilla.org/causes/subtitles
Severity: blocker → critical
Comment 19•14 years ago
|
||
Done. trinity:~ shyam$ curl -L -I -H "Host: www.drumbeat.org" http://pm-drumbeat01.mozilla.org/project/universal-subtitles HTTP/1.1 302 Found Date: Wed, 28 Apr 2010 19:31:52 GMT Server: Apache Location: http://www.mozilla.org/causes/subtitles Content-Type: text/html; charset=iso-8859-1 HTTP/1.1 200 OK Date: Wed, 28 Apr 2010 19:31:53 GMT Server: Apache Content-Location: subtitles.html Vary: negotiate TCN: choice X-Powered-By: PHP/5.2.9 Cache-Control: max-age=900 Expires: Wed, 28 Apr 2010 19:46:53 GMT X-Backend-Server: pm-web01 Content-Type: text/html; charset=UTF-8
Status: NEW → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
Comment 20•14 years ago
|
||
If you need anything else, either file a new bug or reopen and re-assign to server-ops. Past 0330 here and I'm going to crash.
(In reply to comment #14) > (In reply to comment #13) > > In terms of solutions, my understanding is we can't easily pull from the start > > page -- takes days or weeks for google to react to our requests. > > In my understanding, we have direct control over these because they're on our > server and Google syncs them relatively frequently. jslater or pascalc would > know for sure. http://www.google.com/firefox is most certainly *not* on our servers; were you thinking of our first run or what's new pages?
Comment 22•14 years ago
|
||
FWIW, I'm not seeing the redirect yet. Not sure if it takes a while to propagate to a bunch of servers?
Reporter | ||
Comment 23•14 years ago
|
||
* The re-direct seemed to be working earlier, but no longer appears to be working. * When I click on the link in the start page snippet, still takes me to http://www.drumbeat.org/project/universal-subtitles/ * Still need the snippet to re-direct to http://www.mozilla.org/causes/subtitles
Assignee: shyam → server-ops
Severity: critical → normal
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 24•14 years ago
|
||
(In reply to comment #21) > (In reply to comment #14) > > In my understanding, we have direct control over these because they're on > > our server and Google syncs them relatively frequently. jslater or pascalc > > would know for sure. > > http://www.google.com/firefox is most certainly *not* on our servers; were you > thinking of our first run or what's new pages? The directory that Google syncs them from is on our servers, and is what I was refering to. But it's moot, because Google still has to be told to sync them if it's not staged for the beginning of the month. (In reply to comment #23) > * The re-direct seemed to be working earlier, but no longer appears to be > working. Looks like he changed it locally on the box instead of in puppet, and puppet changed it back on the next pass.
Assignee: server-ops → justdave
Assignee | ||
Comment 25•14 years ago
|
||
OK, redirect should be working again now (pushed via puppet this time).
Status: REOPENED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
Matt: is this fixed? Have you noticed any subsequent outages? Netcraft doesn't have uptime availability for drumbeat.org. Thanks!
Reporter | ||
Comment 27•14 years ago
|
||
No outages since the re-direct took effect. Thanks guys! :)
Verified FIXED per comment 27.
Status: RESOLVED → VERIFIED
Updated•9 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•