Closed
Bug 701489
Opened 14 years ago
Closed 14 years ago
Restart devmo wiki
Categories
(mozilla.org Graveyard :: Server Operations, task)
mozilla.org Graveyard
Server Operations
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: sheppy, Assigned: nmaul)
Details
Looks like at least one of the hosts is broken; please restart them all and see if they come back to life. Thanks!
| Assignee | ||
Comment 1•14 years ago
|
||
Done.
Assignee: server-ops → nmaul
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
| Reporter | ||
Comment 2•14 years ago
|
||
Something more serious must be afoot; it's not responding reliably again already. Lots of "Service unavailable" errors and broken connections. Someone needs to figure out what's wrong.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
| Assignee | ||
Comment 3•14 years ago
|
||
I'm not having any luck replicating this.
Is it the django or deki portion that's failing? Always a certain URL that fails, or any element?
What kind of Service Unavailable page are you getting? Bold red lettering (Zeus), or the more normal Apache kind?
| Reporter | ||
Comment 4•14 years ago
|
||
Bold red. It's not happened for about 10 minutes now.
| Assignee | ||
Comment 5•14 years ago
|
||
We've added this cluster to our ganglia performance monitoring/graphing system. If this happens again we may have better data on it.
Severity: critical → major
Status: REOPENED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → INCOMPLETE
| Reporter | ||
Comment 6•14 years ago
|
||
Having this happen again right now.
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
| Reporter | ||
Comment 7•14 years ago
|
||
This is coming and going in waves, where it'll not work at all for a few minutes, then work fine for a while, then stop working again. It's as if some service is dying and being restarted after a while (that's the feeling I get, not some special knowledge I have).
It's making getting work done very difficult, so bumping the urgency a bit here.
Severity: major → critical
| Assignee | ||
Comment 8•14 years ago
|
||
In other bugs this was determined to be a problem with the database server tm-b01-master01. It's load got extremely high due to lots of disk I/O wait, caused by an unrelated database. Since this is more thoroughly documented in other bugs, I'll close this one back out.
The TL;DR is: we're investigating what can be done to mitigate this situation. One (highly recommended) improvement would be to make use of the slave database server(s) for this cluster for read queries. The slave was not affected by this issue, and would have been far faster to respond.
For the record, I don't see any significant issues reported by ganglia for the actual dekiwiki cluster, so I believe all is well there. This appears to have been purely a database concern.
Status: REOPENED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → INCOMPLETE
Updated•11 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•