Closed
Bug 432518
Opened 16 years ago
Closed 16 years ago
geodns deployment
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mrz, Assigned: mrz)
References
Details
tracking
Assignee | ||
Updated•16 years ago
|
Assignee: server-ops → nobody
Component: Server Operations → Server Operations: Projects
Flags: needs-downtime+
Whiteboard: Tuesday 2008/06/03 @ 8pm
Assignee | ||
Updated•16 years ago
|
Assignee: nobody → mrz
Assignee | ||
Updated•16 years ago
|
Whiteboard: Tuesday 2008/06/03 @ 8pm → Tuesday 2008/05/27 @ 8pm
Assignee | ||
Comment 1•16 years ago
|
||
Punting to Dave for nagios... need mysql, replication (geodns02) and process check (named, mysql).
Assignee: mrz → justdave
Assignee | ||
Comment 2•16 years ago
|
||
I bailed on this lastnight for a couple reasons - 1. no Nagios monitoring 2. Release_Lag monitor - not sure how this works and what it would do after changing RRs 3. Didn't tell oncall this was happening - didn't want to spring this change on oncall @ 10pm.
Component: Server Operations: Projects → Server Operations
Whiteboard: Tuesday 2008/05/27 @ 8pm → needs-monitoring
Assignee | ||
Comment 3•16 years ago
|
||
Release_Lag monitor isn't a problem - my ($name, $aliases, $addrtype, $length, @addrs) = gethostbyname('releases.mozilla.org'); will continue to work and grab everything. May need to change that at some point to make sure to grab all the global mirrors but that could be done as part of bug 406267.
Assignee | ||
Comment 4•16 years ago
|
||
13:08 <@nagios> geodns01.sj:DNS is CRITICAL: CRITICAL - Plugin timed out after 4 seconds DNS check tries to resolve www.mozilla.com but those two boxes are set with recursion off. Turn it on? Make new check?
Comment 5•16 years ago
|
||
The test was changed to let us specify a host to test in the service definition. I've finished double-checking all of the service checks, and everything that's still red in nagios is a real problem, the tests are okay. named on geodns02 is indeed not resolving releases.geo.mozilla.com, probably because MySQL isn't replicating so it hasn't picked it up yet. Nagios can't connect to the server to check MySQL because the ACLs to allow nagios in haven't replicated either. I'm betting that fixing the replication problem will resolve both of the other two as well.
Assignee: justdave → mrz
Whiteboard: needs-monitoring
Comment 6•16 years ago
|
||
Restored replication on geodns02. Required a full mysql re-init, it seems that somehow the geodns01 /etc/my.cnf lost the log-bin line and binary logging wasn't running anymore? Anyway this is resolved and tested and you can now setup nagios alerts against it if you want as it should all be good.
Comment 7•16 years ago
|
||
Nagios alerts were already there, they were just all red before. :) DNS is still red: https://nagios.mozilla.org/nagios/cgi-bin/status.cgi?host=geodns02.nl Is named talking to mysql correctly?
Assignee | ||
Comment 8•16 years ago
|
||
Deployed. Thanks!
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Updated•9 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•